Skip to main content

Machine Learning Project Framework Generator

Project description

Whisk ML Project Framework

pypi

docs

Tying together the tools required to release a machine learning model can be daunting. Whisk makes building and releasing ML models easy and fun. Whisk creates a logical and flexible project structure for ML that creates reproducible results and lets you release your model to the world without becoming a software engineer.

Whisk doesn't lock you into a particular ML framework or require you to learn yet another ML packaging API. Instead, it leverages the magic of Python's ecosystem that's available to projects structured in a Pythonic-way. Whisk does the structuring while you focus on the data science.

Read more about our beliefs.

Quickstart

Replace demo with the name of your ML project in the examples below.

Create the project:

pip install whisk
echo "Generate the directory structure, set up a venv, initialize a Git repo, and more."
whisk create demo
cd demo
source venv/bin/activate

Checkout the end-to-end notebook example:

jupyter-notebook notebooks/example.ipynb

The notebook shows how to save your trained model to disk, use the saved model to generate predictions, and how to load Python functions and classes from the project's src directory for a cleaner notebook. It's the guide rails for your own ML project.

There's a placeholder model you can invoke immediately from the command line:

whisk predict [[0,1],[2,3]]

...and a ready-to-go Flask web service:

whisk app start
curl --location --request POST 'http://localhost:5000/predict' \
--header 'Content-Type: application/json' \
--data-raw '{"data":[[0, 1], [2, 3]]}'

Deploy the web service to Heroku (a free account is fine):

whisk app create demo-[INSERT YOUR NAME]

Create a Python package containing your model and share with the world:

whisk model build
echo "Installing the generated Python package"
pip install dist/demo-0.1.0.tar.gz

Invoke the model via the CLI:

demo predict [[0,1],[2,3]]

...and within Python code:

from demo import model
model.predict([[0,1],[2,3]])

Beliefs

  • A notebook isn't enough - A data science notebook is where experimentation starts, but you can't create a reproducible, collaborative ML project with just a *.ipynb file.
  • A Reproducible, collaborative project is a solved problem for classical software - We don't need to re-invent the wheel for machine learning projects. Instead, we need guide rails to help data scientists structure projects without forcing them to also become software engineers.
  • Python already has a good package manager - We don't need overly abstracted solutions to package a trained ML model. A properly structured ML project makes it easy to use pip for packaging a model, making it easy for anyone to benefit from your work.
  • Version control is a requirement - You can't have a reproducible project if the code and training data isn't in version control.
  • Docker is a heavyweight and fragile option for solving reproducibility - when we explicitly declare and isolate dependencies, we don't need to rely on the implicit existence of packages installed in a Docker container. Docker also creates a slow development flow: repeatedly restarting Docker containers is far slower than doing the same in pure Python. Python already has solid native tools for this problem.
  • Optimize for debugging - 90% of writing software is fixing bugs. It should be easy to debug your model logic locally.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template. The project template is heavily inspired by Cookiecutter Data Science.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisk-0.1.9.tar.gz (60.7 kB view hashes)

Uploaded Source

Built Distribution

whisk-0.1.9-py3-none-any.whl (74.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page