Skip to main content

A Higher Level API to BigML.io, the public BigML API

Project description

BigMLer - A Higher-Level API to BigML’s API

BigMLer makes BigML even easier.

BigMLer wraps BigML’s API Python bindings to offer a high-level command-line script to easily create and publish datasets and models, create ensembles, make local predictions from multiple models, and simplify many other machine learning tasks.

BigMLer is open sourced under the Apache License, Version 2.0.

Support

Please report problems and bugs to our BigML.io issue tracker.

Discussions about the different bindings take place in the general BigML mailing list. Or join us in our Campfire chatroom.

Requirements

Python 2.7 is currently supported by BigMLer.

BigMLer requires bigml 0.7.0 or higher.

BigMLer Installation

To install the latest stable release with pip:

$ pip install bigmler

You can also install the development version of bigmler directly from the Git repository:

$ pip install -e git://github.com/bigmlcom/bigmler.git#egg=bigmler

BigML Authentication

All the requests to BigML.io must be authenticated using your username and API key and are always transmitted over HTTPS.

BigML module will look for your username and API key in the environment variables BIGML_USERNAME and BIGML_API_KEY respectively. You can add the following lines to your .bashrc or .bash_profile to set those variables automatically when you log in:

export BIGML_USERNAME=myusername
export BIGML_API_KEY=ae579e7e53fb9abd646a6ff8aa99d4afe83ac291

Otherwise, you can initialize directly when running the BigMLer script as follows:

bigmler --train data/iris.csv --username myusername --api_key ae579e7e53fb9abd646a6ff8aa99d4afe83ac291

BigML Development Mode

Also, you can instruct BigMLer to work in BigML’s Sandbox environment by using the parameter ---dev:

bigmler --train data/iris.csv --dev

Using the development flag you can run tasks under 1 MB without spending any of your BigML credits.

Using BigMLer

To run BigMLer you can use the console script directly. The –help option will describe all the available options:

bigmler --help

Alternatively you can just call bigmler as follows:

python bigmler.py --help

This will display the full list of optional arguments. You can read a brief explanation for each option below.

Quick Start

Let’s see some basic usage examples. Check the installation and authentication sections in BigMLer on Read the Docs if you are not familiar with BigML.

Basics

You can create a new model just with

bigmler --train data/iris.csv

If you check your dashboard at BigML, you will see a new source, dataset, and model. Isn’t it magic?

You can generate predictions for a test set using:

bigmler --train data/iris.csv --test data/test_iris.csv

You can also specify a file name to save the newly created predictions:

bigmler --train data/iris.csv --test data/test_iris.csv --output predictions

If you do not specify the path to an output file, BigMLer will auto-generate one for you under a new directory named after the current date and time (e.g., MonNov1212_174715/predictions.csv).

A different objective field (the field that you want to predict) can be selected using:

bigmler --train data/iris.csv --test data/test_iris.csv  --objective 'sepal length'

If you do not explicitly specify an objective field, BigML will default to the last column in your dataset.

BigMLer will try to use the locale of the model both to create a new source (if --train flag is used) and to interpret test data. In case it fails, it will try en_US.UTF-8 or English_United States.1252 and a warning message will be printed. If you want to change this behaviour you can specify your preferred locale:

bigmler --train data/iris.csv --test data/test_iris.csv \
--locale "English_United States.1252"

If you check your working directory you will see that BigMLer creates a file with the model ids that have been generated (e.g., FriNov0912_223645/models). This file is handy if then you want to use those model ids to generate local predictions. BigMLer also creates a file with the dataset id that has been generated (e.g., TueNov1312_003451/dataset) and another one summarizing the steps taken in the session progress: bigmler_sessions. You can also store a copy of every created or retrieved resource in your output directory (e.g., TueNov1312_003451/model_50c23e5e035d07305a00004f) by setting the flag --store.

Prior Versions Compatibility Issues

BigMLer will accept flags written with underscore as word separator like --clear_logs for compatibility with prior versions. Also --field-names is accepted, although the more complete --field-attributes flag is preferred. --stat_pruning and --no_stat_pruning are discontinued and their effects can be achived by setting the actual --pruning flag to statistical or no-pruning values respectively.

Additional Information

For additional information, see the full documentation for BigMLer on Read the Docs.

History

0.3.1 (2013-05-14)

  • Adding delete for ensembles

  • Creating ensembles when the number of models is greater than one

  • Remote predictions using ensembles

0.3.0 (2013-04-30)

  • Adding cross-validation feature

  • Using user locale to create new resources in BigML

  • Adding –ensemble flag to use ensembles in predictions and evaluations

0.2.1 (2013-03-03)

  • Deep refactoring of main resources management

  • Fixing bug in batch_predict for no headers test sets

  • Fixing bug for wide dataset’s models than need query-string to retrieve all fields

  • Fixing bug in test asserts to catch subprocess raise

  • Adding default missing tokens to models

  • Adding stdin input for –train flag

  • Fixing bug when reading descriptions in –field-attributes

  • Refactoring to get status from api function

  • Adding confidence to combined predictions

0.2.0 (2012-01-21)

  • Evaluations management

  • console monitoring of process advance

  • resume option

  • user defaults

  • Refactoring to improve readability

0.1.4 (2012-12-21)

  • Improved locale management.

  • Adds progressive handling for large numbers of models.

  • More options in field attributes update feature.

  • New flag to combine local existing predictions.

  • More methods in local predictions: plurality, confidence weighted.

0.1.3 (2012-12-06)

  • New flag for locale settings configuration.

  • Filtering only finished resources.

0.1.2 (2012-12-06)

  • Fix to ensure windows compatibility.

0.1.1 (2012-11-07)

  • Initial release.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigmler-0.3.1.tar.gz (58.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page