Skip to main content

Simplified interfaces for assignments on Mechanical Turk.

Project description

https://img.shields.io/travis/etscrivner/turkleton.svg https://coveralls.io/repos/etscrivner/turkleton/badge.svg?branch=master https://img.shields.io/pypi/v/turkleton.svg Documentation Status

Dead simple Python interfaces for Amazon Mechanical Turk.

Installation

Simply use pip to download the package from PyPI

$ pip install turkleton

Features

The existing Python APIs for Mechanical Turk are thin wrappers at best - we can do better.

Turkleton aims to leverage the expressive powers of Python to improve the whole situation. While still under active development, the main features are:

  • Simple interface for defining tasks from pre-built layouts.

  • Simple interface for defining schema of assignment results.

  • Easily upload tasks in batches.

  • Easily download and validate assignments.

Examples

In turkleton there are several objects to be aware of: Tasks, HITs, and Assignments. A Task is a template from which HITs are created. A HIT corresponds to HIT in the Amazon Mechanical Turk API and represents an uploaded Task. Assignments are contained within HITs. An individual Assignment represents the set of answers submitted by a single worker. A HIT can have many Assignments.

Setting Up Your Connection

Turkleton uses a per-process global connection. It should be initialized before you attempt to upload or download anything. You can initialize it like so:

from turkleton import connection
connection.setup(AWS_ACCESS_KEY, AWS_SECRET_ACCESS_KEY)

That’s it!

Creating A Task And Uploading It

To define a HIT you create a Task representing the template of the assignment you want a worker to complete. For example:

import datetime

from turkleton.assignment import task

class MyTask(task.BaseTask):
    __layout_id__ = 'MY LAYOUT ID'
    __reward__ = 0.25
    __title__ = 'Guess How Old From Picture'
    __description__ = 'Look at a picture and guess how old the person is.'
    __keywords__ = ['image', 'categorization']
    __time_per_assignment__ = datetime.timedelta(minutes=5)

Here we’ve created a Task from an existing layout. Now that we’ve defined our task we can easily upload HITs by filling out the layout parameters:

task = MyTask(image_url='http://test.com/img.png', first_guess='29')
hit = task.upload(batch_id='1234')

This will create a new assignment from the task template and upload it to Mechanical Turk. The optional batch_id parameter allows you to set the annotation for the task to an arbitrary string that you can use to retrieve tasks later in batches.

You can upload many tasks in a loop easily as follows:

for image_url in all_image_urls:
    MyTask.create_and_upload(
        image_url=image_url, first_guess='29', batch_id='1234'
    )

If you’d like to leave off the batch id you can also use the context manager:

with task.batched_upload(batch_id='1234')
    for image_url in all_image_urls:
       MyTask.create_and_upload(image_url=image_url, first_guess='29')

Downloading The Results

To download results for a HIT you first need to define an assignment. The assignment defines what values are expected and their types. These are used to automatically parse answers to the various questions:

from turkleton.assignment import assignment
from turkleton.assignment import answers

class MyAssignment(assignment.BaseAssignment):
    categories = answers.MultiChoiceAnswer(question_name='Categories')
    notes = answers.TextAnswer(question_name='AdditionalNotes', default='')
    does_not_match_any = answers.BooleanAnswer(
        question_name='DoesNotMatchAnyCategories', default=False
    )

You can then download all of the HITs in a given batch as follows:

from turkleton.assignment import hit
reviewable_hits = hit.get_reviewable_by_batch_id('1234')

Each HIT may then have multiple assignments associated with it. You can download the assignments, review them, and then dispose of the HIT as follows:

for each in MyAssignment.get_by_hit_id(hit.hit_id):
    print('{} - {} - {}'.format(each.categories, each.notes, each.does_not_match_any))
    if is_valid_assignment(each):
        each.accept('Good job!')
    else:
        each.reject('Assignment does not follow instructions.')
hit.dispose()

History

1.2.1 (2015-06-15)

  • Bugfix, error when retrieving hits by batch id

1.2.0 (2015-06-11)

  • More answer types

  • Bugfix where answers retained single value

1.1.0 (2015-06-06)

  • Improvements to connection management

  • More convenient syntax for uploading batches

1.0.0 (2015-06-05)

  • Major version revisions and updates

0.1.0 (2015-01-11)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turkleton-1.2.1.tar.gz (23.9 kB view hashes)

Uploaded Source

Built Distribution

turkleton-1.2.1-py2.py3-none-any.whl (13.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page