Skip to main content

A library for parallel execution of Python code in the Ufora runtime

Project description

pyfora - Compiled, parallel python
==================================

pyfora is the client package for Ufora_ - a compiled, automatically parallel Python for data science
and numerical computing.

Ufora achieves speed and scale by reasoning about your python code to compile
it to machine code (so it's fast) and find parallelism in it (so that it scales). The Ufora
runtime is fully fault tolerant, and handles all the details of data
management and task scheduling transparently to the user.

The Ufora runtime is invoked by enclosing code in a "ufora.remote" block. Code
and objects are shipped to a Ufora cluster and executed in parallel across
those machines. Results are then injected back into the host python
environment either as native python objects, or as handles (in case the
objects are very large). This allows you to pick the subset of your code that
will benefit from running in Ufora - the remainder can run in your regular
python environment.

For all of this to work properly, Ufora places one major restriction on
the code that it runs: it must be "pure", meaning that it cannot modify data
structures or have side effects. This restriction allows the Ufora runtime to
agressively reorder calculations, which is crucial for
parallelism, and allows it to perform compile-time
optimizations than would not be possible otherwise. For more on the subset of python
that Ufora supports, see `python restrictions`_.

.. _python restrictions: https://ufora.github.io/ufora/documentation/python-restrictions.html


Installation
============

The pyfora client is a pure python package and can be installed by running:

.. code::

pip install pyfora


Getting Started with Ufora
==========================

The ufora backend is available as a docker image that can be run locally on your machine, or in a
cluster of machines on a local network or in the cloud.

- `Getting started with local Ufora`_
- `Getting started with Ufora on AWS`_
- `Running Ufora on a local cluster`_


.. _Getting started with local Ufora: https://ufora.github.io/ufora/tutorials/getting-started-local.html
.. _Getting started with Ufora on AWS: https://ufora.github.io/ufora/tutorials/getting-started-aws.html
.. _Running Ufora on a local cluster: https://ufora.github.io/ufora/tutorials/getting-started-cluster.html


Credits
-------

Pyfora is developed and maintained by the Ufora_ team. Find us on Github_.


- `Distribute`_

.. _Distribute: http://pypi.python.org/pypi/distribute

.. _Ufora: https://ufora.github.io/ufora
.. _Github: https://github.com/ufora/ufora


Pyfora News
===========

0.2
----

*Release date: TBD

* [bug #127]: Correctly propegating communication errors up to Executor.
* [feature]: Support @property decorator.
* [feature]: Improved download performance of large lists of small objects.
* [bug #122]: Wrong exception type from `list + non_list`.
* [bug #120]: Failure when trying to convert a list of mapped functions.
* [bug #119]: Can't convert bound instance methods.
* [bug #116]: Builtin "reduce" function is not parallelizable when applied over lists, xrange, etc.
* [bug #115]: Fixing __getitem__ for strings and tuples
* [bug #111]: Wrong exception when accessing unbound variables.
* [bug #110]: Incorrect conversion of class functions in user-defined classes.
* [bug #109]: list __getitem__ doesn't throw with step 0
* [feature]: Implement `map` builtin
* [feature]: Support `isinstance` on user-defined classes.
* [feature]: Add versioning scheme to socket.io protocol.
* [feature]: Add support for the python REPL.
* [bug #90]: Improved error message for unbound free variables.
* [bug #89]: Ctrl+C doesn't break out of `with` block.
* [bug #68]: Disallow `return` statements in pyfora `with` blocks.
* [bug #67]: tuple unpacking doesn't work
* [feature]: basic linear regression on data-frames
* [feature]: basic CSV parsing
* [feature]: basic data-frames
* [bug #59]: `sequence(0)` not iterable
* [bug #47]: int/float mismatch in `**` operator
* [bug #21]: certain python variables "survive" longer than fora values


*Known Issues:

* `def` order is important in non-module function definition (closures). If functions
`g()` and `h()` are defined inside of function `f` and `g()` calls `h()`, then `def h():` must
appear BEFORE `def g():`.
This also implies that mutually-recursive functions are only possible at module or class level.

* Class static methods cannot be used as values. They can be invoked, but it's not possible
to pass a class static method as an argument to another function.

* Named argument calls are not supported. If you have a function `def f(x):...` you can call it as
`f(42)` but you can't use `f(x=42)`.

* Keyword arguments are not supported.

* Class members can only be initialized inside of `__init__`. If `__init__` calls another function
that initializes members, those members will not be seen by pyfora.

* `return` statements not allowed in `__init__()`

* Wrong exception when calling a funcion with too many arguments.

* @classmethod decorator is not supported.

* No support for `*args`.

* `assert` is not implemented.

* Bad error message when using `self` inside of `__init__` for things other than setting or getting
members. For example, calling `str(self)` inside of `__init__` results in
"PythonToForaConversionError: An internal error occurred: we didn't provide a definition for the following variables: ['self'].
Most likely, there is a mismatch between our analysis of the python code and the generated FORA code underneath. Please file a bug report."

* parsing.csv ignores the first row if there are leading spaces. For example, the following code
leaves out the first row:
let s = """
A,B,C
1,2,3
4,5,6
7,8,9
"""
let res = parsing.csv(s)
res

* No support for object inheritance.


0.1
-----

*Release date: Nov-06-2015

* Initial release of pyfora!
* Includes support for core language features and builtin types.
* Some support for builtin functions like all, any, sum, etc.
* pyfora.aws module and pyfora_aws script help setup a Ufora cluster in EC2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfora-0.2rc2.tar.gz (80.8 kB view hashes)

Uploaded Source

Built Distribution

pyfora-0.2rc2-py2-none-any.whl (141.6 kB view hashes)

Uploaded Python 2

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page