mycloud 0.26
Work distribution for small clusters.
mycloud
Leverage small clusters of machines to increase your productivity.
mycloud requires no prior setup; if you can SSH to your machines, then it will work out of the box. mycloud currently exports a simple mapreduce API with several common input formats; adding support for your own is easy as well.
usage
Starting your cluster:
# list each machine and the number of cores to use
cluster = mycloud.Cluster([('machine1', 4),
('machine2', 4)],
tmp_prefix='/path/to/store/results')
Invoke a function over a list of inputs:
result = cluster.map(my_expensive_function, range(1000))
Use the MapReduce interface to easily handle processing of larger datasets:
from mycloud.resource import CSV
input_desc = [CSV('/path/to/my_input_%d.csv' % i for i in range(100)]
output_desc = [CSV('/path/to/my_output_file.csv']
def map_identity(k, v, output):
output(k, int(v[0]))
def reduce_sum(k, values, output):
output(k, sum(values))
mr = mycloud.mapreduce.MapReduce(cluster,
map_identity,
reduce_sum,
input_desc,
output_desc)
result = mr.run()
for k, v in result[0].reader():
print k, v
| File | Type | Py Version | Uploaded on | Size | # downloads |
|---|---|---|---|---|---|
| mycloud-0.26.tar.gz (md5) | Source | 2011-12-02 | 10KB | 195 | |
- Author: Russell Power
- Home Page: http://rjpower.org/mycloud
- License: BSD
-
Categories
- Development Status :: 3 - Alpha
- Intended Audience :: Developers
- Intended Audience :: System Administrators
- License :: OSI Approved :: BSD License
- Operating System :: POSIX
- Programming Language :: Python :: 2.5
- Programming Language :: Python :: 2.6
- Programming Language :: Python :: 2.7
- Programming Language :: Python :: 3
- Programming Language :: Python :: 3.0
- Programming Language :: Python :: 3.1
- Programming Language :: Python :: 3.2
- Topic :: Software Development :: Libraries
- Topic :: System :: Clustering
- Topic :: System :: Distributed Computing
- Package Index Owner: rjpower
- DOAP record: mycloud-0.26.xml
