Skip to main content

Distributed hybrid (multi-node) heterogeneous (CPU + multi-GPU) computing library. Utilizes and requires CUDA toolkit, OpenMP, and OpenMPI.

Project description

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node (hybrid cluster -- more than one machine with CPU+GPUs) paradigms to leverage the processing power of a cluster. Cython is used to generate glue code for the core C/C++ functions and provide wrappers to call from Python. Requires numpy, CUDA toolkit>=2.0, OpenMP, and OpenMPI. Note: this library does not use the popular mpi4py library.

Features:

  • Get/set/configure bandwidths of local node or entire cluster whether by supplied numpy array or from binary data files
  • Code generator to write temporary binary data files or python files that are to be executed on each node
  • Execute mpirun command from master node with default env var or configurable hostfile
  • Reduction sum with functionality scaling such as python naive sum, multi-thread reduction sum, multi-gpu reduction sum, heterogeneous reduction sum, and hybrid heterogeneous reduction sum.

Additional features such as other reduction operations, dot product, matrix multiplication, image processing kernels, neural networks, and finite element method functions are under consideration for future releases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

distributed_compy-1.0.50.tar.gz (825.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page