Skip to main content

A software package for containerizing HPC applications and managing job workflows

Project description

BEE is a workflow orchestration system designed to build containerized HPC applications and orchestrate workflows across HPC and cloud systems. BEE has adopted the Common Workflow Language (CWL) for specifying workflows. Complex scientific workflows specified by CWL are managed and visualized through a graph database, giving the user the ability to monitor the state of each task in the workflow. BEE runs jobs using the workload scheduler (i.e. Slurm or LSF) on the HPC system that tasks are specified to run on.

BEE workflows can be archived for provenance and reproducibility. BEE can orchestrate workflows with containerized applications or those built locally on a system. However, there are advantages to containerizing an application.

A container is a package of code (usually binaries) and all of that code’s dependencies (libraries, etc.). Once built, this container can be run on many different platforms.

Containers provide many benefits:

  • Users can choose their own software stack (libraries, compilers, etc.) and not be bound by the currently installed environment on any one machine.

  • Codes can be run portably across numerous platforms–all dependencies will be downloaded and installed at run time.

  • Entire workflow environments can be built into one or more containers. A user can include visualization and analysis tools along with the application. They will all work together as the application runs.

  • Provenance and history can be tracked by storing containers in a historical repository. At any time, an older container can be rerun (all of its dependencies are stored with it). Execution is repeatable and interactions between software components can be tracked.

  • Functional testing can be performed on smaller, dissimilar machines–there is no real need to test on the actual HPC platform (performance testing obviously requires target hardware).

BEE Sites

Contact

For bugs and problems report, suggestions and other general questions regarding the BEE project, email questions to bee-dev@lanl.gov.

Contributors:

Concept and Design Contributors

  • James Ahrens

  • Allen McPherson

  • Li-Ta Lo

  • Louis Vernon

Contributing

The BEE project adheres to style guidelines specified in setup.cfg. Before attempting to commit and push changes, please install our pre-commit githooks by running the following command in project root:

If using git –version >= 2.9:

git config core.hooksPath .githooks

Otherwise:

cp .githooks/* .git/hooks/

Using these git hooks will ensure your contributions adhere to style guidelines required for contribution. You will need to repeat these steps for every BEE repo you clone.

Release

This software has been approved for open source release and has been assigned BEE C17056.

Publications

  • An HPC-Container Based Continuous Integration Tool for Detecting Scaling and Performance Issues in HPC Applications, IEEE Transactions on Services Computing, 2024, DOI: 10.1109/TSC.2023.3337662

  • BEE Orchestrator: Running Complex Scientific Workflows on Multiple Systems, HiPC, 2021, DOI: 10.1109/HiPC53243.2021.00052

  • “BeeSwarm: Enabling Parallel Scaling Performance Measurement in Continuous Integration for HPC Applications”, ASE, 2021, DOI: 10.1109/ASE51524.2021.9678805

  • “BeeFlow: A Workflow Management System for In Situ Processing across HPC and Cloud Systems”, ICDCS, 2018, DOI: 10.1109/ICDCS.2018.00103

  • “Build and execution environment (BEE): an encapsulated environment enabling HPC applications running everywhere”, IEEE BigData, 2018, DOI: 10.1109/BigData.2018.8622572

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc_beeflow-0.1.8.tar.gz (633.5 kB view hashes)

Uploaded Source

Built Distribution

hpc_beeflow-0.1.8-py3-none-any.whl (759.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page