Multi-Agent Reinforcement Learning with JAX

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

JaxMARL

Installation | Quick Start | Environments | Algorithms | Citation

Multi-Agent Reinforcement Learning in JAX

JaxMARL combines ease-of-use with GPU enabled efficiency, and supports a wide range of commonly used MARL environments as well as popular baseline algorithms. Our aim is for one library that enables thorough evaluation of MARL methods across a wide range of tasks and against relevant baselines. We also introduce SMAX, a vectorised, simplifed version of the popular StarCraft Multi-Agent Challenge, which removes the need to run the StarCraft II game engine.

For more details, take a look at our blog post or this notebook walks through the basic usage. LINKS TODO

Environments 🌍

Environment	Reference	README	Summary
🔴 MPE	Paper	Source	Communication orientated tasks in a multi-agent particle world
🍲 Overcooked	Paper	Source	Fully-cooperative human-AI coordination tasks based on the homonyms video game
🦾 Multi-Agent Brax	Paper	Source	Continuous multi-agent robotic control based on Brax, analagous to Multi-Agent MuJoCo
🎆 Hanabi	Paper	Source	Fully-cooperative partially-observable multiplayer card game
👾 SMAX	Novel	Source	Simplifed cooperative StarCraft micro-management environment
🧮 STORM: Spatial-Temporal Representations of Matrix Games	Paper	Source	Matrix games represented as grid world scenarios
🪙 Coin Game	Paper	Source	Two-player grid world environment which emulates social dilemmas
💡 Switch Riddle	Paper	Source	Simple cooperative communication game included for debugging

Baseline Algorithms 🦉

We follow CleanRL's philosophy of providing single file implementations which can be found within the baselines directory.

Algorithm	Reference	README
IPPO	Paper	Source
MAPPO	Paper	Source
IQL	Paper	Source
VDN	Paper	Source
QMIX	Paper	Source

Installation 🧗

Before installing, ensure you have the correct JAX version for your hardware accelerator. JaxMARL can then be installed directly from PyPi:

pip install jaxmarl  -- NOTE THIS DOES NOT WORK YET USE: pip install -e .

We have tested JaxMARL on Python 3.8 and 3.9. To run our test scripts, some additional dependencies are required (for comparisons against existing implementations), these can be installed with:

pip install jaxmarl[dev]

Quick Start 🚀

We take inspiration from the PettingZoo and Gymnax interfaces. You can try out training an agent on XX in this Colab TODO. Further introduction scripts can be found here.

Basic JaxMARL API Usage 🖥️

Actions, observations, rewards and done values are passed as dictionaries keyed by agent name, allowing for differing action and observation spaces. The done dictionary contains an additional "__all__" key, specifying whether the episode has ended. We follow a parallel structure, with each agent passing an action at each timestep. For ascyhronous games, such as Hanabi, a dummy action is passed for agents not acting at a given timestep.

import jax
from jaxmarl import make

key = jax.random.PRNGKey(0)
key, key_reset, key_act, key_step = jax.random.split(rng, 4)

# Initialise environment.
env = make('MPE_simple_world_comm_v3')

# Reset the environment.
obs, state = env.reset(key_reset)

# Sample random actions.
key_act = jax.random.split(key_act, env.num_agents)
actions = {agent: env.action_space(agent).sample(key_act[i]) for i, agent in enumerate(env.agents)}

# Perform the step transition.
obs, state, reward, done, infos = env.step(key_step, state, actions)

Contributing 🔨

Please contribute! Please take a look at our contributing guide for how to add an environment/algorithm or submit a bug report.

Citing JaxMARL 📜

If you use JaxMARL in your work, please cite us as follows:

TODO

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.3

Apr 2, 2024

0.0.2

Nov 16, 2023

This version

0.0.1

Nov 16, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jaxmarl-0.0.1.tar.gz (137.7 kB view hashes)

Uploaded Nov 16, 2023 Source

Built Distribution

jaxmarl-0.0.1-py3-none-any.whl (130.6 kB view hashes)

Uploaded Nov 16, 2023 Python 3

Hashes for jaxmarl-0.0.1.tar.gz

Hashes for jaxmarl-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`0a6c9b38ea2624deb62497b60af3f4ef6f086c9c8d0950320ada9b6ac7ea5568`
MD5	`a85a401bace608dfd4d30fb70c81b14a`
BLAKE2b-256	`1233951e9fe2aee97029af695953e68254d9c12f6e5d01a09a5d2e00d00ab5da`

Hashes for jaxmarl-0.0.1-py3-none-any.whl

Hashes for jaxmarl-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7fadd11a57177d0bde7525fc790852e40d27ee9b4b9ef63d6667c1cd20565e06`
MD5	`24090cf6c537085e07a93cfdadc73561`
BLAKE2b-256	`9cd62a77bf521462e80906f9c4b45e7587143fbad3fb9b876ef0ef3a0f67d590`