Skip to main content

A set of utilities to create and manage configuration files effectively, built on top of OmegaConf.

Project description

Springs

A set of utilities to turn OmegaConf into a fully fledge configuration utils. Just like the springs inside an Omega watch, they help you move with your experiments.

Springs overlaps in functionality with Hydra, but without all the unnecessary boilerplate.

The current logo for Springs was generated using DALL·E 2.

To install Springs, simply run

pip install springs

Philosophy

OmegaConf supports creating configurations in all sorts of manners, but we believe that there are benefits into defining configuration from structured objects, namely dataclass. Springs is built around that notion: write one or more dataclass to compose a configuration (with appropriate defaults), then parse the remainder of options or missing values from command line/a yaml file.

Let's look at an example. Imagine we are building a configuration for a machine learning (ML) experiment, and we want to provide information about model and data to use. We start by writing the following structure configuration

import springs as sp
from dataclasses import dataclass

# this sub-config is for
# data settings
@dataclass
class DataConfig:
    # sp.MISSING is an alias to
    # omegaconf.MISSING
    path: str = sp.MISSING
    split: str = 'train'

# this sub-config is for
# model settings
@dataclass
class ModelConfig:
    name: str = sp.MISSING
    num_labels: int = 2


# this sub-config is for
# experiment settings
@dataclass
class ExperimentConfig:
    batch_size: int = 16
    seed: int = 42


# this is our overall config
@dataclass
class Config:
    data: DataConfig = DataConfig()
    model: ModelConfig = ModelConfig()
    exp: ExperimentConfig = ExperimentConfig()

Note how, in matching with OmegaConf syntax, we use MISSING to indicate any value that has no default and should be provided at runtime.

If we want to use this configuration with a function that actually runs this experiment, we can use sp.cli as follows:

@sp.cli(Config)
def main(config: Config)
    # this will print the configuration
    # like a dict
    print(config)
    # you can use dot notation to
    # access attributes...
    config.exp.seed
    # ...or treat it like a dictionary!
    config['exp']['seed']


if __name__ == '__main__':
    main()

Notice how, in the configuration Config above, some parameters are missing. We can specify them from command line...

python main.py \
    data.path=/path/to/data \
    model.name=bert-base-uncased

...or from one or more YAML config files (if multiple, the latter ones override the former ones).

data:
    path: /path/to/data

model:
    name: bert-base-uncased

# you can override any part of
# the config via YAML or CLI
# CLI takes precedence over YAML.
exp:
    seed: 1337

To run with from YAML, do:

python main.py -c config.yaml

Easy, right?

Fine, We Do Support Support Unstructured Configurations

You are not required to used a structured config with Springs. To use our CLI with a bunch of yaml files and/or command line arguments, simply decorate your main function with no arguments.

@sp.cli()
def main(config)
    # do stuff
    ...

Initializing Object from Configurations

Sometimes a configuration contains all the necessary information to instantiate an object from it. Springs supports this use case, and it is as easy as providing a _target_ node in a configuration:

@dataclass
class ModelConfig:
    _target_: str = (
        'transformers.'
        'AutoModelForSequenceClassification.'
        'from_pretrained'
    )
    pretrained_model_name_or_path: str = \
        'bert-base-uncased'
    num_classes: int = 2

In your experiment code, run:

def run_model(model_config: ModelConfig):
    ...
    model = sp.init.now(model_config, ModelConfig)

Note: Previous versions of Springs supported specifying the return type, but now it is actively encouraged. Running sp.init.now(model_config) will now raise a warning if the type is not provided. To prevent this warning, use sp.toggle_warnings(False) before calling sp.init.now/ sp.init.later.

init.now vs init.later

init.now is used to immediately initialize a class or run a method. But what if the function you are not ready to run the _target_ you want to initialize? This is common for example if you receive a configuration in the init method of a class, but you don't have all parameters to run it until later in the object lifetime. In that case, you might want to use init.later. Example:

config = sp.from_dict({'_target_': 'str.lower'})
fn = sp.init.later(config, Callable[..., str])

... # much computation occurs

# returns `this to lowercase`
fn('THIS TO LOWERCASE')

Note that, for convenience sp.init.now is aliased to sp.init.

Path as _target_

If, for some reason, cannot specify the path to a class as a string, you can use sp.Target.to_string to resolve a function, class, or method to its path:

import transformers

@dataclass
class ModelConfig:
    _target_: str = sp.Target.to_string(
        transformers.
        AutoModelForSequenceClassification.
        from_pretrained
    )
    pretrained_model_name_or_path: str = \
        'bert-base-uncased'
    num_classes: int = 2

Static and Dynamic Type Checking

Springs supports both static and dynamic (at runtime) type checking when initializing objects. To enable it, pass the expected return type when initializing an object:

@sp.cli(TokenizerConfig)
def main(config: TokenizerConfig):
    tokenizer = sp.init(config, PreTrainedTokenizerBase)
    print(tokenizer)

This will raise an error when the tokenizer is not a subclass of PreTrainedTokenizerBase. Further, if you use a static type checker in your workflow (e.g., Pylance in Visual Studio Code), springs.init will also annotate its return type accordingly.

Flexible Configurations

Sometimes a configuration has some default parameters, but others are optional and depend on other factors, such as the _target_ class. In these cases, it is convenient to set up a flexible dataclass, using make_flexy after the dataclass decorator.

@sp.make_flexy
@dataclass
class MetricConfig:
    _target_: str = sp.MISSING
    average: str = 'macro'

config = sp.from_flexyclass(MetricConfig)
overrides = {
    # we override the _target_
    '_target_': 'torchmetrics.F1Score',
    # this attribute does not exist in the
    # structured config
    'num_classes': 2
}

config = sp.merge(config, sp.from_dict(overrides))
print(config)
# this will print the following:
# {
#    '_target_': 'torchmetrics.F1Score',
#    'average': 'macro',
#    'num_classes': 2
# }

Note: In previous version of Springs, the canonical way to create a flexible class was to decorate a class with @sp.flexyclass. This method is still there, but it is not encouraged since it creates issues with mypy (and potentially other type checkers). Please consider switching to dataclass followed by make_flexy. To prevent a warning being raised for this, use sp.toggle_warnings(False) before calling sp.flexyclass.

Resolvers

Guide coming soon!

Tips and Tricks

This section includes a bunch of tips and tricks for working with OmegaConf and YAML.

Tip 1: Repeating nodes in YAML input

In setting up YAML configuration files for ML experiments, it is common to have almost-repeated sections. In these cases, you can take advantage of YAML's built in variable mechanism and dictionary merging to remove duplicated imports:

# &tc assigns an alias to this node
train_config: &tc
  path: /path/to/data
  src_field: full_text
  tgt_field: summary
  split_name: train

test_config:
  # << operator indicates merging,
  # *tc is a reference to the alias above
  << : *tc
  split_name: test

This will resolve to:

train_config:
  path: /path/to/data
  split_name: train
  src_field: full_text
  tgt_field: summary

test_config:
  path: /path/to/data
  split_name: test
  src_field: full_text
  tgt_field: summary

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

springs-1.3.2.tar.gz (26.3 kB view hashes)

Uploaded Source

Built Distribution

springs-1.3.2-py3-none-any.whl (26.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page