Provides a decorator for caching a function and an equivalent command-line util.
Project description
Provides a decorator for caching a function. Whenever the function is called with the same arguments, the result is loaded from the cache instead of computed. If the arguments, source code, or enclosing environment have changed, the cache recomputes the data transparently (no need for manual invalidation).
The use case is meant for iterative development, especially on scientific experiments. Many times a developer will tweak some of the code but not all. Often, reusing prior intermediate computations saves a significant amount of time every run.
Quickstart
If you don’t have pip installed, see the pip install guide. Then run:
$ pip install charmonium.cache
>>> from charmonium.cache import memoize
>>> i = 0
>>> @memoize()
... def square(x):
... print("recomputing")
... return x**2 + i
...
>>> square(4)
recomputing
16
>>> square(4) # no need to recompute
16
>>> i = 1
>>> square(4) # global i changed; must recompute
recomputing
17
Advantages
While there are other libraries and techniques for memoization, I believe this one is unique because it is:
Correct with respect to source-code changes: The cache detects if you edit the source code or change a file which the program reads (provided they use this library’s right file abstraction). Users never need to manually invalidate the cache, so long as the functions are pure.
It is precise enough that it will ignore changes in unrelated functions in the file, but it will detect changes in relevant functions in other files. It even detects changes in global variables (as in the example above). See Detecting Changes in Functions for details.
Useful between runs and across machines: A cache can be shared on the network, so that if any machine has computed the function for the same source-source and arguments, this value can be reused by any other machine.
Easy to adopt: Only requires adding one line (decorator) to the function definition.
Bounded in size: The cache won’t take up too much space. This space is partitioned across all memoized functions according to the heuristic.
Supports smart heuristics: They can take into account time-to-recompute and storage-size in addition to recency, unlike naive LRU.
Overhead aware: The library measures the time saved versus overhead. It warns the user if the overhead of caching is not worth it.
Memoize CLI
memoize -- command arg1 arg2 ...
- memoize memoizes command arg1 arg2 .... If the command, its arguments,
or its input files change, then command arg1 arg2 ... will be rerun. Otherwise, the output files (including stderr and stdout) will be produced from a prior run.
Make is good, but it has a hard time with dependencies that are not files. Many dependencies are not well-contained in files. For example, you may want recompute some command every time some status command returns a different value.
To get correct results you would have to incorporate every key you depend on into the filename, which can be messy, so most people don’t do that. memoize is easier to use correctly, for example:
# `make status=$(status)` will not do the right thing. make var=1 make var=2 # usually, nothing recompiles here, contrary to user's intent # `memoize --key=$(status) -- command args` will do the right thing memoize --key=1 -- command args memoize --key=2 -- command args # key changed, command is recomptued.
memoize also makes it easy to memoize commands within existing shell scripts.
Code quality
The code base is strictly and statically typed with pyright. I export type annotations in accordance with PEP 561; clients will benefit from the type annotations in this library.
I have unittests with >95% coverage.
I use pylint with few disabled warnings.
All of the above methods are incorporated into per-commit continuous-testing and required for merging with the main branch; This way they won’t be easily forgotten.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for charmonium.cache-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ea0f228c8cccf173a472261f3528760a978b7699ffbf43d982d5448f021866f |
|
MD5 | 62a755038bd16f3a7ae9ed7808f8ec4f |
|
BLAKE2b-256 | 5151cb5c25b1d91d7b762e8d3576d2811219aab52be5e4ea48a8b5d1b607b11d |