Skip to main content

FastCDC for large git files

Project description

git-fastcdc

Split certain files using content-defined-chunking for faster deduplication. It has a similar use-case to git-lfs, but blobs are in-repository. git-fastcdc mitigates some of the speed penalties. For most use-cases you are probably better off with git-lfs. If you have a focus on archival and deduplication, git- fastcdc might right for you.

Enable

git fastcdc install

Config

Edit .gitattributes:

*.wav binary filter=git_fastcdc
/.gitattributes text -binary -filter
/.gitignore text -binary -filter

By default git-fastcdc runs in-memory. Switch to on-disk:

git config --local fastcdc.ondisk true

If you have a pure git-fastcdc repository, you probably want to disable delta-compression to benefit from the speedups through fastcdc.

git config --local core.bigFileThreshold 1

How

It will split files on filtering when you add them. The split files go into the git-fastcdc branch. You need to push this branch to remotes too!

You will see the actual data in the files in the working copy, in *.wav in the example above. But actually the blobs of these files are just a list of chunks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git_fastcdc-0.1.0.tar.gz (17.0 kB view hashes)

Uploaded Source

Built Distribution

git_fastcdc-0.1.0-py3-none-any.whl (17.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page