Skip to main content

concurrent, pipelined, platform-agnostic Git utilities for managing a large number of Git repositories

Project description

GitMC -- concurrent asynchronous Git Utilities for operations on massive numbers of Git repos

DOI ci PyPI Download stats

Platform-independent (Linux/Mac/Windows) Git utilities, useful for managing large (100+) numbers of Git repos. Speed is an emphasis throughout, with concurrency via Python stdlib asyncio via asyncio.create_subprocess_exec and pipelining makes operations in effect 100x faster overall as the coroutines simultaneously wait for Git operations (particularly remote operations like "fetch" and "pull"). We have implemented individual concurrent subprocess timeout using asyncio.wait_for so that one Git operation hanging doesn't cause other Git operations to fail--this is good for when a Git login popup may go unnoticed by the human.

GitMC uses command-line Git because PyGit also requires command-line Git installed, and we don't need the advanced functionality.


Also see PyGit-bulk for managing large (100+) numbers of users / teams.

This repo contains a Git pre-commit script with explanation.

Install

Install Git in a way accessible from the command line line

  • Mac: brew install git
  • Linux: apt install git
  • Windows: command line Git.
python -m pip install -e .

Usage

gitbranch : Tells of any non-master branches under directory ~/code

python -m gitutils.email : list all contributor email addresses. To fix unwanted emails use Git-filter-repo

find_missing_file : find directories missing exact fullpath to file

find_matching_file : find directories matching exact fullpath to file

Sync large number of git repos

These assume numerous subdirectories under ~/code. They work very quickly for large numbers (100+) of repos.

  • gitstat check if any local repos have pending changes
  • gitpull Git pulls all repos (suggest gitfetch instead)
  • gitfetch Git fetches all repos, printing a summary of files changed on remote

Place an empty file .nogit in a subdirectory to skip it.

[optional] speedup with https pull

For public repos, to make the Git remote checking go at least twice as fast, and significantly reduce the computational burden when SSH is used for git push (as is recommended), consider the "pushInsteadOf" global Git config. To do this, when cloning a public repo (including ones you're a collaborator on), use git clone https://. This global SSH push config one-time does SSH push for HTTPS-cloned repos:

git config --global url."ssh://github.com/".pushInsteadOf https://github.com/

The pattern matching can be made for all sites by omitting github.com from the command above, or you can refine it for each site, or even for specific usernames by editing the command above. For private repos, simply clone with SSH as usual

Preview all changed Jekyll files

This is for a website made using Jekyll or Hugo:

ActOnChanged . -p

It shows web page previews of all pages changed locally--start the Jekyll or Hugo debug server first e.g. hugo serve

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitutils-1.12.1.tar.gz (13.8 kB view hashes)

Uploaded Source

Built Distribution

gitutils-1.12.1-py3-none-any.whl (16.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page