-
Extract the database to
data/git.duckdb
mkdir data/ mv ~/Downloads/git.duckdb.gz data/ gunzip data/git.duckdb.gz
-
Install dependencies and start Streamlit
poetry install poetry run streamlit run Recent.py
OR if you don't want to install Python, there's a
Dockerfile
and/or adocker-compose.yml
too, although it seems to run somewhat slower, perhaps due to the default cgroups limits? DuckDB seems rather hungry for resources.docker compose build docker compose up
-
Update your .ssh config - I use this config for multiplexing to speed up cloning
Host redmine-git User git HostName redmine.mgmtprod Port 2223 IdentityFile ~/.ssh/id_rsa ControlPath ~/.ssh/connections/%r@%h.ctl ControlMaster auto ControlPersist 10m IdentitiesOnly yes
-
Create an
.env
in this directoryexport GITLAB_HOST=gitlab.mgmtprod export GITLAB_USER=<username> export GITLAB_TOKEN=<personal access token> export GITLAB_ROOT="${HOME}/repos/gitlab" export GITOLITE_HOST="redmine-git" export GITOLITE_ROOT="${HOME}/repos/gitolite"
If necessary, create the GitLab personal access token first.
-
The indexing process happens in four steps:
- repository discovery (
poetry run python discover_gitlab.py
anddiscover_gitolite.py
)- this produces
data/repos-*.csv
- this produces
- cloning (or fetching) the repositories (
fetch_known_repos.py
)- this produces bare repositories in
GITLAB_ROOT
andGITOLITE_ROOT
- I have, in the past, used
git worktree
to work with bare repos locally toogit -C ~/repos/gitlab/odoo/odoo.git worktree add ~/work/odoo main
git -C ~/work/odoo commit
rm -rf ~/work/odoo
git -C ~/repos/gitlab/odoo/odoo.git worktree prune
- this produces bare repositories in
- indexing the repositories by parsing the output of
git ls-tree
andgit log --numstat
- produces
data/git_*.csv
- produces
- and lastly loading the CSVs into a DuckDB database
- produces
data/git.duckdb
- to be compressed into
data/git.duckdb.gz
usinggzip -k data/git.duckdb
- originally, this project ran on PostgreSQL
- but DuckDB is useful for a workshop format, and for sharing the DB index in general
- produces
- repository discovery (