-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git operations take a lot of time #7358
Comments
@andinus I am assuming that you've done a shallow clone?! If you have and you're still suffering from performance issues, I can submit a patch (PR) for this issue. The following is what I have in mind: a simple script that runs:
Add it to a GH workflow that Thoughts @manwar P.S: @andinus if upstream does not want to the proposed PR. Please note, that you can do this to your local clone. |
I've just seen that the |
I am also in favour of doing some house keeping. I use zsh with some git integration and the meanwhile 90k files slow down the shell. Can the "historic" commits maybe automatically be squashed, so we have perhaps only a single commit per week on the master? |
@andinus I think your recommendation is the best and quick approach (i.e. deleting stale dirs w/ Before I ran ╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time gs
Refresh index: 100% (88731/88731), done.
On branch issue/7358
Untracked files:
(use "git add <file>..." to include in what will be committed)
script/cleanup_readme_only
nothing added to commit but untracked files present (use "git add" to track)
real 0m3.350s
user 0m1.562s
sys 0m2.123s Running This is how long it took the shell script took to run. However, this may be just be a one time only since it deleted the entirety of the repo's history. ╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bash -c script/cleanup_readme_only
real 2m1.530s
user 4m33.764s
sys 3m12.803s It got rid of 39k files (see below), but we could do better. ╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ git diff --name-only HEAD~ | wc -l
39066 Doing Untracked files:
(use "git add <file>..." to include in what will be committed)
script/cleanup_readme_only
no changes added to commit (use "git add" and/or "git commit -a")
real 0m1.082s
user 0m0.602s
sys 0m0.817s Great improvement, but the script is too slow (even with ╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bin/cleanup
real 0m2.658s
user 0m2.780s
sys 0m5.675s Night and day!!! @manwar : let me know if this is a desirable action, and I'll submit the PR (all the code and local tests are complete). See below GH Action workflow: name: Cleanup Readmes From Repository
on:
schedule:
- cron: '0 0 * * 0' # Run at midnight every Sunday
jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Go
uses: actions/setup-go@v2
with:
go-version: 1.17
- name: Build Go Script
run: go build -o bin/cleanup bin/main.go
- name: Execute Cleanup
run: ./bin/cleanup |
IIRC even after a shallow clone, running this ^, it was slow. @ealvar3z Can you share the script? I'll try running that and report back. |
Please be advised that I ran this on a separate repo: Here's package main
import (
"fmt"
"os"
"path/filepath"
"runtime"
"sync"
)
func isReadmeOnly(dir string) bool {
files, _ := os.ReadDir(dir)
if len(files) == 1 && (files[0].Name() == "README" || files[0].Name() == "README.md") {
return true
}
return false
}
func cleanupReadmeOnly(wg *sync.WaitGroup, pathChan <-chan string) {
defer wg.Done()
for path := range pathChan {
if isReadmeOnly(path) {
os.RemoveAll(path)
}
}
}
func main() {
var wg sync.WaitGroup
ncores := runtime.NumCPU()
pathChan := make(chan string)
for i := 0; i < ncores; i++ {
wg.Add(1)
go cleanupReadmeOnly(&wg, pathChan)
}
err := filepath.WalkDir(".", func(path string, d os.DirEntry, err error) error {
if d.IsDir() {
pathChan <- path
}
return nil
})
if err != nil {
fmt.Println("Error:", err)
}
close(pathChan)
wg.Wait()
} And the #!/bin/bash
cleanup_readme_only() {
num_cores=$(nproc)
find . -type d -print0 | xargs -0 -I {} -P "$num_cores" bash -c \
'if [ "$(ls -A {})" = "README" ] || [ "$(ls -A {})" = "README.md" ]; \
then rm -rf {}; fi'
}
cleanup_readme_only |
It does improve performance, previous these took 71, 16 seconds. Takes about 8, 4 seconds now.
|
Maybe this issue depends on the workflow in use. In my setup I don't experience such performance issues. I'm operating on three branches in my fork of perlweeklychallenge-club:
Synchronize Updates are always fast-forward / incremental this way. |
Currently there are over 70,000+ files in this repository and every week we're adding 100s of files (every week a directory is created for every user and the previous "README" is copied).
I started participating with
challenge-076
. According to my records I've submitted solutions for 25 challenges, so there are ~100 useless directories with my name and a README file. With around 300 users, I believe this adds up.My primary machine is not very fast and it takes
70 seconds
to rungit status
.The text was updated successfully, but these errors were encountered: