Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent gix status performance with clean trees #1771

Open
Byron opened this issue Jan 16, 2025 Discussed in #1767 · 5 comments
Open

Inconsistent gix status performance with clean trees #1771

Byron opened this issue Jan 16, 2025 Discussed in #1767 · 5 comments

Comments

@Byron
Copy link
Member

Byron commented Jan 16, 2025

Discussed in #1767

Originally posted by eatradish January 14, 2025
Hi all, I have written a git status plugin for Bash called bash-git-status, which previously leveraged git-status to detect and prompt users about a Git repository's status:

  • Red if the repository contains uncommited changes.
  • Purpose if the repository contains untracked files.
  • Green if the repository is clean.

Now that gix status is ready, I followed starship#6476 to re-implement this function in hope of better performance. You may find the current implementation here:

AOSC-Dev/bash-git-status#1

However, I found that the performance improvements, while significant, is not consistently found in all cases. For instance, in a "clean" (unchanged) aosc-os-abbs copy, I observed an 1.3x improvement in response time; however, in a "clean" linux copy, the reponse time worsened by about 5x.

The benchmark results are as follows, using

hyperfine -i --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' '/home/saki/bash-git-status/target/release/bash-git-status' 'bash-git-status'

linux get 5x slower with unchanged or untracked:

Benchmark 1: /home/saki/bash-git-status/target/release/bash-git-status
  Time (mean ± σ):      1.567 s ±  0.009 s    [User: 0.185 s, System: 1.061 s]
  Range (min … max):    1.547 s …  1.580 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: bash-git-status
  Time (mean ± σ):     316.9 ms ±   8.2 ms    [User: 108.7 ms, System: 699.8 ms]
  Range (min … max):   307.1 ms … 332.7 ms    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  bash-git-status ran
    4.95 ± 0.13 times faster than /home/saki/bash-git-status/target/release/bash-git-status

changed:

Benchmark 1: /home/saki/bash-git-status/target/release/bash-git-status
  Time (mean ± σ):      87.1 ms ±   2.1 ms    [User: 24.4 ms, System: 153.6 ms]
  Range (min … max):    84.1 ms …  91.1 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: bash-git-status
  Time (mean ± σ):     314.9 ms ±  10.1 ms    [User: 110.4 ms, System: 683.4 ms]
  Range (min … max):   299.4 ms … 327.3 ms    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  /home/saki/bash-git-status/target/release/bash-git-status ran
    3.62 ± 0.15 times faster than bash-git-status

Test Enviromment:

  • OS: AOSC OS (Linux Kernel 6.12.9)
  • CPU: AMD Ryzen 8845H @ 5.14Ghz (8c16t)
  • RAM: 32GiB
  • Storage: Samsung Electronics Co Ltd NVMe SSD Controller PM9C1a (DRAM-less)
@davidkna
Copy link
Contributor

I also noticed that the repo used by zsh-bench regressed a bit:

Command Mean [ms] Min [ms] Max [ms] Relative
starship module git_status 24.4 ± 2.1 21.3 39.1 1.00
starship-gix module git_status 34.2 ± 1.6 31.0 37.6 1.40 ± 0.14

@Byron
Copy link
Member Author

Byron commented Jan 17, 2025

Thanks for checking!

Which platform was that on? I thought it Linux is more sensitive to regressing with whatever gitoxide is doing, even though occasionally I also see runs that all of the sudden are a 4x-5x slower.

My feeling is that this might be related to tuning thread-counts better, but thus far I didn't yet get to dig into this.

@davidkna
Copy link
Contributor

That test was run natively on macOS (m1) to take advantage of hyperfine. zsh-bench with the docker runner on the same machine runs at with 17.755 ms with git vs. 36.504 ms for gitoxide.

@TomPridham
Copy link

TomPridham commented Jan 29, 2025

A reason it is so much slower for clean repos is that there is logic to exit early in the case of changes. If I make a change to a top level file in the linux repo gix-status exits in 45ms, but if I make another change a couple of directories deep it takes 110ms. Without clearing the cache, gitoxide was faster than git, but it is significantly slower if you do clear the cache.
It is spending a bunch of time in this function

pub fn depthfirst<StateMut, Find, V>(

The rest of the threads complete in ~500ms for me, I hid them to make the screenshot smaller

Image
more specifically here

Image

@Byron
Copy link
Member Author

Byron commented Jan 30, 2025

Thanks so much for profiling this, that's very helpful!

And I think it makes sense that this is the cause of the slowdown, because it's part of the tree-index comparison that was recently added, and before that these performance slowdowns weren't known.

It's a single-threaded tree-traversal that is used to create an index from the HEAD^{tree} for comparison against .git/index. Creating an intermediate index was a matter of simplicity at about 15% extra cost compared to just traversing a tree for comparison against .git/index directly.

However, no matter what, HEAD^{tree} has to be traversed, so all objects related to the tree (subtrees) have to be loaded, usually these are in a pack, some of them might also be in loose object files if they are more recent.

On cold cache, it appears the IO needed to access loose objects and packs on top of traversing the directory and statting all files is causing huge slowdowns.

The best I can imagine doing is to run the tree-index comparison sequentially.

The code for that is here, but it's not super trivial to make it sequential, even though it's probably OK to 'hack' it just to get an idea of the performance implications of sequentially running this part.

The question here is if these slowdowns go away once the parallelism is reduced.

Something else that could be done is to use strace particularly to see how long certain fopen/mmap calls take. I'd think that some of these hang for a while or are much slower than usual, just because the user-time (i.e. the computation done by non-kernel code) should remain with hot and cold FS-caches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants