Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: Rewrite update script #372

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fstachura
Copy link
Collaborator

No description provided.

tleb and others added 2 commits December 21, 2024 20:12
Avoid calling this parse-docs script that is expensive. This heuristic
avoids running it on most files, and is almost free.

Signed-off-by: Théo Lebrun <[email protected]>
By default ctags sorts entries. This is not useful to the update script,
but takes time.
user time for `update.py 16` on musl v1.2.5 went from 1m21.613s to
1m11.849s.
@fstachura fstachura marked this pull request as draft December 29, 2024 22:43
@fstachura
Copy link
Collaborator Author

fstachura commented Dec 29, 2024

I noticed that deduplicating definitions from references doesn't work properly.

New update script uses futures to dynamically schedule many smaller
tasks between a constant number of threads, instead of statically
assigning a single long running task to each thread.
This results in better CPU saturation.

Database handles are not shared between threads anymore, instead
the main thread is used to commit results of other threads into the
database.
This trades locking on database access for serialization costs - since
multiprocessing is used, values returned from futures are pickled.
(although in practice that depends on ProcessPool configuration)
@fstachura fstachura marked this pull request as ready for review December 30, 2024 00:20
@Daniil159x
Copy link

Hi, I think these changes are good and work better than update.py in the master.

But I have CPU idle when processing futures in the main thread.
изображение

Maybe async would be better in utilizing CPU?

@tleb
Copy link
Member

tleb commented Feb 1, 2025

Cannot reproduce good performance. I compared the original update.py versus my own PoC (called update-ng.py below) versus yours (called update-franek.py below). Everything is in a single branch to simplify testing (sorry for the crappy commit messages).

Command Mean [s] Min [s] Max [s] Relative
update.py 40.472 ± 0.196 40.250 40.617 4.72 ± 0.04
update-ng.py 8.578 ± 0.055 8.531 8.639 1.00
update-franek.py 80.363 ± 0.164 80.204 80.531 9.37 ± 0.06

Here is what it looks like:

⟩ hyperfine --min-runs 3 --export-markdown benchmark-table.md \
--parameter-list update update.py,update-ng.py,update-franek.py \
--prepare 'rm -rf data/musl/data/*' \
'TLEB_UPDATE={update} TLEB_NO_FETCH=1 ./utils/index ./data musl'
Benchmark 1: TLEB_UPDATE=update.py TLEB_NO_FETCH=1 ./utils/index ./data musl
  Time (mean ± σ):     40.472 s ±  0.196 s    [User: 71.356 s, System: 39.680 s]
  Range (min … max):   40.250 s … 40.617 s    3 runs

Benchmark 2: TLEB_UPDATE=update-ng.py TLEB_NO_FETCH=1 ./utils/index ./data musl
  Time (mean ± σ):      8.578 s ±  0.055 s    [User: 72.419 s, System: 38.537 s]
  Range (min … max):    8.531 s …  8.639 s    3 runs

Benchmark 3: TLEB_UPDATE=update-franek.py TLEB_NO_FETCH=1 ./utils/index ./data musl
  Time (mean ± σ):     80.363 s ±  0.164 s    [User: 78.747 s, System: 49.339 s]
  Range (min … max):   80.204 s … 80.531 s    3 runs

Summary
  TLEB_UPDATE=update-ng.py TLEB_NO_FETCH=1 ./utils/index ./data musl ran
    4.72 ± 0.04 times faster than TLEB_UPDATE=update.py TLEB_NO_FETCH=1 ./utils/index ./data musl
    9.37 ± 0.06 times faster than TLEB_UPDATE=update-franek.py TLEB_NO_FETCH=1 ./utils/index ./data musl
  • script.sh is limiting to first 10 tags because laptop on battery.
  • TLEB_NO_FETCH=1 makes sure that utils/index does not try doing Git fetches (avoid network ops in benchmark).
  • Not done in a Docker container to have hyperfine report valid usr and sys timings.
  • I have a weird reproducible issue with my update-ng.py that has those timings: wallclock 7.040s, usr 18.589s, sys 178.334s. On the same system, update.py does wallclock 22.012s, usr 23.667s, sys 46.578s. Notice the massive sys, which I cannot understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants