Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: sleep between concurrent groups #481

Open
1 of 2 tasks
nwsparks opened this issue May 23, 2024 · 8 comments
Open
1 of 2 tasks

Feature request: sleep between concurrent groups #481

nwsparks opened this issue May 23, 2024 · 8 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@nwsparks
Copy link

The feature request

A flag to sleep for X seconds between concurrent groups or between pr's would be very helpful to prevent build storms.

For example if concurrency was 5, you could set --second-between-executions 60 and it would process 5 repositories, wait 60 seconds, then process another 5 repositories.

Alternatively it could be done in between each PR but I suspect that is harder to accomplish. Controlling it via concurrency seems fine to me.

Implementation

  • I would like to contribute this feature if it's a suitable addition to multi-gitter
  • I have no intention of adding this feature myself.
@nwsparks nwsparks added the enhancement New feature or request label May 23, 2024
@lindell
Copy link
Owner

lindell commented May 23, 2024

Cloud you please add context on why you want to do it. And which platform you are using /GitHub/GitLab etc.

@nwsparks
Copy link
Author

sure, the platform is github. i'm doing a simple change, updating .dockerignore, which is a fast operation. I'm doing it against quite a few repositories so without a pause between pull requests it will cause a build storm and clog up CI for quite awhile.

I settled for just adding a sleep at the end of the script and low concurrency which I guess is what most people could do in lieu of a flag to control it.

@lindell
Copy link
Owner

lindell commented May 23, 2024

Seems like a valid usecase. But since it has a simple workaround I will not prioritize it. If anyone else wants to pick it up, feel free.

I think the easiest option would be to add X seconds before starting any new run (except the first batch).

@lindell lindell added the good first issue Good for newcomers label May 23, 2024
@tplass-ias
Copy link

tplass-ias commented Jan 10, 2025

For the close command, no script is run, so the workaround doesn't work and I get a secondary rate limit error from GitHub:

failed to get pull request batch: could not make GitHub GraphQL request: You have exceeded a secondary rate limit.

more details on github secondary rate limits here: https://docs.github.com/en/graphql/overview/rate-limits-and-node-limits-for-the-graphql-api#secondary-rate-limits

the close case could potentially be solved by refactoring API usage, e.g. one big graphql query is generally better than sending many graphql api calls. Please let me know and I can file a separate issue for either/both: this specific close case, generic GitHub throttling handling issue

for context, I am trying to operate on ~1k repos in github within a single GH org. Thx for the great tool! Handling gh rate limits would make this my preferred tool over https://github.com/gruntwork-io/git-xargs

@lindell
Copy link
Owner

lindell commented Jan 13, 2025

@tplass-ias Multi-gitter fetches all statuses with one API call:
https://github.com/lindell/multi-gitter/blob/master/internal/scm/github/github.go#L693

It seems to be unfortunate that you hit it during that call, but even if you do, it should be opaque from a user perspective and nothing that you should have to manually specifying sleeps to resolve.

Seconary rate limit detection, backoff and retry is already implemented, but only for V3 (non graphql) calls.
This should be implemented for the GraphQL calls as well, and it should solve your problem.

@tplass-ias
Copy link

tplass-ias commented Jan 13, 2025

thanks for the reply @lindell, that should indeed solve the problem. I wanted to call out something that has tripped up my team when dealing with GitHub API's: the rate limit is shared for REST and GraphQL calls

We ended up making a meta-client with a single lock, which dispatches calls through the respective REST or GraphQL client 😓

edit after actually reading some of the code: I see you already have a shared lock, but only use it for rate limiting PUT/POST calls. These cost more in the secondary rate limit, but GET also spends against the same rate limit. A simple solution could be to subject READ calls to the modLock 🤷

@lindell
Copy link
Owner

lindell commented Jan 13, 2025

I see now that they have a way more detailed explanation for the secondary rate limit than before.

Previously, the only description said that it existed, not really how it worked. Only that concurrent POST/PUTs would be limited.

I don't think disabling all concurrency against the API will be good as it may slow down most use-cases by a lot. But using the details that now exist, we should be able to proactively limiting the calls.

@lindell
Copy link
Owner

lindell commented Jan 19, 2025

@tplass-ias This is the more basic version of just refactoring to use the retry mechanic for graphQL calls as well. It should also make it easier to take the step to use the point system etc in the future.

But for now, I think this should fix your problem: #524

PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants