Add a way to respect API rate limits and timeouts #66

AntonLydike · 2024-09-19T10:07:51Z

According to the api docs, the response may contain the following headers to indicate a request to self-limit request rates:

X-Rate-Limit-Limit: 50
X-Rate-Limit-Interval: 1s

It would be neat if this API supported a mode to self-limit requests to conform to this, or allow for a way to signal these limits to an underlying user.

Happy to submit a patch, if this is a welcome feature.

Also, please let me know if something like this is already implemented here, then I'm happy to write some documentation!

The text was updated successfully, but these errors were encountered:

fabiobatalha · 2024-09-19T14:34:14Z

Hello @AntonLydike

There is a polite mode in the API. In fact, this API has a synchronous approach so usually it never do lots of requests. One implementation that is attended to increase the API performance is to do requests in parallel using multiprocessing or something like that while iterating into pages.

You can review the polite mode.

fabiobatalha · 2024-09-19T14:42:15Z

I toke a look in the implementation and it seems to be broken, it should be improved for better performance.

Take a look at:

crossrefapi/crossref/restful.py

Line 58 in 53a0c77

def do_http_request( # noqa: PLR0913

AntonLydike · 2024-09-19T17:16:23Z

Basically, what I'm doing is sharing a single Works object between multiple threads. I implemented rate limiting on top of that, but I basically have to guess the current limits (which seems to vary daily, some days I get away with more requests/second than others).

It would be cool to have an internal method inside the API to handle this rate limiting even when used in a multi-threaded workload. (no need to do multiprocessing here as pythons multithreading works fine for IO bound workloads like this one).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a way to respect API rate limits and timeouts #66

Add a way to respect API rate limits and timeouts #66

AntonLydike commented Sep 19, 2024

fabiobatalha commented Sep 19, 2024

fabiobatalha commented Sep 19, 2024 •

edited

Loading

AntonLydike commented Sep 19, 2024

Add a way to respect API rate limits and timeouts #66

Add a way to respect API rate limits and timeouts #66

Comments

AntonLydike commented Sep 19, 2024

fabiobatalha commented Sep 19, 2024

fabiobatalha commented Sep 19, 2024 • edited Loading

AntonLydike commented Sep 19, 2024

fabiobatalha commented Sep 19, 2024 •

edited

Loading