-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default retry behavior causes rate limited events to be lost and swallows exceptions #86
Comments
Thanks @tommedema for the feedback. Sorry about the friction you've experienced using the SDK, we'll be taking it to make improvements. As for
The goals of the lines around error 400 for events send against the HTTP v2 API is to find and remove the events that caused the API to raise an invalid payload response, like this utility that's used to prune out events. This case is for an invalid payload of length 1 - in which case it would not make sense to retry afterwards as it means this event would not ever be successfully uploaded. Is |
Right, we don't read from |
This doesn't address my point. You're talking about an invalid state, yet you're not throwing any exceptions. This requires basic error handling rather than simply swallowing these errors. I don't disagree that you can't retry them, but you shouldn't swallow the errors either. This should throw an exception so the consumer can handle them. |
Perhaps I'm not explaining myself correctly. The fact that you return a response doesn't help if that response is not a good representation of what actually happened. For example if I log an event without a userId, this is clearly an application error and I should know about this, yet you simply remove this event in I.e. you say it's successful yet clearly it was not because you just lost an event. |
In the docs it says you should retry after 30 seconds in these cases:
|
The client itself should return a |
Sure, do note that you're doing this in many other places too. For example here, here, here, here, here, here, etc. We ended up replacing the entire class with something that doesn't retry but at least bubbles up what is going wrong: https://gist.github.com/tommedema/04de4004ee5b0f17fcb3fd29b2ad92b1 This then allows the consumer to actually retry or throw an application error: https://gist.github.com/tommedema/0f80ae4c7334e6531c56c598643f0eb7 |
This one is hard to explain so I just made a screen recording going over the latest source code:
https://app.usebubbles.com/bpPaAzXS3vWMpZh8tZU3N9/amplitude-retry-behavior-issues
Please let me know your thoughts. We're seeing 429s on our amplitude dashboard and we're looking to catch these 429 so that we can push them to a retry queue for future consumption.
Also, in your readme you say we can create our own retry class with a sendEventsWithRetry method, but then how do we access the underlying transport method?
This would require something like:
But your
setupDefaultTransport
function is not exported as part of the npm module as far as we can seeSimilarly, to create a retry handler we'll need to implement
RetryClass
, butRetryClass
is not exported:Additionally, it seems like the retry logic is currently ignoring the most common kind of throttling. An example of a response payload for a throttled device is:
So in addition to looking at exceededDailyQuotaDevices and exceededDailyQuotaUsers, the props throttledDevices and throttledUsers should be considered. We are now going to write custom logic to handle this case in our own retry class, but it's not ideal. Unfortunately, the "plug and play" promise of Amplitude is far from a reality at the moment
The text was updated successfully, but these errors were encountered: