-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK Sends Repetitive Requests (no replays allowed) #3
Comments
Have experienced with queryBlocks calls similar to
Interestingly, I've ONLY experienced it when querying a specific managed L2 node (public ID: Other managed L3s ( In fact, on the managed L3, I'm able to rapid-fire requests to pull all 15000+ blocks 50 at a time and have done so perhaps 15-20 times in testing with no error at any time. The managed L2, though, errors within the first 5 calls or so every time. The application experiencing this issue can be found here: |
The problem area is this line in think https://github.com/dragonchain/dragonchain-sdk-javascript/blob/master/src/services/dragonchain-client/DragonchainClient.ts#L1324 I suggest that we use more precise NODEJS timing functions. process.hrtime.bigint() |
The date has to be sent as an ISO 8601 time string, so I don't know if getting a bigint unix timestamp would help. Additionally that function you linked isn't compatible with node 8, which we support, so we can't use that function anyways. I'm also not 100% convinced it's that line with the issue, because I've seen this error happen when performing a query manually in the node repl for the first time and I still get the error. Could maybe be the http request library doing some weird re-request? Either way, this warrants more investigation. |
for(let i=0;i<100;i++){console.log(new Date().now().toISOString())} yields:
|
if we had more precision in that string we could solve this, but i actually agree that bigint might not be exactly what we need... after reading further on it |
Yeah, of course you can get a raw for loop to generate the same date string when there's no other logic. But I've also seen this error occur when performing a single request, so although getting the date iso string may be a problem, I'm still not convinced that's the root cause. |
#begindumbness I shouldn't ever be intentionally sending the same request through twice at all, and it continues to error if I DO reissue the same request, even seconds or minutes apart, until I issue a new (different) request. |
Yeah, the replay protection only comes into play if the signature of the request is identical, which means all parts of the request, including query parameters and time stamp would have to be identical to trigger replay protection. |
Okay, pure speculation here: is it possible that the SDK client is reusing previous fetch options somehow if the query parameters match a previous failed request? If timestamp is enough to get past the replay protection, I'd think reissuing the same failed request a few seconds or more later would DEFINITELY get it through, but it doesn't. |
I'm not sure why that would happen, but I wouldn't immediately rule it out until I see proof otherwise. There's no code to 'cache' any sort of request data or anything, but yeah, if you're making a request seconds later, it absolutely should work fine. |
Let me see if I can force-break it again and prove that idea. |
Disregard that thought. It was a red herring. Once I get it to error, I can rerun that request over and over (with a few seconds between each request), and whether it succeeds appears to be random (no specific trigger from my end). I'll leave this to the experts. 😄 It IS still ONLY happening on that specific managed L2 for me though. Not a single issue with other nodes. |
I am fairly confident that the timestamp precision is the issue. Python gives 6 digits for the miliseconds. Node gives 3. # Python
for x in range(0, 100): print(datetime.datetime.utcnow().isoformat() + 'Z')
|
I know that the node sdk has less precision, BUT I have seen this happen when performing a SINGLE request in a repl, which is why even if the timestamp was an issue, there is at least one other thing also causing this problem. |
Check this out --> https://github.com/wadey/node-microtime |
ahh. gotcha |
Check this out: I have crap for internet, and I’ve been having even more trouble with connection timeouts, etc., in recent days, and usually only to certain sites/addresses. I wonder if I’m not causing the problem by failed requests that node/my browser is automatically repeating, but only to that endpoint? |
If it’s failed request handling on the nodejs/expressjs side from my end, they would explain how I could be sending two identical queries, but I’d still think it wouldn’t be sending them in the same thousandth of a second. |
Yeah, I've seen that problem before with express as the webserver, but dragonchain uses python flask+gunicorn for the webserver, so I highly doubt that's the issue. I currently suspect some sort of hidden retry logic within node's HTTP request code itself, or maybe in |
Yeah, unfortunately I haven't come across a consistent way to repro. Any consistent repro steps would be highly appreciated. |
Attempted a fix with #5 Note this was released as version |
Same problem is occurring both with and without the adjusted timestamp format. I manually added the new timestamp format (with 3 random trailing digits) in SDK version 3.2.0 and ran the same method that was causing the problem. In my tests, I've added code that should log ANY http/https request that's made. Sample requests with old timestamp format:
Sample requests with new timestamp format:
|
Yeah, this strengthens my suspicion that there must be some silent retry logic going on that we don't see. Warrants further investigation |
Could even be an http protocol-level matter (https://blogs.oracle.com/ravello/beware-http-requests-automatic-retries), but that seems to normally affect long-running requests, and I have no idea if node would see that as a separate request (and thus log it in my code) or not. In fact, I'm inclined to think I WOULDN'T see the retry. From the article:
Edit: though the actual requests are being made within my nodejs app (inside an expressjs request), not being made directly through the client SDK in the browser). The weirdest part is that this is only happening against the one managed L2 for me in my testing, and it happens extremely consistently (every attempt to pull a large number of blocks, and within the first 5-10 requests). |
Yeah, the even weirder part is I haven't ever seen this issue in the python SDK. At some point when I get some time, I'll proxy my requests through something that allows me to watch my http traffic so I can watch the actual packets going in/out and confirm if this is happening or not. |
Quick update on this issue: tried loading from the problematic managed L2 I originally noticed the problem on, and since the 4.0 update, it looks like it's working fine now. Pulled 38k+ blocks with no trouble at all. I'm still blaming elasticsearch. Will update here if I run into it again. |
For some reason, the javascript SDK occasionally sends repeat requests which trigger the replay protection on Dragonchain (
Previous matching request found (no replays allowed)
). This is a bug that only occurs with this SDK.Cause at this time is still unknown. If any details are discovered, this ticket should be updated.
The text was updated successfully, but these errors were encountered: