-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets seem to have 404 errors. #13
Comments
Hey, thanks for compiling a list of missing things! Indeed some things are missing (and some of it was expected, as not all contractor data was uploaded on the servers). Do you have a rough estimate of how much data is 404'ing (in percentages)? If it is more than expected, @brandonhoughton could take a look at it :) |
Appologies, still trying to get a real handle on that answer, if I go by hazy-thistle-chipmunk its about 20% missing, if i go by woozy-ruby-ostrich its more like 2-5% so far. so, I have yet to get a proper answer for that, but I think I had a tool that checks status codes, without downloading that might be able to 'ping' every file and give a 100% answer across all datasets. Just... trying to find that program. |
Hi! This was an issue with my indexing code. I checked if both files exist and then ignore that and include all files =P |
Still working on this, however: can someone verify there is a good copy of https://openaipublic.blob.core.windows.net/minecraft-rl/data/10.0/thirsty-lavender-koala-f153ac423f61-20220414-104227.jsonl as I keep getting an invalid output that ends mid line 442. |
@brandonhoughton wonder if something happened in the upload process where these files broke half-way upload? |
Hi, have been getting chunks of https://openaipublic.blob.core.windows.net/minecraft-rl/snapshots/all_7xx_Apr_6.json and I have noticed each username is missing chunks of data.
Attached is what is coming up as 404 for the user hazy-thistle-chipmunk but all seem to be missing chunks (both mp4 + jsonl)
hazy-thistle-chipmunk-404.txt
The text was updated successfully, but these errors were encountered: