Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset download not working #7

Closed
nikit-srivastava opened this issue Jul 16, 2024 · 2 comments
Closed

dataset download not working #7

nikit-srivastava opened this issue Jul 16, 2024 · 2 comments

Comments

@nikit-srivastava
Copy link

Hello,

Thank you for the nice work.

I would like to benchmark your system for my research work, however, I have been facing an error while running the data_download.sh script.

In your data_download.sh script there seems to be a problem while executing line 21 and line 26. It seems that the zip files being downloaded in these lines are corrupted. Looking further into this issue, I found that the wget statement is not downloading the zip files correctly. Moreover, the google drive link at line 26 does not work. To fix the wget issue, I would like to suggest the following at the line 21:
curl -L "https://drive.usercontent.google.com/download?id=1QbT5FDOJtdVd7AqZ-ekwUh2_pn6nNpb3&confirm=xxx" -o output_pred21_8_30.zip

For, wiki-news-300d-1M.zip you probably need to check the permissions of the zip file on your drive or reupload it.

@Reham-Osama
Copy link
Collaborator

Hello,

Can you please try to download the files from these two links and let me know if they work:

  1. wiki-news-300d-1M.zip: https://drive.google.com/file/d/1UTPGv8QUgqSVQ2JeX9QVW0YhbGRxONLL/view?usp=sharing
  2. output_pred21_8_30.zip: https://drive.google.com/file/d/1QbT5FDOJtdVd7AqZ-ekwUh2_pn6nNpb3/view?usp=sharing

It looks like there was an update in Google Drive that hinders downloading big files in the way used in the script.
For now, you can comment Line 21 and 26 and download the files using these provided links.

@nikit-srivastava
Copy link
Author

@Reham-Osama Thank you for making the links available, the dataset download now seems to complete normally. However, I encountered an error where the word_embedding docker container was unable to find the embedding file. I managed to resolve it by changing line 8 @ docker-compose-server.yml to - ./data/wiki-news-300d-1M.txt:/app/data/wiki-news-300d-1M.txt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants