Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksum for PubLayNet_PDF.tar.gz #45

Open
conjuncts opened this issue Feb 28, 2024 · 1 comment
Open

Checksum for PubLayNet_PDF.tar.gz #45

conjuncts opened this issue Feb 28, 2024 · 1 comment

Comments

@conjuncts
Copy link

Hello,
I tried downloading the pdf dataset, but I only unzipped around 10% before I ran into a data corruption issue. Are checksums or data splits available for the PubLayNet_PDF.tar.gz?

@themanoftalent
Copy link

It sounds like you're encountering issues with downloading the PubLayNet dataset. Unfortunately, without specific details about where you're downloading the dataset from, it's challenging to provide a precise solution for me. However, I can offer some general advice for ya.

  1. Check for Official Sources: Ensure that you're downloading the dataset from the official source. This is very typical.
  2. Checksums: Check if the dataset provider offers checksums for the files.
  3. Data Splits: Some datasets are split into multiple parts for easier downloading. Ensure that you've downloaded all parts.
  4. Redownload: If you suspect the downloaded file is corrupted, try downloading it again. It works sometimes.
    Akif, the outlier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants