You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I notice that the train.jsonl file downloaded from huggingface under DocDownStream1.0 contains a lot of duplicate data depending on the data source. For example, each data in DocVQA is duplicated 3x. 2x for Deepform, InfographicsVQA, and KleisterCharity, and WikiTableQuestions. Is this intentional?
The text was updated successfully, but these errors were encountered:
Hi, I notice that the train.jsonl file downloaded from huggingface under DocDownStream1.0 contains a lot of duplicate data depending on the data source. For example, each data in DocVQA is duplicated 3x. 2x for Deepform, InfographicsVQA, and KleisterCharity, and WikiTableQuestions. Is this intentional?
The text was updated successfully, but these errors were encountered: