Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data creation scripts for markdown #82

Open
shubhamagarwal92 opened this issue Sep 7, 2023 · 0 comments
Open

Data creation scripts for markdown #82

shubhamagarwal92 opened this issue Sep 7, 2023 · 0 comments

Comments

@shubhamagarwal92
Copy link

Hi!

Thank you for open-sourcing the code. Could you please also provide scripts related to the GROBID library and storing in markdown format as mention in the Appendix of the paper as:

We use a modified version of the GROBID library for converting PDFs to text, as well as obtaining titles,
authors and citations. 

The final paper documents are stored in a markdown format, as opposed to full LaTeX. We use markdown as
the standard format for all documents in the corpus to support knowledge blending between sources. Papers
are citation processed, following the title-based approach
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant