Awesome Scholarly Data Analysis

List of resources on scholarly data analysis ranging from datasets, papers, and code about bibliometrics, citation analysis, and other scholarly commons resources. Available online at https://shubhanshu.com/awesome-scholarly-data-analysis/

Datasets

Publication and Citation

Peer Review

Grants and Funding

Academic Genealogy

Author Profiles

Author name disambiguation

Thesis datasets

Information Extraction and NLP

Networks

ACL Anthology Network
I³ Open Innovation Dataset Index - Multiple datasets related to patent networks, inventor careers, etc.
Science4cast Competition - capture the evolution of scientific concepts and predict which research topics will emerge in the coming years

Taxonomies and Ontologies of Research Concepts

SciGraph Springer Nature
Medical Subject Headings maintained by the National Library of Medicine of the United States
Computer Science Ontology maintained by Scholarly Knowledge: Modeling, Mining and Sense Making
Physics Subject Headings (PhySH) maintained by American Physical Society (APS) GitHub
Open Biological and Biomedical Ontology (OBO) maintained by the OBO Foundry
ACM Computing Classification System maintained by the Association for Computing Machinery
Physics and Astronomy Classification Scheme (PACS) maintained by American Institute of Physics (AIP) discontinued in 2010 and replaced by Physics Subject Headings
Mathematics Subject Classification (MSC) mantained by Mathematical Reviews and zbMATH
Journal of Economic Literature (JEL) maintained by the American Economic Association
STW Thesaurus for Economics maintained by ZBW - Leibniz Information Centre for Economics
Australian and New Zealand Standard Research Classification (ANZSRC) maintained by Australian Bureau of Statistics, it consists of 3 sub-classification schemes:
- Fields of Research (FoR) classification
- Research Fields, Courses and Disciplines (RFCD) classification
- Socio-Economic Objective (SEO) classification
Library of Congress Classification (LCC) maintained by Library of Congress
Fields of Study (FoS) maintained by Microsoft Academic
CrossRef Open Funder's Registry
Scientific Keyphrase Extraction Datasets - KP20k, NUS, MAG_KP
Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources
XL-BEL is a benchmark for cross-lingual biomedical entity linking (XL-BEL). The benchmark spans 10 typologically diverse languages
IteraTeR: Understanding Iterative Revision from Human-Written Text based on ArXiv abstract edit versions
CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation
AckExtract: Acknowledgement and its name entities extraction from scholarly papers
The MSVEC Dataset: Multi-Domain Scientific Claim Verification Evaluation Corpus (MSVEC)
GIANT: The 1-Billion Annotated Synthetic Bibliographic-Reference-String Dataset for Deep Citation Parsing - dataverse

Affiliations

Altmetrics and Dimensions

Tools

User interface to publication datasets and analysis

Tools for collecting open access papers

Tools for classifying research papers

Visualizations

Language Processing and Information Extraction

Citation and metadata extraction

Publication and Publisher Info

Interactive sheet for deciding publication strategy and open science - Tweet

Author Name Disambiguation

Bibliographic Entity Automatic Recognition and Disambiguation - paper

Community

Journals

Conferences

Workshops

Summer Schools

Courses

SI 710: Science of Science - University of Michigan School of Information

Associations & Community

Research Groups

Science of Science and Computational Discovery Lab - Colorado University, Boulder

Blogs

Contributions

The following people have contributed to the items on this list.

Shubhanshu Mishra - Maintainer of the list.
Angelo Antonio Salatino
Philipp Zumstein
Ali (Aliakbar Akbaritabar)
Andrea Mannocci

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
_config.yml		_config.yml

napsternxg/awesome-scholarly-data-analysis

Folders and files

Latest commit

History

Repository files navigation