Skip to content

Auto-compute Citations to Software From Replication Files

Notifications You must be signed in to change notification settings

recite/softverse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Softverse: Auto-compute Citations to Software From Replication Files

We analyze replication files from 34 social science journals including the APSR, AJPS, JoP, BJPolS, Political Analysis, World Politics, Political Behavior, etc. posted to the Harvard Dataverse to tally the libraries used. This can be used as a way to calculate citation metrics for software.

see: https://gojiberries.io/2023/07/02/hard-problems-about-research-software/

Scripts

  1. Datasets by Dataverse produces list of datasets by dataverse (.gz)

  2. List And Download All (R) Scripts Per Dataset takes the files from step #1 and produces list of files per dataset (.gz) and downloads those scripts (dump here)

  3. Regex the files to tally imports takes the output from step #2 and produces imports per file and imports per package (if there are multiple imports per repository, we only count it once). A snippet of that last file can be seen below.

p.s. Deprecated R Files here

Top R Package Imports

package count
ggplot2 1322
foreign 1009
stargazer 901
dplyr 789
tidyverse 720
xtable 608
plyr 485
lmtest 451
MASS 442
gridExtra 420
sandwich 394
haven 356
car 342
readstata13 339
reshape2 324
stringr 318
texreg 273
data.table 263
scales 257
tidyr 253
grid 247
lme4 241
Hmisc 236
lubridate 223
readxl 218
broom 195
lfe 190
RColorBrewer 188
ggpubr 188
estimatr 174

Authors

Gaurav Sood and Daniel Weitzel

About

Auto-compute Citations to Software From Replication Files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published