Managing all data checks
Install bdchecks
with:
install.packages("bdchecks")
Or for development version:
devtools::install_github("bd-R/bdchecks")
Load with:
library(bdchecks)
Perform data checks (not exported yet):
result <- bdchecks::perform_dc(data_bats)
Check what data checks were performed (default show method):
result
Quick glance at data check result (% of records that passed) (not exported yet):
# Nice summary
summary_dc(result)
This is a recommended workflow to add new data checks:
- Load libraries
library(bdchecks)
library(devtools) # To install new version of a package
library(usethis) # To export data.checks object
- Check if original build works
check()
- Create new data check
3.1 Add meta information to ./inst/extdata/data_check.yaml
file
3.2 Add data check function to ./R/
directory (file should be named after a data check, e.g. dc_checkthis.R
). First argument to a data check function must be a vector (column) to perform data check on.
3.3 Add test data to ./inst/extdata/data_test.yaml
- Export new dat check
install() # To have new check in `system.file("extdata/data_check.yaml")`
data.checks <- bdchecks:::datacheck_info_export() # export documentation and combines new check with old ones
use_data(data_taxonrank, data.checks, data_bats, overwrite = TRUE, internal = TRUE) # exports old (and new data checks)
document() # document new check to `Rd`
install() # install package with a new check
- Test if everything works
perform_test_dc() # perform tests for data checks
check() # perform general check
- Post-incorporation
6.1 Increase version (and date) in DESCRIPTON
6.2 Run git add
and git commit
with message: "v0.x.x Added name_of_check data check"