Skip to content

Commit

Permalink
Merge pull request #18 from lcpilling/v0.2.8
Browse files Browse the repository at this point in the history
v0.2.8
lcpilling authored Oct 6, 2024
2 parents c19bff4 + 3c19007 commit e0c5ab3
Showing 13 changed files with 355 additions and 130 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -48,3 +48,4 @@ po/*~
# RStudio Connect folder
rsconnect/
docs
inst/doc
6 changes: 5 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: ukbrapR
Title: R functions to use in the UK Biobank Research Analysis Platform (RAP)
Version: 0.2.7.9000
Version: 0.2.8
Authors@R: c(person("Luke", "Pilling",
email = "L.Pilling@exeter.ac.uk",
role = c("aut", "cre"),
@@ -28,3 +28,7 @@ Encoding: UTF-8
LazyData: true
RoxygenNote: 7.2.3
BugReports: https://github.com/lcpilling/ukbrapR/issues
Suggests:
knitr,
rmarkdown
VignetteBuilder: knitr
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# ukbrapR v0.2.7.9000 (05 October 2024)
# ukbrapR v0.2.8 (05 October 2024)

### Bug fixes
- Baseline dates TSV is now correctly located even if user changes working directory
- HES operations dates were sometimes parsed as character - this is now fixed to parse as dates

### Updates
- Warnings relating to parsing issues during grepping that are safe to ignore are now suppressed
- Updates to documentation / examples / pkgdown site
- New website articles to `ascertain_diagnoses`, `label_fields` and for `spark_functions`


# ukbrapR v0.2.7 (30 September 2024)
8 changes: 7 additions & 1 deletion R/get_emr_spark.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
#' Get UK Biobank participant Electronic Medical Records (EMR) data in a RAP Spark environment
#'
#' @description Using a Spark node/cluster on the UK Biobank Research Analysis Platform (DNAnexus), use R to get medical records for specific diagnostic codes list
#' @description
#'
#' This function is not maintained. Better to use `get_diagnoses()`.
#'
#' Using a Spark node/cluster on the UK Biobank Research Analysis Platform (DNAnexus), use R to get medical records for specific diagnostic codes list
#'
#' @return Returns a list of data frames (the participant data for the requested diagnosis codes: `death_cause`, `hesin_diag`, and `gp_clinical`. Also includes the original codes list)
#'
@@ -36,6 +40,8 @@ get_emr_spark <- function(
verbose=FALSE
) {

lifecycle::deprecate_warn("0.2.0", "get_emr_spark()", "get_diagnoses()", details="Spark functions are no longer maintained any may contain bugs compared to newer functions.")

start_time <- Sys.time()

vocab_col = "vocab_id"
152 changes: 25 additions & 127 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,40 @@
# ukbrapR <a href="https://lcpilling.github.io/ukbrapR/"><img src="man/figures/ukbrapR.png" align="right" width="150" /></a>

<!-- badges: start -->
[![](https://img.shields.io/badge/version-0.2.7.9000-informational.svg)](https://github.com/lcpilling/ukbrapR)
[![](https://img.shields.io/github/last-commit/lcpilling/ukbrapR.svg)](https://github.com/lcpilling/ukbrapR/commits/master)
[![](https://img.shields.io/badge/version-0.2.8-informational.svg)](https://github.com/lcpilling/ukbrapR)
[![](https://img.shields.io/github/last-commit/lcpilling/ukbrapR.svg)](https://github.com/lcpilling/ukbrapR/commits/main)
[![](https://img.shields.io/badge/lifecycle-experimental-orange)](https://www.tidyverse.org/lifecycle/#experimental)
[![DOI](https://zenodo.org/badge/709765135.svg)](https://zenodo.org/doi/10.5281/zenodo.11517716)
<!-- badges: end -->

ukbrapR (phonetically: 'U-K-B-wrapper') is an R package for working in the UK Biobank Research Analysis Platform (RAP). The aim is to make it quicker, easier, and more reproducible.

> Since version `0.2.0` the package works best in a "normal" cluster using RStudio and raw UK Biobank data from the table-exporter. Prior versions were designed with Spark clusters in mind. These functions are still available but are not updated.
> Since `v0.2.0` ukbrapR works best on a "normal" cluster using RStudio and raw data from the table-exporter. Old Spark functions are still available but are not updated.
<sub>Wrapped server icon by DALL-E</sub>

## Installation

In the DNAnexus Tools menu launch an RStudio environment on a normal priority instance. Install {ukbrapR} as below:
In the DNAnexus Tools menu launch an RStudio environment on a normal priority instance.

```r
# install latest release (recommended)
remotes::install_github("lcpilling/ukbrapR@*release")

# development version
# remotes::install_github("lcpilling/ukbrapR")

# previous release (see tags)
# remotes::install_github("lcpilling/ukbrapR@v0.1.7")
```

## Export tables of raw data
## Ascertain diagnoses

Diagnosis of conditions in UK Biobank participants come from multiple data sources. {ukbrapR} makes it fast and easy to ascertain diagnoses from multiple UK Biobank data sources in the DNAnexus Research Analysis Platform (RAP). Follow the below steps. See the website article for more details.


This only needs to happen once per project. Running `ukbrapR::export_tables()` will submit the necessary `table-exporter` jobs to save the raw medical records files to the RAP persistent storage for the project. ~10Gb of text files are created. This will cost ~£0.15 per month to store in the RAP standard storage.
### 1. Export tables of raw data

Once the files are exported (~15mins) these can then be used by the below functions to extract diagnoses based on codes lists.
This only needs to happen once per project. Run `export_tables()` to submit the `table-exporter` jobs to save the required files to the RAP persistent storage. ~10Gb of text files are created, costing ~£0.15 per month to store.

## Get GP, HES, cancer registry, and self-reported illness data
### 2. Get diagnoses from all data sources

For a given set of diagnostic codes get the participant Electronic Medical Records (EMR) and self-reported illess data. Returns a list containing up to 6 data frames: the subset of the clinical files with matched codes.

@@ -54,22 +54,20 @@ head(codes_df_ckd)
#> 1 ckd ICD10 N18.3
#> 2 ckd ICD10 N18.4
#> 3 ckd ICD10 N18.5
#> 4 ckd ICD10 N18.6
#> 5 ckd ICD10 N18.9
#> 6 ckd ICD10 N19
#> ...

# get diagnosis data - returns list of data frames (one per source)
diagnosis_list <- get_diagnoses(codes_df_ckd)
#> 7 ICD10 codes, 40 Read2 codes, 37 CTV3 codes
#> ~3 minutes
#> ~2 minutes

# N records for each source
nrow(diagnosis_list$gp_clinical) # 29,083
nrow(diagnosis_list$hesin_diag) # 206,390
nrow(diagnosis_list$death_cause) # 1,962
```

## Get date first diagnosed
### 3. Get date first diagnosed

Identify the date first diagnosed for each participant from any of datasets searched with `get_diagnoses()` (cause of death, HES diagnoses, GP clinical, cancer registry, HES operations, and self-reported illness fields).

@@ -81,43 +79,14 @@ Also included are:

```r
# for each participant, get Date First diagnosed with the condition
diagnosis_df <- get_df(diagnosis_list)
#> ~2 seconds

# skim data
skimr::skim(diagnosis_df)
#> ── Data Summary ────────────────────────
#> Values
#> Name diagnosis_df
#> Number of rows 502269
#> Number of columns 8
#>
#> ── Variable type: character ─────────────────────────────────────────────────────
#> skim_variable n_missing complete_rate min max empty n_unique whitespace
#> 1 src 470334 0.0636 2 5 0 3 0
#>
#> ── Variable type: Date ──────────────────────────────────────────────────────────
#> skim_variable n_missing complete_rate min max median n_unique
#> 1 gp_df 489522 0.0254 1958-01-01 2017-09-06 2009-09-15 3263
#> 2 hes_df 477568 0.0492 1995-08-29 2022-10-31 2018-05-15 5562
#> 3 death_df 500342 0.00384 2008-02-20 2022-12-15 2020-03-03 1429
#> 4 df 0 1 1958-01-01 2022-12-01 2022-10-30 6367
#>
#> ── Variable type: numeric ───────────────────────────────────────────────────────
#> skim_variable n_missing complete_rate mean sd
#> 1 bin 0 1 0.0636 0.244
#> 2 bin_prev 0 1 0.0131 0.114
```

You can add a prefix to all the variable names by specifying the "prefix" option:

```r
# {optional} add a prefix to the variable names with "prefix"
diagnosis_df <- get_df(diagnosis_list, prefix="ckd")
#> ~2 seconds

# how many cases ascertained?
table(diagnosis_df$ckd_bin)
#> 0 1
#>470334 31935
#> 0 1
#> 470334 31935

# source of earliest diagnosis date
table(diagnosis_df$ckd_src)
@@ -130,7 +99,7 @@ summary(diagnosis_df$ckd_df[ diagnosis_df$ckd_bin_prev == 1 ])
#> "1958-01-01" "2006-06-21" "2007-01-12" "2006-06-24" "2007-11-19" "2010-06-16"
```

## Ascertaining multiple conditions at once
### Ascertaining multiple conditions at once

The default `get_df()` behaviour is to use all available codes. However the most time-efficient way to get multiple conditions is to run `get_diagnoses()` once for all codes for the conditions you wish to ascertain, then get the "date first diagnosed" for each condition separately. In the codes data frame you just need a field indicating the condition name, that will become the variable prefixes.

@@ -152,88 +121,17 @@ table(diagnosis_df$hh_bin)
#> 500254 2015

table(diagnosis_df$ckd_bin)
#> 0 1
#>470334 31935
```

In the above example we also included a UK Biobank self-reported illness code for haemochromatosis, that was also ascertained (the Date First is run on each condition separately, they do not all need to have the same data sources).

## Label UK Biobank data fields

Categorical fields are exported as integers but are encoded with labels. For example [20116 "Smoking status"](https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20116):

| Coding | Meaning |
|--------|----------------------|
| -3 | Prefer not to answer |
| 0 | Never |
| 1 | Previous |
| 2 | Current |

This package includes two functions to label a single UK Biobank field or a data frame of them using the [UK Biobank encoding schema](https://biobank.ctsu.ox.ac.uk/crystal/schema.cgi). Examples:

```r
# update the Smoking status field
ukb <- label_ukb_field(ukb, field="p20116_i0")

table(ukb$p20116_i0) # tabulates the values
#> -3 0 1 2
#> 2057 273405 172966 52949

table(haven::as_factor(ukb$p20116_i0)) # tabulates the labels
#> Prefer not to answer Never Previous Current
#> 2057 273405 172966 52949

haven::print_labels(ukb$p20116_i0) # show the value:label mapping for this variable
#> Labels:
#> value label
#> -3 Prefer not to answer
#> 0 Never
#> 1 Previous
#> 2 Current

#
# if you have a whole data frame of exported fields, you can use the wrapper function label_ukb_fields()

# say the `ukb` data frame contains 4 variables: `eid`, `p54_i0`, `p31` and `age_at_assessment`

# update the variables that looks like UK Biobank fields with titles and, where cateogrical, labels
# i.e., `p54_i0` and `p31` only -- `eid` and `age_at_assessment` are ignored
ukb <- label_ukb_fields(ukb)

table(ukb$p31) # tabulates the values
#> 0 1
#> 273238 229031

table(haven::as_factor(ukb$p31)) # tabulates the labels
#> Female Male
#> 273238 229031
#> 470334 31935
```

In the above example we also included a UK Biobank self-reported illness code for haemochromatosis, that was also ascertained (the Date First is run on each condition separately, they do not all need to have the same data sources).

## Pull phenotype data from Spark environment

**Pull phenotypes from Apache Spark on DNAnexus to an R data frame.** Recommend launching a Spark cluster with at least `mem1_hdd1_v2_x16` and **2 nodes** otherwise this can fail with error "...ensure that workers...have sufficient resources"

The underlying code is mostly from the [UK Biobank GitHub](https://github.com/UK-Biobank/UKB-RAP-Notebooks/blob/main/NBs_Prelim/105_export_participant_data_to_r.ipynb).

```r
# get phenotype data (participant ID, sex, baseline age, and baseline assessment date)
ukb <- get_rap_phenos(c("eid", "p31", "p21003_i0", "p53_i0"))
#> 48.02 sec elapsed

# summary of data
table(ukb$p31)
#> Female Male
#> 273297 229067
summary(ukb$p21003_i0)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 37.00 50.00 58.00 56.53 63.00 73.00
```

### Previous Spark functionality

If you need to see the previous release documentation follow the tags to the version required: https://github.com/lcpilling/ukbrapR/tree/v0.1.7
## Other functions

* Label UK Biobank data fields with `label_ukb_fields()`
* Upload/download files between worker and RAP with `upload_to_rap()` and `download_from_rap()`
* Pull phenotypes from Spark instance with `get_rap_phenos()`

## Questions and comments

15 changes: 15 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
url: https://lcpilling.github.io/ukbrapR/
template:
bootstrap: 5
light-switch: true

authors:
Luke Pilling:
@@ -30,3 +31,17 @@ reference:
- get_rap_phenos
- get_emr_spark
- get_selfrep_illness_spark

articles:
- title: Get started
navbar: ~
contents:
- ascertain_diagnoses
- label_fields
- spark_functions

- title: Upcoming functions
desc: Ideas for functions or those in development
contents:
- extract_fields
- extract_variants
2 changes: 2 additions & 0 deletions man/get_emr_spark.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
139 changes: 139 additions & 0 deletions vignettes/ascertain_diagnoses.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
title: "Ascertain diagnoses"
description: >
Ascertain UK Biobank participant diagnoses from all sources (medical records and self-report data).
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Ascertain diagnoses}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(ukbrapR)
```

Diagnosis of conditions in UK Biobank participants come from multiple data sources:

* Self-report during assessment

* Hospital inpatient records (HES)

* Primary care (GP)

* Cancer registry

* Cause of death

The {ukbrapR} package makes it fast and easy to ascertain diagnoses from multiple UK Biobank data sources in the DNAnexus Research Analysis Platform (RAP).


## Requires exported files

This only needs to happen once per project. Running `export_tables()` will submit the necessary `table-exporter` jobs to save the raw medical records files to the RAP persistent storage for the project. ~10Gb of text files are created. This will cost ~£0.15 per month to store in the RAP standard storage.

Once the files are exported (~15mins) these can then be used by the below functions to extract diagnoses based on codes lists.


## Input

Depending on the data source different coding vocabularies are required:

* `ICD10` (for searching HES diagnoses, cause of death, and cancer registry)

* `ICD9` (for searching older HES diagnosis data)

* `Read2` and `CTV3` (for GP clinical events)

* `OPCS3` and `OPCS4` (for HES operations)

* `ukb_cancer` and `ukb_noncancer` (for self-reported illness at UK Biobank assessments - all instances will be searched)

Ascertaining diagnoses typically takes two steps:


## 1. Get medical records and self-reported illness data for provided codes

For a given set of diagnostic codes get the participant medical events and self-reported data. Returns a list of 6 data frames: the subset of the long clinical files with matched codes.

Codes need to be provided as a data frame with two fields: `vocab_id` and `code`. Valid code vocabularies are listed above. Other cols (such as condition and description) are ignored.

```{r}
# example diagnostic codes for Chronic Kidney Disease
codes_df_ckd <- ukbrapR:::codes_df_ckd
head(codes_df_ckd)
# get diagnosis data - returns list of data frames (one per source)
diagnosis_list <- get_diagnoses(codes_df_ckd)
# N records for each source
nrow(diagnosis_list$gp_clinical)
nrow(diagnosis_list$hesin_diag)
nrow(diagnosis_list$death_cause)
```

If providing primary care codes for measures (BMI etc) these are also returned (the `gp_clinical` object in the returned list contains all cols for matched codes).


## 2. Get date first diagnosed

Usually the user is interested in combining the separate data sources into a combined phenotype: the date first diagnosed for each participant from the data/codes in step 1 (cause of death, HES diagnoses, GP clinical, cancer registry, HES operations, and self-reported illness fields).

In addition to the "date first" `df` field are:

- a `src` field indicating the source of the date of first diagnosis.
- a `bin` field indicating the cases [1] and controls [0]. This relies on a small number of baseline fields also exported. The `df` field for the controls is the date of censoring (currently 30 October 2022).
- a `bin_prev` field indicating whether the case was before the UK Biobank baseline assessment

```{r}
# for each participant, get Date First diagnosed with the condition
diagnosis_df <- get_df(diagnosis_list)
names(diagnosis_df)
summary(diagnosis_df)
```

You can add a prefix to all the variable names by specifying the "prefix" option:

```{r}
diagnosis_df <- get_df(diagnosis_list, prefix="ckd")
# how many cases ascertained?
table(diagnosis_df$ckd_bin)
# source of earliest diagnosis date
table(diagnosis_df$ckd_src)
# date of diagnosis for prevalent cases (i.e., before UK Biobank baseline assessment)
summary(diagnosis_df$ckd_df[ diagnosis_df$ckd_bin_prev == 1 ])
```

## Ascertaining multiple conditions at once

The default `get_df()` behaviour is to use all available codes. However, the most time-efficient way to get multiple conditions is to run `get_diagnoses()` once for all codes for the conditions you wish to ascertain, then get the "date first diagnosed" for each condition separately. In the codes data frame you just need a field indicating the condition name, that will become the variable prefixes.

```{r}
# combine haemochromatosis and CKD codes together
# each contain there columns: condition, vocab_id, and code
# where `condition` is either "hh" or "ckd" and will become the variable prefix
codes_df_combined <- rbind(ukbrapR:::codes_df_hh, ukbrapR:::codes_df_ckd)
# get diagnosis data - returns list of data frames (one per source)
diagnosis_list <- get_diagnoses(codes_df_combined)
# for each participant, get Date First diagnosed with the condition
diagnosis_df <- get_df(diagnosis_list, group_by="condition")
# each condition has full set of output
table(diagnosis_df$hh_bin)
table(diagnosis_df$ckd_bin)
```

In the above example we also included a UK Biobank self-reported illness code for haemochromatosis, that was also ascertained (the Date First is run on each condition separately, they do not all need to have the same data sources).


18 changes: 18 additions & 0 deletions vignettes/extract_fields.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: "Extract fields"
description: >
Get participant data for specific list of fields from the cohort database.
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Extract fields}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(ukbrapR)
```
18 changes: 18 additions & 0 deletions vignettes/extract_variants.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: "Extract variants"
description: >
Pull specific variants from whole genome sequence DRAGEN variant call files (pVCFs) into R.
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Extract variants}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(ukbrapR)
```
70 changes: 70 additions & 0 deletions vignettes/label_fields.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
title: "Label fields"
description: >
Assign categorical UK Biobank fields the labels from the showcase schema.
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Label fields}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(ukbrapR)
```

Categorical fields are exported as integers but are encoded with labels.

For example [20116 "Smoking status"](https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20116):

| Coding | Meaning |
|--------|----------------------|
| -3 | Prefer not to answer |
| 0 | Never |
| 1 | Previous |
| 2 | Current |

This package includes two functions to label a single UK Biobank field or a data frame of them using the [UK Biobank encoding schema](https://biobank.ctsu.ox.ac.uk/crystal/schema.cgi). Examples:

```{r, eval=FALSE, echo=TRUE}
# update the Smoking status field
ukb <- label_ukb_field(ukb, field="p20116_i0")
table(ukb$p20116_i0) # tabulates the values
#> -3 0 1 2
#> 2057 273405 172966 52949
table(haven::as_factor(ukb$p20116_i0)) # tabulates the labels
#> Prefer not to answer Never Previous Current
#> 2057 273405 172966 52949
haven::print_labels(ukb$p20116_i0) # show the value:label mapping for this variable
#> Labels:
#> value label
#> -3 Prefer not to answer
#> 0 Never
#> 1 Previous
#> 2 Current
#
# if you have a whole data frame of exported fields, you can use the wrapper function label_ukb_fields()
# say the `ukb` data frame contains 4 variables: `eid`, `p54_i0`, `p31` and `age_at_assessment`
# update the variables that looks like UK Biobank fields with titles and, where cateogrical, labels
# i.e., `p54_i0` and `p31` only -- `eid` and `age_at_assessment` are ignored
ukb <- label_ukb_fields(ukb)
table(ukb$p31) # tabulates the values
#> 0 1
#> 273238 229031
table(haven::as_factor(ukb$p31)) # tabulates the labels
#> Female Male
#> 273238 229031
```

50 changes: 50 additions & 0 deletions vignettes/spark_functions.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: "Spark functions"
description: >
Pull phenotype data from Spark environment.
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Spark functions}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(ukbrapR)
```


## Pull phenotype data from Spark environment to an R data frame

**Needs to be run in an Apache Spark environment on the UK Biobank DNAnexus RAP.**

Recommend launching a Spark cluster with at least `mem1_hdd1_v2_x16` and **2 nodes** otherwise this can fail with error "...ensure that workers...have sufficient resources"

The underlying code is mostly from the [UK Biobank GitHub](https://github.com/UK-Biobank/UKB-RAP-Notebooks/blob/main/NBs_Prelim/105_export_participant_data_to_r.ipynb).

```{r, eval=FALSE, echo=TRUE}
# get phenotype data (participant ID, sex, baseline age, and baseline assessment date)
ukb <- get_rap_phenos(c("eid", "p31", "p21003_i0", "p53_i0"))
#> 48.02 sec elapsed
# summary of data
table(ukb$p31)
#> Female Male
#> 273297 229067
summary(ukb$p21003_i0)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 37.00 50.00 58.00 56.53 63.00 73.00
```

### No more updates...

I am moving away from using Spark as the default environment, mostly due to the cost implications; it is significantly cheaper (and quicker!) to store and search exported raw text files in the RAP persistant storage than do everything in a Spark environment (plus the added benefit that the RStudio interface is available in "normal" instances).

The Spark functions are available as before but all updates are to improve functionality in "normal" instances using RStudio, as we move to the new era of RAP-only UK Biobank analysis.

If you need to see the previous release documentation follow the tags to the version required: https://github.com/lcpilling/ukbrapR/tree/v0.1.7

0 comments on commit e0c5ab3

Please sign in to comment.