Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,4 @@ inst/doc
/doc/
/Meta/
/data-raw/data
docs
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Description: What the package does (one paragraph).
License: GPL (>= 3)
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
RoxygenNote: 7.3.3
Suggests:
here,
knitr,
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ export(htr_calc_mean)
export(htr_change_freq)
export(htr_create_ensemble)
export(htr_download_ESM)
export(htr_fill_missing)
export(htr_fix_calendar)
export(htr_integrate_levels)
export(htr_make_folder)
Expand Down
38 changes: 35 additions & 3 deletions R/htr_download_ESM.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
#' The process involves:
#' 1. Finding all wget script files in the input directory
#' 2. For each script, changing to the output directory
#' 3. Executing the wget script with the `-s` flag (silent mode)
#' 3. Executing the wget script with the `-q` flag (quiet mode)
#' 4. Restoring the original working directory
#'
#' All wget scripts are processed in parallel using multiple workers for efficient
Expand All @@ -26,6 +26,16 @@
#' remote repositories (e.g., ESGF data nodes).
#' @param outdir Character string. Directory where the downloaded NetCDF files will
#' be saved. The function will change to this directory before executing wget scripts.
#' @param quiet Logical. If `TRUE` (default), the download progress is shown.
#' If `FALSE`, the download progress is not shown.
#' @param security Logical. If `FALSE` (default), skips security checks.
#' Note that this option will only work if the data is not secured at all. If `TRUE`,
#' user must input a character string in either the `openid` argument or
#' in the `certificate` argument.
#' @param openid Character string. String of the OpenID that can be used
#' to download secure files.
#' @param certificate Character string. String of the certificate that can be used
#' to download secure files.
#'
#' @return
#' No return value. The function downloads NetCDF files to the specified output
Expand Down Expand Up @@ -55,7 +65,11 @@
#' }
htr_download_ESM <- function(hpc = NA, # if ran in the HPC, possible values are "array", "parallel"
indir, # where wget files are located
outdir) { # where .nc files should be downloaded
outdir, # where .nc files should be downloaded
quiet = TRUE,
security = FALSE,
openid = NA,
certificate = NA) {

# Create output folder if it doesn't exist
htr_make_folder(outdir)
Expand All @@ -73,7 +87,25 @@ htr_download_ESM <- function(hpc = NA, # if ran in the HPC, possible values are

wget_files <- function(script) {
setwd(outdir)
system(paste0("bash ", script, " -s")) # Change the path to where you want the data stored, then run wget from there

if(isTRUE(quiet)) {
system_code <- paste0("bash ", script, " -q")
} else {
system_code <- paste0("bash ", script)
}

if(isFALSE(security)) {
system_code <- paste0(system_code, " -s")
} else if(isTRUE(security) && length(openid) > 0) {
system_code <- paste0(system_code, " -o ", openid)
} else if(isTRUE(security) && length(certificate) > 0) {
system_code <- paste0(system_code, " -c", certificate)
} else {
cat("You need to input your openid or a certificate to download a secure file.")
}

system(system_code)

setwd(pth) # change back the working directory
}

Expand Down
160 changes: 160 additions & 0 deletions R/htr_fill_missing.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
#' Replace missing values
#'
#' This function replaces missing values in the climate model files using Climate Data
#' Operators (CDO), with values depending on the method chosen. Missing values can be
#' replaced by the nearest neighbor's values, the distance-weighted average of the neighbor's
#' values, or a constant.
#'
#' @details
#' The CDO command used depends on the chosen method
#' - **Nearest neighbor**: Uses `cdo setmisstonn input output`
#' - **Distance-weighted average**: Uses `cdo setmisstodis,neighbors input output`
#' - **Constant**: Uses `cdo setmisstoc,constant input output`
#'
#' @author Tin Buenafe
#'
#' @inheritParams htr_slice_period
#' @param method Character string. Method used to calculate the missing value. Accepted methods are:
#' - `"setmisstonn"`: Set missing values to the nearest neighbor's value
#' - `"setmisstodis"`: Set missing values to the distance-weighted average of the neighbors.
#' The default number of neighbors is 4, but this can be changed by changing the `neighbors` parameter.
#' - `"setmisstoc"`: Set missing values to a defined constant (requires `constant` parameter)
#' @param neighbors Numeric. Number of neighbors used to calculate missing values. Default is 4.
#' @param constant Numeric. Constant value used to replace all missing values.
#'
#' @return
#' No return value. The function creates time-sliced files in the specified output
#' directory with the same base file names as the input.
#'
#' @note
#' - Requires CDO (Climate Data Operators) to be installed and accessible from the system PATH
#' - Uses parallel processing when `hpc` is not set to "array"
#'
#' @references
#' CDO User Guide: https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf
#' CDO setmiss operator: https://code.mpimet.mpg.de/projects/cdo/embedded/index.html#x1-3610002.6.15
#'
#' @export
#'
#' @examples
#' \dontrun{
#' # Fill missing values using distance-weighted average of 7 neighbors
#' htr_fill_missing(hpc = NA,
#' file = NA,
#' indir = file.path(base_dir, "data", "proc", "integrated", "tos"),
#' outdir = file.path(base_dir, "data", "proc", "filled", "tos"),
#' method = "setmisstodis",
#' neighbors = 7,
#' constant = NA
#' )
#'
#' # Fill missing values using the nearest neighbor's value
#' htr_fill_missing(hpc = NA,
#' file = NA,
#' indir = file.path(base_dir, "data", "proc", "integrated", "tos"),
#' outdir = file.path(base_dir, "data", "proc", "filled", "tos")
#' )
#' }
htr_fill_missing <- function(hpc = NA,
file = NA,
indir,
outdir,
method = "setmisstonn", # default is setmistonn
neighbors = NA, # if method = setmisstodis, default set is 4 if no number is given
constant = NA # if method = setmisstoc, this is required
) {

# Create output folder if it doesn't exist
htr_make_folder(outdir)

# Define workers
if(is.na(hpc)) {
w <- parallelly::availableCores(methods = "system", omit = 2)
} else {
w <- parallelly::availableCores(methods = "Slurm", omit = 2)
}

##############

#TODO: Change these to assert that

fill_missing <- function(file,
method,
constant,
neighbors) {

# Naming new file
out_file <- file %>%
stringr::str_replace(indir, outdir)

# Filling missing values using chosen method

if(stringr::str_to_lower(method) %in% c("setmisstoc", "setmisstonn", "setmisstodis")) {

method_name <- stringr::str_to_lower(method)

if(method_name == "setmisstoc") {

if(is.numeric(constant)) {

system_code <- paste0("cdo ", method_name, ",", constant, " ", file, " ", out_file)

} else {

print("Please provide a numeric constant value.")

}

} else if(method_name == "setmisstonn") {

system_code <- paste0("cdo ", method_name, ",", " ", file, " ", out_file)

} else if(method_name == "setmisstodis") {

if(is.numeric(neighbors)) {

system_code <- paste0("cdo ", method_name, ",", neighbors, " ", file, " ", out_file)

} else {

system_code <- paste0("cdo ", method_name, " ", file, " ", out_file)

}

}

system(system_code)

} else {

print("Please input a valid method.")

}

}


##############

# TODO: Change this

if (hpc %in% c("array")) { # For hpc == "array", use the specific files as the starting point

netCDF <- dir(indir, pattern = file, full.names = TRUE)

fill_missing(netCDF,
method,
constant,
neighbors) # run function

} else { # For hpc == "parallel" and non-hpc work, use the input directory as the starting point and run jobs in parallel

netCDFs <- dir(indir, full.names = TRUE)

future::plan(future::multisession, workers = w)
furrr::future_walk(netCDFs, fill_missing, method, constant, neighbors)
future::plan(future::sequential)

}

}
4 changes: 2 additions & 2 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ htr_make_folder <- function(folder) {
#' The function creates a global raster covering -180 to 180 degrees longitude
#' and -90 to 90 degrees latitude at the specified resolution. All cells are
#' set to value 1, and the raster is converted to netCDF4 format using
#' [`htr_mask2netCDF4()`] for compatibility with CDO regridding operations.
#' `htr_mask2netCDF4()` for compatibility with CDO regridding operations.
#'
#' @author David Schoeman and Tin Buenafe
#'
Expand Down Expand Up @@ -236,7 +236,7 @@ htr_get_Years <- function(nc_file, yr1, yr2, infold, outfold, overwrite) {
#'
#' @details
#' The function processes all files in a directory, extracts CMIP6 metadata
#' using [`htr_get_CMIP6_bits()`], and returns unique combinations of the
#' using `htr_get_CMIP6_bits()`, and returns unique combinations of the
#' requested metadata elements. This is essential for organizing batch
#' processing operations where files need to be grouped by their characteristics.
#'
Expand Down
7 changes: 4 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,16 @@ knitr::opts_chunk$set(

<!-- badges: end -->

The goal of `hotrstuff` is to facilitate the rapid download, wrangling and processing of Earth System Model (ESM) output from the Coupled Model Intercomparison Project (CMIP).
`hotrstuff` facilitates the rapid download, wrangling and processing of Earth System Model (ESM) output from the Coupled Model Intercomparison Project (CMIP).

To get started, you will need to download the wget scripts from your chosen CMIP6 repository. We use: <https://aims2.llnl.gov/search>. From there `hotrstuff` makes it easy to:

- Download,\
- Merge files,\
- Regrid to chosen resolution,\
- Slice to required timeframe,\
- Crop to requested spatial area,\
- Change frequency of data (e.g., changing from monthly to yearly),\
- Calculate the vertical mean for depth-resolved ESMs, and\
- Create mean/median ensembles of variables, scenarios etc.

## Requirements
Expand Down Expand Up @@ -83,4 +84,4 @@ To get started with `hotrstuff`, follow the vignette [here](https://snbuenafe.gi

## Citation

Buenafe K, Schoeman D, Everett J (2024). hotrstuff: Facilitate the rapid download, wrangling and processing of Earth System Model (ESM) output from the Coupled Model Intercomparison Project (CMIP).. R package version 0.0.1, <https://github.com/SnBuenafe/hotrstuff>.
Buenafe K, Schoeman D, Everett J (2024). hotrstuff: Facilitate the rapid download, wrangling and processing of Earth System Model (ESM) outputs from the Coupled Model Intercomparison Project (CMIP). R package version 0.0.2, <https://github.com/SnBuenafe/hotrstuff>.
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h
[![Windows](https://github.com/SnBuenafe/hotrstuff/actions/workflows/Windows.yaml/badge.svg)](https://github.com/SnBuenafe/hotrstuff/actions/workflows/Windows.yaml)

<!--[![Codecov test coverage](https://github.com/SnBuenafe/hotrstuff/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/SnBuenafe/hotrstuff/actions/workflows/test-coverage.yaml) -->

<!-- badges: end -->

The goal of `hotrstuff` is to facilitate the rapid download, wrangling
and processing of Earth System Model (ESM) output from the Coupled Model
Intercomparison Project (CMIP).
`hotrstuff` facilitates the rapid download, wrangling and processing of
Earth System Model (ESM) output from the Coupled Model Intercomparison
Project (CMIP).

To get started, you will need to download the wget scripts from your
chosen CMIP6 repository. We use: <https://aims2.llnl.gov/search>. From
Expand All @@ -26,7 +27,8 @@ there `hotrstuff` makes it easy to:
- Merge files,
- Regrid to chosen resolution,
- Slice to required timeframe,
- Crop to requested spatial area,
- Change frequency of data (e.g., changing from monthly to yearly),
- Calculate the vertical mean for depth-resolved ESMs, and
- Create mean/median ensembles of variables, scenarios etc.

## Requirements
Expand Down Expand Up @@ -86,6 +88,6 @@ To get started with `hotrstuff`, follow the vignette
## Citation

Buenafe K, Schoeman D, Everett J (2024). hotrstuff: Facilitate the rapid
download, wrangling and processing of Earth System Model (ESM) output
from the Coupled Model Intercomparison Project (CMIP).. R package
version 0.0.1, <https://github.com/SnBuenafe/hotrstuff>.
download, wrangling and processing of Earth System Model (ESM) outputs
from the Coupled Model Intercomparison Project (CMIP). R package version
0.0.2, <https://github.com/SnBuenafe/hotrstuff>.
23 changes: 23 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,29 @@
url: ~

template:
bootstrap: 5
bslib:
bootswatch: yeti
navbar-brand-font-size: 2rem
#nav-link-font-size: 1.5rem

development:
destination:
version_label: info
version_tooltip: "The package is in the early stages of development. Use with caution."

navbar:
bg: dark
structure:
left: [home, reference]
components:
home:
text: "Get Started"
href: articles/hotrstuff.html
reference:
text: "Reference"
href: reference/index.html

reference:

- title: Data Aquisition
Expand All @@ -26,6 +45,10 @@ reference:
- title: Spatial Processing
- contents:
- htr_regrid_esm
- htr_fill_missing

- title: Depth-resolved Processing
- contents:
- htr_integrate_levels
- htr_show_levels

Expand Down
4 changes: 3 additions & 1 deletion data-raw/DATASET.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
## code to prepare `DATASET` dataset goes here
## DESCRIPTION: Code to prepare data



usethis::use_data(DATASET, overwrite = TRUE)
4 changes: 2 additions & 2 deletions docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading