Skip to content

Commit

Permalink
Merge pull request #525 from agmurray/package-name-backtick-audit-rsa…
Browse files Browse the repository at this point in the history
…mple

Package name backtick audit rsample
  • Loading branch information
hfrick authored Sep 4, 2024
2 parents 2c0def3 + 0806b5a commit 3b15c34
Show file tree
Hide file tree
Showing 11 changed files with 30 additions and 29 deletions.
30 changes: 16 additions & 14 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@

* Improved documentation for `initial_split()` and friends (@laurabrianna, #519).

* Formatting improvement: package names are now not in backticks anymore (@agmurray, #525).

## Bug fixes

* `vfold_cv()` now utilizes the `breaks` argument correctly for repeated cross-validation (@ZWael, #471).
Expand Down Expand Up @@ -150,7 +152,7 @@

* The `reg_intervals()` function is a convenience function for `lm()`, `glm()`, `survreg()`, and `coxph()` models (#206).

* A few internal functions were exported so that `rsample`-adjacent packages can use the same underlying code.
* A few internal functions were exported so that rsample-adjacent packages can use the same underlying code.

* The `obj_sum()` method for `rsplit` objects was updated (#215).

Expand All @@ -171,11 +173,11 @@

* The `print()` methods for `rsplit` and `val_split` objects were adjusted to show `"<Analysis/Assess/Total>"` and `<Training/Validation/Total>`, respectively.

* The `drinks`, `attrition`, and `two_class_dat` data sets were removed. They are in the `modeldata` package.
* The `drinks`, `attrition`, and `two_class_dat` data sets were removed. They are in the modeldata package.

* Compatability with `dplyr` 1.0.0.
* Compatability with dplyr 1.0.0.

# `rsample` 0.0.6
# rsample 0.0.6

* Added `validation_set()` for making a single resample.

Expand All @@ -187,24 +189,24 @@

* `initial_time_split()` and `rolling_origin()` now have a `lag` parameter that ensures that previous data are available so that lagged variables can be calculated. (#135, #136)

# `rsample` 0.0.5
# rsample 0.0.5

* Added three functions to compute different bootstrap confidence intervals.
* A new function (`add_resample_id()`) augments a data frame with columns for the resampling identifier.
* Updated `initial_split()`, `mc_cv()`, `vfold_cv()`, `bootstraps()`, and `group_vfold_cv()` to use tidyselect on the stratification variable.
* Updated `initial_split()`, `mc_cv()`, `vfold_cv()`, `bootstraps()` with new `breaks` parameter that specifies the number of bins to stratify by for a numeric stratification variable.


# `rsample` 0.0.4
# rsample 0.0.4

Small maintenance release.

## Minor improvements and fixes

* `fill()` was removed per the deprecation warning.
* Small changes were made for the new version of `tibble`.
* Small changes were made for the new version of tibble.

# `rsample` 0.0.3
# rsample 0.0.3

## New features

Expand All @@ -216,25 +218,25 @@ Small maintenance release.

* Changed the R version requirement to be R >= 3.1 instead of 3.3.3.

* The `recipes`-related `prepper` function was [moved to the `recipes` package](https://github.com/tidymodels/rsample/issues/48). This makes the `rsample` install footprint much smaller.
* The recipes-related `prepper()` function was [moved to the recipes package](https://github.com/tidymodels/rsample/issues/48). This makes the rsample install footprint much smaller.

* `rsplit` objects are shown differently inside of a tibble.

* Moved from the `broom` package to the `generics` package.
* Moved from the broom package to the generics package.


# `rsample` 0.0.2
# rsample 0.0.2

* `initial_split`, `training`, and `testing` were added to do training/testing splits prior to resampling.
* Another resampling method, `group_vfold_cv`, was added.
* `caret2rsample` and `rsample2caret` can convert `rset` objects to those used by `caret::trainControl` and vice-versa.
* A function called `form_pred` can be used to determine the original names of the predictors in a formula or `terms` object.
* A vignette and a function (`prepper`) were included to facilitate using the `recipes` with `rsample`.
* A vignette and a function (`prepper`) were included to facilitate using the recipes with rsample.
* A `gather` method was added for `rset` objects.
* A `labels` method was added for `rsplit` objects. This can help identify which resample is being used even when the whole `rset` object is not available.
* A variety of `dplyr` methods were added (e.g. `filter`, `mutate`, etc) that work without dropping classes or attributes of the `rsample` objects.
* A variety of dplyr methods were added (e.g. `filter()`, `mutate()`, etc) that work without dropping classes or attributes of the rsample objects.

# `rsample` 0.0.1 (2017-07-08)
# rsample 0.0.1 (2017-07-08)

Initial public version on CRAN

1 change: 0 additions & 1 deletion R/make_strata.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@
#' table(x3)
#' table(make_strata(x3))
#'
#' # `oilType` data from `caret`
#' x4 <- rep(LETTERS[1:7], c(37, 26, 3, 7, 11, 10, 2))
#' table(x4)
#' table(make_strata(x4))
Expand Down
2 changes: 1 addition & 1 deletion R/misc.R
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ split_unnamed <- function(x, f) {
#' @param x An `rset` or `tune_results` object.
#' @param ... Not currently used.
#' @return A character value or `NA_character_` if the object was created prior
#' to `rsample` version 0.1.0.
#' to rsample version 0.1.0.
#' @rdname get_fingerprint
#' @aliases .get_fingerprint
#' @examples
Expand Down
2 changes: 1 addition & 1 deletion R/nest.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' `nested_cv` can be used to take the results of one resampling procedure
#' and conduct further resamples within each split. Any type of resampling
#' used in `rsample` can be used.
#' used in rsample can be used.
#'
#' @details
#' It is a bad idea to use bootstrapping as the outer resampling procedure (see
Expand Down
4 changes: 2 additions & 2 deletions R/permutations.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
#' by permuting/shuffling one or more columns. This results in analysis
#' samples where some columns are in their original order and some columns
#' are permuted to a random order. Unlike other sampling functions in
#' `rsample`, there is no assessment set and calling `assessment()` on a
#' rsample, there is no assessment set and calling `assessment()` on a
#' permutation split will throw an error.
#'
#' @param data A data frame.
#' @param permute One or more columns to shuffle. This argument supports
#' `tidyselect` selectors. Multiple expressions can be combined with `c()`.
#' tidyselect selectors. Multiple expressions can be combined with `c()`.
#' Variable names can be used as if they were positions in the data frame, so
#' expressions like `x:y` can be used to select a range of variables.
#' See \code{\link[tidyselect]{language}} for more details.
Expand Down
2 changes: 1 addition & 1 deletion man/get_fingerprint.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/make_strata.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/nested_cv.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/permutations.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion vignettes/Applications/Intervals.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ intervals %>% split(intervals$term)

For bias-corrected and accelerated (BCa) intervals, an additional argument is required. The `.fn` argument is a function that computes the statistic of interest. The first argument should be for the `rsplit` object and other arguments can be passed in using the ellipses.

These intervals use an internal leave-one-out resample to compute the Jackknife statistic and will recompute the statistic for _every bootstrap resample_. If the statistic is expensive to compute, this may take some time. For those calculations, we use the `furrr` package so these can be computed in parallel if you have set up a parallel processing plan (see `?future::plan`).
These intervals use an internal leave-one-out resample to compute the Jackknife statistic and will recompute the statistic for _every bootstrap resample_. If the statistic is expensive to compute, this may take some time. For those calculations, we use the furrr package so these can be computed in parallel if you have set up a parallel processing plan (see `?future::plan`).

The user-facing function takes an argument for the function and the ellipses.

Expand Down
8 changes: 4 additions & 4 deletions vignettes/Working_with_rsets.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Now let's write a function that will, for each resample:

1. obtain the analysis data set (i.e. the 90% used for modeling)
1. fit a logistic regression model
1. predict the assessment data (the other 10% not used for the model) using the `broom` package
1. predict the assessment data (the other 10% not used for the model) using the broom package
1. determine if each sample was predicted correctly.

Here is our function:
Expand Down Expand Up @@ -109,7 +109,7 @@ example[1:10, setdiff(names(example), names(attrition))]

For this model, the `.fitted` value is the linear predictor in log-odds units.

To compute this data set for each of the 100 resamples, we'll use the `map` function from the `purrr` package:
To compute this data set for each of the 100 resamples, we'll use the `map` function from the purrr package:

```{r model_purrr, warning=FALSE}
library(purrr)
Expand Down Expand Up @@ -182,7 +182,7 @@ The calculated 95% confidence interval contains zero, so we don't have evidence

## Bootstrap Estimates of Model Coefficients

Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [`broom` package](https://cran.r-project.org/package=broom) package has a `tidy` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map` can be used to estimate and save these values for each split.
Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [broom package](https://cran.r-project.org/package=broom) package has a `tidy` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map()` can be used to estimate and save these values for each split.


```{r coefs}
Expand All @@ -200,7 +200,7 @@ bt_resamples$betas[[1]]

## Keeping Tidy

As previously mentioned, the [`broom` package](https://cran.r-project.org/package=broom) contains a class called `tidy` that created representations of objects that can be easily used for analysis, plotting, etc. rsample contains `tidy` methods for `rset` and `rsplit` objects. For example:
As previously mentioned, the [broom package](https://cran.r-project.org/package=broom) contains a class called `tidy` that created representations of objects that can be easily used for analysis, plotting, etc. rsample contains `tidy` methods for `rset` and `rsplit` objects. For example:

```{r tidy_rsplit}
first_resample <- bt_resamples$splits[[1]]
Expand Down

0 comments on commit 3b15c34

Please sign in to comment.