From 83f8eca3ed2798de6819ec2fcedbacc95179029f Mon Sep 17 00:00:00 2001 From: Nicholas Spyrison Date: Fri, 3 Nov 2023 14:43:11 -0500 Subject: [PATCH] tour to horizontal layout, update figures and documentation. --- DESCRIPTION | 7 +- NEWS.md | 126 ++--- R/1_cheem_lists.r | 4 +- R/2_visualization.r | 13 +- R/7_presaved_attribution.r | 492 +++++++++--------- R/9_data.r | 434 +++++++-------- .../_readme_output_radial_cheem_tour.r | 97 ++-- buildignore/tour_penguins.gif | Bin 129605 -> 187295 bytes data/ames_rf_pred.rda | Bin 3403 -> 3209 bytes data/ames_rf_shap.rda | Bin 25306 -> 24473 bytes data/chocolates_svm_pred.rda | Bin 239 -> 239 bytes data/chocolates_svm_shap.rda | Bin 6998 -> 7000 bytes data/penguin_xgb_pred.rda | Bin 978 -> 373 bytes data/penguin_xgb_shap.rda | Bin 4682 -> 1743 bytes inst/preprocessing/_main_preprocess.r | 2 +- .../cheem/data/preprocess_ames2018.rds | Bin 71373 -> 71383 bytes .../shiny_apps/cheem/data/preprocess_fifa.rds | Bin 118155 -> 118167 bytes .../preprocess_toy_mixture_regression.rds | Bin 39113 -> 39129 bytes .../data/preprocess_toy_quad_regression.rds | Bin 31353 -> 31364 bytes .../data/preprocess_toy_trig_regression.rds | Bin 29630 -> 29638 bytes man/amesHousing2018.Rd | 322 ++++++------ man/ames_rf_pred.Rd | 179 +++---- man/cheem.Rd | 56 +- man/cheem_ls.Rd | 4 +- man/chocolates.Rd | 168 +++--- man/chocolates_svm_pred.Rd | 198 +++---- man/penguin_xgb_pred.Rd | 175 +++---- man/proto_basis1d_distribution.Rd | 182 +++---- 28 files changed, 1238 insertions(+), 1221 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 6f42533..6378207 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,14 +1,13 @@ Package: cheem Title: Interactively Explore Local Explanations with the Radial Tour -Version: 0.3.1.9000 +Version: 0.4.0.0 Authors@R: c( person("Nicholas", "Spyrison", role = c("aut", "cre"), email = "spyrison@gmail.com", comment = c(ORCID = "https://orcid.org/0000-0002-8417-0212")) ) -Description: Given a tree-based machine learning model, calculate the tree SHAP - ; - local explanation of every observation. View the data space, explanation +Description: Given a non-linear model, calculate the local explanation. + We purpose view the data space, explanation space, and model residuals as ensemble graphic interactive on a shiny application. After an observation of interest is identified, the normalized variable importance of the local explanation is used as a 1D projection diff --git a/NEWS.md b/NEWS.md index c0178c3..71946f6 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,61 +1,65 @@ -# cheem v0.3.1 - -- Repaired packagedown site! -- Fixed news on packagedown site. -- Recreate the classification saved cases, they fit too well to work for illustrations. -- Trying to include more axis context in `global_view()`. Will explore text and facet panel titles. -- Minor documentation and code clean up and clarifications. - - -# cheem v0.3.0 -- Generalized for any attribution - -- Rebase all functions from expecting a unified `treeshap::shap()` to generalized -data frame or matrix format for arbitrary attribution spaces. -- Rework vignette and examples to reflect this change. -- Added precomputed predictions and attributions for the Ames, Chocolates, and Penguins datasets. This allows users to run attribution-agnostic functions without dependencies. -- Add `subset_cheem()`, a convenience function for subsetting cheem lists after construction. -- Removed plotly subplot variations of visuals: `global_view_subplots()`, `radial_cheem_tour_subplots()`. These were development variations never used in the shiny app. -- Minor function renames for parsimony and consistency. - - -# cheem v0.2.0 (CRAN releases here on out) - -## App related changes - -- Added vignette: _Getting started with cheem_. -- Added __pkgdown__ site: https://nspyrison.github.io/cheem/. -- Added global model performance metrics to shiny app. -- In `global_view()`, added yhaty panel (residual plot/confusion matrix). -- In `global_view()`, added color options: log_maha.data and cor_attr.y. -- In `cheem_radial_tour()`, added regression case panel with additional fixed y of residual. -- In app radial tour inputs, added inclusion variable, subsetting variables used in radial tour. -- `plotly::subplot()` variants of `global_view()` & `cheem_radial_tour()`. -- Added AmesHousing data, chocolates, and new toy simulated datasets (shiny app only). -- Reduced shiny app wording. - -## Internal & utilities - -- Major rebase of `cheem_ls()`. -- Added `linear/logistic_tform()` to suggest an alpha as a function of the number of observations. - - -## Sourcing __treeshap__ - -- __drat__ repository hosting __treeshap__ did not work with debian and window rhub platforms; -- Minimally ported functions and cpp source files with @author & @source. Changed examples for consistency and smaller code base support. -- as of v0.3.0, cheem was generalized to all local variable attributions, so this is not a concern. - - -# cheem v0.1.0 (GitHub only, commit 283da4) - -## Primary preprocessing functions - -- `default_rf()` create a `randomForest::randomForest()` with more conservative defaults. -- `attr_df_treeshap()` create `treeshap::treeshap()` local explanations of each observation. -- `cheem_ls()` create a cheem list of prepared tables for use in `run_app()`. - -## Primary visual functions - -- `run_app()` which is a shiny app consuming the following two: -- `global_view()` linked 'plotly' of approximations of data- and attribution-spaces with model information. -- `cheem_radial_tour()` create `spinifex::ggtour` of the specified radial tour. Consumed by animate_plotly, animate_gganimate, or filmstrip. +# cheem v0.4.0 + +- Repaired packagedown site! +- Fixed news on packagedown site. +- Shiny app has go buttons rather than waiting after every input change. +- Shiny app text, plot dimensions, and text cleaned up. +- Classification tour now uses a horizontal layout. +- Cleaned up the text on the facet panels for `global_tour()` and `radial_cheem_tour()`. +- Recreate the saved classification model, they fit too well to work as illustrations. +- Set seed more consistently. All model and attribution shifted a bit, but will be more replicable going forward. +- Minor documentation and code clean up and clarifications. + + +# cheem v0.3.0 -- Generalized for any attribution + +- Rebase all functions from expecting a unified `treeshap::shap()` to generalized +data frame or matrix format for arbitrary attribution spaces. +- Rework vignette and examples to reflect this change. +- Added precomputed predictions and attributions for the Ames, Chocolates, and Penguins datasets. This allows users to run attribution-agnostic functions without dependencies. +- Add `subset_cheem()`, a convenience function for subsetting cheem lists after construction. +- Removed plotly subplot variations of visuals: `global_view_subplots()`, `radial_cheem_tour_subplots()`. These were development variations never used in the shiny app. +- Minor function renames for parsimony and consistency. + + +# cheem v0.2.0 (CRAN releases here on out) + +## App related changes + +- Added vignette: _Getting started with cheem_. +- Added __pkgdown__ site: https://nspyrison.github.io/cheem/. +- Added global model performance metrics to shiny app. +- In `global_view()`, added yhaty panel (residual plot/confusion matrix). +- In `global_view()`, added color options: log_maha.data and cor_attr.y. +- In `cheem_radial_tour()`, added regression case panel with additional fixed y of residual. +- In app radial tour inputs, added inclusion variable, subsetting variables used in radial tour. +- `plotly::subplot()` variants of `global_view()` & `cheem_radial_tour()`. +- Added AmesHousing data, chocolates, and new toy simulated datasets (shiny app only). +- Reduced shiny app wording. + +## Internal & utilities + +- Major rebase of `cheem_ls()`. +- Added `linear/logistic_tform()` to suggest an alpha as a function of the number of observations. + + +## Sourcing __treeshap__ + +- __drat__ repository hosting __treeshap__ did not work with debian and window rhub platforms; +- Minimally ported functions and cpp source files with @author & @source. Changed examples for consistency and smaller code base support. +- as of v0.3.0, cheem was generalized to all local variable attributions, so this is not a concern. + + +# cheem v0.1.0 (GitHub only, commit 283da4) + +## Primary preprocessing functions + +- `default_rf()` create a `randomForest::randomForest()` with more conservative defaults. +- `attr_df_treeshap()` create `treeshap::treeshap()` local explanations of each observation. +- `cheem_ls()` create a cheem list of prepared tables for use in `run_app()`. + +## Primary visual functions + +- `run_app()` which is a shiny app consuming the following two: +- `global_view()` linked 'plotly' of approximations of data- and attribution-spaces with model information. +- `cheem_radial_tour()` create `spinifex::ggtour` of the specified radial tour. Consumed by animate_plotly, animate_gganimate, or filmstrip. diff --git a/R/1_cheem_lists.r b/R/1_cheem_lists.r index 418bb1d..40b9cd4 100644 --- a/R/1_cheem_lists.r +++ b/R/1_cheem_lists.r @@ -166,7 +166,7 @@ global_view_df_1layer <- function( #' comp <- 2 #' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) #' bas <- sug_basis(peng_xgb_shap, prim, comp) -#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = 1, comparison_obs = 2) +#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = prim, comp) #' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) #' animate_plotly(ggt) #' } @@ -203,7 +203,7 @@ global_view_df_1layer <- function( #' comp <- 2 #' global_view(ames_rf_chm, primary_obs = prim, comparison_obs = comp) #' bas <- sug_basis(ames_rf_shap, prim, comp) -#' mv <- sug_manip_var(ames_rf_shap, primary_obs = 1, comparison_obs = 2) +#' mv <- sug_manip_var(ames_rf_shap, primary_obs = prim, comp) #' ggt <- radial_cheem_tour(ames_rf_chm, basis = bas, manip_var = mv) #' animate_plotly(ggt) #' } diff --git a/R/2_visualization.r b/R/2_visualization.r index e3d3508..ce39a7d 100644 --- a/R/2_visualization.r +++ b/R/2_visualization.r @@ -584,26 +584,25 @@ radial_cheem_tour <- function( ### Classification case ----- if(.prob_type == "classification"){ .pred_clas <- decode_df$predicted_class + .facet_fore <- rep("attribution projection", each = .n) ## ggtour ggt <- spinifex::ggtour(.mt_path, .dat, angle = angle, do_center_frame = do_center_frame) + + spinifex::facet_wrap_tour(facet_var = .facet_fore, nrow = 1) + ## Density spinifex::proto_density( aes_args = list(color = .pred_clas, fill = .pred_clas), row_index = row_index, rug_shape = pcp_shape) + - - #Warning message: - #In Ops.factor(yscale, x[, 2]) : '*' not meaningful for factors ## PCP on Basis, 1D proto_basis1d_distribution( cheem_ls$attr_df, primary_obs = .prim_obs, comparison_obs = .comp_obs, - position = "bottom1d", group_by = .pred_clas, + position = "floor1d", group_by = .pred_clas, do_add_pcp_segments = as.logical(do_add_pcp_segments), pcp_shape = pcp_shape, inc_var_nms = inc_var_nms, row_index = row_index) + ## Basis 1D - spinifex::proto_basis1d(position = "bottom1d", manip_col = "black") + + spinifex::proto_basis1d(position = "floor1d", manip_col = "black") + spinifex::proto_origin1d() + ## Highlight comparison obs spinifex::proto_highlight1d( @@ -641,7 +640,7 @@ radial_cheem_tour <- function( ## Foreground: .dat_fore <- rbind(.dat, .dat) .idx_fore <- c(row_index, row_index) - .facet_fore <- factor(rep(c("observed y", "residual"), each = 2 * .n)) + .facet_fore <- factor(rep(c("attribution projection by observed y", "attribution projection by residual"), each = 2 * .n)) .fixed_y <- c(.y, .resid) } else { ## not doubled up data; just fixed_observed y @@ -652,7 +651,7 @@ radial_cheem_tour <- function( ## Foreground: .dat_fore <- .dat .idx_fore <- row_index - .facet_fore <- rep("observed y", each = .n) + .facet_fore <- rep("attribution projection by observed y", each = .n) .class_fore <- .class .fixed_y <- .y } diff --git a/R/7_presaved_attribution.r b/R/7_presaved_attribution.r index 5fa4770..e93f036 100644 --- a/R/7_presaved_attribution.r +++ b/R/7_presaved_attribution.r @@ -1,246 +1,248 @@ -## penguin_xgb_: pred, shap ----- -#' Penguins xgb model predictions and shap values -#' -#' Predictions and shapviz attribution of an xgb model of Penguin data -#' classifying penguin species. -#' -#' @format `penguin_xgb_pred` is a n=333 length vector of the prediction of an -#' xgb model predicting the number of the factor level of the species of penguin. -#' `penguin_xgb_shap` is a (333 x 4) data frame of the shapviz SHAP attribution of -#' the xgb model for each observation. -#' -#' __Replicating__ -#' ``` -#' library(cheem) -#' library(xgboost) -#' library(shapviz) -#' -#' ## Classification setup -#' X <- spinifex::penguins_na.rm[, 1:4] -#' Y <- spinifex::penguins_na.rm$species -#' clas <- spinifex::penguins_na.rm$species -#' -#' ## Model and predict -#' peng_train <- data.matrix(X) %>% -#' xgb.DMatrix(label = Y) -#' peng_xgb_fit <- xgboost(data = peng_train, max.depth = 3, nrounds = 25) -#' penguin_xgb_pred <- predict(peng_xgb_fit, newdata = peng_train) -#' -#' ## shapviz -#' penguin_xgb_shap <- shapviz(peng_xgb_fit, X_pred = peng_train, X = X) -#' penguin_xgb_shap <- penguin_xgb_shap$S -#' -#' if(F){ ## Don't accidentally save -#' save(penguin_xgb_pred, file = "./data/penguin_xgb_pred.rda") -#' save(penguin_xgb_shap, file = "./data/penguin_xgb_shap.rda") -#' #usethis::use_data(penguin_xgb_pred) -#' #usethis::use_data(penguin_xgb_shap) -#' } -#' ``` -#' @keywords datasets -#' @examples -#' library(cheem) -#' -#' ## Classification setup -#' X <- spinifex::penguins_na.rm[, 1:4] -#' Y <- spinifex::penguins_na.rm$species -#' clas <- spinifex::penguins_na.rm$species -#' -#' ## Precomputed predictions and shap attribtion -#' str(penguin_xgb_pred) -#' str(penguin_xgb_shap) -#' -#' ## Cheem -#' peng_chm <- cheem_ls(X, Y, penguin_xgb_shap, penguin_xgb_pred, clas, -#' label = "Penguins, xgb, shapviz") -#' -#' ## Save for use with shiny app (expects an rds file) -#' if(FALSE){ ## Don't accidentally save. -#' saveRDS(peng_chm, "./peng_xgb_shapviz.rds") -#' run_app() ## Select the saved rds file from the data dropdown. -#' } -#' -#' ## Cheem visuals -#' if(interactive()){ -#' prim <- 1 -#' comp <- 2 -#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) -#' bas <- sug_basis(peng_xgb_shap, prim, comp) -#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = 1, comparison_obs = 2) -#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) -#' animate_plotly(ggt) -#' } -"penguin_xgb_pred" - -#' @rdname penguin_xgb_pred -"penguin_xgb_shap" - - -## chocolate_svm_: pred, shap ----- -#' Chocolate svm model predictions and shap values -#' -#' Predictions and DALEX shap attribution of an svm model of Chocolate data -#' classifying type of chocolate (light/dark). -#' -#' @format `chocolate_svm_pred` is a n=88 length vector of the prediction of an -#' svm model predicting the number of the factor level of the species of penguin. -#' `chocolate_svm_shap` is a (88 x 10) data frame of the DALEX SHAP attribution -#' of the svm model for each observation. -#' -#' __Replicating__ -#' ``` -#' library(cheem) -#' library(e1071) -#' library(DALEX) -#' -#' ## Classification setup -#' X <- chocolates[, 5:14] -#' Y <- chocolates$Type -#' clas <- chocolates$Type -#' -#' ## Model and predict -#' choc_svm_fit <- svm( -#' formula = Y ~ ., data = data.frame(Y, X), -#' type = 'C-classification', kernel = 'linear', probability = TRUE) -#' chocolates_svm_pred <- predict(choc_svm_fit, data.frame(Y, X)) -#' -#' ## SHAP via DALEX, versatile but slow -#' choc_svm_exp <- explain(choc_svm_fit, data = X, y = Y, -#' label = "Chocolates, svm") -#' ## Note that cheem expects a full [n, p] attribution space -#' chocolates_svm_shap <- matrix(NA, nrow(X), ncol(X)) ## init a df of the same structure -#' tictoc::tic("choc svm DALEX shap") -#' sapply(1:nrow(X), function(i){ -#' pps <- predict_parts_shap(choc_svm_exp, new_observation = X[i, ]) -#' ## Keep just the [n, p] local explanations -#' chocolates_svm_shap[i, ] <<- tapply( -#' pps$contribution, pps$variable, mean, na.rm = TRUE) %>% as.vector() -#' }) -#' chocolates_svm_shap <- as.data.frame(chocolates_svm_shap) -#' tictoc::toc() ## ~35-40 sec for me -#' -#' if(F){ ## Don't accidentally save -#' save(chocolates_svm_pred, file = "./data/chocolates_svm_pred.rda") -#' save(chocolates_svm_shap, file = "./data/chocolates_svm_shap.rda") -#' #usethis::use_data(chocolates_svm_pred) -#' #usethis::use_data(chocolates_svm_shap) -#' } -#' ``` -#' @keywords datasets -#' @examples -#' library(cheem) -#' -#' ## Classification setup -#' X <- chocolates[, 5:14] -#' Y <- chocolates$Type -#' clas <- chocolates$Type -#' -#' ## Precomputed predictions and shap attribution -#' str(chocolates_svm_pred) -#' str(chocolates_svm_shap) -#' -#' ## Cheem -#' choc_chm <- cheem_ls(X, Y, chocolates_svm_shap, -#' chocolates_svm_pred, clas, -#' label = "Chocolates, SVM, shap") -#' -#' ## Save for use with shiny app (expects an rds file) -#' if(FALSE){ ## Don't accidentally save. -#' saveRDS(choc_chm, "./chocolates_svm_shap.rds") -#' run_app() ## Select the saved rds file from the data dropdown. -#' } -#' -#' ## Cheem visuals -#' if(interactive()){ -#' prim <- 1 -#' comp <- 2 -#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) -#' bas <- sug_basis(peng_xgb_shap, prim, comp) -#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = 1, comparison_obs = 2) -#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) -#' animate_plotly(ggt) -#' } -"chocolates_svm_pred" - -#' @rdname chocolates_svm_pred -"chocolates_svm_shap" - - -## ames_rf_: pred, shap ----- -#' Ames random forest model predictions and shap values -#' -#' Predictions and treeshap attribution of a random forest model of North Ames -#' house sales data regressing Sales Price from house and lot variables. -#' -#' @format `ames_rf_pred` is a n=338 length vector of the prediction of an -#' random forest model predicting the numeric House Sales in North Ames. -#' `ames_rf_shap` is a (338 x 9) data frame of the treeshap SHAP attribution of -#' the random forest model for each observation. -#' -#' __Replicating__ -#' ``` -#' library(cheem) -#' library(randomForest) -#' library(treeshap) -#' -#' ## Regression setup -#' dat <- amesHousing2018_NorthAmes -#' X <- dat[, 1:9] -#' Y <- dat$SalePrice -#' clas <- dat$SubclassMS -#' -#' ## Model and treeSHAP -#' ames_rf_fit <- randomForest::randomForest( -#' X, Y, ntree = 125, -#' mtry = ifelse(is_discrete(Y), sqrt(ncol(X)), ncol(X) / 3), -#' nodesize = max(ifelse(is_discrete(Y), 1, 5), nrow(X) / 500)) -#' ames_rf_pred <- predict(ames_rf_fit, X) -#' ames_rf_shap <- treeshap::treeshap( -#' treeshap::randomForest.unify(ames_rf_fit, X), X, FALSE, FALSE) -#' ames_rf_shap <- ames_rf_shap$shaps -#' -#' if(F){ ## Don't accidentally save -#' save(ames_rf_pred, file = "./data/ames_rf_pred.rda") -#' save(ames_rf_shap, file = "./data/ames_rf_shap.rda") -#' #usethis::use_data(ames_rf_pred) -#' #usethis::use_data(ames_rf_shap) -#' } -#' ``` -#' @keywords datasets -#' @examples -#' library(cheem) -#' -#' ## Regression setup -#' dat <- amesHousing2018_NorthAmes -#' X <- dat[, 1:9] -#' Y <- dat$SalePrice -#' clas <- dat$SubclassMS -#' -#' ## Precomputed predictions and shap attribution -#' str(ames_rf_pred) -#' str(ames_rf_shap) -#' -#' ## Cheem -#' ames_chm <- cheem_ls(X, Y, ames_rf_shap, ames_rf_pred, clas, -#' label = "Ames, random forest, treeshap") -#' -#' ## Save for use with shiny app (expects an rds file) -#' if(FALSE){ ## Don't accidentally save. -#' saveRDS(ames_chm, "./ames_rf_tshap.rds") -#' run_app() ## Select the saved rds file from the data dropdown. -#' } -#' -#' ## Cheem visuals -#' if(interactive()){ -#' prim <- 1 -#' comp <- 2 -#' global_view(ames_chm, primary_obs = prim, comparison_obs = comp) -#' bas <- sug_basis(ames_rf_shap, prim, comp) -#' mv <- sug_manip_var(ames_rf_shap, primary_obs = 1, comparison_obs = 2) -#' ggt <- radial_cheem_tour(ames_chm, basis = bas, manip_var = mv) -#' animate_plotly(ggt) -#' } -"ames_rf_pred" - -#' @rdname ames_rf_pred +## penguin_xgb_: pred, shap ----- +#' Penguins xgb model predictions and shap values +#' +#' Predictions and shapviz attribution of an xgb model of Penguin data +#' classifying penguin species. +#' +#' @format `penguin_xgb_pred` is a n=333 length vector of the prediction of an +#' xgb model predicting the number of the factor level of the species of penguin. +#' `penguin_xgb_shap` is a (333 x 4) data frame of the shapviz SHAP attribution of +#' the xgb model for each observation. +#' +#' __Replicating__ +#' ``` +#' library(cheem) +#' library(xgboost) +#' library(shapviz) +#' set.seed(135) +#' +#' ## Classification setup +#' X <- spinifex::penguins_na.rm[, 1:4] +#' Y <- spinifex::penguins_na.rm$species +#' clas <- spinifex::penguins_na.rm$species +#' +#' ## Model and predict +#' peng_train <- data.matrix(X) %>% +#' xgb.DMatrix(label = Y) +#' peng_xgb_fit <- xgboost(data = peng_train, max.depth = 3, nrounds = 5) +#' penguin_xgb_pred <- predict(peng_xgb_fit, newdata = peng_train) +#' +#' ## shapviz +#' penguin_xgb_shap <- shapviz(peng_xgb_fit, X_pred = peng_train, X = X) +#' penguin_xgb_shap <- penguin_xgb_shap$S +#' +#' if(F){ ## Don't accidentally save +#' save(penguin_xgb_pred, file = "./data/penguin_xgb_pred.rda") +#' save(penguin_xgb_shap, file = "./data/penguin_xgb_shap.rda") +#' #usethis::use_data(penguin_xgb_pred) +#' #usethis::use_data(penguin_xgb_shap) +#' } +#' ``` +#' @keywords datasets +#' @examples +#' library(cheem) +#' +#' ## Classification setup +#' X <- spinifex::penguins_na.rm[, 1:4] +#' Y <- spinifex::penguins_na.rm$species +#' clas <- spinifex::penguins_na.rm$species +#' +#' ## Precomputed predictions and shap attribtion +#' str(penguin_xgb_pred) +#' str(penguin_xgb_shap) +#' +#' ## Cheem +#' peng_chm <- cheem_ls(X, Y, penguin_xgb_shap, penguin_xgb_pred, clas, +#' label = "Penguins, xgb, shapviz") +#' +#' ## Save for use with shiny app (expects an rds file) +#' if(FALSE){ ## Don't accidentally save. +#' saveRDS(peng_chm, "./peng_xgb_shapviz.rds") +#' run_app() ## Select the saved rds file from the data dropdown. +#' } +#' +#' ## Cheem visuals +#' if(interactive()){ +#' prim <- 1 +#' comp <- 2 +#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) +#' bas <- sug_basis(peng_xgb_shap, prim, comp) +#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = prim, comp) +#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) +#' animate_plotly(ggt) +#' } +"penguin_xgb_pred" + +#' @rdname penguin_xgb_pred +"penguin_xgb_shap" + + +## chocolate_svm_: pred, shap ----- +#' Chocolate svm model predictions and shap values +#' +#' Predictions and DALEX shap attribution of an svm model of Chocolate data +#' classifying type of chocolate (light/dark). +#' +#' @format `chocolate_svm_pred` is a n=88 length vector of the prediction of an +#' svm model predicting the number of the factor level of the species of penguin. +#' `chocolate_svm_shap` is a (88 x 10) data frame of the DALEX SHAP attribution +#' of the svm model for each observation. +#' +#' __Replicating__ +#' ``` +#' library(cheem) +#' library(e1071) +#' library(DALEX) +#' set.seed(135) +#' +#' ## Classification setup +#' X <- chocolates[, 5:14] +#' Y <- chocolates$Type +#' clas <- chocolates$Type +#' +#' ## Model and predict +#' choc_svm_fit <- svm( +#' formula = Y ~ ., data = data.frame(Y, X), +#' type = 'C-classification', kernel = 'linear', probability = TRUE) +#' chocolates_svm_pred <- predict(choc_svm_fit, data.frame(Y, X)) +#' +#' ## SHAP via DALEX, versatile but slow +#' choc_svm_exp <- explain(choc_svm_fit, data = X, y = Y, +#' label = "Chocolates, svm") +#' ## Note that cheem expects a full [n, p] attribution space +#' ## Shap takes about ~30-40 sec for me +#' chocolates_svm_shap <- matrix(NA, nrow(X), ncol(X)) ## init a df of the same structure +#' sapply(1:nrow(X), function(i){ +#' pps <- predict_parts_shap(choc_svm_exp, new_observation = X[i, ]) +#' ## Keep just the [n, p] local explanations +#' chocolates_svm_shap[i, ] <<- tapply( +#' pps$contribution, pps$variable, mean, na.rm = TRUE) %>% as.vector() +#' }) +#' chocolates_svm_shap <- as.data.frame(chocolates_svm_shap) +#' +#' if(F){ ## Don't accidentally save +#' save(chocolates_svm_pred, file = "./data/chocolates_svm_pred.rda") +#' save(chocolates_svm_shap, file = "./data/chocolates_svm_shap.rda") +#' #usethis::use_data(chocolates_svm_pred) +#' #usethis::use_data(chocolates_svm_shap) +#' } +#' ``` +#' @keywords datasets +#' @examples +#' library(cheem) +#' +#' ## Classification setup +#' X <- chocolates[, 5:14] +#' Y <- chocolates$Type +#' clas <- chocolates$Type +#' +#' ## Precomputed predictions and shap attribution +#' str(chocolates_svm_pred) +#' str(chocolates_svm_shap) +#' +#' ## Cheem +#' choc_chm <- cheem_ls(X, Y, chocolates_svm_shap, +#' chocolates_svm_pred, clas, +#' label = "Chocolates, SVM, shap") +#' +#' ## Save for use with shiny app (expects an rds file) +#' if(FALSE){ ## Don't accidentally save. +#' saveRDS(choc_chm, "./chocolates_svm_shap.rds") +#' run_app() ## Select the saved rds file from the data dropdown. +#' } +#' +#' ## Cheem visuals +#' if(interactive()){ +#' prim <- 1 +#' comp <- 2 +#' global_view(choc_chm, primary_obs = prim, comparison_obs = comp) +#' bas <- sug_basis(chocolates_svm_shap, prim, comp) +#' mv <- sug_manip_var(chocolates_svm_shap, primary_obs = prim, comp) +#' ggt <- radial_cheem_tour(choc_chm, basis = bas, manip_var = mv) +#' animate_plotly(ggt) +#' } +"chocolates_svm_pred" + +#' @rdname chocolates_svm_pred +"chocolates_svm_shap" + + +## ames_rf_: pred, shap ----- +#' Ames random forest model predictions and shap values +#' +#' Predictions and treeshap attribution of a random forest model of North Ames +#' house sales data regressing Sales Price from house and lot variables. +#' +#' @format `ames_rf_pred` is a n=338 length vector of the prediction of an +#' random forest model predicting the numeric House Sales in North Ames. +#' `ames_rf_shap` is a (338 x 9) data frame of the treeshap SHAP attribution of +#' the random forest model for each observation. +#' +#' __Replicating__ +#' ``` +#' library(cheem) +#' library(randomForest) +#' library(treeshap) +#' set.seed(135) +#' +#' ## Regression setup +#' dat <- amesHousing2018_NorthAmes +#' X <- dat[, 1:9] +#' Y <- dat$SalePrice +#' clas <- dat$SubclassMS +#' +#' ## Model and treeSHAP +#' ames_rf_fit <- randomForest::randomForest( +#' X, Y, ntree = 125, +#' mtry = ifelse(is_discrete(Y), sqrt(ncol(X)), ncol(X) / 3), +#' nodesize = max(ifelse(is_discrete(Y), 1, 5), nrow(X) / 500)) +#' ames_rf_pred <- predict(ames_rf_fit, X) +#' ames_rf_shap <- treeshap::treeshap( +#' treeshap::randomForest.unify(ames_rf_fit, X), X, FALSE, FALSE) +#' ames_rf_shap <- ames_rf_shap$shaps +#' +#' if(F){ ## Don't accidentally save +#' save(ames_rf_pred, file = "./data/ames_rf_pred.rda") +#' save(ames_rf_shap, file = "./data/ames_rf_shap.rda") +#' #usethis::use_data(ames_rf_pred) +#' #usethis::use_data(ames_rf_shap) +#' } +#' ``` +#' @keywords datasets +#' @examples +#' library(cheem) +#' +#' ## Regression setup +#' dat <- amesHousing2018_NorthAmes +#' X <- dat[, 1:9] +#' Y <- dat$SalePrice +#' clas <- dat$SubclassMS +#' +#' ## Precomputed predictions and shap attribution +#' str(ames_rf_pred) +#' str(ames_rf_shap) +#' +#' ## Cheem +#' ames_chm <- cheem_ls(X, Y, ames_rf_shap, ames_rf_pred, clas, +#' label = "Ames, random forest, treeshap") +#' +#' ## Save for use with shiny app (expects an rds file) +#' if(FALSE){ ## Don't accidentally save. +#' saveRDS(ames_chm, "./ames_rf_tshap.rds") +#' run_app() ## Select the saved rds file from the data dropdown. +#' } +#' +#' ## Cheem visuals +#' if(interactive()){ +#' prim <- 1 +#' comp <- 2 +#' global_view(ames_chm, primary_obs = prim, comparison_obs = comp) +#' bas <- sug_basis(ames_rf_shap, prim, comp) +#' mv <- sug_manip_var(ames_rf_shap, primary_obs = prim, comp) +#' ggt <- radial_cheem_tour(ames_chm, basis = bas, manip_var = mv) +#' animate_plotly(ggt) +#' } +"ames_rf_pred" + +#' @rdname ames_rf_pred "ames_rf_shap" \ No newline at end of file diff --git a/R/9_data.r b/R/9_data.r index e4650b9..30f122f 100644 --- a/R/9_data.r +++ b/R/9_data.r @@ -1,218 +1,218 @@ -## Ames housing 2018 ---- -#' Ames housing data 2018 -#' -#' House sales prices from Ames, Iowa, USA between 2006 and 2010. Only complete -#' numeric observations remain. -#' -#' \describe{ -#' \item{amesHousing2018}{Complete data.frame, n = 2291, 18 numeric variable -#' (including 2 temporal: MoSold, YrSold ), response variable SalePrice, -#' 3 class factors.} -#' \item{amesHousing2018_NorthAmes}{A simplified subsample, just North Ames -#' (largest neighborhood). Complete data.frame, n = 338, 9 numeric variables, -#' response variable SalePrice, 1 class factor SubclassMS, a zoning subclass.} -#' \item{amesHousing2018_raw}{Original data from Kaggle, 2930 rows of 82 -#' variables. Sparse rows (639) and sparse/defaulted columns (64) are removed.} -#' } -#' -#' No data dictionary is provided on Kaggle, but amesHousing2018 variables -#' are inferred to be: -#' \itemize{ -#' \item LotFrontage, Length of the front (street facing) side of the lot -#' in yards (0.914m) -#' \item LotArea, Area of the lot in square yards (0.836m^2) -#' \item OverallQual, Overall quality (of the house?), integer in (1, 10) -#' \item OverallCond, Overall condition (of the lot?), integer in (1, 10) -#' \item YearBuild, The year the house was originally built -#' \item BsmtUnfArea, Unfinished basement area, in square yards (0.836m^2) -#' \item TotBsmtArea, Total basement area, in square yards (0.836m^2) -#' \item 1stFlrArea, First (ground) floor living area in square yards (0.836m^2) -#' \item LivingArea, Total living area in square yards (0.836m^2) -#' \item Bathrms, The number of bathrooms -#' \item Bedrms, The number of bedrooms -#' \item TotRms, The total number of rooms -#' \item GarageYrBlt, The year the garage was build -#' \item GarageCars, The number of car spaces in the garage -#' \item GarageArea, The area of the garage in square yards (0.836m^2) -#' \item MoSold, The number of the month of the house sale -#' \item YrSold, The number of the year of the house sale -#' \item SalePrice, The sale of the house in USD (as of the year of sale?) -#' \item SubclassMS, Factor subclass of construction zone, 16 levels -#' \item SubclassMS, Factor major class of construction zone, 7 levels -#' \item Neighborhd, Factor neighborhood of Ames, IA, 28 levels -#' } -#' -#' @format complete data.frame with 2291 rows and 18 numeric variables, -#' SalesPrice, the response variable, and 3 class variables -#' @source De Cock, D. (2011). "Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project," \emph{Journal of Statistics Education}, Volume 19, Number 3. -#' \url{http://jse.amstat.org/v19n3/decock/DataDocumentation.txt} -#' \url{http://jse.amstat.org/v19n3/decock.pdf} -#' @source {Kaggle, Ames Housing Dataset} -#' \url{https://www.kaggle.com/prevek18/ames-housing-dataset} -#' -#' __Replicating this dataset:__ -#' ``` -#' if(FALSE) ## Don't accidentally open the URL. -#' browseURL("https://www.kaggle.com/prevek18/ames-housing-dataset") -#' ames <- readr::read_csv("./buildignore/AmesHousing.csv") -#' amesHousing2018_raw <- data.frame(ames) -#' ## save(amesHousing2018_raw, file = "./data/amesHousing2018_raw.rda") -#' -#' ## Complete rows and numeric variables -#' ames1 <- ames[, unlist(lapply(ames, is.numeric))] -#' ames1$Bathrooms <- ames1$`Full Bath` + ames1$`Half Bath` -#' ames1 <- ames1[, c(1:18, 38, 19:37)] -#' col_idx <- !(colnames(ames1) %in% c( -#' "Order", "Mas Vnr Area", "BsmtFin SF 1", "BsmtFin SF 2", -#' "Bsmt Full Bath", "Bsmt Half Bath", "Fireplaces", -#' "Wood Deck SF", "Open Porch SF", "Enclosed Porch", -#' "3Ssn Porch", "Screen Porch", "Pool Area", "Misc Val", "2nd Flr SF", -#' "Low Qual Fin SF", "Full Bath", "Half Bath", "Kitchen AbvGr")) -#' row_idx <- !is.na(ames1$"Garage Yr Blt") & -#' !is.na(ames1$"Lot Frontage") & -#' !is.na(ames1$"Bsmt Unf SF") & -#' !is.na(ames1$"Total Bsmt SF") -#' ames2 <- as.data.frame(ames1[row_idx, col_idx]) -#' -#' ## Looking for character classes to keep: -#' ames_char <- ames[, unlist(lapply(ames, is.character))] -#' ames_clas <- as.data.frame(lapply(ames_char, factor))[, -1] -#' ames_clasint <- data.frame(lapply(ames_clas, as.integer)) -#' col_idx_char <- which(names(ames_clas) %in% -#' c("MS.SubClass", "MS.Zoning", "Neighborhood")) -#' classes <- ames_clas[row_idx, col_idx_char] -#' -#' amesHousing2018 <- cbind(ames2, classes) -#' names(amesHousing2018) <- c( -#' "LotFrontage", "LotArea","OverallQual", "OverallCond", "YearBuild", -#' "YearRemod", "BsmtUnfArea", "TotBsmtArea", "1stFlrArea", "LivingArea", -#' "Bathrms", "Bedrms", "TotRms", "GarageYrBlt", "GarageCars", "GarageArea", -#' "MoSold", "YrSold", "SalePrice", "SubclassMS", "ZoneMS", "Neighborhd") -#' ## save(amesHousing2018, file = "./data/amesHousing2018.rda") -#' -#' .thin_col_idx <- names(amesHousing2018) %in% c( -#' "LotArea", "OverallQual", "YearBuild", -#' "LivingArea", "Bathrms", "Bedrms", "TotRms", -#' "GarageYrBlt", "GarageArea", "SalePrice", "SubclassMS") -#' amesHousing2018_thin <- amesHousing2018[, .thin_col_idx] -#' -#' ## subset to north ames, and only 5 largest subclasses -#' r_idx <- amesHousing2018$Neighborhd == "NAmes" & -#' amesHousing2018$SubclassMS %in% c("020", "050", "080", "090", "060") -#' amesHousing2018_NorthAmes <- amesHousing2018_thin[r_idx, ] -#' amesHousing2018_NorthAmes$SubclassMS <- factor( -#' amesHousing2018_NorthAmes$SubclassMS, -#' unique(amesHousing2018_NorthAmes$SubclassMS)) -#' if(F){ ## Don't accidentally save -#' save(amesHousing2018_NorthAmes, file = "./data/amesHousing2018_NorthAmes.rda") -#' ``` -#' @keywords datasets -#' @examples -#' library(cheem) -#' -#' ## Regression setup: -#' dat <- amesHousing2018_NorthAmes -#' X <- dat[, 1:9] -#' Y <- dat$SalePrice -#' clas <- dat$SubclassMS -#' -#' ## Cheem list -#' ames_rf_chm <- cheem_ls(X, Y, ames_rf_shap, ames_rf_pred, clas, -#' label = "North Ames, RF, SHAP") -#' ## Cheem visuals -#' if(interactive()){ -#' prim <- 1 -#' comp <- 2 -#' global_view(ames_rf_chm, primary_obs = prim, comparison_obs = comp) -#' bas <- sug_basis(ames_rf_chm, prim, comp) -#' mv <- sug_manip_var(ames_rf_chm, primary_obs = 1, comparison_obs = 2) -#' ggt <- radial_cheem_tour(ames_rf_chm, basis = bas, manip_var = mv) -#' animate_plotly(ggt) -#' } -#' -#' ## Save for use with shiny app (expects an rds file) -#' if(FALSE){ ## Don't accidentally save. -#' saveRDS(ames_rf_chm, "./NAmes_rf_tshap.rds") -#' run_app() ## Select the saved rds file from the data drop down. -#' } -"amesHousing2018" - -#' @rdname amesHousing2018 -"amesHousing2018_raw" - -#' @rdname amesHousing2018 -"amesHousing2018_NorthAmes" - - -## Chocolates ----- -#' Chocolates dataset -#' -#' The chocolates data was compiled by students at Iowa State University of -#' STAT503 (circa 2015) taught by Dianne Cook. Nutrition label information -#' on the chocolates as listed on manufacturer websites. -#' All numbers were normalized to be equivalent to a 100g serving. -#' Units of measurement are listed in the variable name. -#' -#' @format A complete data.frame with 88 observations and 10 numeric variables, -#' name of the chocolate, manufacturer, country, and type of the chocolate. -#' \itemize{ -#' \item Name, the name of the chocolate -#' \item MFR, chocolate manufacturer -#' \item Country, the country the manufacturer is incorporated. -#' \item Type, the type of chocolate according to the website, either 'Dark' -#' or 'Milk" -#' \item Calories, the number of calories per 100 grams -#' \item CalFat, calories from fat per 100 grams -#' \item TotFat_g, grams of total fat per 100 grams -#' \item SatFat_g, grams of saturated fat per 100 grams -#' \item Chol_mg, milligrams of cholesterol per 100 grams -#' \item Na_mg, milligrams of sodium (salt) per 100 grams -#' \item Carbs_g, grams of carbohydrates per 100 grams -#' \item Fiber_g, grams of fiber per 100 grams -#' \item Sugars_g, grams of sugar per 100 grams -#' \item Protein_g, grams of sugar per 100 grams -#' } -#' @source {Monash University, Introduction to Machine Learning course} \url{https://iml.numbat.space/} -#' -#' __Replicating this dataset:__ -#' ``` -#' if(FALSE) ## Don't accidentally open the URL. -#' browseURL("https://iml.numbat.space/") -#' ## Accessed Jan 2022 -#' chocolates <- readr::read_csv("https://iml.numbat.space/data/chocolates.csv") -#' chocolates <- data.frame(chocolates) -#' chocolates[, 2] <- factor(chocolates[, 2]) -#' chocolates[, 3] <- factor(chocolates[, 3]) -#' chocolates[, 4] <- factor(chocolates[, 4]) -#' if(F){ ## Don't accidentally save -#' save(chocolates, file = "./data/chocolates.rda") -#' ``` -#' @examples -#' library(cheem) -#' -#' ## Classification setup -#' X <- chocolates[, 5:14] -#' Y <- chocolates$Type -#' clas <- chocolates$Type -#' -#' ## Cheem -#' choc_chm <- cheem_ls(X, Y, chocolates_svm_shap, chocolates_svm_pred, clas, -#' label = "Chocolates, LM, shap") -#' -#' ## Save for use with shiny app (expects an rds file) -#' if(FALSE){ ## Don't accidentally save. -#' saveRDS(choc_chm, "./chocolates_svm_shap.rds") -#' run_app() ## Select the saved rds file from the data dropdown. -#' } -#' -#' ## Cheem visuals -#' if(interactive()){ -#' prim <- 1 -#' comp <- 2 -#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) -#' bas <- sug_basis(peng_xgb_shap, prim, comp) -#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = 1, comparison_obs = 2) -#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) -#' animate_plotly(ggt) -#' } +## Ames housing 2018 ---- +#' Ames housing data 2018 +#' +#' House sales prices from Ames, Iowa, USA between 2006 and 2010. Only complete +#' numeric observations remain. +#' +#' \describe{ +#' \item{amesHousing2018}{Complete data.frame, n = 2291, 18 numeric variable +#' (including 2 temporal: MoSold, YrSold ), response variable SalePrice, +#' 3 class factors.} +#' \item{amesHousing2018_NorthAmes}{A simplified subsample, just North Ames +#' (largest neighborhood). Complete data.frame, n = 338, 9 numeric variables, +#' response variable SalePrice, 1 class factor SubclassMS, a zoning subclass.} +#' \item{amesHousing2018_raw}{Original data from Kaggle, 2930 rows of 82 +#' variables. Sparse rows (639) and sparse/defaulted columns (64) are removed.} +#' } +#' +#' No data dictionary is provided on Kaggle, but amesHousing2018 variables +#' are inferred to be: +#' \itemize{ +#' \item LotFrontage, Length of the front (street facing) side of the lot +#' in yards (0.914m) +#' \item LotArea, Area of the lot in square yards (0.836m^2) +#' \item OverallQual, Overall quality (of the house?), integer in (1, 10) +#' \item OverallCond, Overall condition (of the lot?), integer in (1, 10) +#' \item YearBuild, The year the house was originally built +#' \item BsmtUnfArea, Unfinished basement area, in square yards (0.836m^2) +#' \item TotBsmtArea, Total basement area, in square yards (0.836m^2) +#' \item 1stFlrArea, First (ground) floor living area in square yards (0.836m^2) +#' \item LivingArea, Total living area in square yards (0.836m^2) +#' \item Bathrms, The number of bathrooms +#' \item Bedrms, The number of bedrooms +#' \item TotRms, The total number of rooms +#' \item GarageYrBlt, The year the garage was build +#' \item GarageCars, The number of car spaces in the garage +#' \item GarageArea, The area of the garage in square yards (0.836m^2) +#' \item MoSold, The number of the month of the house sale +#' \item YrSold, The number of the year of the house sale +#' \item SalePrice, The sale of the house in USD (as of the year of sale?) +#' \item SubclassMS, Factor subclass of construction zone, 16 levels +#' \item SubclassMS, Factor major class of construction zone, 7 levels +#' \item Neighborhd, Factor neighborhood of Ames, IA, 28 levels +#' } +#' +#' @format complete data.frame with 2291 rows and 18 numeric variables, +#' SalesPrice, the response variable, and 3 class variables +#' @source De Cock, D. (2011). "Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project," \emph{Journal of Statistics Education}, Volume 19, Number 3. +#' \url{http://jse.amstat.org/v19n3/decock/DataDocumentation.txt} +#' \url{http://jse.amstat.org/v19n3/decock.pdf} +#' @source {Kaggle, Ames Housing Dataset} +#' \url{https://www.kaggle.com/prevek18/ames-housing-dataset} +#' +#' __Replicating this dataset:__ +#' ``` +#' if(FALSE) ## Don't accidentally open the URL. +#' browseURL("https://www.kaggle.com/prevek18/ames-housing-dataset") +#' ames <- readr::read_csv("./buildignore/AmesHousing.csv") +#' amesHousing2018_raw <- data.frame(ames) +#' ## save(amesHousing2018_raw, file = "./data/amesHousing2018_raw.rda") +#' +#' ## Complete rows and numeric variables +#' ames1 <- ames[, unlist(lapply(ames, is.numeric))] +#' ames1$Bathrooms <- ames1$`Full Bath` + ames1$`Half Bath` +#' ames1 <- ames1[, c(1:18, 38, 19:37)] +#' col_idx <- !(colnames(ames1) %in% c( +#' "Order", "Mas Vnr Area", "BsmtFin SF 1", "BsmtFin SF 2", +#' "Bsmt Full Bath", "Bsmt Half Bath", "Fireplaces", +#' "Wood Deck SF", "Open Porch SF", "Enclosed Porch", +#' "3Ssn Porch", "Screen Porch", "Pool Area", "Misc Val", "2nd Flr SF", +#' "Low Qual Fin SF", "Full Bath", "Half Bath", "Kitchen AbvGr")) +#' row_idx <- !is.na(ames1$"Garage Yr Blt") & +#' !is.na(ames1$"Lot Frontage") & +#' !is.na(ames1$"Bsmt Unf SF") & +#' !is.na(ames1$"Total Bsmt SF") +#' ames2 <- as.data.frame(ames1[row_idx, col_idx]) +#' +#' ## Looking for character classes to keep: +#' ames_char <- ames[, unlist(lapply(ames, is.character))] +#' ames_clas <- as.data.frame(lapply(ames_char, factor))[, -1] +#' ames_clasint <- data.frame(lapply(ames_clas, as.integer)) +#' col_idx_char <- which(names(ames_clas) %in% +#' c("MS.SubClass", "MS.Zoning", "Neighborhood")) +#' classes <- ames_clas[row_idx, col_idx_char] +#' +#' amesHousing2018 <- cbind(ames2, classes) +#' names(amesHousing2018) <- c( +#' "LotFrontage", "LotArea","OverallQual", "OverallCond", "YearBuild", +#' "YearRemod", "BsmtUnfArea", "TotBsmtArea", "1stFlrArea", "LivingArea", +#' "Bathrms", "Bedrms", "TotRms", "GarageYrBlt", "GarageCars", "GarageArea", +#' "MoSold", "YrSold", "SalePrice", "SubclassMS", "ZoneMS", "Neighborhd") +#' ## save(amesHousing2018, file = "./data/amesHousing2018.rda") +#' +#' .thin_col_idx <- names(amesHousing2018) %in% c( +#' "LotArea", "OverallQual", "YearBuild", +#' "LivingArea", "Bathrms", "Bedrms", "TotRms", +#' "GarageYrBlt", "GarageArea", "SalePrice", "SubclassMS") +#' amesHousing2018_thin <- amesHousing2018[, .thin_col_idx] +#' +#' ## subset to north ames, and only 5 largest subclasses +#' r_idx <- amesHousing2018$Neighborhd == "NAmes" & +#' amesHousing2018$SubclassMS %in% c("020", "050", "080", "090", "060") +#' amesHousing2018_NorthAmes <- amesHousing2018_thin[r_idx, ] +#' amesHousing2018_NorthAmes$SubclassMS <- factor( +#' amesHousing2018_NorthAmes$SubclassMS, +#' unique(amesHousing2018_NorthAmes$SubclassMS)) +#' if(F){ ## Don't accidentally save +#' save(amesHousing2018_NorthAmes, file = "./data/amesHousing2018_NorthAmes.rda") +#' ``` +#' @keywords datasets +#' @examples +#' library(cheem) +#' +#' ## Regression setup: +#' dat <- amesHousing2018_NorthAmes +#' X <- dat[, 1:9] +#' Y <- dat$SalePrice +#' clas <- dat$SubclassMS +#' +#' ## Cheem list +#' ames_rf_chm <- cheem_ls(X, Y, ames_rf_shap, ames_rf_pred, clas, +#' label = "North Ames, RF, SHAP") +#' ## Cheem visuals +#' if(interactive()){ +#' prim <- 1 +#' comp <- 2 +#' global_view(ames_rf_chm, primary_obs = prim, comparison_obs = comp) +#' bas <- sug_basis(ames_rf_chm, prim, comp) +#' mv <- sug_manip_var(ames_rf_chm, primary_obs = prim, comp) +#' ggt <- radial_cheem_tour(ames_rf_chm, basis = bas, manip_var = mv) +#' animate_plotly(ggt) +#' } +#' +#' ## Save for use with shiny app (expects an rds file) +#' if(FALSE){ ## Don't accidentally save. +#' saveRDS(ames_rf_chm, "./NAmes_rf_tshap.rds") +#' run_app() ## Select the saved rds file from the data drop down. +#' } +"amesHousing2018" + +#' @rdname amesHousing2018 +"amesHousing2018_raw" + +#' @rdname amesHousing2018 +"amesHousing2018_NorthAmes" + + +## Chocolates ----- +#' Chocolates dataset +#' +#' The chocolates data was compiled by students at Iowa State University of +#' STAT503 (circa 2015) taught by Dianne Cook. Nutrition label information +#' on the chocolates as listed on manufacturer websites. +#' All numbers were normalized to be equivalent to a 100g serving. +#' Units of measurement are listed in the variable name. +#' +#' @format A complete data.frame with 88 observations and 10 numeric variables, +#' name of the chocolate, manufacturer, country, and type of the chocolate. +#' \itemize{ +#' \item Name, the name of the chocolate +#' \item MFR, chocolate manufacturer +#' \item Country, the country the manufacturer is incorporated. +#' \item Type, the type of chocolate according to the website, either 'Dark' +#' or 'Milk" +#' \item Calories, the number of calories per 100 grams +#' \item CalFat, calories from fat per 100 grams +#' \item TotFat_g, grams of total fat per 100 grams +#' \item SatFat_g, grams of saturated fat per 100 grams +#' \item Chol_mg, milligrams of cholesterol per 100 grams +#' \item Na_mg, milligrams of sodium (salt) per 100 grams +#' \item Carbs_g, grams of carbohydrates per 100 grams +#' \item Fiber_g, grams of fiber per 100 grams +#' \item Sugars_g, grams of sugar per 100 grams +#' \item Protein_g, grams of sugar per 100 grams +#' } +#' @source {Monash University, Introduction to Machine Learning course} \url{https://iml.numbat.space/} +#' +#' __Replicating this dataset:__ +#' ``` +#' if(FALSE) ## Don't accidentally open the URL. +#' browseURL("https://iml.numbat.space/") +#' ## Accessed Jan 2022 +#' chocolates <- readr::read_csv("https://iml.numbat.space/data/chocolates.csv") +#' chocolates <- data.frame(chocolates) +#' chocolates[, 2] <- factor(chocolates[, 2]) +#' chocolates[, 3] <- factor(chocolates[, 3]) +#' chocolates[, 4] <- factor(chocolates[, 4]) +#' if(F){ ## Don't accidentally save +#' save(chocolates, file = "./data/chocolates.rda") +#' ``` +#' @examples +#' library(cheem) +#' +#' ## Classification setup +#' X <- chocolates[, 5:14] +#' Y <- chocolates$Type +#' clas <- chocolates$Type +#' +#' ## Cheem +#' choc_chm <- cheem_ls(X, Y, chocolates_svm_shap, chocolates_svm_pred, clas, +#' label = "Chocolates, LM, shap") +#' +#' ## Save for use with shiny app (expects an rds file) +#' if(FALSE){ ## Don't accidentally save. +#' saveRDS(choc_chm, "./chocolates_svm_shap.rds") +#' run_app() ## Select the saved rds file from the data dropdown. +#' } +#' +#' ## Cheem visuals +#' if(interactive()){ +#' prim <- 1 +#' comp <- 2 +#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp) +#' bas <- sug_basis(peng_xgb_shap, prim, comp) +#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = prim, comp) +#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv) +#' animate_plotly(ggt) +#' } "chocolates" \ No newline at end of file diff --git a/buildignore/_readme_output_radial_cheem_tour.r b/buildignore/_readme_output_radial_cheem_tour.r index 4d20c83..302fd77 100644 --- a/buildignore/_readme_output_radial_cheem_tour.r +++ b/buildignore/_readme_output_radial_cheem_tour.r @@ -1,46 +1,51 @@ -# Create some output for Readme ---- - -library(cheem) -library(spinifex) - -if(F) ## Attempting to parallelize, small test but didn't look 11x. - browseURL( - "https://stackoverflow.com/questions/67321487/how-to-use-multiple-cores-to-make-gganimate-faster") -n_cores <- future::availableCores() ## 12 on Acer laptop -future::plan("multiprocess", workers = n_cores - 1) - -## Classification: -X <- penguins_na.rm[, 1:4] -clas <- penguins_na.rm$species -Y <- as.integer(clas) -colnames(X) <- c("bl", "bd", "fl", "bm") - -rf_fit <- default_rf(X, Y) -## Long runtime for full datasets or complex models: -shap_df <- stop("REPLACE ME") -this_ls <- cheem_ls(X, Y, class = clas, - model = rf_fit, - attr_df = shap_df) - -bas <- sug_basis(shap_df, rownum = 1) -mv <- which(colnames(penguins_ls$attr_df) == "fl") -ggt <- radial_cheem_tour(this_ls, basis = bas, manip_var = mv, - primary_obs = 243, comparison_obs = 169, angle = .25) -if(interactive()){ - ## Render gif - gif <- animate_gganimate( - ggt, height = 2, width = 4.5, units = "in", res = 150) #, render = gganimate::av_renderer()) - ## Save gif - gganimate::anim_save("tour_penguins.gif", animation = gif, path = "./buildignore") - beepr::beep() - - ## Render mp4 - # mp4 <- animate_gganimate( - # ggt, height = 2, width = 4.5, units = "in", res = 150, - # render = gganimate::av_renderer()) - ## Save mp4 - # gganimate::anim_save("tour_penguins.mp4", animation = mp4, path = "./ignore") - # beepr::beep() -} -tictoc::toc() -## , YMMV +# Create some output for Readme ---- + +library(cheem) +library(spinifex) + +if(F) ## Attempting to parallelize, small test but didn't look 11x. + browseURL( + "https://stackoverflow.com/questions/67321487/how-to-use-multiple-cores-to-make-gganimate-faster") +n_cores <- future::availableCores() ## 12 on Acer laptop +future::plan("multiprocess", workers = n_cores - 1) + +## Classification: +X <- penguins_na.rm[, 1:4] +clas <- penguins_na.rm$species +Y <- as.integer(clas) +colnames(X) <- c("bl", "bd", "fl", "bm") + +## Cheem +peng_chm <- cheem_ls(X, Y, penguin_xgb_shap, penguin_xgb_pred, clas, + label = "Penguins, xgb, shapviz") + +bas <- sug_basis(penguin_xgb_shap, rownum = 1) +mv <- which(colnames(X) == "fl") +ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv, + primary_obs = 243, comparison_obs = 169, angle = .10) + +prim <- 243 +comp <- 256 +global_view(peng_chm, primary_obs = prim, comparison_obs = comp) +bas <- sug_basis(penguin_xgb_shap, prim) +mv <- sug_manip_var(penguin_xgb_shap, primary_obs = prim, comparison_obs = comp) +ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv, prim, comp) +animate_plotly(ggt) +if(interactive()){ + ## Render gif + gif <- animate_gganimate( + ggt, height = 2, width = 6, units = "in", res = 150) #, render = gganimate::av_renderer()) + ## Save gif + gganimate::anim_save("tour_penguins.gif", animation = gif, path = "./buildignore") + beepr::beep() + + ## Render mp4 + # mp4 <- animate_gganimate( + # ggt, height = 2, width = 4.5, units = "in", res = 150, + # render = gganimate::av_renderer()) + ## Save mp4 + # gganimate::anim_save("tour_penguins.mp4", animation = mp4, path = "./ignore") + # beepr::beep() +} +tictoc::toc() +## , YMMV diff --git a/buildignore/tour_penguins.gif b/buildignore/tour_penguins.gif index 547986d158c7d1a7ea40641832c234d3e0e0295e..8711fc9b9570590aea8301c96708e37731fde07c 100644 GIT binary patch literal 187295 zcmXV0cTm&M)BdDE=%IHEJ@kNpfS^ID6h)*r145*UfTE~~2{nWkdJ!>5@4X|2-iwHI zkSIq`lRGC$-=P|N(69RUFW%WKbm%=i5Lx|{JhZP}Mcg!1^?cx_NXUB%Ma z+~LmG-u$ht0cErzQICIUXN5*v%*rY!YXS3ENJm!I=;$a8hoh}@)u_`jAGuQj_>=^TXCq*|IyMLrcz9;&nZK{;cxx&FPMWPO zjW~Y((~bA*QJ8qr+X159W~>p47k=(V@cl=%{QP>U8QGt5acPg0@Wb^TD>N+t;Q!Bq zt*z{=+=}tTtpQ?IR>zBs!P)V;=G7Sh(E2Yn;=tBGfDY~w%XXGa>zgIoBKZ&PaIfr% zTmfD(KLkrC+(@wr4{g&_`tLB@w{NFS|D}Vh<(rCi)I>h)t1QpT{#;L_(V8eh(mN|7 z3nAzkKmKWVCL93J4nLr!CHwENk2fZe7ir_=nVV&I=3)$JO}NJ98MUMs9zpRc9II7R zmL74SpZZF(Jf?s$hO2qcdI@)%bUmwM<^9aY^Czv9e#l8TR*iHL{W(4@@?zc7^BuR9 zZ4+!)n#s7%f`ABvZk+tw#!Owr7$j7 za!lH7ULJ9NvahbLOJ?@nnKW!{V{3KoRKd{D;CipNmf8KXu&nZqJ{z!n)Z?Z_8Z`y0 zofMiJZ&g_~US9NA2UnX$?%P@(iN#&O#Z5LOndy?sah{aRxVV&-*<`<{l|x!u{)5P} z&i3*7&3H2m4oCFQVT3c|Fu26>s`Bv_P7Yy53q5BS*Om`el_cU|gc~k4mQ9%B`&`!| zt!az4+(cW1(*O`$$hd~m6%OZ=bsDV6>xo8)o8%bRzV3_VJMXlD@-_}ut`J7yL{smS?A$~()Yy{`6e$- z8p}SE+tqt74>y+2R%3fe!lq3X^YxzZ%bZ7=Di@pmH-=xDHdigT2Y=gJ9%-)r_!bXg z5;1G3S?eKk%DIfT)NTxr#Z7a~TI;q(GSB<0jJDQ)nxN>#ikP=G?7lC#S?)5{*0?`g zb9*G$yuIo3V$1#gm9h5bFCVGlOy{n3w0zwdOp&`a-qHH)(`3Hst7~uCzVFY~`+OXK z)Bf|zYELZf-1W|mU*C4#m*1M`eDmk$(Zu)>%{yF`&|8e5&TN(|ZL#^VWJQ^Gz zxKzn%7_<7$YB<6WwHCo;M+=$oMF4;)X{;{)>h@q3&BY}q!Es%Tj*+pnhLc71Akd0t3VH?#Ouq?&0cLy`p_AbJ=_7E0yI z+;1MCD%ZZPC?Gyp_Zd_j%mu+nD$x``QBeS)uz4ikC6h=nZT+J~LZtuNtV*WrPF{ zYX~}hXV!18X!ro)s*M`4;uY#qQ0Q^GU7wa27bswR;dA7$OL%1<03P@YoWI`_JUZ=r zhxV5`^S;~j(cEhC-@wtvZpXz59mu>dT6%U=z2AUW!L%#wvj4Bv(DsnW8xATfp6|0p z)_=acYQ*#lqHN6B$XRZxsNQD!hee~&oA;T)N8^lZx9bn}|EYW_zy6~A;5$k5>SKDvX7TUh}fSM-u$bWYSPLA9}Am0gIrir zh{B1Rz;vND6RtL1O?H)WZ?;cKJ>YiflQmyAPyKk!u6VvhXCO7BHNgCJ;I@T*JF9(z z{*N0bA^KdRaz<3CVP%qMN)lJ6ai58IfWz<~r*9fp+gTeBh761D0*Blt{AzBG_5xkD zqh?J6hHBIEog`;#jOBEOrIHhKZb;oO7>2irGcJ*%>fIj1Rkx1DfqilBviU}(K=l^= z`b8=_S$*@a$BEezTOZnFdL@V|K=rhT^z&vjH`0Cb_5(@z44IM9&#W2Q@~)Z_S+sGv z?BgeF<+x-hPS0uThG9A-Yqy0bKf?POWhN(0w>9|kv%}*cwg@-=p1EV1}1IGyQP@B;IE*9E%XrSrhHjI4iOa~U7bsBJC6h?iXca>L@;H0j0HmOF|zbFSj>x_V-9SMH~#l-Ss>2T@J0aD{A+_CvtgC)^C_C{UQq5moJ20 zFu(VK)58ZznJH-QD06vGkjQi(&8DuSJF{Qs;FJ4BL~ed+At>j~LAhIagnA=Q7kMw- zalmlQpgtpS;)T|?&s{iEiI=Rh&FvoDvwS5^N398@Gt41?3+m*49rkE6T#r`LHtAUL6lCwXdtuk*0T*3~DifY}W+ zM>pFxe!eCKigoW2L=_o{RtY{l1LicV`q6*{`7GD zjaI^AOQ&lG;Nr9GbJfN}Aphf(5GMdkJ{_j7;>!BizhjiwYIDgCUnB@@!ZDa}>AG*Z z8JnLlY;D6^uOIY3cqwjNZ9W%!{JmU%Y1f>l0e!9by;84W4|L&bZ+&i1b5G$xOxIJx z{!4f(C^(JSBdBBZwLC1W3QR5!fxR~!7E46 z#ZRi(+=7xum%k>yV%lP=|4qXFQ>&(3-G5Q{xR~wVH;U1j9@q1r>5DEulcb+my4~Ws z#RcX3DK&V3XDWg)4Fbk4Gk$%l@~3cm<;TmPXP@R3gJO^Z7m8h-p$hrA^x@f$b_0qnSbHgkUX+}y0Ej;vazJiCLyUn2|7oH!}17B^Qv)sVzCOwDLJ|7+k8K-{+`w9iH zARd6jAA`7#hC@K%+5n(Ngo%0LAv-lMm_?Q{7h1aRXqv)0&l&70Yu3rSNC_XmW<^a%%8r1ZrkH`4urp;RrzOuld@jU=ZrJNo}p8|;8Hu*)9-1< zY3(LH;3rhdJgSySt(D=ar+M+(4>HgNz(cVw+8EMq%5ZQW#-Hzgsg9109cAQ3X~ zYNz1Z0=+J6p~Oly8Ne4hEYJnx8C4K?G8Vxw2@XPonvNi+Sf(8`v-u>60bpgYfEpZ? zblD@UewQgsf;lK40SvPi1(bk>My$(rpi=CgXd-*dp-PaKPnhOigda=+ z>?>K5iogMtb5Dp10>vT$q4Y;&2zzEYL)xQ-T*-i{6R!dzG81b}m0f8KE57xE4S+@iK+07-vBOYOVHa zIWSNGfL;rLSzzE2K6PAJXt!*1I3B6wSV6=9R!8AJs0h_^E;nyZBVvSROno08k*TYm ziEv+ow_*K4<$7x+pKAjWL1!|QdQGNE#U$*F7l(ap74uQC@sq|wQnh{?la*|BJHEiR zy}+fEQ_#gR$XvF&!l7F_2l%ND z9>X(TwqSdYX1Sm9Ml}vta_r{4*r{Cs@}<(X(inTMoRu@y_87(Wvgh^6C-s#-l(`_+ zc4s)HA_h^Ifw1=NTaD?fGUZCu?&2B+c(kFIny%-IT>zrvo;<|dp~KLmiFFP58O8ia z8{BBY^n=PgdgRYrp3H!$Wg#QhMmZ7*_Kyc`n$&3IM)21r2cp-E!4|CiNBc7_E;{ior0e_Hn{)CVzRbX1`H@7%N$}?~C=0~+ zFNE&dJ%&j-)@)#Mb@G128xG+St~WhApGWyW!)p;t@ur}7raG;oC_2$8ah8WiIS-_l zdp)wJ@<)c`&7UYf8^6o=-ZqXX`=qeGK_N(aE#1?@5HXFrH z&&t8=7a7c07$$%Mi+=)1cCb-=%wTNoG|70 zCWXyz{A}d@{jt1oI5;51W2zgaE8@m=m1oyE~8#hlrNWkv}DQXxW%S?Zv96=FIZ zk}o+_mpN92#%mQ8<%G$Ne}`a%@m6Po%Ge?wXo<0EwY$={@-BPd-6cH>h|22w;jsd9 z2T1fwgyRv@+ZqNEe)_|+mxgUJ+1-a{Y(n3%9CBfYXCbZy z8vYEL7HbiomoT3T1Va%_Opv`A2H#7}68|Q3J~NJ&udgV4uyUXe!qmdCpso0p1Pr}D zhPf95*ZkLVab@Ehjb-!8(-{ffP3feMLxVGOS(~PzglmcjsK^$yW6`n8(>ipEk{e;~ z_x>FFT0hHnbavz5SOFhp?Uc05ygxVnjHw#2?Emk-BJv3|vTX5razhZ7>%h2ezVQIL zQ;XR-_}k(;xbgKbJtS|((99>nkAwVg2kO6hy_v)ej+CXu>@|T)1B@>Kg@ZW(EJz~d0Q}+Q5c_#?1U#Apvn)xUsuNd9UKaNmR0^NdE@hR=;u$J_`f>iX?~9~a(CAg zp3qluZ|5GW(B9(nuig7B!v1Bvk=eh?k-S$S3;AjPO&m%X8<&1?`rc(^FDfa7u4?=*gZv5Ydg5c*fx?>5^;}!ekyU&gnhCTz>Hb~d8 zf>MJQ`?vpEB&F&;)t~$=d-{OJ zsEiM&^p#Zl2N?R--b_pYV`MHunf2pw{Me+(kfQ4k`flF2+|LWS$8?~dV$whF(~5$w zZ+SoeY1eu9^xVX=>vJR>s0j14?yHIFC#Ty_zr%HkxdRB|R zZ0r4*%ImG6FhQFGy9yX1ALa#5UHb$fi~f@ z;NzR(wm3Q4zrR5bzN{-yGQQZpjsE;`fW~|O@xQpfXq^ca5NxUXGK5Wj=Q8xdw*=*B z0R5RVWhDr{f>b;cRgYIimE?zG7#Rzomf-btt0y~M{A&+1BB2}{c49n?99JX4?A%0m zG|gUzljyZd>RH?Od3wuq#dL2T8A}c@lIu?9!lx8|9;(Q7hW&V5-FeouY55kM0ZpMu zw0Z|(AnWM+3UWQ}=j)Xh&xVcbjLXLeWY3!l8BFcFhD~sj*h%#zeXFZsA8B=>TPc_) z7SJd$k-#MIRT+z|EOjISivL6j9}YVgTULb>O$y^gX_^Y}n_Jl`;CUP-pQ@OkiVFhG zff|W=yyBy+;}L5%E46YJdj0d}w+f#W;O&ckBO`=Myl3bHUKg&})`c~ym)-37y`#4z zbcD?lJw2e~jsJImwzzoDv^hVOc}y&Jl|m7J;b+Efn4pqADKo|E6d);Js{DaVOT^$e zSIlK&7F z&*>O2uO4IYs#%aWAAXmtVPYlvyp!~)Y6%xawNt|--afrFZf%?VQ;HeaqV`R*%oX;) z-p{c#FrNuLZU_DJB7fo6muAW8vd^^~=->JGbS(7QoyaRa$@i{{XIrE>BtIf$R&=)I zOP;<->er|HolU{ddTmUZKa$w-m1q!5TO$#QoJQAsgtRMVchf)kjrxBuaqleVrGD9{ z_K^QI%qV?yRC>~~k*jD3Doavioa|-3!=&QMQlHrsh4dD`NBesk%cd>M$!c+hSZcus zf?Hw`03%uUvIf9=<+f&xb}vN5ZuX_tarn5JiEuX-6RB~mO1lUec2=mtbl^ETQFi7K zw7>JL#CPzORDD*Z3Sr&X?D6WVd93E1cis0I)i2Z`5Xu!1@9W&L5>8{@{IUktK3G9RjEj?i1y7c?HY?#FAIkrqHEU*t;d_)}Af#JHzrBe#5Y!Ya*Qviomo~f0SQa#iYU_Ptc#;6imm_3SnT7P+qoN&Mhd!?`7KtRqV-J)tzsh% z{VK8xwI+5jy!uvir+ei)`MC2wL6xxs^L1&@?_r|Ou>xS@0zHY-hcy|L8&k@r2Q<5y|l$7#K4Zaz`5~6z@rplxGx*4^3tf zJFPIV8WgIE<7zx-gZD1Q6SEf@h;?b70tRkM#a+euwZ${ti}|geFV={h5ldY(GZMe^a51A6CeX4I@8ZdBg9)zv>k@)N1dhc}0+R_n3-!mDXfL9UgZ;nA_ zt_-2Wyr>fe2P64O7J@3HIB62YKkEO{84HA|0asXrfC$ml063ULUU!DX#Xr@3ncB+#M=9czaG2R1y;JF@To@H#7!1 z7L-dkQ=PWx=C7J_GYE{J6dL^A#{tGmbzyEZ*iX~#Hd)a$R@W_Mis-9*21-Dm@5+VL zwP6RvsgQ4F#)&95A+mM`|AEgPHp;ph@6`R7lT@s?`x@-)m2RdYG~B=^;+^oWfvA|m zLhUAHsyNb!N=sAr_Xx>;ud!U%<+1F2duwKTPd5l;#RAwE1+`R5*;nu$ZY*XG&7{=2 z)b%=$uc8xPZ2P8by!c=qhUYBDIk#ST%Mae!KAv+$3<>fD06F)5&EtLL{#1nBEgKxF z%3SnOs(Z&sfb&IjeHJGvV16gk0%rt(zzN&Bmcb1#-$;Z}Ervh#Hv7*g;8bng-# zo`8kwOe5f%mwvX&Z8PwkHhaOY9Vt&Yz1O~Z+WlO*wYK}U($5aXgrfwP4D zHh{BrzM+X0?LMx>kU^476g%=Xl9vzL+#ZAgg~F-e^g(pH80N>srZb2aKR~_ESL*)n zuRDSM`zph&V4`?v38`6mL+S9;SyyB_`Blgkd&aoyd+qXK|5sA~YA=PgbUlN?=^k>i9SdDoMARKOK&U0+P8puRpo_mE%<42k#5nTY4$O3i&I>KT#1*s42^!-Ad(YAB zzmGC5!GB&%Ma@ZmDd#)ds7*L8bS!r{FO2PkN%6bmc@l0TazzU|NAVCFENuU2v=$039m2|k5*$jp5|h`x!SU9zr~8M0E*M28@JB(= zM;pDwp{%W_BHweU+XK6yr_RN2;Y9OQICZ|f=kF6XMyG3IY|NM}<33VNRY= z%=TbQ&0c*ceGyOkde?B7Onp`c5**px&qr_04CS?l)3roCoCI5Wf>m~UDRW#rHAXi& zN*$N_4zjeZoecP&_LY>h*cTX_)$}cN_c?LZJyS)fndu~X%4tXtXCWk-Ye9J!<~*E$ zytpO@*RAM%LSgL%yiW<7VW1)zh`v=XV=&lHpHU+d>{bH}fHDRa;Kd|jp&CXulVF88 zXpmE+(%ua13u=P!4_jZ3g_n{Y zHDjHmFrrMV0E5GnOv{kWsU^v;-wI1}V$1aLZl304_^~kb6W`5XbQqRmsu`F#@#~EA+nw-7Xl(!8gxLD^(vu0=^c9dotdgol zxX_5R)g-L3h|XN^izWDoLkSUF{f$(v2mxryK}x4jOmCktB9wbO3M82MP>GWxPM+7r zkl=Vgl0_pj#g{=)azN+zV&ihhuA2|vC_92fPR2t$U!0ZG1^?8}n`d3c)c zhZ{?CVG3xx-uD|6iX^Ll(gen2GK3YS&u!AkV_GO=HfCk+(I=xoHzH?Yp%8ada!l?? zU4W!*pn-Mj*LKO}8-{yu{z=C@M}E!aEI$^TSd zyjNYb;VsR%l6DGksFAT?US(mSfhGE?gVp>(3j150rw%1*mrFm)m9g|xEH{vYobHb~ zjKiIdg6Fi}wUGyV6n4VvxY@)0o4VLV@&v!+0q~bsZCi9igD7;ar?6LV92-h|p`r^z zk<(ADlHV$%4W(svJxincRNA$JAZT=!pS4qOi-X8U%YgNY8p&zMe70W$jvORh48XI$ zoUP|hn&d8&G>*AB2$mpou~F!}vG2@x2-i=cD77sgZx*<>dLt?W5q-^ZnM^-7B1Bk8)EP}%Lp!; z7lo(=OHBt_inXnn307&ePw<~n!mp3QMGGQ4aiUF{F4?!MoNiyR$I@NHv!@YKOaQ5e zZb-Fk1UuC5KH$9yb2W~-ZX97X33ekgfF?mxjn-l{z)CDW+km9vlJ+}?uF8C)(c!k> z!;K4}i5fgP+C{gO`0W!euAW=jC_%Z=$#|TY0oviVf4$d#Kr`NQ#5632iQB;oensjN zVVOiNtza%27@)Jfd0or3wK8*|(P((g~mQQUro|@bxWEXF5cb$Rxms4#O3N&;H^}Wr^8G4wzr!ZRHk;jO08` zyG5HMyQ;TtsSn-y8`N^|@S{InctAp5AkXUk9}P$;#iUEKq;ck@XYiOE4}MJK6-5^Q zNfrYq__K6KdJC%lJxC(>Wv&bO;4eiq24}wQ?(=ZRGGL7k?>d&bgVy!h)Eol^q*;Y- zU^za4Wk6(BFNcS+_B?J?E_afGz56Y8*(wQdP3AMIKBQd&Ydafho<@fL4bMck%L8~` z$pOA+!iL97sjEI=WLzXJJD=6V!Z$p11@~ZlA=XCGVeVG^iYK&XuV={HHq^t;(7IaD ztA^EQHep1>cpNJ#8Gs>L`21D!yf==+RlYq!1N*Hs|vTOgc=)kLVfdaHAEs2a{ z@lWcw=Bm!_QAc-|y1!kbt~Z#@tdaZu~&jqdV` z)jgE6I;1(h9%@Sa6m|F~b8Kna_l$zy**+K-@voaaj6r@@pg=Y4?xrA#Vu%2Bf9M@_ zz;ov>yfxeHKW0x4-4EtI8?O8CzPtI_#WPQRqaY-KMi&d*N@g-b^dAnyT{>#Y9kbQ+Jcoe~%p3_=g7%%K9C4uCLr!-=Na|qgHGg5VfZ_ zIA%e@{!!Mba930~*Ykip{1-R|$|F7T(AAneO+9#r~Q$$q7iujZj(@~bW6(1}-+Uy+1jRz|a8PBPRLrA@@(tOy&`0z zRZ?eV_4p!O)cy43NrMqBfh6S#K4;5;ynqn3ts-WewqcF~s*;Tdg3I6dL2|?1 z(w|XzLl_Zf-L$9Y`boG_&7_@(-SrMNl}B18n){@X;a4HzKIvGEb6%noVaIT7pQLOLu=6d9LC zjK9YBi$H&{i2>9Z_b$V=zn2iim9G@+)8@+@EnnD(#ZPGu65Fpc8_)iE^F?q!*XHeo zGZ|vTi@1hwMj>T$vN(of`A2kVDQEr*1n(Q-HGANDCE~wL#dp9&Hc`enFo0PV_N7~X zFb@LAdEkqlA$-#Bko|E27Xl9klkfv%A#)+rV>~B`9cfAQ-A%qdl)>+TEn$cNjprTNb_NVQ2{b3y`b@;>_d85t)xi!7 z3JQGgfApoC0C_vr62}WBf^p{Ay>a_Lm9X(h2#r(GcP`%wLgGE|w`8}6A550MS>^jn z1vi91A^`{;Xr~k?U_{;ht-3c|W?S#~S)g&cr|wRVuv6xV&wH*wCZYnzYrRDHH$G@6 zI)a5HdWd-LIVjqL{XObA-E(&OJ?^FV{9YitTVPyD-FqId_L@k@rwG3P^sHwiok~7# zz!e$kYnpVAq)=`QJ0dzK*i=>_$r>(Yk~X)s4U3lN2V1NGO4QIjtq|nEZcO1Z@&&E( zgz(R+iMap=lDCL5oey$J3RCK7x0UHcTJ0!x?s{uS=K>pJy{9Lp(`~p)pSi#SdR+nC-kfIJx)