Skip to content

Commit

Permalink
tour to horizontal layout, update figures and documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
nspyrison committed Nov 3, 2023
1 parent 5a110b2 commit 83f8eca
Show file tree
Hide file tree
Showing 28 changed files with 1,238 additions and 1,221 deletions.
7 changes: 3 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
Package: cheem
Title: Interactively Explore Local Explanations with the Radial Tour
Version: 0.3.1.9000
Version: 0.4.0.0
Authors@R: c(
person("Nicholas", "Spyrison", role = c("aut", "cre"),
email = "[email protected]",
comment = c(ORCID = "https://orcid.org/0000-0002-8417-0212"))
)
Description: Given a tree-based machine learning model, calculate the tree SHAP
<arXiv:1802.03888>; <https://github.com/ModelOriented/treeshap>
local explanation of every observation. View the data space, explanation
Description: Given a non-linear model, calculate the local explanation.
We purpose view the data space, explanation
space, and model residuals as ensemble graphic interactive on a shiny
application. After an observation of interest is identified, the normalized
variable importance of the local explanation is used as a 1D projection
Expand Down
126 changes: 65 additions & 61 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,65 @@
# cheem v0.3.1

- Repaired packagedown site!
- Fixed news on packagedown site.
- Recreate the classification saved cases, they fit too well to work for illustrations.
- Trying to include more axis context in `global_view()`. Will explore text and facet panel titles.
- Minor documentation and code clean up and clarifications.


# cheem v0.3.0 -- Generalized for any attribution

- Rebase all functions from expecting a unified `treeshap::shap()` to generalized
data frame or matrix format for arbitrary attribution spaces.
- Rework vignette and examples to reflect this change.
- Added precomputed predictions and attributions for the Ames, Chocolates, and Penguins datasets. This allows users to run attribution-agnostic functions without dependencies.
- Add `subset_cheem()`, a convenience function for subsetting cheem lists after construction.
- Removed plotly subplot variations of visuals: `global_view_subplots()`, `radial_cheem_tour_subplots()`. These were development variations never used in the shiny app.
- Minor function renames for parsimony and consistency.


# cheem v0.2.0 (CRAN releases here on out)

## App related changes

- Added vignette: _Getting started with cheem_.
- Added __pkgdown__ site: https://nspyrison.github.io/cheem/.
- Added global model performance metrics to shiny app.
- In `global_view()`, added yhaty panel (residual plot/confusion matrix).
- In `global_view()`, added color options: log_maha.data and cor_attr.y.
- In `cheem_radial_tour()`, added regression case panel with additional fixed y of residual.
- In app radial tour inputs, added inclusion variable, subsetting variables used in radial tour.
- `plotly::subplot()` variants of `global_view()` & `cheem_radial_tour()`.
- Added AmesHousing data, chocolates, and new toy simulated datasets (shiny app only).
- Reduced shiny app wording.

## Internal & utilities

- Major rebase of `cheem_ls()`.
- Added `linear/logistic_tform()` to suggest an alpha as a function of the number of observations.


## Sourcing __treeshap__

- __drat__ repository hosting __treeshap__ did not work with debian and window rhub platforms;
- Minimally ported functions and cpp source files with @author & @source. Changed examples for consistency and smaller code base support.
- as of v0.3.0, cheem was generalized to all local variable attributions, so this is not a concern.


# cheem v0.1.0 (GitHub only, commit 283da4)

## Primary preprocessing functions

- `default_rf()` create a `randomForest::randomForest()` with more conservative defaults.
- `attr_df_treeshap()` create `treeshap::treeshap()` local explanations of each observation.
- `cheem_ls()` create a cheem list of prepared tables for use in `run_app()`.

## Primary visual functions

- `run_app()` which is a shiny app consuming the following two:
- `global_view()` linked 'plotly' of approximations of data- and attribution-spaces with model information.
- `cheem_radial_tour()` create `spinifex::ggtour` of the specified radial tour. Consumed by animate_plotly, animate_gganimate, or filmstrip.
# cheem v0.4.0

- Repaired packagedown site!
- Fixed news on packagedown site.
- Shiny app has go buttons rather than waiting after every input change.
- Shiny app text, plot dimensions, and text cleaned up.
- Classification tour now uses a horizontal layout.
- Cleaned up the text on the facet panels for `global_tour()` and `radial_cheem_tour()`.
- Recreate the saved classification model, they fit too well to work as illustrations.
- Set seed more consistently. All model and attribution shifted a bit, but will be more replicable going forward.
- Minor documentation and code clean up and clarifications.


# cheem v0.3.0 -- Generalized for any attribution

- Rebase all functions from expecting a unified `treeshap::shap()` to generalized
data frame or matrix format for arbitrary attribution spaces.
- Rework vignette and examples to reflect this change.
- Added precomputed predictions and attributions for the Ames, Chocolates, and Penguins datasets. This allows users to run attribution-agnostic functions without dependencies.
- Add `subset_cheem()`, a convenience function for subsetting cheem lists after construction.
- Removed plotly subplot variations of visuals: `global_view_subplots()`, `radial_cheem_tour_subplots()`. These were development variations never used in the shiny app.
- Minor function renames for parsimony and consistency.


# cheem v0.2.0 (CRAN releases here on out)

## App related changes

- Added vignette: _Getting started with cheem_.
- Added __pkgdown__ site: https://nspyrison.github.io/cheem/.
- Added global model performance metrics to shiny app.
- In `global_view()`, added yhaty panel (residual plot/confusion matrix).
- In `global_view()`, added color options: log_maha.data and cor_attr.y.
- In `cheem_radial_tour()`, added regression case panel with additional fixed y of residual.
- In app radial tour inputs, added inclusion variable, subsetting variables used in radial tour.
- `plotly::subplot()` variants of `global_view()` & `cheem_radial_tour()`.
- Added AmesHousing data, chocolates, and new toy simulated datasets (shiny app only).
- Reduced shiny app wording.

## Internal & utilities

- Major rebase of `cheem_ls()`.
- Added `linear/logistic_tform()` to suggest an alpha as a function of the number of observations.


## Sourcing __treeshap__

- __drat__ repository hosting __treeshap__ did not work with debian and window rhub platforms;
- Minimally ported functions and cpp source files with @author & @source. Changed examples for consistency and smaller code base support.
- as of v0.3.0, cheem was generalized to all local variable attributions, so this is not a concern.


# cheem v0.1.0 (GitHub only, commit 283da4)

## Primary preprocessing functions

- `default_rf()` create a `randomForest::randomForest()` with more conservative defaults.
- `attr_df_treeshap()` create `treeshap::treeshap()` local explanations of each observation.
- `cheem_ls()` create a cheem list of prepared tables for use in `run_app()`.

## Primary visual functions

- `run_app()` which is a shiny app consuming the following two:
- `global_view()` linked 'plotly' of approximations of data- and attribution-spaces with model information.
- `cheem_radial_tour()` create `spinifex::ggtour` of the specified radial tour. Consumed by animate_plotly, animate_gganimate, or filmstrip.
4 changes: 2 additions & 2 deletions R/1_cheem_lists.r
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ global_view_df_1layer <- function(
#' comp <- 2
#' global_view(peng_chm, primary_obs = prim, comparison_obs = comp)
#' bas <- sug_basis(peng_xgb_shap, prim, comp)
#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = 1, comparison_obs = 2)
#' mv <- sug_manip_var(peng_xgb_shap, primary_obs = prim, comp)
#' ggt <- radial_cheem_tour(peng_chm, basis = bas, manip_var = mv)
#' animate_plotly(ggt)
#' }
Expand Down Expand Up @@ -203,7 +203,7 @@ global_view_df_1layer <- function(
#' comp <- 2
#' global_view(ames_rf_chm, primary_obs = prim, comparison_obs = comp)
#' bas <- sug_basis(ames_rf_shap, prim, comp)
#' mv <- sug_manip_var(ames_rf_shap, primary_obs = 1, comparison_obs = 2)
#' mv <- sug_manip_var(ames_rf_shap, primary_obs = prim, comp)
#' ggt <- radial_cheem_tour(ames_rf_chm, basis = bas, manip_var = mv)
#' animate_plotly(ggt)
#' }
Expand Down
13 changes: 6 additions & 7 deletions R/2_visualization.r
Original file line number Diff line number Diff line change
Expand Up @@ -584,26 +584,25 @@ radial_cheem_tour <- function(
### Classification case -----
if(.prob_type == "classification"){
.pred_clas <- decode_df$predicted_class
.facet_fore <- rep("attribution projection", each = .n)
## ggtour
ggt <- spinifex::ggtour(.mt_path, .dat, angle = angle,
do_center_frame = do_center_frame) +
spinifex::facet_wrap_tour(facet_var = .facet_fore, nrow = 1) +
## Density
spinifex::proto_density(
aes_args = list(color = .pred_clas, fill = .pred_clas),
row_index = row_index, rug_shape = pcp_shape) +

#Warning message:
#In Ops.factor(yscale, x[, 2]) : '*' not meaningful for factors
## PCP on Basis, 1D
proto_basis1d_distribution(
cheem_ls$attr_df,
primary_obs = .prim_obs, comparison_obs = .comp_obs,
position = "bottom1d", group_by = .pred_clas,
position = "floor1d", group_by = .pred_clas,
do_add_pcp_segments = as.logical(do_add_pcp_segments),
pcp_shape = pcp_shape, inc_var_nms = inc_var_nms,
row_index = row_index) +
## Basis 1D
spinifex::proto_basis1d(position = "bottom1d", manip_col = "black") +
spinifex::proto_basis1d(position = "floor1d", manip_col = "black") +
spinifex::proto_origin1d() +
## Highlight comparison obs
spinifex::proto_highlight1d(
Expand Down Expand Up @@ -641,7 +640,7 @@ radial_cheem_tour <- function(
## Foreground:
.dat_fore <- rbind(.dat, .dat)
.idx_fore <- c(row_index, row_index)
.facet_fore <- factor(rep(c("observed y", "residual"), each = 2 * .n))
.facet_fore <- factor(rep(c("attribution projection by observed y", "attribution projection by residual"), each = 2 * .n))
.fixed_y <- c(.y, .resid)
} else {
## not doubled up data; just fixed_observed y
Expand All @@ -652,7 +651,7 @@ radial_cheem_tour <- function(
## Foreground:
.dat_fore <- .dat
.idx_fore <- row_index
.facet_fore <- rep("observed y", each = .n)
.facet_fore <- rep("attribution projection by observed y", each = .n)
.class_fore <- .class
.fixed_y <- .y
}
Expand Down
Loading

0 comments on commit 83f8eca

Please sign in to comment.