Correct names for all loss/deviance functions/variables; fixes #52 and …

…fixes #53
pbreheny · Jun 13, 2024 · 973f9e6 · 973f9e6
1 parent a858c71
commit 973f9e6
Show file tree

Hide file tree

Showing 20 changed files with 143 additions and 127 deletions.
diff --git a/.version.json b/.version.json
@@ -1,6 +1,6 @@
 {
   "schemaVersion": 1,
   "label": "GitHub",
-  "message": "3.4.0.5",
+  "message": "3.4.0.6",
   "color": "blue"
 }
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: grpreg
 Title: Regularization Paths for Regression Models with Grouped Covariates
-Version: 3.4.0.5
-Date: 2023-11-20
+Version: 3.4.0.6
+Date: 2024-06-13
 Authors@R: c(
   person("Patrick", "Breheny", role=c("aut","cre"), email="[email protected]", comment=c(ORCID="0000-0002-0650-1119")),
   person("Yaohui", "Zeng", role="ctb"),

diff --git a/R/calcNullDev.R b/R/calcNullDev.R
diff --git a/R/calc_null_dev.R b/R/calc_null_dev.R
@@ -0,0 +1,5 @@
+calc_null_dev <- function(X, y, group, family) {
+  form <- if (any(group==0)) formula(y~X[, group==0]) else formula(y~1)
+  fit <- glm(form, family=family)
+  mean(deviance_grpreg(y, predict(fit, type="response"), family))
+}
diff --git a/R/cv-grpreg.R b/R/cv-grpreg.R
@@ -4,59 +4,61 @@
 #' grouped covariates over a grid of values for the regularization parameter
 #' lambda.
 #' 
-#' The function calls \code{grpreg}/\code{cv.grpsurv} \code{nfolds} times, each
-#' time leaving out 1/\code{nfolds} of the data.  The cross-validation error is
+#' The function calls [grpreg()] or [grpsurv()] `nfolds` times, each
+#' time leaving out 1/`nfolds` of the data.  The cross-validation error is
 #' based on the deviance;
-#' [see here for more details](https://pbreheny.github.io/grpreg/articles/web/models.html).
+#' [see here for more details](https://pbreheny.github.io/grpreg/articles/models.html).
 #' 
 #' For Gaussian and Poisson responses, the folds are chosen according to simple
 #' random sampling.  For binomial responses, the numbers for each outcome class
 #' are balanced across the folds; i.e., the number of outcomes in which
-#' \code{y} is equal to 1 is the same for each fold, or possibly off by 1 if
+#' `y` is equal to 1 is the same for each fold, or possibly off by 1 if
 #' the numbers do not divide evenly.  This approach is used for Cox regression
 #' as well to balance the amount of censoring cross each fold.
 #' 
-#' For Cox models, \code{cv.grpsurv} uses the approach of calculating the full
+#' For Cox models, `cv.grpsurv` uses the approach of calculating the full
 #' Cox partial likelihood using the cross-validated set of linear predictors.
 #' Other approaches to cross-validation for the Cox regression model have been
 #' proposed in the literature; the strengths and weaknesses of the various
 #' methods for penalized regression in the Cox model are the subject of current
 #' research.  A simple approximation to the standard error is provided,
-#' although an option to bootstrap the standard error (\code{se='bootstrap'})
+#' although an option to bootstrap the standard error (`se='bootstrap'`)
 #' is also available.
 #' 
-#' As in \code{grpreg}, seemingly unrelated regressions/multitask learning can
-#' be carried out by setting \code{y} to be a matrix, in which case groups are
-#' set up automatically (see \code{\link{grpreg}} for details), and
-#' cross-validation is carried out with respect to rows of \code{y}.  As
+#' As in [grpreg()], seemingly unrelated regressions/multitask learning can
+#' be carried out by setting `y` to be a matrix, in which case groups are
+#' set up automatically (see [grpreg()] for details), and
+#' cross-validation is carried out with respect to rows of `y`.  As
 #' mentioned in the details there, it is recommended to standardize the
 #' responses prior to fitting.
 #' 
 #' @aliases cv.grpreg cv.grpsurv
-#' @param X The design matrix, as in \code{grpreg}/\code{grpsurv}.
-#' @param y The response vector (or matrix), as in \code{grpreg}\code{grpsurv}.
-#' @param group The grouping vector, as in \code{grpreg}\code{grpsurv}.
-#' @param ... Additional arguments to \code{grpreg}\code{grpsurv}.
+#' 
+#' @param X The design matrix, as in [grpreg()]/[grpsurv()].
+#' @param y The response vector (or matrix), as in [grpreg()]/[grpsurv()].
+#' @param group The grouping vector, as in [grpreg()]/[grpsurv()].
+#' @param ... Additional arguments to [grpreg()]/[grpsurv()].
 #' @param nfolds The number of cross-validation folds.  Default is 10.
 #' @param seed You may set the seed of the random number generator in order to
 #' obtain reproducible results.
 #' @param fold Which fold each observation belongs to.  By default the
 #' observations are randomly assigned.
-#' @param returnY Should \code{cv.grpreg}\code{grpsurv} return the fitted
+#' @param returnY Should cv.grpreg()/cv.grpsurv() return the fitted
 #' values from the cross-validation folds?  Default is FALSE; if TRUE, this
 #' will return a matrix in which the element for row i, column j is the fitted
 #' value for observation i from the fold in which observation i was excluded
-#' from the fit, at the jth value of lambda.  NOTE: For \code{cv.grpsurv}, the
-#' rows of \code{Y} are ordered by time on study, and therefore will not
-#' correspond to the original order of observations pased to \code{cv.grpsurv}.
+#' from the fit, at the jth value of lambda.  NOTE: For `cv.grpsurv()`, the
+#' rows of `Y` are ordered by time on study, and therefore will not
+#' correspond to the original order of observations pased to `cv.grpsurv`.
 #' @param trace If set to TRUE, cv.grpreg will inform the user of its progress
 #' by announcing the beginning of each CV fold.  Default is FALSE.
-#' @param se For \code{cv.grpsurv}, the method by which the cross-valiation
+#' @param se For `cv.grpsurv()`, the method by which the cross-valiation
 #' standard error (CVSE) is calculated.  The 'quick' approach is based on a
 #' rough approximation, but can be calculated more or less instantly.  The
 #' 'bootstrap' approach is more accurate, but requires additional computing
 #' time.
-#' @return An object with S3 class \code{"cv.grpreg"} containing:
+#' 
+#' @returns An object with S3 class \code{"cv.grpreg"} containing:
 #' \item{cve}{The error for each value of \code{lambda}, averaged across the
 #' cross-validation folds.} \item{cvse}{The estimated standard error associated
 #' with each value of for \code{cve}.} \item{lambda}{The sequence of
@@ -67,12 +69,14 @@
 #' observations, not the original observations.} \item{min}{The index of
 #' \code{lambda} corresponding to \code{lambda.min}.} \item{lambda.min}{The
 #' value of \code{lambda} with the minimum cross-validation error.}
-#' \item{null.dev}{The deviance for the intercept-only model.} \item{pe}{If
-#' \code{family="binomial"}, the cross-validation prediction error for each
-#' value of \code{lambda}.}
+#' \item{null.dev}{The deviance for the intercept-only model.}
+#' \item{pe}{If `family="binomial"`, the cross-validation prediction error for
+#' each value of `lambda`.}
+#' 
 #' @author Patrick Breheny
-#' @seealso \code{\link{grpreg}}, \code{\link{plot.cv.grpreg}},
-#' \code{\link{summary.cv.grpreg}}, \code{\link{predict.cv.grpreg}}
+#' 
+#' @seealso [grpreg()], [plot.cv.grpreg()], [summary.cv.grpreg()],
+#' [predict.cv.grpreg()]
 #' 
 #' @examples
 #' \dontshow{set.seed(1)}
@@ -159,7 +163,7 @@ cv.grpreg <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, returnY
     if (trace) cat("Starting CV fold #", i, sep="","\n")
     res <- cvf(i, X, y, fold, cv.args)
     Y[fold==i, 1:res$nl] <- res$yhat
-    E[fold==i, 1:res$nl] <- res$loss
+    E[fold==i, 1:res$nl] <- res$deviance
     if (fit$family=="binomial") PE[fold==i, 1:res$nl] <- res$pe
   }
 
@@ -173,7 +177,7 @@ cv.grpreg <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, returnY
   cve <- apply(E, 2, mean)
   cvse <- apply(E, 2, sd) / sqrt(n)
   min <- which.min(cve)
-  null.dev <- calcNullDev(X, y, group=XG$g, family=fit$family)
+  null.dev <- calc_null_dev(X, y, group=XG$g, family=fit$family)
 
   val <- list(cve=cve, cvse=cvse, lambda=lambda, fit=fit, fold=fold, min=min, lambda.min=lambda[min], null.dev=null.dev)
   if (fit$family=="binomial") val$pe <- apply(PE[, ind], 2, mean)
@@ -195,7 +199,7 @@ cvf <- function(i, X, y, fold, cv.args) {
   X2 <- X[fold==i, , drop=FALSE]
   y2 <- y[fold==i]
   yhat <- matrix(predict(fit.i, X2, type="response"), length(y2))
-  loss <- loss.grpreg(y2, yhat, fit.i$family)
+  deviance <- deviance_grpreg(y2, yhat, fit.i$family)
   pe <- if (fit.i$family=="binomial") {(yhat < 0.5) == y2} else NULL
-  list(loss=loss, pe=pe, nl=length(fit.i$lambda), yhat=yhat)
+  list(deviance=deviance, pe=pe, nl=length(fit.i$lambda), yhat=yhat)
 }
diff --git a/R/cv-grpsurv.R b/R/cv-grpsurv.R
@@ -61,12 +61,12 @@ cv.grpsurv <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, se=c('
 
   # Return
   if (se == "quick") {
-    L <- loss.grpsurv(y, Y, total=FALSE)
+    L <- deviance_grpsurv(y, Y, total=FALSE)
     cve <- apply(L, 2, sum)/sum(fit$fail)
     cvse <- apply(L, 2, sd)*sqrt(nrow(L))/sum(fit$fail)
   } else {
-    cve <- as.double(loss.grpsurv(y, Y))/sum(fit$fail)
-    cvse <- se.grpsurv(y, Y)/sum(fit$fail)
+    cve <- as.double(deviance_grpsurv(y, Y))/sum(fit$fail)
+    cvse <- se_grpsurv(y, Y)/sum(fit$fail)
   }
   min <- which.min(cve)
 
@@ -84,5 +84,5 @@ cvf.surv <- function(i, XX, y, fold, cv.args) {
   nl <- length(fit.i$lambda)
   yhat <- predict(fit.i, X2)
 
-  list(nl=length(fit.i$lambda), yhat=yhat)#, loss=loss)
+  list(nl=length(fit.i$lambda), yhat=yhat)
 }
diff --git a/R/gBridge.R b/R/gBridge.R
@@ -151,7 +151,7 @@ gBridge <- function(X, y, group=1:ncol(X), family=c("gaussian", "binomial", "poi
                         group = group,
                         lambda = lambda,
                         alpha = alpha,
-                        loss = loss,
+                        deviance = 2 * loss,
                         n = n,
                         penalty = "gBridge",
                         df = df,

diff --git a/R/grpreg.R b/R/grpreg.R
@@ -10,13 +10,13 @@
 #' the level of individual covariates (i.e., selecting important groups as well
 #' as important members of those groups).  Group selection selects important
 #' groups, and not members within the group -- i.e., within a group,
-#' coefficients will either all be zero or all nonzero.  The \code{grLasso},
-#' \code{grMCP}, and \code{grSCAD} penalties carry out group selection, while
-#' the \code{gel} and \code{cMCP} penalties carry out bi-level selection.  For
-#' bi-level selection, see also the \code{\link{gBridge}} function.  For
+#' coefficients will either all be zero or all nonzero.  The `grLasso`,
+#' `grMCP`, and `grSCAD` penalties carry out group selection, while
+#' the `gel` and `cMCP` penalties carry out bi-level selection.  For
+#' bi-level selection, see also the [gBridge()] function.  For
 #' historical reasons and backwards compatibility, some of these penalties have
-#' aliases; e.g., \code{gLasso} will do the same thing as \code{grLasso}, but
-#' users are encouraged to use \code{grLasso}.
+#' aliases; e.g., `gLasso` will do the same thing as `grLasso`, but
+#' users are encouraged to use `grLasso`.
 #' 
 #' Please note the distinction between \code{grMCP} and \code{cMCP}.  The
 #' former involves an MCP penalty being applied to an L2-norm of each group.
@@ -39,12 +39,12 @@
 #' \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) + }{Q(\beta|X, y) =
 #' (1/n)*L(\beta|X, y) + P(\beta, \lambda),}\deqn{ P_\lambda(\beta)}{Q(\beta|X,
 #' y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),} where the loss function L is
-#' the deviance (-2 times the log likelihood) for the specified outcome
+#' the negative log-likelihood (half the deviance) for the specified outcome
 #' distribution (gaussian/binomial/poisson). For more details, refer to the
 #' following:
 #' \itemize{
-#'   \item [Models and loss functions](https://pbreheny.github.io/grpreg/articles/web/models.html)
-#'   \item [Penalties](https://pbreheny.github.io/grpreg/articles/web/penalties.htmlPenalties)
+#'   \item [Models and loss functions](https://pbreheny.github.io/grpreg/articles/models.html)
+#'   \item [Penalties](https://pbreheny.github.io/grpreg/articles/penalties.htmlPenalties)
 #' }
 #'   
 #' For the bi-level selection methods, a locally approximated coordinate
@@ -151,7 +151,8 @@
 #' \item{group}{Same as above.}
 #' \item{lambda}{The sequence of \code{lambda} values in the path.}
 #' \item{alpha}{Same as above.}
-#' \item{loss}{A vector containing either the residual sum of squares (`"gaussian"`) or negative log-likelihood (`"binomial"`) of the fitted model at each value of `lambda`.}
+#' \item{deviance}{A vector containing the deviance of the fitted model at each
+#' value of `lambda`.}
 #' \item{n}{Number of observations.}
 #' \item{penalty}{Same as above.}
 #' \item{df}{A vector of length `nlambda` containing estimates of effective number of model parameters all the points along the regularization path.  For details on how this is calculated, see Breheny and Huang (2009).}
@@ -299,15 +300,15 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
     if (bilevel) fit <- .Call("lcdfit_gaussian", XG$X, yy, penalty, K1, K0, lambda, alpha, eps, 0, gamma, tau, as.integer(max.iter), XG$m, as.integer(dfmax), as.integer(gmax), as.integer(user.lambda))
     else fit <- .Call("gdfit_gaussian", XG$X, yy, penalty, K1, K0, lambda, lam.max, alpha, eps, as.integer(max.iter), gamma, XG$m, as.integer(dfmax), as.integer(gmax), as.integer(user.lambda))
     b <- rbind(mean(y), matrix(fit[[1]], nrow=p))
-    loss <- fit[[2]]
+    dev <- fit[[2]]
     Eta <- matrix(fit[[3]], nrow=n) + mean(y)
     df <- fit[[4]] + 1 # Intercept
     iter <- fit[[5]]
   } else {
     if (bilevel) fit <- .Call("lcdfit_glm", XG$X, yy, family, penalty, K1, K0, lambda, alpha, eps, 0, gamma, tau, as.integer(max.iter), XG$m, as.integer(dfmax), as.integer(gmax), as.integer(warn), as.integer(user.lambda))
     else fit <- .Call("gdfit_glm", XG$X, yy, family, penalty, K1, K0, lambda, alpha, eps, as.integer(max.iter), gamma, XG$m, as.integer(dfmax), as.integer(gmax), as.integer(warn), as.integer(user.lambda))
     b <- rbind(fit[[1]], matrix(fit[[2]], nrow=p))
-    loss <- fit[[3]]
+    dev <- fit[[3]]
     Eta <- matrix(fit[[4]], nrow=n)
     df <- fit[[5]]
     iter <- fit[[6]]
@@ -319,7 +320,7 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
   iter <- iter[ind]
   lambda <- lambda[ind]
   df <- df[ind]
-  loss <- loss[ind]
+  dev <- dev[ind]
   Eta <- Eta[, ind, drop=FALSE]
   if (iter[1] == max.iter) stop("Algorithm failed to converge for any values of lambda.  This indicates a combination of (a) an ill-conditioned feature matrix X and (b) insufficient penalization.  You must fix one or the other for your model to be identifiable.", call.=FALSE)
   if (warn & any(iter==max.iter)) warning("Algorithm failed to converge for all values of lambda", call.=FALSE)
@@ -347,7 +348,7 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
                         group = factor(group),
                         lambda = lambda,
                         alpha = alpha,
-                        loss = loss,
+                        deviance = dev,
                         linear.predictors = Eta,
                         n = n,
                         penalty = penalty,

diff --git a/R/grpsurv.R b/R/grpsurv.R
@@ -10,9 +10,9 @@
 #' defined to be \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) + }{Q(\beta|X,
 #' y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),}\deqn{
 #' P_\lambda(\beta)}{Q(\beta|X, y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),}
-#' where the loss function L is the deviance (-2 times the partial
-#' log-likelihood) from the Cox regression mode.
-#' [See here for more details](https://pbreheny.github.io/ncvreg/articles/web/models.html).
+#' where the loss function L is the negative partial log-likelihood (half the
+#' deviance) from the Cox regression model.
+#' [See here for more details](https://pbreheny.github.io/ncvreg/articles/models.html).
 #' 
 #' Presently, ties are not handled by \code{grpsurv} in a particularly
 #' sophisticated manner.  This will be improved upon in a future release of
@@ -70,29 +70,35 @@
 #' saturation?  Default is TRUE.
 #' @param returnX Return the standardized design matrix?  Default is FALSE.
 #' @param ... Not used.
-#' @return An object with S3 class \code{"grpsurv"} containing: \item{beta}{The
-#' fitted matrix of coefficients.  The number of rows is equal to the number of
-#' coefficients, and the number of columns is equal to \code{nlambda}.}
-#' \item{group}{Same as above.} \item{lambda}{The sequence of \code{lambda}
-#' values in the path.} \item{penalty}{Same as above.} \item{gamma}{Same as
-#' above.} \item{alpha}{Same as above.} \item{loss}{The negative partial
-#' log-likelihood of the fitted model at each value of \code{lambda}.}
-#' \item{n}{The number of observations.} \item{df}{A vector of length
-#' \code{nlambda} containing estimates of effective number of model parameters
-#' all the points along the regularization path.  For details on how this is
-#' calculated, see Breheny and Huang (2009).} \item{iter}{A vector of length
-#' \code{nlambda} containing the number of iterations until convergence at each
-#' value of \code{lambda}.} \item{group.multiplier}{A named vector containing
-#' the multiplicative constant applied to each group's penalty.}
+#' 
+#' @returns An object with S3 class `"grpsurv"` containing:
+#' \item{beta}{The fitted matrix of coefficients. The number of rows is equal to
+#' the number of coefficients, and the number of columns is equal to `nlambda`.}
+#' \item{group}{Same as above.}
+#' \item{lambda}{The sequence of `lambda` values in the path.}
+#' \item{penalty}{Same as above.}
+#' \item{gamma}{Same as above.}
+#' \item{alpha}{Same as above.}
+#' \item{deviance}{The deviance of the fitted model at each value of `lambda`.}
+#' \item{n}{The number of observations.}
+#' \item{df}{A vector of length `nlambda` containing estimates of effective
+#' number of model parameters all the points along the regularization path. For
+#' details on how this is calculated, see Breheny and Huang (2009).}
+#' \item{iter}{A vector of length `nlambda` containing the number of iterations
+#' until convergence at each value of `lambda`.}
+#' \item{group.multiplier}{A named vector containing the multiplicative constant
+#' applied to each group's penalty.}
 #' 
 #' For Cox models, the following objects are also returned (and are necessary
-#' to estimate baseline survival conditonal on the estimated regression
-#' coefficients), all of which are ordered by time on study.  I.e., the ith row
-#' of \code{W} does not correspond to the ith row of \code{X}):
+#' to estimate baseline survival conditional on the estimated regression
+#' coefficients), all of which are ordered by time on study (i.e., the ith row
+#' of `W` does not correspond to the ith row of `X`):
+#' 
+#' \item{W}{Matrix of `exp(beta)` values for each subject over all `lambda`
+#' values.}
+#' \item{time}{Times on study.}
+#' \item{fail}{Failure event indicator.}
 #' 
-#' \item{W}{Matrix of \code{exp(beta)} values for each subject over all
-#' \code{lambda} values.} \item{time}{Times on study.} \item{fail}{Failure
-#' event indicator.}
 #' @author Patrick Breheny
 #' 
 #' @seealso [plot.grpreg()], [predict.grpsurv()], [cv.grpsurv()]
@@ -200,7 +206,7 @@ grpsurv <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD
   b <- matrix(res[[1]], p, nlambda)
   iter <- res[[2]]
   df <- res[[3]]
-  loss <- -2*res[[4]]
+  loss <- -res[[4]]
   Eta <- matrix(res[[5]], n, nlambda)
 
   # Eliminate saturated lambda values, if any
@@ -231,7 +237,7 @@ grpsurv <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD
                         penalty = penalty,
                         gamma = gamma,
                         alpha = alpha,
-                        loss = loss,
+                        deviance = 2 * loss,
                         n = n,
                         df = df,
                         iter = iter,