Skip to content

Commit

Permalink
Correct names for all loss/deviance functions/variables; fixes #52 and
Browse files Browse the repository at this point in the history
…fixes #53
  • Loading branch information
pbreheny committed Jun 13, 2024
1 parent a858c71 commit 973f9e6
Show file tree
Hide file tree
Showing 20 changed files with 143 additions and 127 deletions.
2 changes: 1 addition & 1 deletion .version.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"schemaVersion": 1,
"label": "GitHub",
"message": "3.4.0.5",
"message": "3.4.0.6",
"color": "blue"
}
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: grpreg
Title: Regularization Paths for Regression Models with Grouped Covariates
Version: 3.4.0.5
Date: 2023-11-20
Version: 3.4.0.6
Date: 2024-06-13
Authors@R: c(
person("Patrick", "Breheny", role=c("aut","cre"), email="[email protected]", comment=c(ORCID="0000-0002-0650-1119")),
person("Yaohui", "Zeng", role="ctb"),
Expand Down
5 changes: 0 additions & 5 deletions R/calcNullDev.R

This file was deleted.

5 changes: 5 additions & 0 deletions R/calc_null_dev.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
calc_null_dev <- function(X, y, group, family) {
form <- if (any(group==0)) formula(y~X[, group==0]) else formula(y~1)
fit <- glm(form, family=family)
mean(deviance_grpreg(y, predict(fit, type="response"), family))
}
62 changes: 33 additions & 29 deletions R/cv-grpreg.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,59 +4,61 @@
#' grouped covariates over a grid of values for the regularization parameter
#' lambda.
#'
#' The function calls \code{grpreg}/\code{cv.grpsurv} \code{nfolds} times, each
#' time leaving out 1/\code{nfolds} of the data. The cross-validation error is
#' The function calls [grpreg()] or [grpsurv()] `nfolds` times, each
#' time leaving out 1/`nfolds` of the data. The cross-validation error is
#' based on the deviance;
#' [see here for more details](https://pbreheny.github.io/grpreg/articles/web/models.html).
#' [see here for more details](https://pbreheny.github.io/grpreg/articles/models.html).
#'
#' For Gaussian and Poisson responses, the folds are chosen according to simple
#' random sampling. For binomial responses, the numbers for each outcome class
#' are balanced across the folds; i.e., the number of outcomes in which
#' \code{y} is equal to 1 is the same for each fold, or possibly off by 1 if
#' `y` is equal to 1 is the same for each fold, or possibly off by 1 if
#' the numbers do not divide evenly. This approach is used for Cox regression
#' as well to balance the amount of censoring cross each fold.
#'
#' For Cox models, \code{cv.grpsurv} uses the approach of calculating the full
#' For Cox models, `cv.grpsurv` uses the approach of calculating the full
#' Cox partial likelihood using the cross-validated set of linear predictors.
#' Other approaches to cross-validation for the Cox regression model have been
#' proposed in the literature; the strengths and weaknesses of the various
#' methods for penalized regression in the Cox model are the subject of current
#' research. A simple approximation to the standard error is provided,
#' although an option to bootstrap the standard error (\code{se='bootstrap'})
#' although an option to bootstrap the standard error (`se='bootstrap'`)
#' is also available.
#'
#' As in \code{grpreg}, seemingly unrelated regressions/multitask learning can
#' be carried out by setting \code{y} to be a matrix, in which case groups are
#' set up automatically (see \code{\link{grpreg}} for details), and
#' cross-validation is carried out with respect to rows of \code{y}. As
#' As in [grpreg()], seemingly unrelated regressions/multitask learning can
#' be carried out by setting `y` to be a matrix, in which case groups are
#' set up automatically (see [grpreg()] for details), and
#' cross-validation is carried out with respect to rows of `y`. As
#' mentioned in the details there, it is recommended to standardize the
#' responses prior to fitting.
#'
#' @aliases cv.grpreg cv.grpsurv
#' @param X The design matrix, as in \code{grpreg}/\code{grpsurv}.
#' @param y The response vector (or matrix), as in \code{grpreg}\code{grpsurv}.
#' @param group The grouping vector, as in \code{grpreg}\code{grpsurv}.
#' @param ... Additional arguments to \code{grpreg}\code{grpsurv}.
#'
#' @param X The design matrix, as in [grpreg()]/[grpsurv()].
#' @param y The response vector (or matrix), as in [grpreg()]/[grpsurv()].
#' @param group The grouping vector, as in [grpreg()]/[grpsurv()].
#' @param ... Additional arguments to [grpreg()]/[grpsurv()].
#' @param nfolds The number of cross-validation folds. Default is 10.
#' @param seed You may set the seed of the random number generator in order to
#' obtain reproducible results.
#' @param fold Which fold each observation belongs to. By default the
#' observations are randomly assigned.
#' @param returnY Should \code{cv.grpreg}\code{grpsurv} return the fitted
#' @param returnY Should cv.grpreg()/cv.grpsurv() return the fitted
#' values from the cross-validation folds? Default is FALSE; if TRUE, this
#' will return a matrix in which the element for row i, column j is the fitted
#' value for observation i from the fold in which observation i was excluded
#' from the fit, at the jth value of lambda. NOTE: For \code{cv.grpsurv}, the
#' rows of \code{Y} are ordered by time on study, and therefore will not
#' correspond to the original order of observations pased to \code{cv.grpsurv}.
#' from the fit, at the jth value of lambda. NOTE: For `cv.grpsurv()`, the
#' rows of `Y` are ordered by time on study, and therefore will not
#' correspond to the original order of observations pased to `cv.grpsurv`.
#' @param trace If set to TRUE, cv.grpreg will inform the user of its progress
#' by announcing the beginning of each CV fold. Default is FALSE.
#' @param se For \code{cv.grpsurv}, the method by which the cross-valiation
#' @param se For `cv.grpsurv()`, the method by which the cross-valiation
#' standard error (CVSE) is calculated. The 'quick' approach is based on a
#' rough approximation, but can be calculated more or less instantly. The
#' 'bootstrap' approach is more accurate, but requires additional computing
#' time.
#' @return An object with S3 class \code{"cv.grpreg"} containing:
#'
#' @returns An object with S3 class \code{"cv.grpreg"} containing:
#' \item{cve}{The error for each value of \code{lambda}, averaged across the
#' cross-validation folds.} \item{cvse}{The estimated standard error associated
#' with each value of for \code{cve}.} \item{lambda}{The sequence of
Expand All @@ -67,12 +69,14 @@
#' observations, not the original observations.} \item{min}{The index of
#' \code{lambda} corresponding to \code{lambda.min}.} \item{lambda.min}{The
#' value of \code{lambda} with the minimum cross-validation error.}
#' \item{null.dev}{The deviance for the intercept-only model.} \item{pe}{If
#' \code{family="binomial"}, the cross-validation prediction error for each
#' value of \code{lambda}.}
#' \item{null.dev}{The deviance for the intercept-only model.}
#' \item{pe}{If `family="binomial"`, the cross-validation prediction error for
#' each value of `lambda`.}
#'
#' @author Patrick Breheny
#' @seealso \code{\link{grpreg}}, \code{\link{plot.cv.grpreg}},
#' \code{\link{summary.cv.grpreg}}, \code{\link{predict.cv.grpreg}}
#'
#' @seealso [grpreg()], [plot.cv.grpreg()], [summary.cv.grpreg()],
#' [predict.cv.grpreg()]
#'
#' @examples
#' \dontshow{set.seed(1)}
Expand Down Expand Up @@ -159,7 +163,7 @@ cv.grpreg <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, returnY
if (trace) cat("Starting CV fold #", i, sep="","\n")
res <- cvf(i, X, y, fold, cv.args)
Y[fold==i, 1:res$nl] <- res$yhat
E[fold==i, 1:res$nl] <- res$loss
E[fold==i, 1:res$nl] <- res$deviance
if (fit$family=="binomial") PE[fold==i, 1:res$nl] <- res$pe
}

Expand All @@ -173,7 +177,7 @@ cv.grpreg <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, returnY
cve <- apply(E, 2, mean)
cvse <- apply(E, 2, sd) / sqrt(n)
min <- which.min(cve)
null.dev <- calcNullDev(X, y, group=XG$g, family=fit$family)
null.dev <- calc_null_dev(X, y, group=XG$g, family=fit$family)

val <- list(cve=cve, cvse=cvse, lambda=lambda, fit=fit, fold=fold, min=min, lambda.min=lambda[min], null.dev=null.dev)
if (fit$family=="binomial") val$pe <- apply(PE[, ind], 2, mean)
Expand All @@ -195,7 +199,7 @@ cvf <- function(i, X, y, fold, cv.args) {
X2 <- X[fold==i, , drop=FALSE]
y2 <- y[fold==i]
yhat <- matrix(predict(fit.i, X2, type="response"), length(y2))
loss <- loss.grpreg(y2, yhat, fit.i$family)
deviance <- deviance_grpreg(y2, yhat, fit.i$family)
pe <- if (fit.i$family=="binomial") {(yhat < 0.5) == y2} else NULL
list(loss=loss, pe=pe, nl=length(fit.i$lambda), yhat=yhat)
list(deviance=deviance, pe=pe, nl=length(fit.i$lambda), yhat=yhat)
}
8 changes: 4 additions & 4 deletions R/cv-grpsurv.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,12 @@ cv.grpsurv <- function(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold, se=c('

# Return
if (se == "quick") {
L <- loss.grpsurv(y, Y, total=FALSE)
L <- deviance_grpsurv(y, Y, total=FALSE)
cve <- apply(L, 2, sum)/sum(fit$fail)
cvse <- apply(L, 2, sd)*sqrt(nrow(L))/sum(fit$fail)
} else {
cve <- as.double(loss.grpsurv(y, Y))/sum(fit$fail)
cvse <- se.grpsurv(y, Y)/sum(fit$fail)
cve <- as.double(deviance_grpsurv(y, Y))/sum(fit$fail)
cvse <- se_grpsurv(y, Y)/sum(fit$fail)
}
min <- which.min(cve)

Expand All @@ -84,5 +84,5 @@ cvf.surv <- function(i, XX, y, fold, cv.args) {
nl <- length(fit.i$lambda)
yhat <- predict(fit.i, X2)

list(nl=length(fit.i$lambda), yhat=yhat)#, loss=loss)
list(nl=length(fit.i$lambda), yhat=yhat)
}
2 changes: 1 addition & 1 deletion R/gBridge.R
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ gBridge <- function(X, y, group=1:ncol(X), family=c("gaussian", "binomial", "poi
group = group,
lambda = lambda,
alpha = alpha,
loss = loss,
deviance = 2 * loss,
n = n,
penalty = "gBridge",
df = df,
Expand Down
29 changes: 15 additions & 14 deletions R/grpreg.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@
#' the level of individual covariates (i.e., selecting important groups as well
#' as important members of those groups). Group selection selects important
#' groups, and not members within the group -- i.e., within a group,
#' coefficients will either all be zero or all nonzero. The \code{grLasso},
#' \code{grMCP}, and \code{grSCAD} penalties carry out group selection, while
#' the \code{gel} and \code{cMCP} penalties carry out bi-level selection. For
#' bi-level selection, see also the \code{\link{gBridge}} function. For
#' coefficients will either all be zero or all nonzero. The `grLasso`,
#' `grMCP`, and `grSCAD` penalties carry out group selection, while
#' the `gel` and `cMCP` penalties carry out bi-level selection. For
#' bi-level selection, see also the [gBridge()] function. For
#' historical reasons and backwards compatibility, some of these penalties have
#' aliases; e.g., \code{gLasso} will do the same thing as \code{grLasso}, but
#' users are encouraged to use \code{grLasso}.
#' aliases; e.g., `gLasso` will do the same thing as `grLasso`, but
#' users are encouraged to use `grLasso`.
#'
#' Please note the distinction between \code{grMCP} and \code{cMCP}. The
#' former involves an MCP penalty being applied to an L2-norm of each group.
Expand All @@ -39,12 +39,12 @@
#' \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) + }{Q(\beta|X, y) =
#' (1/n)*L(\beta|X, y) + P(\beta, \lambda),}\deqn{ P_\lambda(\beta)}{Q(\beta|X,
#' y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),} where the loss function L is
#' the deviance (-2 times the log likelihood) for the specified outcome
#' the negative log-likelihood (half the deviance) for the specified outcome
#' distribution (gaussian/binomial/poisson). For more details, refer to the
#' following:
#' \itemize{
#' \item [Models and loss functions](https://pbreheny.github.io/grpreg/articles/web/models.html)
#' \item [Penalties](https://pbreheny.github.io/grpreg/articles/web/penalties.htmlPenalties)
#' \item [Models and loss functions](https://pbreheny.github.io/grpreg/articles/models.html)
#' \item [Penalties](https://pbreheny.github.io/grpreg/articles/penalties.htmlPenalties)
#' }
#'
#' For the bi-level selection methods, a locally approximated coordinate
Expand Down Expand Up @@ -151,7 +151,8 @@
#' \item{group}{Same as above.}
#' \item{lambda}{The sequence of \code{lambda} values in the path.}
#' \item{alpha}{Same as above.}
#' \item{loss}{A vector containing either the residual sum of squares (`"gaussian"`) or negative log-likelihood (`"binomial"`) of the fitted model at each value of `lambda`.}
#' \item{deviance}{A vector containing the deviance of the fitted model at each
#' value of `lambda`.}
#' \item{n}{Number of observations.}
#' \item{penalty}{Same as above.}
#' \item{df}{A vector of length `nlambda` containing estimates of effective number of model parameters all the points along the regularization path. For details on how this is calculated, see Breheny and Huang (2009).}
Expand Down Expand Up @@ -299,15 +300,15 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
if (bilevel) fit <- .Call("lcdfit_gaussian", XG$X, yy, penalty, K1, K0, lambda, alpha, eps, 0, gamma, tau, as.integer(max.iter), XG$m, as.integer(dfmax), as.integer(gmax), as.integer(user.lambda))
else fit <- .Call("gdfit_gaussian", XG$X, yy, penalty, K1, K0, lambda, lam.max, alpha, eps, as.integer(max.iter), gamma, XG$m, as.integer(dfmax), as.integer(gmax), as.integer(user.lambda))
b <- rbind(mean(y), matrix(fit[[1]], nrow=p))
loss <- fit[[2]]
dev <- fit[[2]]
Eta <- matrix(fit[[3]], nrow=n) + mean(y)
df <- fit[[4]] + 1 # Intercept
iter <- fit[[5]]
} else {
if (bilevel) fit <- .Call("lcdfit_glm", XG$X, yy, family, penalty, K1, K0, lambda, alpha, eps, 0, gamma, tau, as.integer(max.iter), XG$m, as.integer(dfmax), as.integer(gmax), as.integer(warn), as.integer(user.lambda))
else fit <- .Call("gdfit_glm", XG$X, yy, family, penalty, K1, K0, lambda, alpha, eps, as.integer(max.iter), gamma, XG$m, as.integer(dfmax), as.integer(gmax), as.integer(warn), as.integer(user.lambda))
b <- rbind(fit[[1]], matrix(fit[[2]], nrow=p))
loss <- fit[[3]]
dev <- fit[[3]]
Eta <- matrix(fit[[4]], nrow=n)
df <- fit[[5]]
iter <- fit[[6]]
Expand All @@ -319,7 +320,7 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
iter <- iter[ind]
lambda <- lambda[ind]
df <- df[ind]
loss <- loss[ind]
dev <- dev[ind]
Eta <- Eta[, ind, drop=FALSE]
if (iter[1] == max.iter) stop("Algorithm failed to converge for any values of lambda. This indicates a combination of (a) an ill-conditioned feature matrix X and (b) insufficient penalization. You must fix one or the other for your model to be identifiable.", call.=FALSE)
if (warn & any(iter==max.iter)) warning("Algorithm failed to converge for all values of lambda", call.=FALSE)
Expand Down Expand Up @@ -347,7 +348,7 @@ grpreg <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD"
group = factor(group),
lambda = lambda,
alpha = alpha,
loss = loss,
deviance = dev,
linear.predictors = Eta,
n = n,
penalty = penalty,
Expand Down
56 changes: 31 additions & 25 deletions R/grpsurv.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
#' defined to be \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) + }{Q(\beta|X,
#' y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),}\deqn{
#' P_\lambda(\beta)}{Q(\beta|X, y) = (1/n)*L(\beta|X, y) + P(\beta, \lambda),}
#' where the loss function L is the deviance (-2 times the partial
#' log-likelihood) from the Cox regression mode.
#' [See here for more details](https://pbreheny.github.io/ncvreg/articles/web/models.html).
#' where the loss function L is the negative partial log-likelihood (half the
#' deviance) from the Cox regression model.
#' [See here for more details](https://pbreheny.github.io/ncvreg/articles/models.html).
#'
#' Presently, ties are not handled by \code{grpsurv} in a particularly
#' sophisticated manner. This will be improved upon in a future release of
Expand Down Expand Up @@ -70,29 +70,35 @@
#' saturation? Default is TRUE.
#' @param returnX Return the standardized design matrix? Default is FALSE.
#' @param ... Not used.
#' @return An object with S3 class \code{"grpsurv"} containing: \item{beta}{The
#' fitted matrix of coefficients. The number of rows is equal to the number of
#' coefficients, and the number of columns is equal to \code{nlambda}.}
#' \item{group}{Same as above.} \item{lambda}{The sequence of \code{lambda}
#' values in the path.} \item{penalty}{Same as above.} \item{gamma}{Same as
#' above.} \item{alpha}{Same as above.} \item{loss}{The negative partial
#' log-likelihood of the fitted model at each value of \code{lambda}.}
#' \item{n}{The number of observations.} \item{df}{A vector of length
#' \code{nlambda} containing estimates of effective number of model parameters
#' all the points along the regularization path. For details on how this is
#' calculated, see Breheny and Huang (2009).} \item{iter}{A vector of length
#' \code{nlambda} containing the number of iterations until convergence at each
#' value of \code{lambda}.} \item{group.multiplier}{A named vector containing
#' the multiplicative constant applied to each group's penalty.}
#'
#' @returns An object with S3 class `"grpsurv"` containing:
#' \item{beta}{The fitted matrix of coefficients. The number of rows is equal to
#' the number of coefficients, and the number of columns is equal to `nlambda`.}
#' \item{group}{Same as above.}
#' \item{lambda}{The sequence of `lambda` values in the path.}
#' \item{penalty}{Same as above.}
#' \item{gamma}{Same as above.}
#' \item{alpha}{Same as above.}
#' \item{deviance}{The deviance of the fitted model at each value of `lambda`.}
#' \item{n}{The number of observations.}
#' \item{df}{A vector of length `nlambda` containing estimates of effective
#' number of model parameters all the points along the regularization path. For
#' details on how this is calculated, see Breheny and Huang (2009).}
#' \item{iter}{A vector of length `nlambda` containing the number of iterations
#' until convergence at each value of `lambda`.}
#' \item{group.multiplier}{A named vector containing the multiplicative constant
#' applied to each group's penalty.}
#'
#' For Cox models, the following objects are also returned (and are necessary
#' to estimate baseline survival conditonal on the estimated regression
#' coefficients), all of which are ordered by time on study. I.e., the ith row
#' of \code{W} does not correspond to the ith row of \code{X}):
#' to estimate baseline survival conditional on the estimated regression
#' coefficients), all of which are ordered by time on study (i.e., the ith row
#' of `W` does not correspond to the ith row of `X`):
#'
#' \item{W}{Matrix of `exp(beta)` values for each subject over all `lambda`
#' values.}
#' \item{time}{Times on study.}
#' \item{fail}{Failure event indicator.}
#'
#' \item{W}{Matrix of \code{exp(beta)} values for each subject over all
#' \code{lambda} values.} \item{time}{Times on study.} \item{fail}{Failure
#' event indicator.}
#' @author Patrick Breheny
#'
#' @seealso [plot.grpreg()], [predict.grpsurv()], [cv.grpsurv()]
Expand Down Expand Up @@ -200,7 +206,7 @@ grpsurv <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD
b <- matrix(res[[1]], p, nlambda)
iter <- res[[2]]
df <- res[[3]]
loss <- -2*res[[4]]
loss <- -res[[4]]
Eta <- matrix(res[[5]], n, nlambda)

# Eliminate saturated lambda values, if any
Expand Down Expand Up @@ -231,7 +237,7 @@ grpsurv <- function(X, y, group=1:ncol(X), penalty=c("grLasso", "grMCP", "grSCAD
penalty = penalty,
gamma = gamma,
alpha = alpha,
loss = loss,
deviance = 2 * loss,
n = n,
df = df,
iter = iter,
Expand Down
Loading

0 comments on commit 973f9e6

Please sign in to comment.