Updates to documentation

pbreheny · Sep 26, 2018 · c821619 · c821619
1 parent 571ffe8
commit c821619
Show file tree

Hide file tree

Showing 16 changed files with 246 additions and 92 deletions.
diff --git a/NEWS b/NEWS
@@ -1,23 +1,39 @@
-3.1-4 (2018-06-15)
+# grpreg 3.2-0 (2018-XX-XX)
+  * New: cv.grpsurv now calculates SE, with bootstrap option
+  * Change: R^2 now consistently uses the Cox-Snell definition for all types
+    of models
+  * Change: Survival loss now uses deviance
+  * Change: cv.grpsurv now uses 'fold', not 'cv.ind', to declare assignments
+  * Fixed: cv.grpreg now correctly handles out-of-order groups for Poisson
+  * Fixed: cv.grpsurv now correctly standardizes out-of-order groups
+  * Fixed: grpreg no longer returns loss=NA with family='binomial' for some
+    lambda values
+  * Internal: SSR-BEDPP optimization reinstated after bug fix
+  * Internal: C code for binom/pois combined into gdfit_glm, lcdfit_glm
+  * Documentation: Lots of updates
+  * Documentation: vignette now html (used to be pdf)
+  * Documentation: pkgdown website
+
+# grpreg 3.1-4 (2018-06-15)
   * Fixed: Works with arbitrarily "messy" group structures now (constant
     columns, out of order groups, etc.) due to restructuring of standardization/
     orthogonalization
   * Internal: SSR-BEDPP rule turned off due to bug
 
-3.1-3 (2018-04-07)
+# grpreg 3.1-3 (2018-04-07)
   * Internal: C code now uses || instead of |
 
-3.1-2 (2017-07-05)
+# grpreg 3.1-2 (2017-07-05)
   * Fixed: Bug in applying screening rules with group lasso for linear
     regression with user-specified lambda sequence (thank you very much to
     Natasha Sahr for pointing this out)
 
-3.1-1 (2017-06-07)
+# grpreg 3.1-1 (2017-06-07)
   * Fixed: Cross-validation no longer fails when constant columns are present
     (thank you to Matthew Rosenberg for pointing this out)
   * Fixed: Cross-validation no longer fails when group.multiplier is specified
 
-3.1-0 (2017-05-18)
+# grpreg 3.1-0 (2017-05-18)
   * New: Additional tests and support for coersion of various types with
     respect to both X and y
   * Change: Convergence criterion now based on RMSD of linear predictors
@@ -32,16 +48,16 @@
   * Fixed: The binding of X and G fixes several potential bugs, including
     Issue #12 (GitHub)
 
-3.0-2
+# grpreg 3.0-2
   *  Fixed bug involving mismatch between group.multiplier and group if
 	   group is given out of order.
 
-3.0-1 (2016-06-06)
+# grpreg 3.0-1 (2016-06-06)
   * Fixed: memory allocation bug
   * Deprecation: Re-introduced 'birthwt.grpreg' for backwards compatibility,
     but this is deprecated
 
-3.0-0 (2016-06-02)
+# grpreg 3.0-0 (2016-06-02)
   * New: methods for survival analysis (Cox modeling): grpsurv, cv.grpsurv, AUC,
     predict.grpsurv
   * New: option to return fitted values from cross-validation folds
@@ -54,14 +70,14 @@
   * Documentation: Added vignettes (a quick-start guide and a detailed
     description of available penalties)
 
-2.8-1 (2015-05-30)
+# grpreg 2.8-1 (2015-05-30)
   * New: cv.grpreg now allows user to specify lambda (thanks to Vincent
     Arel-Bundock for suggesting this change)
   * Fixed: bug for predict.grpreg(fit, type="nvars") or type="ngroups" when
     scalar lambda value is passed
   * Documentation: Updated citations
 
-2.8-0 (2014-11-15)
+# grpreg 2.8-0 (2014-11-15)
   * New: More flexible interface through the 'group' argument; groups may now be
     out of order, and may be named rather than only consecutive integers
   * New: 'X' can now be a matrix of integers (previously this would result in
@@ -78,7 +94,7 @@
   * Fixed: bug in cv.grpreg when attempting to use leave-one-out
     cross-validation
 
-2.7-1 (2014-08-13)
+# grpreg 2.7-1 (2014-08-13)
   * Fixed: More rigorous initialization at C level to prevent possible memory
     access problems
   * Fixed: predict() for types 'vars', 'nvars', and 'ngroups' with multivariate
@@ -87,23 +103,23 @@
     multivariate outcomes (thank you to Cajo ter Braak for pointing out that
     this was broken)
 
-2.7-0 (2014-08-13)
+# grpreg 2.7-0 (2014-08-13)
   * New: support for Poisson regression
   * Internal: .Call now used instead of .C
   * Fixed: bug in cv.grpreg when attempting to use leave-one-out
     cross-validation (thank you to Cajo ter Braak for pointing this out)
 
-2.6-0 (2014-03-21)
+# grpreg 2.6-0 (2014-03-21)
   * Internal: Various internal changes to make the package more efficient for
     large data sets
 
-2.5-0 (2013-12-24)
+# grpreg 2.5-0 (2013-12-24)
   * New: group exponential lasso 'gel' method
   * New: 'gmax' option
   * New: 'nvars' and 'ngroups' options for predict
   * Change: appearance of summary.cv.grpreg display
 
-2.4-0 (2013-06-07)
+# grpreg 2.4-0 (2013-06-07)
   * New: options in plot.cv.grpreg to plot estimates of r-squared,
     signal-to-noise ratio, scale parameter, and prediction error in addition to
     cross-validation error (deviance)
@@ -114,20 +130,20 @@
   * New: 'summary' method for cv.grpreg objects
   * New: 'coef' and 'predict' methods for cv.grpreg objects
   * Change: Brought gBridge up to date so that it now handles constant columns,
-    etc. (see 2.2-0)
+    etc. (see # grpreg 2.2-0)
   * Fixed: bug in predict type='coefficients' when 'lambda' argument specified
   * Fixed: bug in cv.grpreg with user-defined lambda values
 
-2.3-0 (2013-02-10)
+# grpreg 2.3-0 (2013-02-10)
   * Internal: Switched to SVD-based orthogonalization to allow for linear
     dependency within groups
 
-2.2-1 (2012-11-16)
+# grpreg 2.2-1 (2012-11-16)
   * Fixed: compilation error for 32-bit Windows
   * Fixed: bug in calculation of binomial deviance when fitted probabilities
     are close to 0 or 1
 
-2.2-0 (2012-10-09)
+# grpreg 2.2-0 (2012-10-09)
   * New: select now Now allows '...' options to be passed to logLik
   * New: Added option to plot norm of each group, rather than individual
     coefficients
@@ -139,12 +155,12 @@
   * Fixed: bug for returning group when some groups were eliminated due to
     constant columns
 
-2.1-0 (2012-07-28)
+# grpreg 2.1-0 (2012-07-28)
   * New: grpreg can now handle constant columns (they produce beta=0)
   * Fixed: Bug involving orthogonalization with unpenalized groups
   * Internal: restructuring of C code
 
-2.0-0 (2012-07-21)
+# grpreg 2.0-0 (2012-07-21)
   * New: Group MCP, group SCAD methods added
   * New: Added 'cv.grpreg' to facilitate cross-validation
   * New: 'dfmax' option
@@ -156,7 +172,7 @@
   * Internal: standardize and orthogonalize functions added
   * Internal: Much more extensive and reproducible code testing
 
-1.2-0 (2011-06-22)
+# grpreg 1.2-0 (2011-06-22)
   * New: grpreg now returns 'loss'
   * New: Added logLik method
   * Change: Syntax of 'select' modified (no longer requires X, y to be passed)

diff --git a/index.Rmd b/index.Rmd
@@ -12,13 +12,13 @@ knitr::knit_hooks$set(small.mar = function(before, options, envir) {
 
 `grpreg` is an R package for fitting the regularization path of linear regression, GLM, and Cox regression models with grouped penalties.  This includes group selection methods such as group lasso, group MCP, and group SCAD as well as bi-level selection methods such as the group exponential lasso, the composite MCP, and the group bridge.  Utilities for carrying out cross-validation as well as post-fitting visualization, summarization, and prediction are also provided.
 
-This vignette offers a brief introduction to the basic use of `grpreg`.  For more details on the package, visit the `grpreg` website at <http://pbreheny.github.io/grpreg>.  For more on the algorithms used by `grpreg`, see the original articles:
+This site focuses on illustrating the usage and syntax of `grpreg`.  For more on the algorithms used by `grpreg`, see the original articles:
 
 * [Breheny, P. and Huang, J. (2009) Penalized methods for bi-level variable selection.  *Statistics and its interface*, **2**: 369-380.](http://myweb.uiowa.edu/pbreheny/pdf/Breheny2009.pdf)
 
 * [Breheny, P. and Huang, J. (2015) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. *Statistics and Computing*, **25**: 173-187.](http://www.springerlink.com/openurl.asp?genre=article&id=doi:10.1007/s11222-013-9424-2)
 
-For more information on specific penalties, see <http://pbreheny.github.io/grpreg/articles/web/penalties.html>.
+For more information on specific penalties, including references describing the methods implemented by `grpreg` in greater detail, see <http://pbreheny.github.io/grpreg/articles/web/penalties.html>.
 
 ## Installation
 
@@ -44,13 +44,15 @@ head(X)
 group
 ```
 
-To fit a group lasso model to this data:
+We can fit a penalized regression model to this data with:
 
 ```{r fit}
-fit <- grpreg(X, y, group, penalty="grLasso")
+fit <- grpreg(X, y, group)
 ```
 
-We can then plot the coefficient paths with
+By default, `grpreg` fits a linear regression model with a group lasso penalty.  For more detail on other types of models available, see [here](articles/web/models.html).  For more detail on other types of penalties available, see [here](articles/web/penalties.html).
+
+Fitting a penalized regression model produces a path of coefficients, which we can plot with
 
 ```{r plot, h=4, w=6, small.mar=TRUE}
 plot(fit)

diff --git a/man/auc.Rd b/man/auc.Rd
@@ -20,13 +20,15 @@
   predictors are used to guard against overfitting.  Thus, the values
   returned by \code{AUC.cv.grpsurv} will be lower than those
   you would obtain with \code{survConcordance} if you fit the full
-  (unpenalized) model.}
+  (unpenalized) model.
+}
 \references{van Houwelingen H, Putter H (2011). Dynamic Prediction in
   Clinical Survival Analysis.  CRC Press.}
 \author{Patrick Breheny}
 \seealso{\code{\link{cv.grpsurv}},
   \code{\link[survival]{survConcordance}}} 
 \examples{
+\dontshow{set.seed(1)}
 data(Lung)
 X <- Lung$X
 y <- Lung$y
@@ -36,5 +38,5 @@ cvfit <- cv.grpsurv(X, y, group, returnY=TRUE)
 head(AUC(cvfit))
 ll <- log(cvfit$fit$lambda)
 plot(ll, AUC(cvfit), xlim=rev(range(ll)), lwd=3, type='l',
-     xlab=expression(log(lambda)), ylab='AUC')
+     xlab=expression(log(lambda)), ylab='AUC', las=1)
 }
diff --git a/man/birthwt.grpreg.Rd → man/birthwt-grpreg.Rd b/man/birthwt.grpreg.Rd → man/birthwt-grpreg.Rd
diff --git a/man/cv.grpreg.Rd → man/cv-grpreg.Rd b/man/cv.grpreg.Rd → man/cv-grpreg.Rd
@@ -74,8 +74,8 @@ cv.grpsurv(X, y, group, ..., nfolds=10, seed, fold, se=c('quick',
   An object with S3 class \code{"cv.grpreg"} containing:
   \item{cve}{The error for each value of \code{lambda}, averaged
     across the cross-validation folds.}
-  \item{cvse}{The estimated standard error associated with each value of
-    for \code{cve}.}
+  \item{cvse}{The estimated standard error associated with each value
+    of for \code{cve}.}
   \item{lambda}{The sequence of regularization parameter values along
     which the cross-validation error was calculated.}
   \item{fit}{The fitted \code{grpreg} object for the whole data.}
@@ -87,8 +87,8 @@ cv.grpsurv(X, y, group, ..., nfolds=10, seed, fold, se=c('quick',
   \item{lambda.min}{The value of \code{lambda} with the minimum
     cross-validation error.}
   \item{null.dev}{The deviance for the intercept-only model.}
-  \item{pe}{If \code{family="binomial"}, the cross-validation prediction
-    error for each value of \code{lambda}.}
+  \item{pe}{If \code{family="binomial"}, the cross-validation
+    prediction error for each value of \code{lambda}.}
 }
 \author{Patrick Breheny}
 \seealso{\code{\link{grpreg}}, \code{\link{plot.cv.grpreg}},

diff --git a/man/gBridge.Rd b/man/gBridge.Rd
@@ -66,7 +66,7 @@ gamma=0.5, group.multiplier, warn=TRUE)
     bi-level variable selection.  \emph{Statistics and its interface},
     \strong{2}: 369-380.
     }}
-\author{Patrick Breheny <[email protected]>}
+\author{Patrick Breheny}
 \seealso{\code{\link{grpreg}}}
 \examples{
 data(Birthwt)

diff --git a/man/grpreg.Rd b/man/grpreg.Rd
@@ -108,14 +108,15 @@ tau = 1/3, group.multiplier, warn=TRUE, returnX = FALSE, ...)
   parameter of the MCP penalty is denoted 'gamma'.  Note, however, that
   in Breheny and Huang (2009), \code{gamma} is denoted 'a'.
 
-  The objective function is defined to be
-  \deqn{\frac{1}{2n}RSS + penalty}{RSS/(2*n) + penalty}
-  for \code{"gaussian"} and
-  \deqn{-\frac{1}{n} loglik + penalty}{-loglik/n + penalty}
-  for \code{"binomial"}, where the likelihood is from a traditional
-  generalized linear model for the log-odds of an event.  For logistic
-  regression models, some care is taken to avoid model saturation; the
-  algorithm  may exit early in this setting.
+  The objective function for \code{grpreg} optimization is defined to be
+  \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) +
+    P_\lambda(\beta)}{Q(\beta|X, y) = (1/n)*L(\beta|X, y) +
+    P(\beta, \lambda),}
+  where the loss function L is the deviance (-2 times the log
+  likelihood) for the specified outcome distribution
+  (gaussian/binomial/poisson).
+  \href{http://pbreheny.github.io/ncvreg/articles/web/models.html}{See
+    here for more details}.
 
   For the bi-level selection methods, a locally approximated coordinate
   descent algorithm is employed.  For the group selection methods, group
@@ -163,26 +164,28 @@ tau = 1/3, group.multiplier, warn=TRUE, returnX = FALSE, ...)
 }
 \value{
   An object with S3 class \code{"grpreg"} containing:
-  \item{beta}{The fitted matrix of coefficients.  The number of rows is
-    equal to the number of coefficients, and the number of columns is
-    equal to \code{nlambda}.}
-  \item{family}{Same as above.}
-  \item{group}{Same as above.}
-  \item{lambda}{The sequence of \code{lambda} values in the path.}
-  \item{alpha}{Same as above.}
-  \item{loss}{A vector containing either the residual sum of squares
-    (\code{"gaussian"}) or negative log-likelihood (\code{"binomial"})
-    of the fitted model at each value of \code{lambda}.}
-  \item{n}{Number of observations.}
-  \item{penalty}{Same as above.}
-  \item{df}{A vector of length \code{nlambda} containing estimates of
-    effective number of model parameters all the points along the
-    regularization path.  For details on how this is calculated, see
-    Breheny and Huang (2009).}
-  \item{iter}{A vector of length \code{nlambda} containing the number
-    of iterations until convergence at each value of \code{lambda}.}
-  \item{group.multiplier}{A named vector containing the multiplicative
-    constant applied to each group's penalty.}
+  \describe{
+    \item{beta}{The fitted matrix of coefficients.  The number of rows
+      is equal to the number of coefficients, and the number of columns
+      is equal to \code{nlambda}.}
+    \item{family}{Same as above.}
+    \item{group}{Same as above.}
+    \item{lambda}{The sequence of \code{lambda} values in the path.}
+    \item{alpha}{Same as above.}
+    \item{loss}{A vector containing either the residual sum of squares
+      (\code{"gaussian"}) or negative log-likelihood (\code{"binomial"})
+      of the fitted model at each value of \code{lambda}.}
+    \item{n}{Number of observations.}
+    \item{penalty}{Same as above.}
+    \item{df}{A vector of length \code{nlambda} containing estimates of
+      effective number of model parameters all the points along the
+      regularization path.  For details on how this is calculated, see
+      Breheny and Huang (2009).}
+    \item{iter}{A vector of length \code{nlambda} containing the number
+      of iterations until convergence at each value of \code{lambda}.}
+    \item{group.multiplier}{A named vector containing the multiplicative
+      constant applied to each group's penalty.}
+  }
 }
 \references{
   \itemize{
@@ -207,7 +210,7 @@ tau = 1/3, group.multiplier, warn=TRUE, returnX = FALSE, ...)
   }
 }
 
-\author{Patrick Breheny <patrick-breheny@uiowa.edu>}
+\author{Patrick Breheny}
 \seealso{\code{\link{cv.grpreg}}, as well as
   \code{\link[=plot.grpreg]{plot}} and
   \code{\link[=select.grpreg]{select}} methods.}

diff --git a/man/grpsurv.Rd b/man/grpsurv.Rd
@@ -72,16 +72,20 @@ group.multiplier, warn=TRUE, returnX=FALSE, ...)
   The sequence of models indexed by the regularization parameter
   \code{lambda} is fit using a coordinate descent algorithm.  In order
   to accomplish this, the second derivative (Hessian) of the Cox partial
-  log-likelihood is diagonalized (see references for details).  The
+  log-likelihood is diagonalized (see references for details).    The
   objective function is defined to be
-  \deqn{-\frac{1}{n}L(\beta|X,y) + \textrm{penalty},}{-(1/n) L(beta|X,y)
-    + penalty(beta),}
-  where L is the partial log-likelihood from the Cox regression
-  model.
+  \deqn{Q(\beta|X, y) = \frac{1}{n} L(\beta|X, y) +
+    P_\lambda(\beta)}{Q(\beta|X, y) = (1/n)*L(\beta|X, y) +
+    P(\beta, \lambda),}
+  where the loss function L is the deviance (-2 times the partial
+  log-likelihood) from the Cox regression mode.
+  \href{http://pbreheny.github.io/ncvreg/articles/web/models.html}{See
+    here for more details}.
 
   Presently, ties are not handled by \code{grpsurv} in a particularly
   sophisticated manner.  This will be improved upon in a future release
-  of \code{grpreg}.}
+  of \code{grpreg}.
+}
 \value{
   An object with S3 class \code{"grpsurv"} containing:
   \item{beta}{The fitted matrix of coefficients.  The number of rows is
@@ -142,7 +146,7 @@ not correspond to the ith row of \code{X}):
     \url{http://www.jstatsoft.org/v39/i05}
   }
 }
-\author{Patrick Breheny <[email protected]>}
+\author{Patrick Breheny}
 \seealso{\code{\link{plot.grpreg}},
   \code{\link{predict.grpsurv}},
   \code{\link{cv.grpsurv}},

diff --git a/man/logLik.grpreg.Rd → man/logLik-grpreg.Rd b/man/logLik.grpreg.Rd → man/logLik-grpreg.Rd
@@ -30,7 +30,7 @@ REML=FALSE, ...)
   display correctly.  However, it works with 'AIC' and 'BIC' without any
   glitches and returns the expected vectorized output.
   }
-\author{Patrick Breheny <patrick-breheny@uiowa.edu>}
+\author{Patrick Breheny}
 \seealso{\code{grpreg}}
 \examples{
 data(Birthwt)

diff --git a/man/plot.cv.grpreg.Rd → man/plot-cv-grpreg.Rd b/man/plot.cv.grpreg.Rd → man/plot-cv-grpreg.Rd
@@ -34,7 +34,7 @@
   \code{lambda}.  For \code{rsq} and \code{snr}, these confidence
   intervals are quite crude, especially near zero, and will hopefully be
   improved upon in later versions of \code{grpreg}.}
-\author{Patrick Breheny <patrick-breheny@uiowa.edu>}
+\author{Patrick Breheny}
 \seealso{\code{\link{grpreg}}, \code{\link{cv.grpreg}}}
 \examples{
 # Birthweight data