glum 2.0.0
·
275 commits
to main
since this release
Breaking changes:
- Renamed the package to
glum
!!! Hurray! Celebration. GeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
lose thefit_dispersion
parameter.
Please use thedispersion
method of the appropriate family instance instead.- All functions now use
sample_weight
as a keyword instead ofweights
, in line with scikit-learn. - All functions now use
dispersion
as a keyword instead ofphi
. - Several methods
GeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
that should have been private have had an underscore prefixed on their names:tear_down_from_fit
,_set_up_for_fit
,_set_up_and_check_fit_args
,_get_start_coef
,_solve
and_solve_regularization_path
. glum.GeneralizedLinearRegressor.report_diagnostics
andglum.GeneralizedLinearRegressor.get_formatted_diagnostics
are now public.
New features:
- P1 and P2 now accepts 1d array with the same number of elements as the unexpanded design matrix. In this case,
the penalty associated with a categorical feature will be expanded to as many elements as there are levels,
all with the same value. ExponentialDispersionModel
gains adispersion
method.BinomialDistribution
andTweedieDistribution
gain alog_likelihood
method.- The
fit
method ofGeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
now saves the column types of pandas data frames. GeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
gain two properties:family_instance
andlink_instance
.GeneralizedLinearRegressor.std_errors
andGeneralizedLinearRegressor.covariance_matrix
have been added and support non-robust, robust (HC-1), and clustered
covariance matrices.GeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
now acceptfamily='gaussian'
as an alternative tofamily='normal'
.
Bug fix:
- The
score
method ofGeneralizedLinearRegressor
andGeneralizedLinearRegressorCV
now accepts data frames. - Upgraded the code to use tabmat 3.0.0.
Other:
- A major overhaul of the documentation. Everything is better!
- The methods of the link classes will now return scalars when given scalar inputs. Under certain circumstances, they'd return zero-dimensional arrays.
- There is a new benchmark available
glm_benchmarks_run
based on the Boston housing dataset. See here. glm_benchmarks_analyze
now includesoffset
in the index. See here.glmnet_python
was removed from the benchmarks suite.- The innermost coordinate descent was optimized. This speeds up coordinate descent dominated problems like LASSO by about 1.5-2x. See here.