You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to understand what's the stopping criterion on the (sub)gradient norm, in order to scale gradient_tol correctly.
If I understood correctly, for a quadratic datatif/family="gaussian", and vanishing coefs, the grad argument of _norm_min_subgrad should be X.T @ y / len(y)
When I debug, I see this being true only when I have previously centered and normalized X. otherwise, even if in my GeneralizedLinearRegressor I have set fit_intercept=False, I see:
a P1 array which is not constant
in update_quadratics(), gradient_rows is equal to y / len(y) as expected, however grad = gradient_rows @ data.X does not yield a grad equal to X.T @ y / len(y).
It seems this is due to X being a Mat: <class 'tabmat.dense_matrix.DenseMatrix'> of shape (10, 5). Shift: [0. 0. 0. 0. 0.] Mult: [1.03404146 1.22010398 0.81522932 1.0686763 1.27198591] so scaling happens.
Why is this the case eventhough clf.scale_predictors, and clf._center_predictors are False ?
To reproduce, print gradient_rows and grad in update_quadratics, and run:
fromglumimportGeneralizedLinearRegressorimportnumpyasnpnp.random.seed(0)
X=np.random.randn(10, 5)
y=np.random.randn(10)
X-=X.mean(axis=0)
alpha=0.001clf=GeneralizedLinearRegressor(
alpha=alpha, gradient_tol=1000, fit_intercept=False, family="gaussian",
l1_ratio=1, verbose=10).fit(X, y)
print("grad rows should be", y/len(y)) # it isprint("grad should be", X.T @ y/len(y)) # it is not!
If you center and scale X before, then the problem disappears
The text was updated successfully, but these errors were encountered:
I'm trying to understand what's the stopping criterion on the (sub)gradient norm, in order to scale
gradient_tol
correctly.If I understood correctly, for a quadratic datatif/
family="gaussian"
, and vanishing coefs, thegrad
argument of_norm_min_subgrad
should be X.T @ y / len(y)When I debug, I see this being true only when I have previously centered and normalized X. otherwise, even if in my GeneralizedLinearRegressor I have set
fit_intercept=False
, I see:P1
array which is not constantupdate_quadratics()
, gradient_rows is equal toy / len(y)
as expected, howevergrad = gradient_rows @ data.X
does not yield agrad
equal toX.T @ y / len(y)
.It seems this is due to
X
being aMat: <class 'tabmat.dense_matrix.DenseMatrix'> of shape (10, 5). Shift: [0. 0. 0. 0. 0.] Mult: [1.03404146 1.22010398 0.81522932 1.0686763 1.27198591]
so scaling happens.Why is this the case eventhough
clf.scale_predictors,
andclf._center_predictors
are False ?To reproduce, print gradient_rows and grad in
update_quadratics
, and run:If you center and scale X before, then the problem disappears
The text was updated successfully, but these errors were encountered: