Skip to content

Overlap with sklearn #361

@amueller

Description

@amueller

Hey folks!
This looks like a really cool library that I hadn't seen before.
It looks like there's a couple of things we now have upstream in sklearn and I wanted to note that so you could maybe add that to the docs.

Also, I'd be really curious to see which parts could be upstreamed. Contributing to sklearn can be a bit of a lengthy process, but if some of your tools are generally useful it might be nice to put them into sklearn.

Btw, TrainOnlyTransformerMixin is quite interesting. I was wondering how you do that because the scikit-learn API doesn't allow it (which I dislike), but it seems you're hashing and checking for equality? That's certainly an interesting approach. Ideally I'd like to fix the API so this isn't as awkward.

I think EstimatorTransformer is the same as the sklearn VotingClassifier, right? Though maybe that's not super clear from the name?

And you're probably aware that ColumnSelector + FeatureUnion has finally after a long wait been implemented in ColumnTransformer. So this example is actually significantly shorter in sklearn now:
https://scikit-lego.readthedocs.io/en/latest/preprocessing.html#Example-2

Most cases of IdentityTransformer are now handled with the special 'passthrough' string.

I thought RandomRegressor is the same as DummyRegressor but apparently we don't have a sampling strategy for DummyRegressor. It think that would be a nice addition if you're interested.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions