Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to run MFE for datasets of more than ~500 features #130

Open
schmitcn opened this issue Jul 8, 2023 · 2 comments
Open

[BUG] Unable to run MFE for datasets of more than ~500 features #130

schmitcn opened this issue Jul 8, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@schmitcn
Copy link

schmitcn commented Jul 8, 2023

Describe the bug
When running MFE with group general and a dataset with more than (around) 500 features, a RecursionError: maximum recursion depth exceeded while calling a Python object error is thrown.

To Reproduce
Steps to reproduce the behavior:

        mfe = MFE(groups=["general"])
        mfe.fit(X, y) # where X has more than 500 features

Expected behavior
Generate the general meta-features.

Screenshots
N/A

Desktop (please complete the following information):

  • OS: macOS
  • Version: Ventura 13.4

Additional context
The stack trace is as follows:

  File "[...]/lib/python3.8/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "[...]/lib/python3.8/site-packages/patsy/desc.py", line 233, in _eval_binary_plus
    left_expr = evaluator.eval(tree.args[0])
  File "[...]/lib/python3.8/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "[...]/lib/python3.8/site-packages/patsy/desc.py", line 233, in _eval_binary_plus
    left_expr = evaluator.eval(tree.args[0])
  File "[...]/lib/python3.8/site-packages/patsy/desc.py", line 394, in eval
    assert isinstance(tree, ParseNode)
RecursionError: maximum recursion depth exceeded while calling a Python object

The failure comes from patsy and seems to be related to what is mentioned in this issue in their repo. It is not fixed and they do not intend to do so, as the successor of patsy, formulaic already has this solved. My suggestion here would be to upgrade to formulaic, as patsy is no longer under active development (stated in their readme).

@schmitcn schmitcn added the bug Something isn't working label Jul 8, 2023
@schmitcn
Copy link
Author

Hi @ealcobaca, @FelSiq,

Are there any plans on addressing this anytime soon? If not, that's fine, I just need to know this for a project planning purpose (so that we can look for a different tool).

Best regards.

@FelSiq
Copy link
Collaborator

FelSiq commented Jul 20, 2023

Hi @schmitcn

Sorry for the delay. We won't be addressing this issue soon, but there might be a solution.

Did you try using mfe.fit(X, ..., transform_cat="one-hot")?
This should avoid using the patsy dependency, and will provide an alternative method for encoding categorical variables.

Thanks for your feeback.

Best regards,
Felipe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants