Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Automatic “IN” Filter to Support “LIKE” Search for Categorical Filters #32276

Open
chrisdukeLlama opened this issue Feb 16, 2025 · 15 comments
Labels
dashboard:filtersets Related to the filtersets of the Dashboard

Comments

@chrisdukeLlama
Copy link

Description:

Currently, Superset automatically applies an “IN” filter when filtering categorical values in dashboards. I want to modify this behavior to use a “LIKE” search instead, but I can’t find where this logic is implemented in the codebase.

Question:
• Where in the Superset code is the logic for automatically applying “IN” filters to categorical searches?
• Is there an existing way to override this behavior, or would this require a code modification?

Why This Matters:
• The current implementation forces exact matches, which is not suitable for my use case.
• A “LIKE” search would allow for partial matching, making filters more flexible, especially for datasets with non-standardized values.
• I understand the performance concerns with LIKE searches on large datasets, but for smaller datasets, this is a much-needed feature.

Additional Context:
• I read that the Filter Box would have allowed this, but it has been removed.
• I followed previous discussions about keeping it and understand the arguments, but right now, there is no alternative that works for my case.
• I appreciate Apache Superset because it displays text much better than Grafana or Metabase, but for text data, the lack of flexible filtering is a big limitation.

Would appreciate any pointers on which file or function controls this behavior.

@chrisdukeLlama chrisdukeLlama added the sip Superset Improvement Proposal label Feb 16, 2025
Copy link

dosubot bot commented Feb 16, 2025

Hey @chrisdukeLlama! I'm here to help you with any bugs, questions, or contributions you have for Apache Superset. Let me know how I can assist you!

The logic for automatically applying filters to categorical searches in Superset is located in the superset-frontend/src/filters/components/Select/buildQuery.ts file. The code applies an ILIKE filter for string types, which is similar to LIKE but case-insensitive. If you want to use LIKE instead of ILIKE, you would need to modify the code to change the op value from 'ILIKE' to 'LIKE' [1].

Additionally, the backend logic in superset/models/helpers.py handles different filter operators, including LIKE and ILIKE, in the SQLAlchemy query construction. This suggests that the backend already supports both operators, and you might be able to leverage this by adjusting the frontend logic to specify the desired operator [2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@dosubot dosubot bot added the dashboard:filtersets Related to the filtersets of the Dashboard label Feb 16, 2025
@chrisdukeLlama
Copy link
Author

@dosu

Excuse me imprecise previous question
Problem:
• The current filtering system defaults to categorical "IN" searches.
• I need a text search filter that supports "LIKE" or "ILIKE".
• "ILIKE" does not seem to be applied automatically as mentioned in previous responses.

Questions:
	•	Where in the code does Superset apply "IN" for filters?
•	How can I modify the filtering logic to allow text-based LIKE searches instead of IN?
•	Is there any planned feature or workaround for full-text search in dashboards?

Copy link

dosubot bot commented Feb 16, 2025

The logic that defaults to using "IN" for categorical filters in Superset is implemented in the onSubjectChange function within the useSimpleTabFilterProps hook. If the operator is not defined, it defaults to using the "IN" operator [1].

To modify the filtering logic to allow text-based "LIKE" searches instead of "IN", you would need to adjust this logic to set the operator to "LIKE" or "ILIKE" based on your requirements. This would involve changing the default operator in the relevant part of the code where the filter is applied.

As for planned features or workarounds for full-text search in dashboards, I wasn't able to find specific information in the repository. You might need to implement custom logic or modifications to achieve full-text search capabilities.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@chrisdukeLlama
Copy link
Author

@dosu Thanks for confirming where "IN" is set!

  1. Why does Superset default to "IN" for categorical filters, even when filtering text fields?

  2. Shouldn’t it automatically detect TEXT columns and apply "LIKE" or "ILIKE" instead?

  3. Would a PR to introduce an optional "LIKE" search mode be accepted?

    Many users need partial matching for text data. The old Filter Box allowed it, but since its removal, there’s no built-in way to filter text dynamically.

Copy link

dosubot bot commented Feb 16, 2025

  1. The rationale behind Superset defaulting to "IN" for categorical filters, even when filtering text fields, is not explicitly documented in the repository. However, using "IN" is generally efficient for exact matches, which is often suitable for categorical data.

  2. Superset does have logic to handle different data types and apply appropriate operators. The useSimpleTabFilterProps function includes logic to determine relevant operators based on column types, but it doesn't automatically switch to "LIKE" or "ILIKE" for text columns by default [1].

  3. While I can't predict the acceptance of a pull request, introducing an optional "LIKE" search mode could be a valuable enhancement for users needing partial matching for text data. It would be beneficial to align with the project's goals and guidelines when proposing such a feature.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@chrisdukeLlama
Copy link
Author

@dosu thanks, I decided that it might be the easiest way to solve my problem by downgrading to an older version of superset that still includes the filter box, which version would that be? That would work with like search on text columns right?

Copy link

dosubot bot commented Feb 16, 2025

The Filter Box component was removed in Apache Superset version 4.0.0. Therefore, you would need to downgrade to a version prior to 4.0.0 to use the Filter Box [1][2]. The Filter Box supported both 'LIKE' and 'ILIKE' searches on text columns, allowing for partial and case-insensitive matching [3][4][5].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@chrisdukeLlama
Copy link
Author

@dosu so the elderly are right when they say that everything was better back in the days

Copy link

dosubot bot commented Feb 16, 2025

It's often the case that older versions have features that some users find indispensable. If the Filter Box meets your needs, reverting to a version before 4.0.0 could be a practical solution for now. Just keep in mind that older versions might lack newer features or security updates. If you decide to go this route, ensure your environment is secure and consider contributing to the project if you have ideas for improvements. If you need further assistance with the downgrade or any other questions, feel free to ask!

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@chrisdukeLlama
Copy link
Author

@dosu the data world would be much easier if there was only categorical data for me as well, but there isn't

Copy link

dosubot bot commented Feb 16, 2025

Indeed, working with diverse data types can present challenges, especially when it comes to filtering and searching. If you have any more questions or need further assistance with Superset or data handling, feel free to reach out. I'm here to help!

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@chrisdukeLlama
Copy link
Author

chrisdukeLlama commented Feb 17, 2025

@dosu you said: The Filter Box component was removed in Apache Superset version 4.0.0. Therefore, you would need to downgrade to a version prior to 4.0.0 to use the Filter Box [1][2]. The Filter Box supported both 'LIKE' and 'ILIKE' searches on text columns, allowing for partial and case-insensitive matching [3][4][5].

but in 3.0.2 I get only:
Date filter
Instant filtering
Show time grain dropdown
Show time column
Limit selector values

so how do I get a free text filter?

Copy link

dosubot bot commented Feb 17, 2025

The Filter Box component was last available in Apache Superset version 3.x before its removal in version 4.0.0 [1]. If you're using version 3.0.2 and not seeing the expected Filter Box options, it might be due to configuration settings or specific dashboard setups.

To enable free text filtering, you might need to ensure that the Filter Box is properly configured in your dashboard. If the Filter Box is not appearing as expected, consider checking the dashboard's configuration or exploring other available filter components that might offer similar functionality. If the Filter Box is not meeting your needs, you may need to explore custom solutions or modifications to achieve the desired filtering capabilities.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@rusackas
Copy link
Member

In previous discussions about this, the various implementations of like/ilike were shot down because on large datasets they can cause a huge performance regression. I can't recall how hard it would be, but it may be possible to add a checkbox when configuring a filter to enable/disable the fuzzier search options, including the options like %x, %x%, and x% but it may warrant a bit more discussion, as I remember there being some strong opinions about it. I assure you it'd be better to find a way to add the feature to the current-era Superset rather than going back major versions and losing all the progress the project has made!

@chrisdukeLlama
Copy link
Author

@rusackas
thanks a lot for all that information, I can imagine all the difficulties with huge datasets, for small datasets an option would be great though.
I am of course not a friend of going back major versions, but making my things work in 3.02.0 with the filterbox has been a matter of an hour though, I am working with a framework of several components, apache superset only being one among several others, so I am just not sure whether I can free up the time to understand its structure even partially so that I can modify it for me.
It is great software don't get me wrong and I understand that people do not want to go the path towards more unstructured data, even though I think in times of LLM that would be a great addition (but of course an extremely laborious one..

@betodealmeida betodealmeida removed the sip Superset Improvement Proposal label Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboard:filtersets Related to the filtersets of the Dashboard
Projects
None yet
Development

No branches or pull requests

3 participants