Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example on how to achieve live filtering #709

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
43 changes: 43 additions & 0 deletions content/develop/concepts/app-design/dataframes.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,49 @@ In addition to column configuration, `st.dataframe` and `st.data_editor` have a
- `column_order` : Pass a list of column labels to specify the order of display.
- `disabled` : Pass a list of column labels to disable them from editing. This let's you avoid disabling them individually.

## Live filtering

Live filtering of a dataset can be achieved by combining `st.dataframe` and input elements like the `select_slider`, `text_input` or `multiselect`.
In the example below, a sample DataFrame will be filtered using these three different elements.
Custom filtering logic can be written by using the `apply` method provided by Pandas.
The custom logic in the `lambda` methods defaults to `True` if a filter is not used.
This makes sure that it's not required to provide values for each filter.

```python
import pandas
import streamlit as st

# Some sample data:
employees = pandas.DataFrame([
{"Name": "Ava Reynolds", "Age": 38, "Skills": ["Python", "Javascript"]},
{"Name": "Caleb Roberts", "Age": 29, "Skills": ["juggling", "karate", "Python"]},
{"Name": "Harper Anderson", "Age": 51, "Skills": ["sailing", "French", "Javascript"]}
])

# Create an input element and apply the filter to the DataFrame with employees
age_input = st.sidebar.select_slider("Minimum age", options=range(0, 100))
age_filter = employees["Age"] > age_input

# Filter the name field, but default to True if the filter is not used
name_input = st.sidebar.text_input("Name")
name_filter = employees["Name"].apply(lambda name: name_input in name if name_input else True)

# Filter the skills, but default to True if no skills are selected
# Options contains all unique values in the multilabel column Skills
skills_input = st.sidebar.multiselect("Skills", options=employees["Skills"].explode().unique())
skills_filter = employees["Skills"].apply(
# We check whether any of the selected skills are in the row, defaulting to True if the input is not specified
# To check whether all of the selected skills are there, simply replace `any` with `all`
lambda skills: any(skill in skills for skill in skills_input) if skills_input else True
)

# Apply the three different filters and display the data
# Since the default when the filter is not used is True, we can simply use the & operator
employees_filtered = employees[age_filter & name_filter & skills_filter]
st.dataframe(employees_filtered, hide_index=True)
```


## Handling large datasets

`st.dataframe` and `st.data_editor` have been designed to theoretically handle tables with millions of rows thanks to their highly performant implementation using the glide-data-grid library and HTML canvas. However, the maximum amount of data that an app can realistically handle will depend on several other factors, including:
Expand Down