Skip to content

Commit

Permalink
Added description of app
Browse files Browse the repository at this point in the history
  • Loading branch information
bqd39 committed Aug 13, 2024
1 parent 17c0404 commit 432b981
Showing 1 changed file with 24 additions and 7 deletions.
31 changes: 24 additions & 7 deletions examples/dash/plotly-large-dataset/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,40 @@
# Plotting large datasets in Dash

Interactive Dash applications that display a figure denoting flight date and time (24h) vs flight delay time (minute). You can select the date range you want to visualize in `resampler` and `combined`.

The applications plot large datasets using one of:
Interactive Dash applications that plot large datasets using one of:
- [**WebGL**](https://plotly.com/python/webgl-vs-svg/) (in `webgl` folder): a powerful technology that uses GPU to accelerate computation, helping you render figures more effectively. This method is generally ideal for figures with up to 100,000-200,000 markers (terminology for data points in charts), depending on the power of your GPU. For figures larger than that, it's often optimal to aggregate the data points first

- [**`plotly-resampler`**](https://github.com/predict-idlab/plotly-resampler) (in `resampler` folder): an external library that dynamically aggregates time-series data respective to the current graph view. This approach helps you downsample your dataset at the cost of losing some details.

- Combined approach (in `combined` folder).

We will be using a commercial flight dataset that documents information such as flight delays in the first half (1/1-6/30) of 2006. You can find it [here](https://github.com/vega/falcon/blob/master/data/flights-3m.csv). For the purpose of this project, we will focus on plotting departure delays.

Once you download the dataset, run `python csv-clean.py flights-3m.csv` to obtain the cleaned csv file `flights-3m-cleaned.csv`. Move the cleaned file to the `data` folder in any of the project folders (`webgl`, `resample` or `combined`) you want to test.

## Description

On its home page, the apps will display a scatter plot figure denoting departure delay time (minute) of around 3 million flights, captured below. You can select the date range you want to visualize in `resampler` and `combined`.

- `webgl`

![](static/app_webgl.png)

- [**`plotly-resampler`**](https://github.com/predict-idlab/plotly-resampler) (in `resampler` folder): an external library that dynamically aggregates time-series data respective to the current graph view. This approach helps you downsample your dataset at the cost of losing some details.
- `resampler`

![](static/app_resampler.png)

- Combined approach (in `combined` folder).
- `combined`

![](static/app_combined.png)

We will be using a commercial flight dataset that documents information such as flight departure date/time and delays. You can find it [here](https://github.com/vega/falcon/blob/master/data/flights-3m.csv).
You can also click on the graph and drag your cursor around to zoom into any part of the graph you want.

![](static/zoom_in.gif)

To revert the figure to its original state, click on the `Reset axes` button at the upper right corner of the figure.

![](static/zoom_out.gif)

Once you download the dataset, run `python csv-clean.py flights-3m.csv` to obtain the cleaned csv file `flights-3m-cleaned.csv`. Move the cleaned file to the `data` folder in any of the project folders (`webgl`, `resample` or `combined`) you want to test.

## Local testing

Expand Down

0 comments on commit 432b981

Please sign in to comment.