A Python tool for multi-dimensional Pareto frontier selection with advanced data processing (folder-based CSV loading, groupby processing, IDX-based multi-value selection) and an optional interactive Plotly dashboard.
- Folder-Based Data Loading: Load one or many CSV files from a directory and concatenate automatically
- Groupby Processing: Optimize Pareto frontier per group (e.g., by
FP,STD) - Multi-Value Variables with IDXs: Select from list-of-lists using IDX1/IDX2 bin edges (value- or index-based)
- Per-Feature Tolerances: Absolute or relative epsilon values
- Pareto Selection: Platypus EpsilonBoxArchive
- Optional Dashboard: 2D/3D scatter, tolerance overlays, interactive tooltips
No manual setup is required. The tcsh scripts in scripts/ will create and manage the environment for you.
- Open
scripts/run_pareto.shand set theconfigvariable to your config path. - Run the pipeline using tcsh:
tcsh scripts/run_pareto.sh
Notes:
scripts/run_pareto.shis the stable entry point. It sets up the Python environment and executes the pipeline for you.scripts/run_pareto_uv.shis experimental/unstable at the moment; do not use it.- If you need to enable the dashboard (e.g.,
--visualizeor--dashboard-only), edit thepython_cmdline inside the script to append the desired flags.
Run only via tcsh scripts:
tcsh scripts/run_pareto.shOptional flags like --visualize, --dashboard-only, or --bind-all should be appended inside the script by editing its python_cmd line. Direct python main.py commands are not recommended and have been removed from this README.
run:
# Input path: a single CSV file or a directory of CSV files
input_dir: path/to/data_or_dir
# Use input filename as prefix for output files
use_input_prefix: true
# Output settings
output_processed_data: true
# Optional: dashboard settings (server does NOT auto-start from config)
visualization:
dashboard_port: 8050
dashboard_host: localhost
# Optional: groupby columns for grouped Pareto selection
groupby_columns:
- FP
- STD
data:
AREA:
objective: minimize
variable:
type: single
tolerance:
type: absolute
value: 1.0
DYNAMIC:
objective: minimize
variable:
type: multi
selection_strategy: value # or: index
idx1_value: 0.25 # value- or index-based depending on strategy
idx2_value: 2.5
tolerance:
type: relative
value: 0.15Notes:
- The dashboard never auto-starts from config; use
--visualizeor--dashboard-only. run.input_dirmay point to a directory; all CSVs within are loaded and concatenated.- For legacy configs,
run.input_fileis still accepted.
- Expected CSV columns (example):
FP,STD,AREA,OUT_CAP,LEAKAGE,DYNAMIC_IDX1,DYNAMIC_IDX2,DYNAMIC,TRANSITION_IDX1,TRANSITION_IDX2,TRANSITION,DELAY_IDX1,DELAY_IDX2,DELAY - FP, STD: Groupby columns (strings)
- AREA, OUT_CAP, LEAKAGE: Single-value variables (numeric)
- DYNAMIC, TRANSITION, DELAY: Multi-value variables (list of lists)
- _IDX1, _IDX2: Bin edge arrays for IDX-based value selection
- IDX-based selection uses IDX1/IDX2 bin edges to find the appropriate element
variable[i][j] - Supports both index and value strategies (
selection_strategy)
vis_data.npz: Saved next to the config or in--run-dir; used by--dashboard-onlypareto_results.csv: Pareto front summary with columns ordered as:candidate_index- For each feature in
feature_names:FEATURE,FEATURE_IDX1,FEATURE_IDX2(IDX columns appear immediately after their feature when applicable)
processed_data.csv: When--processed-outputis provided orrun.output_processed_datais true
- 2D/3D scatter plots, feature selection, tolerance visualization, interactive tooltips, statistics panel
- Use
--bind-allto expose the dashboard on your network (0.0.0.0)
ParetoFilter/
├── requirements.txt
├── README.md
├── main.py # CLI and entry point
├── src/
│ ├── preprocess.py # Data preprocessing and tolerance handling
│ ├── pareto.py # Pareto frontier selection logic
│ ├── visualization.py # Plotly dashboard implementation
│ └── utils.py # Utility functions
├── runs/
│ ├── sample_run_1/
│ │ ├── config.yaml # Sample config
│ │ ├── sample_run_1_processed_data.csv
│ │ ├── sample_run_1_pareto_results.csv
│ │ └── vis_data.npz
│ └── ...
This project targets Python 3.11 and uses type hints throughout. Follow the coding standards in the Cursor Rules for consistency.