A Python tool for analyzing and visualizing water quality data to support environmental monitoring and community awareness.
The Water Quality Analyzer helps environmental scientists, researchers, and activists analyze water quality datasets. It provides statistical analysis, trend detection, and professional visualizations to support data-driven environmental decisions.
- π Statistical Analysis: Calculate mean, median, standard deviation, and percentiles
- π Trend Visualization: Plot time-series data with automatic trend lines
- π― Standards Compliance: Compare measurements against EPA water quality standards
- π Location Comparison: Compare water quality across multiple monitoring sites
- π Outlier Detection: Identify unusual measurements using box plots
- π Flexible Input: Supports both CSV and Excel file formats
- Python 3.7 or higher
- pip package manager
- Clone this repository:
git clone https://github.com/yourusername/water-quality-analyzer.git
cd water-quality-analyzer- Install required dependencies:
pip install -r requirements.txtfrom water_quality_analyzer import WaterQualityAnalyzer
# Initialize the analyzer with your data file
analyzer = WaterQualityAnalyzer('water_data.csv')
# Explore your dataset
columns = analyzer.explore_data()
# Generate a comprehensive report
analyzer.generate_report(
parameter_column='pH',
date_column='ActivityStartDate',
location_column='MonitoringLocationName'
)-
Prepare Your Data: Ensure your CSV or Excel file contains:
- Water quality measurements (pH, dissolved oxygen, temperature, etc.)
- Date/time stamps
- Location identifiers (optional)
-
Explore the Dataset:
analyzer = WaterQualityAnalyzer('your_data.csv')
columns = analyzer.explore_data()- Analyze Specific Parameters:
# Clean and analyze data
df = analyzer.clean_data('pH', date_column='Date')
stats = analyzer.analyze_parameter(df, 'pH')- Create Visualizations:
# Distribution plots
analyzer.plot_distribution(df, 'pH')
# Time series trends
analyzer.plot_trends(df, 'pH', 'Date')
# Location comparisons
analyzer.plot_comparison(df, 'pH', 'Location')Your input file should contain columns such as:
| Column Name | Description | Example |
|---|---|---|
pH |
pH measurement | 7.2 |
Temperature |
Water temperature (Β°C) | 18.5 |
Dissolved Oxygen |
DO level (mg/L) | 8.3 |
ActivityStartDate |
Measurement date | 2024-01-15 |
MonitoringLocationName |
Sample location | River Site A |
The analyzer includes EPA water quality standards for:
- pH: 6.5 - 8.5
- Dissolved Oxygen: β₯ 5.0 mg/L
- Temperature: 0 - 30Β°C
- Turbidity: 0 - 5 NTU
- Nitrate: 0 - 10 mg/L
- Phosphorus: 0 - 0.1 mg/L
These can be customized in the standards dictionary.
The analyzer generates high-resolution PNG images:
{parameter}_distribution.png- Histogram and box plot{parameter}_trend.png- Time series with trend line{parameter}_comparison.png- Multi-location comparison
All images are saved at 300 DPI for publication quality.
The tool creates three types of plots:
- Distribution Analysis: Shows the frequency distribution and identifies outliers
- Trend Analysis: Displays changes over time with regression lines
- Location Comparison: Compares measurements across different monitoring sites
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This repository includes example_data.csv for demonstration purposes.
To analyze your own water quality data:
- Place your CSV file in the project directory
- The
.gitignorefile will prevent your data from being committed to GitHub - Run the analyzer with your filename:
WaterQualityAnalyzer('your_data.csv')
Note: Only example_data.csv is tracked in this repository. Your actual data files remain private on your local machine.
This project is licensed under the MIT License - see the LICENSE file for details.
- EPA water quality standards documentation
- Environmental monitoring community
- Open source data visualization libraries
If you encounter any issues or have questions:
- Open an issue on GitHub
- Check existing issues for solutions
- Review the documentation
- Add support for more water quality parameters
- Implement seasonal analysis
- Add interactive dashboard
- Export reports to PDF
- Integration with real-time monitoring systems
- Multi-language support
If you use this tool in your research, please cite:
Water Quality Analyzer (2024). Available at: https://github.com/yourusername/water-quality-analyzer
Made with π for environmental conservation