A professional Python package for analyzing vegetation indices and detecting hotspots in satellite imagery using advanced geospatial techniques and Microsoft Planetary Computer.
This package provides a comprehensive suite of tools for vegetation analysis using satellite imagery. It specializes in:
- NDVI (Normalized Difference Vegetation Index) Calculation: Compute vegetation indices from multi-spectral satellite data
- Temporal Analysis: Analyze vegetation changes over time with statistical insights
- Hotspot Detection: Identify spatial clusters of high/low vegetation using Getis-Ord Gi* statistics
- Advanced Visualization: Create publication-ready maps and time series plots
- Data Integration: Seamless access to Landsat data via Microsoft Planetary Computer
- β Automated Landsat data retrieval from Microsoft Planetary Computer
- β Cloud masking and quality filtering
- β Multi-temporal data stacking and alignment
- β NDVI calculation with temporal statistics
- β Getis-Ord Gi* hotspot analysis
- β Focal statistics and neighborhood operations
- β Edge detection using Sobel filters
- β Hotspot persistence tracking
- β Interactive time series plots
- β Hotspot maps with custom colormaps
- β True and false color composites
- β Statistical distribution plots
- β Publication-ready figures
- Python 3.8 or higher
- Git
# Clone the repository
git clone https://github.com/your-username/NDVI-Hotspot.git
cd NDVI-Hotspot
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -e .
from ndvi_hotspot import LandsatDataLoader, NDVIProcessor, HotspotAnalyzer, NDVIVisualizer
from datetime import datetime
import matplotlib.pyplot as plt
# Define area of interest (bbox: [minx, miny, maxx, maxy])
bbox = [-120.5, 35.5, -120.0, 36.0] # Central California
start_date = datetime(2023, 4, 1)
end_date = datetime(2023, 9, 30)
# Load satellite data
loader = LandsatDataLoader()
landsat_data = loader.search_and_load(
bbox=bbox,
start_date=start_date,
end_date=end_date,
max_cloud_cover=20
)
# Calculate NDVI
processor = NDVIProcessor()
ndvi_data = processor.calculate_ndvi(landsat_data)
# Analyze temporal statistics
ndvi_stats = processor.calculate_temporal_stats(ndvi_data)
# Detect hotspots
analyzer = HotspotAnalyzer()
hotspots = analyzer.calculate_hotspots(ndvi_stats['mean'])
# Visualize results
visualizer = NDVIVisualizer()
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# Plot NDVI time series
visualizer.plot_ndvi_timeseries(ndvi_data, ax=axes[0,0])
# Plot mean NDVI
visualizer.plot_ndvi_map(ndvi_stats['mean'], ax=axes[0,1], title='Mean NDVI')
# Plot hotspots
visualizer.plot_hotspot_map(hotspots, ax=axes[1,0])
# Plot NDVI histogram
visualizer.plot_ndvi_histogram(ndvi_stats['mean'], ax=axes[1,1])
plt.tight_layout()
plt.show()
from ndvi_hotspot import HotspotAnalyzer
import numpy as np
# Initialize analyzer with custom parameters
analyzer = HotspotAnalyzer(
neighborhood_size=5,
significance_level=0.05
)
# Calculate hotspots with different kernel types
hotspots_circular = analyzer.calculate_hotspots(
ndvi_stats['mean'],
kernel_type='circular'
)
hotspots_square = analyzer.calculate_hotspots(
ndvi_stats['mean'],
kernel_type='square'
)
# Analyze hotspot persistence over time
ndvi_timeseries = processor.get_temporal_data(ndvi_data)
persistence = analyzer.analyze_hotspot_persistence(ndvi_timeseries)
# Get summary statistics
stats = analyzer.get_hotspot_stats(hotspots_circular)
print(f"Hot spots: {stats['hot_spots']} pixels")
print(f"Cold spots: {stats['cold_spots']} pixels")
print(f"Significant ratio: {stats['significant_ratio']:.2%}")
Handles data acquisition from Microsoft Planetary Computer's STAC catalog.
loader = LandsatDataLoader()
# Search by coordinates and date range
data = loader.search_and_load(
bbox=[-120.5, 35.5, -120.0, 36.0],
start_date=datetime(2023, 4, 1),
end_date=datetime(2023, 9, 30),
max_cloud_cover=20
)
# Search by specific scene IDs
scene_ids = ['LC08_L2SP_043034_20230401_02_T1']
data = loader.load_by_scene_ids(scene_ids)
Calculates vegetation indices and temporal statistics.
processor = NDVIProcessor()
# Calculate NDVI from Landsat data
ndvi = processor.calculate_ndvi(landsat_data)
# Calculate temporal statistics
stats = processor.calculate_temporal_stats(ndvi)
# Returns: mean, std, min, max, median, percentile_25, percentile_75
# Detect vegetation anomalies
anomalies = processor.detect_vegetation_anomalies(ndvi, threshold=2.0)
Performs spatial hotspot analysis using Getis-Ord Gi* statistics.
analyzer = HotspotAnalyzer(neighborhood_size=3)
# Calculate hotspots
hotspots = analyzer.calculate_hotspots(ndvi_mean)
# Analyze persistence over time
persistence = analyzer.analyze_hotspot_persistence(ndvi_timeseries)
# Apply Sobel edge detection
edges = analyzer.sobel_edge_detection(ndvi_mean)
Creates comprehensive visualizations for analysis results.
visualizer = NDVIVisualizer()
# Plot NDVI time series
visualizer.plot_ndvi_timeseries(ndvi_data)
# Create hotspot maps
visualizer.plot_hotspot_map(hotspots, title='NDVI Hotspots')
# Generate true color composites
visualizer.plot_true_color(landsat_data, bands=['red', 'green', 'blue'])
# Plot statistical distributions
visualizer.plot_ndvi_histogram(ndvi_data)
NDVI-Hotspot/
βββ README.md # This file
βββ requirements.txt # Package dependencies
βββ example_analysis.py # Complete workflow example
βββ Untitled.ipynb # Original analysis notebook
βββ ndvi_hotspot/ # Main package
βββ __init__.py # Package initialization
βββ data_loader.py # Landsat data acquisition
βββ ndvi_processor.py # NDVI calculation and analysis
βββ hotspot_analyzer.py # Spatial hotspot detection
βββ visualizer.py # Plotting and visualization
βββ utils.py # Utility functions
βββ config.py # Configuration and constants
The package includes configurable parameters in ndvi_hotspot/config.py
:
# Hotspot analysis parameters
HOTSPOT_SIGNIFICANCE_LEVELS = [0.01, 0.05, 0.1]
DEFAULT_NEIGHBORHOOD_SIZE = 3
EDGE_DETECTION_THRESHOLD = 0.1
# Color schemes for visualization
NDVI_COLORMAP = 'RdYlGn'
HOTSPOT_COLORS = {
'hot': '#d7191c',
'cold': '#2c7bb6',
'not_significant': '#ffffbf'
}
# Memory management
MAX_MEMORY_GB = 8
CHUNK_SIZE_MB = 100
The package generates various types of analysis outputs:
- NDVI Time Series: Track vegetation changes over time
- Hotspot Maps: Spatial clusters of high/low vegetation
- Statistical Summaries: Comprehensive vegetation statistics
- Color Composites: True and false color satellite imagery
- Persistence Analysis: Temporal stability of hotspots
- Track deforestation and reforestation
- Monitor agricultural crop health
- Assess drought impacts on vegetation
- Precision agriculture and yield prediction
- Irrigation management
- Crop stress detection
- Climate change impact studies
- Biodiversity assessments
- Land use change analysis
NDVI is calculated as:
NDVI = (NIR - Red) / (NIR + Red)
Where NIR is near-infrared and Red is red band reflectance.
The Gi* statistic identifies spatial clusters:
Gi* = (Ξ£ wij * xj - XΜ * Ξ£ wij) / (S * β[(n * Ξ£ wijΒ² - (Ξ£ wij)Β²) / (n-1)])
Where wij are spatial weights and xj are attribute values.
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit changes (
git commit -m 'Add amazing feature'
) - Push to branch (
git push origin feature/amazing-feature
) - Open a Pull Request
# Install development dependencies
pip install -r requirements.txt
# Run tests
pytest tests/
# Format code
black ndvi_hotspot/
# Type checking
mypy ndvi_hotspot/
This project is licensed under the MIT License - see the LICENSE.txt file for details.
- Microsoft Planetary Computer for providing free access to Landsat data
- stackstac team for efficient satellite data processing tools
- xrspatial developers for spatial analysis capabilities
- Python geospatial community for the excellent ecosystem of tools
For questions, issues, or feature requests:
- Check the Issues page
- Create a new issue with detailed description
- Contact the maintainers
Happy analyzing! π°οΈπ±
This package demonstrates professional Python development practices for geospatial analysis and remote sensing applications.