Real-time edge inference system for species detection and compliance monitoring in harsh deep-sea environments
Features β’ Installation β’ Quick Start β’ API Reference β’ Architecture β’ Contributing
DeepSeaGuard Multimodal Edge is an advanced AI system designed for real-time marine species detection and environmental monitoring in challenging deep-sea conditions. This repository provides the core AI model and API service that complements the DeepSeaGuard Dashboard frontend, enabling autonomous underwater vehicles (AUVs) and remotely operated vehicles (ROVs) to make intelligent decisions in real-time.
- Multimodal Data Fusion β Combines video, audio, and sensor data for robust detection
- Species Classification β Identifies fish, cephalopods, marine mammals, and other marine life
- Environmental Context β Incorporates real-time sensor readings for enhanced accuracy
- Edge-Optimized β Designed for deployment on resource-constrained hardware
- Compliance Ready β Outputs align with international seabed authority standards
- Multimodal Fusion Model β Robust to low visibility and noisy conditions
- Species Detection β Identifies fish, cephalopods, marine mammals, and other tagged species
- Environmental Context β Incorporates turbidity, oxygen, and temperature readings into inference
- Quality Scoring β Predicts "visibility confidence" for each prediction
- Real-time Processing β Sub-second inference on edge hardware
- Edge Ready β Designed for deployment on NVIDIA Jetson and other low-power hardware
- API-first β FastAPI microservice with comprehensive REST endpoints
- Docker Support β Containerized deployment for easy scaling
- Compliance Integration β Outputs align with ISA reporting standards (ISBA/21/LTC/15)
- Modular Architecture β Easy to extend and customize for specific use cases
- Video Feeds β AUV/ROV camera streams with low-light optimization
- Hydrophone Audio β Acoustic detection and species identification
- Environmental Sensors β Turbidity, dissolved oxygen, temperature, pressure
- Marine Life Monitoring β Species proximity and impact alerts
- Python 3.8 or higher
- CUDA-compatible GPU (recommended for training)
- Docker (optional, for containerized deployment)
# Clone the repository
git clone https://github.com/your-org/deepseaguard-multimodal-edge.git
cd deepseaguard-multimodal-edge
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Build the Docker image
docker build -t deepseaguard-edge .
# Run the container
docker run -p 8000:8000 deepseaguard-edge# Install development dependencies
pip install -r requirements-dev.txt
# Install pre-commit hooks
pre-commit install
# Run tests
pytest tests/from src.models import MultimodalModel
from src.infer import DeepSeaInference
# Initialize the model
model = MultimodalModel()
inference = DeepSeaInference(model)
# Run inference on new data
result = inference.predict(
video_path="data/sample_video.mp4",
audio_path="data/sample_audio.wav",
sensors={"turbidity": 0.5, "oxygen": 8.2, "temperature": 4.1}
)
print(f"Detected species: {result['species']}")
print(f"Confidence: {result['confidence']}")# Start the FastAPI server
python src/server.py
# The API will be available at http://localhost:8000
# Interactive docs at http://localhost:8000/docs# Train with default configuration
python src/train.py
# Train with custom config
python src/train.py --config config/custom_config.yamlMain inference endpoint for species detection.
Request Body:
{
"video_path": "path/to/video.mp4",
"audio_path": "path/to/audio.wav",
"sensors": {
"turbidity": 0.5,
"dissolved_oxygen": 8.2,
"temperature": 4.1,
"pressure": 101.3
}
}Response:
{
"species": "Sebastes mentella",
"confidence": 0.87,
"quality_score": 0.92,
"environmental_context": {
"visibility_rating": "good",
"noise_level": "low"
},
"timestamp": "2024-01-15T10:30:00Z"
}Health check endpoint.
Response:
{
"status": "healthy",
"version": "1.0.0",
"uptime": "2h 15m"
}Model performance metrics and statistics.
The API returns standard HTTP status codes:
200- Success400- Bad Request (invalid input)422- Validation Error500- Internal Server Error
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Video Input β β Audio Input β β Sensor Input β
β (AUV/ROV) β β (Hydrophone) β β (Environmental) β
βββββββββββ¬ββββββββ βββββββββββ¬ββββββββ βββββββββββ¬ββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Video Encoder β β Audio Encoder β β Sensor Encoder β
β (CNN/3D CNN) β β (SpectroCNN) β β (MLP) β
βββββββββββ¬ββββββββ βββββββββββ¬ββββββββ βββββββββββ¬ββββββββ
β β β
ββββββββββββββββββββββββΌβββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Fusion Head β
β (Attention + β
β Classification) β
βββββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββββ
β Output Layer β
β (Species + β
β Confidence) β
βββββββββββββββββββ
- Architecture: 3D CNN with temporal attention
- Input: Video frames (HΓWΓTΓC)
- Output: Video features (D_video)
- Architecture: Spectrogram CNN with frequency attention
- Input: Audio spectrograms (FΓT)
- Output: Audio features (D_audio)
- Architecture: Multi-layer perceptron
- Input: Environmental sensor readings
- Output: Sensor features (D_sensor)
- Architecture: Multi-head attention + classification
- Input: Concatenated features from all encoders
- Output: Species predictions + confidence scores
- Accuracy: 94.2% on test set
- Inference Time: <100ms on NVIDIA Jetson Xavier
- Memory Usage: <2GB RAM
- Power Consumption: <15W
- Fish: 150+ species including cod, haddock, redfish
- Cephalopods: Squid, octopus, cuttlefish
- Marine Mammals: Whales, dolphins, seals
- Other: Crustaceans, echinoderms, cnidarians
The system uses YAML configuration files for easy customization:
# config.yaml
model:
video_encoder:
backbone: "resnet3d"
pretrained: true
audio_encoder:
sample_rate: 44100
n_mels: 128
fusion:
attention_heads: 8
hidden_dim: 512
training:
batch_size: 16
learning_rate: 0.001
epochs: 100
optimizer: "adamw"
inference:
confidence_threshold: 0.7
max_batch_size: 4
device: "cuda"# Run all tests
pytest
# Run with coverage
pytest --cov=src
# Run specific test categories
pytest tests/unit/
pytest tests/integration/
pytest tests/performance/The system includes comprehensive monitoring capabilities:
- Performance Metrics: Inference time, accuracy, throughput
- System Health: CPU, memory, GPU utilization
- Model Drift: Detection of performance degradation
- Alerting: Automated notifications for anomalies
We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Add tests for new functionality
- Run the test suite (
pytest) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 guidelines
- Use type hints
- Write comprehensive docstrings
- Include unit tests for new features
This project is licensed under the MIT License - see the LICENSE file for details.
- Research Partners: Marine Biology Institute, Ocean Technology Center
- Hardware Partners: NVIDIA, Intel, ARM
- Open Source: PyTorch, FastAPI, OpenCV community
- Data Sources: Oceanographic databases, marine life catalogs
- Documentation: docs.deepseaguard.ai
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
- DeepSeaGuard Dashboard - Web interface
- DeepSeaGuard Mobile - Mobile app
- DeepSeaGuard Cloud - Cloud services
Made with β€οΈ for marine conservation
Website β’ Documentation β’ Community