Skip to content

Commit eca4f3c

Browse files
Kamal Sai DevarapalliKamal Sai Devarapalli
authored andcommitted
feat: Add health check utility and improve project documentation
- Add comprehensive health check script for monitoring all microservices - Add CHANGELOG.md for better project tracking - Update README with recent improvements and health check usage - Enhance .gitignore with additional patterns for better dev experience - Improve code documentation in utility scripts
1 parent fe4c462 commit eca4f3c

File tree

5 files changed

+338
-3
lines changed

5 files changed

+338
-3
lines changed

.gitignore

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,3 +99,59 @@ Archive/
9999
# Test results (keep directory structure but ignore JSON files)
100100
results/*.json
101101
!results/README.md
102+
103+
# Bazel
104+
.bazel/
105+
bazel-*
106+
107+
# Coverage reports
108+
.coverage.*
109+
coverage.xml
110+
*.cover
111+
.hypothesis/
112+
113+
# Profiling
114+
*.prof
115+
116+
# mypy
117+
.mypy_cache/
118+
.dmypy.json
119+
dmypy.json
120+
121+
# Pyre type checker
122+
.pyre/
123+
124+
# pytype static type analyzer
125+
.pytype/
126+
127+
# Celery
128+
celerybeat-schedule
129+
celerybeat.pid
130+
131+
# SageMath parsed files
132+
*.sage.py
133+
134+
# Environments
135+
.env.venv
136+
.env.*.local
137+
138+
# mkdocs documentation
139+
/site
140+
141+
# IDEs
142+
*.sublime-project
143+
*.sublime-workspace
144+
.vscode/*
145+
!.vscode/settings.json
146+
!.vscode/tasks.json
147+
!.vscode/launch.json
148+
!.vscode/extensions.json
149+
150+
# macOS
151+
.AppleDouble
152+
.LSOverride
153+
154+
# Windows
155+
Thumbs.db
156+
ehthumbs.db
157+
Desktop.ini

CHANGELOG.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Changelog
2+
3+
All notable changes to the EventStreamMonitor project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [Unreleased]
9+
10+
### Added
11+
- Comprehensive health check utility script for monitoring all services
12+
- Enhanced documentation with CHANGELOG.md for better project tracking
13+
- Improved .gitignore patterns for better development experience
14+
- Better code documentation and type hints in utility scripts
15+
16+
### Improved
17+
- Enhanced README with better formatting and additional information
18+
- Improved error handling in test scripts
19+
- Better service health monitoring capabilities
20+
21+
### Changed
22+
- Updated project structure documentation
23+
- Enhanced development workflow documentation
24+
25+
## [1.0.0] - 2024-12-19
26+
27+
### Added
28+
- Initial release of EventStreamMonitor
29+
- Real-time microservices monitoring platform
30+
- Kafka-based event streaming for log processing
31+
- Live monitoring dashboard with real-time updates
32+
- Redis caching for performance optimization
33+
- Database connection pooling optimized for high-throughput workloads
34+
- Gunicorn-based deployment with multi-worker configuration
35+
- Docker containerization for all services
36+
- Service-level error breakdown and statistics
37+
- Comprehensive test suite
38+
- Load testing scripts
39+
- Integration tests for all services
40+
41+
### Services
42+
- User Management Service (Port 5001)
43+
- Task Processing Service (Port 5002)
44+
- Notification Service (Port 5003)
45+
- Log Monitor Service (Port 5004)
46+
47+
### Documentation
48+
- Architecture documentation
49+
- Setup guides for all components
50+
- Performance configuration guides
51+
- Bazel build system documentation
52+
- Contributing guidelines
53+

README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,9 @@ docker-compose up
162162
# Run dry run tests
163163
python3 scripts/dry_run_tests.py
164164

165+
# Run health check
166+
python3 scripts/health_check.py
167+
165168
# Run integration tests
166169
cd tests
167170
python3 -m pytest integration/
@@ -182,6 +185,13 @@ Contributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTIN
182185

183186
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
184187

188+
## Recent Updates
189+
190+
- **Enhanced Health Monitoring**: New comprehensive health check utility script
191+
- **Improved Documentation**: Added CHANGELOG.md for better project tracking
192+
- **Better Development Experience**: Enhanced .gitignore and development tools
193+
- **Code Quality**: Improved documentation and type hints across the codebase
194+
185195
## Acknowledgments
186196

187197
- Apache Kafka for event streaming
@@ -192,3 +202,7 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
192202
## Contact
193203

194204
For questions or suggestions, please open an issue on GitHub.
205+
206+
## Changelog
207+
208+
See [CHANGELOG.md](CHANGELOG.md) for a detailed list of changes and updates.

scripts/health_check.py

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Health Check Utility for EventStreamMonitor Services
4+
5+
This script provides comprehensive health checking for all microservices,
6+
including connectivity, health endpoints, and service status monitoring.
7+
"""
8+
import requests
9+
import socket
10+
import time
11+
import sys
12+
from typing import Dict, List
13+
from datetime import datetime
14+
from dataclasses import dataclass
15+
16+
17+
@dataclass
18+
class ServiceHealth:
19+
"""Service health status container"""
20+
name: str
21+
url: str
22+
port: int
23+
is_reachable: bool
24+
is_healthy: bool
25+
response_time: float
26+
status_code: int = 0
27+
error_message: str = ""
28+
29+
30+
SERVICES = {
31+
"usermanagement": {
32+
"url": "http://localhost:5001",
33+
"port": 5001,
34+
"health_endpoint": "/health",
35+
"name": "User Management"
36+
},
37+
"taskprocessing": {
38+
"url": "http://localhost:5002",
39+
"port": 5002,
40+
"health_endpoint": "/health",
41+
"name": "Task Processing"
42+
},
43+
"notification": {
44+
"url": "http://localhost:5003",
45+
"port": 5003,
46+
"health_endpoint": "/health",
47+
"name": "Notification"
48+
},
49+
"logmonitor": {
50+
"url": "http://localhost:5004",
51+
"port": 5004,
52+
"health_endpoint": "/health",
53+
"name": "Log Monitor"
54+
}
55+
}
56+
57+
58+
def check_port_connectivity(host: str, port: int, timeout: int = 3) -> bool:
59+
"""
60+
Check if a port is open and accessible
61+
62+
Args:
63+
host: Hostname or IP address
64+
port: Port number
65+
timeout: Connection timeout in seconds
66+
67+
Returns:
68+
True if port is accessible, False otherwise
69+
"""
70+
try:
71+
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
72+
sock.settimeout(timeout)
73+
result = sock.connect_ex((host, port))
74+
sock.close()
75+
return result == 0
76+
except (socket.error, OSError):
77+
return False
78+
79+
80+
def check_service_health(service_config: Dict) -> ServiceHealth:
81+
"""
82+
Perform comprehensive health check for a service
83+
84+
Args:
85+
service_config: Service configuration dictionary
86+
87+
Returns:
88+
ServiceHealth object with health status
89+
"""
90+
name = service_config.get("name", "Unknown")
91+
url = service_config["url"]
92+
port = service_config["port"]
93+
health_endpoint = service_config.get("health_endpoint", "/health")
94+
95+
# Check port connectivity
96+
is_reachable = check_port_connectivity("localhost", port)
97+
98+
# Check health endpoint
99+
is_healthy = False
100+
response_time = 0.0
101+
status_code = 0
102+
error_message = ""
103+
104+
if is_reachable:
105+
try:
106+
start_time = time.time()
107+
response = requests.get(
108+
f"{url}{health_endpoint}",
109+
timeout=5
110+
)
111+
response_time = (time.time() - start_time) * 1000 # Convert to ms
112+
status_code = response.status_code
113+
is_healthy = response.status_code == 200
114+
except requests.exceptions.Timeout:
115+
error_message = "Request timeout"
116+
except requests.exceptions.ConnectionError:
117+
error_message = "Connection refused"
118+
except (requests.exceptions.RequestException, ValueError) as e:
119+
error_message = str(e)
120+
else:
121+
error_message = "Port not accessible"
122+
123+
return ServiceHealth(
124+
name=name,
125+
url=url,
126+
port=port,
127+
is_reachable=is_reachable,
128+
is_healthy=is_healthy,
129+
response_time=response_time,
130+
status_code=status_code,
131+
error_message=error_message
132+
)
133+
134+
135+
def print_health_status(health: ServiceHealth):
136+
"""
137+
Print formatted health status for a service
138+
139+
Args:
140+
health: ServiceHealth object
141+
"""
142+
status_icon = "✓" if health.is_healthy else "✗"
143+
status_text = "HEALTHY" if health.is_healthy else "UNHEALTHY"
144+
145+
print(f"\n{status_icon} {health.name}")
146+
print(f" URL: {health.url}")
147+
print(f" Port: {health.port}")
148+
print(f" Status: {status_text}")
149+
150+
if health.is_reachable:
151+
print(f" Response Time: {health.response_time:.2f}ms")
152+
print(f" Status Code: {health.status_code}")
153+
else:
154+
print(f" Error: {health.error_message}")
155+
156+
157+
def main():
158+
"""Main health check function"""
159+
print("=" * 70)
160+
print("EVENTSTREAMMONITOR - SERVICE HEALTH CHECK")
161+
print("=" * 70)
162+
print(f"Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
163+
print("=" * 70)
164+
165+
health_results: List[ServiceHealth] = []
166+
167+
# Check all services
168+
for service_config in SERVICES.values():
169+
health = check_service_health(service_config)
170+
health_results.append(health)
171+
print_health_status(health)
172+
173+
# Summary
174+
print("\n" + "=" * 70)
175+
print("SUMMARY")
176+
print("=" * 70)
177+
178+
healthy_count = sum(1 for h in health_results if h.is_healthy)
179+
total_count = len(health_results)
180+
181+
print(f"Services Healthy: {healthy_count}/{total_count}")
182+
183+
if healthy_count == total_count:
184+
print("\n✓ All services are healthy and operational!")
185+
return 0
186+
else:
187+
print("\n✗ Some services are not healthy. Check logs with:")
188+
print(" docker-compose logs [service-name]")
189+
return 1
190+
191+
192+
if __name__ == "__main__":
193+
sys.exit(main())
194+

scripts/quick_stream_errors.py

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,17 @@
11
#!/usr/bin/env python3
22
"""
3-
Quick script to stream error events directly to Kafka
4-
No dependencies on local imports - pure Kafka producer
3+
Quick script to stream error events directly to Kafka.
4+
5+
This script sends predefined error and critical log events to Kafka for testing
6+
and demonstration purposes. It connects directly to Kafka without requiring
7+
local service dependencies.
8+
9+
Usage:
10+
python3 scripts/quick_stream_errors.py
11+
12+
Requirements:
13+
- Kafka must be running (docker-compose up -d kafka)
14+
- kafka-python package installed
515
"""
616
from kafka import KafkaProducer
717
import json
@@ -54,7 +64,15 @@
5464

5565

5666
def stream_errors():
57-
"""Stream error events to Kafka"""
67+
"""
68+
Stream predefined error events to Kafka.
69+
70+
Connects to Kafka broker, sends error events to the 'application-logs-errors'
71+
topic, and provides feedback on the streaming process.
72+
73+
Returns:
74+
int: Exit code (0 for success, 1 for failure)
75+
"""
5876
print("=" * 60)
5977
print("Streaming Error Events to Kafka")
6078
print("=" * 60)

0 commit comments

Comments
 (0)