TaskGrid+ is a containerized, distributed task processing system that allows dynamic task distribution across multiple worker nodes. The system uses gRPC for communication and provides a name service for dynamic worker discovery.
┌──────────────┐ ┌──────────────┐ ┌───────────────┐
│ │ │ │ │ Worker │
│ Client ├────►│ Dispatcher ├────►│ (reverse) │
│ │ │ │ │ │
└──────────────┘ └───────┬──────┘ └───────────────┘
│ ┌───────────────┐
│ │ Worker │
┌───────▼──────┐ │ (sum) │
│ │ │ │
│ Nameservice ├────►└───────────────┘
│ │ ┌───────────────┐
└───────┬──────┘ │ Worker │
│ │ (hash) │
┌───────▼──────┐ │ │
│ │ └───────────────┘
│ Monitoring │ ┌───────────────┐
│ │ │ Worker │
└──────────────┘ │ (upper) │
│ │
└───────────────┘
┌───────────────┐
│ Worker │
│ (wait) │
│ │
└───────────────┘
- Port: N/A (runs as needed)
- Responsibilities:
- Sends task requests to dispatcher
- Retrieves task results
- Provides task status monitoring
- Key Methods:
send_task(type, payload)request_result(task_id)
- Port: 50052
- Responsibilities:
- Task queue management
- Worker discovery via nameservice
- Task distribution
- Result collection and storage
- Key Features:
- Asynchronous task processing
- Load balancing
- Error handling and retries
- Task status tracking
- Port: 50051
- Responsibilities:
- Worker registration
- Service discovery
- Address resolution
- Key Features:
- Dynamic worker registration/deregistration
- Type-based worker lookup
- Health checking (planned)
- Ports: 50060-50064
- Types:
- Reverse Worker (50060)
- String reversal operations
- Example: "hello" → "olleh"
- Sum Worker (50061)
- Numerical array summation
- Example: [1,2,3] → 6
- Hash Worker (50062)
- SHA256 hash calculation
- Example: "text" → "982d9e3...""
- Upper Worker (50063)
- Text uppercase conversion
- Example: "hello" → "HELLO"
- Wait Worker (50064)
- Simulated processing delays
- Example: "2" → wait 2 seconds
- Reverse Worker (50060)
- Ports:
- gRPC: 50053
- REST: 8080
- Responsibilities:
- System statistics collection
- Performance monitoring
- Health checking
- Metrics:
- Active workers
- Pending tasks
- Average processing time
- Worker-specific statistics
-
Task Submission:
Client → Dispatcher → Nameservice → Worker → Dispatcher → Client -
Worker Registration:
Worker → Nameservice (register) Dispatcher → Nameservice (lookup) -
Monitoring:
Monitoring ← Nameservice (worker stats) Monitoring ← Dispatcher (task stats)
All services operate within a Docker network (taskgrid) with the following characteristics:
- Network Type: Bridge
- Service Discovery: Docker DNS
- Inter-container Communication: gRPC
- External Access: Exposed ports
{
"id": int,
"type": string,
"payload": string,
"result": string,
"status": string,
"timestamp_created": int64,
"timestamp_completed": int64
}-
Worker Failures:
- Automatic task requeuing
- Worker deregistration
- Error reporting to client
-
Network Issues:
- Connection retry mechanism
- Timeout handling
- Circuit breaker pattern (planned)
-
Invalid Tasks:
- Input validation
- Type checking
- Error response to client
-
Build and start all services:
docker-compose up --build
-
Run test client:
docker-compose up test-client
-
Monitor system:
curl http://localhost:8080/stats
-
Update Proto Definition:
// Add new task type message NewTaskType { string input = 1; string output = 2; }
-
Implement Worker:
def process_new_task(self, payload): # Implementation return result
-
Update Docker Configuration:
worker-new: build: args: - WORKER_TYPE=new - PORT=50065
The system includes a test client that:
- Generates random tasks
- Sends them every 30 seconds
- Logs results to
taskgrid_tests.log - Provides real-time feedback
Access system statistics through:
- REST API: http://localhost:8080/stats
- gRPC Interface: Port 50053
- Log Files: Container logs via Docker