Skip to content

AI Service

Francois edited this page Feb 6, 2026 · 4 revisions

AI Service

The AI Service is a standalone microservice responsible for providing an intelligent opponent in the "Player vs AI" game mode. It uses Deep Reinforcement Learning to predict the optimal paddle movement based on the current state of the game.

Property Value
Port 3005
Language Python 3.10+
Framework FastAPI
Model PPO (Proximal Policy Optimization)
Library Stable Baselines3

Architecture

The service operates as a stateless API. The Game Service (Node.js) acts as the client, sending the game state to the AI Service every frame (or tick) and receiving a move in response.

sequenceDiagram
    participant Game as ๐Ÿ“ Game Service
    participant AI as ๐Ÿค– AI Service
    
    loop Every Game Tick
        Game->>AI: POST /predict { ball_x, ball_y, paddle_y, ... }
        AI->>AI: Normalize State & Run Model
        AI-->>Game: JSON { move: "UP" | "DOWN" | "STOP" }
    end

Loading

Tech Stack & Implementation

Core Libraries

  • FastAPI: High-performance web server for serving the model.
  • Stable Baselines3: Reliable implementation of the PPO algorithm.
  • Gymnasium: Used to create the Pong environment for training.
  • NumPy: Efficient array manipulation for state normalization.

Model Configuration

The agent uses PPO (Proximal Policy Optimization) with a Multi-Layer Perceptron (MLP) policy. This was chosen over CNN (Convolutional Neural Networks) for performance reasons, as we pass coordinate data rather than raw pixels.

Training Hyperparameters:

model = PPO(
    'MlpPolicy',           # Neural Network type (Dense layers)
    env,                   # Pong Environment
    learning_rate=0.001,   # Learning speed
    n_steps=2048,          # Steps per update
    batch_size=64,         # Minibatch size
    n_epochs=10,           # Epochs per update
    verbose=1
)

API Reference

cf API Documentation


๐Ÿ‹๏ธ Training the Model

The model is pre-trained and saved as a .zip file (e.g., ppo_pong_final.zip). To retrain the model, you can run the training script inside the container.

  1. Enter the container:
docker exec -it pong-ai
  1. Run training:
python3 train.py

This will launch the Gymnasium environment and train for ~2 million timesteps.


๐Ÿš€ Running the Service

The service is part of the main docker-compose.yml stack.

  pong-ai:
    build: ./srcs/pong-ai
    ports:
      - "3005:3005"
    volumes:
      - ./srcs/ai:/app
    networks:
      - transcendence-network

To start it individually:

docker compose up -d pong-ai --build

See also:

Resources

Introduction to Proximal Policy Optimization algorithm (PPO)

๐Ÿ—๏ธ Architecture

๐ŸŒ Web Technologies

Backend

Frontend

๐Ÿ”ง Core Technologies

๐Ÿ” Security

โ›“๏ธ Blockchain

๐Ÿ› ๏ธ Dev Tools & Quality


๐Ÿ“ Page model

Clone this wiki locally