Troubleshooting Guide

This guide helps resolve common issues when working with Trinity.

Build Issues

"zig build test" fails

Symptoms: Compilation errors in test files

Solutions:

Ensure Zig 0.15.x is installed:
```
zig version  # Should be 0.15.x
```
Clean build artifacts:
```
zig build clean
zig build test
```
Check for Zig 0.15 API migration issues:
- ArrayList.init() → ArrayList.empty()
- append(item) → append(allocator, item)
- std.time.sleep → std.Thread.sleep

"Test command not yet implemented"

Symptoms: tri test shows limited functionality message

Solution: Use zig build test instead:

zig build test  # Full test suite
tri test         # Limited, use zig build test

Format errors on commit

Symptoms: Pre-commit hook fails with formatting issues

Solution:

zig fmt src/
git add src/
git commit

FPGA Issues

FPGA programming fails

Symptoms: openFPGALoader cannot detect board or fails to program

Critical: JTAG cable MUST be in JTAG mode (PID 0x0008), not bootloader mode (PID 0x0013)

Solution:

# First, switch cable to JTAG mode
fxload -t fx2 -I ./fpga/openxc7-synth/xc7a-xc7s-ftdi.hex -d 0x0013

# Then program
tri fpga flash
# OR
./fpga/tools/flash_no_sudo.sh hslm_full_top.bit

Never skip fxload step — programming will fail.

UART communication fails (no echo)

Symptoms: tri fpga uart runs but no response from board

Possible Causes:

UART headers not soldered — Hardware issue, cannot be fixed in software
Wrong baud rate — Check UART_README.md for correct rate
CPLD issue — Abnormal CPLD version indicates hardware problem

Diagnosis:

# Check CPLD version
tri fpga probe

# If CPLD shows 0xFFFE consistently, this is a DLC10 clone with known behavior

Solution: Do not debug software if UART headers are not soldered. Hardware modification required.

"openFPGALoader --cable xpc" fails

Symptoms: Error when trying to use xpc cable type

Cause: openFPGALoader does not support --cable xpc

Solution: Use fxload first, then:

openFPGALoader --cable ft232 --bitstream hslm_full_top.bit

Training Issues

Training stalls at low steps

Symptoms: Loss stops improving before 10K steps

Cause: Using flat LR schedule instead of cosine

Solution:

# In Railway environment variables
HSLM_LR_SCHEDULE: cosine  # NOT flat

Never use flat schedule — training dead by 20K steps.

Early kill at 30K steps

Symptoms: Worker stops automatically around 30K steps

Cause: Old binary bug (fixed in recent versions)

Solution:

# Restart worker with latest binary
tri farm recycle --service <service-id>

PPL not improving

Symptoms: Loss oscillating or not decreasing

Diagnosis:

Check context length (should be ≥ 81 for NTP)
Verify LR schedule is cosine
Check for repetition rate anomalies
Review 5-gate record verification

Solution:

# Check SEVO configuration
tri farm evolve --auto

Cloud & Deployment Issues

Railway deployment fails

Symptoms: Service fails to start on Railway

Checks:

Dockerfile is being used:

# Check service config
railway service instance get --service-id <id>
# Should show: builder: NIXPACKS

Environment variables are set:

railway variable list --service-id <id>
# Minimum required: HSLM_OPTIMIZER, HSLM_LR, HSLM_LR_SCHEDULE

Dockerfile path is correct:

# Must set in service config, NOT just env var
dockerfilePath: "Dockerfile.hslm-train"

Solution:

# Update service config
tri deploy update --service-id <id> --dockerfile-path "Dockerfile.hslm-train" --builder nixpacks

"startCommand cannot be set" error

Symptoms: Deployment fails when startCommand is set

Cause: Training services must use Dockerfile ENTRYPOINT, not Railway's startCommand

Solution:

# Remove startCommand from service config
railway service update --service-id <id> --start-command null

Agent Issues

"Agent timeout" in logs

Symptoms: Agent stops after 1 hour without completing task

Cause: Default AGENT_TIMEOUT is 3600s (1 hour)

Solution: Adjust timeout in .ralph/config.json:

{
  "timeout_seconds": 7200  // 2 hours
}

Agent fails to create PR

Symptoms: Agent reports success but no PR is created

Diagnosis:

Check PAT permissions (must have repo scope)
Verify branch exists on remote
Check for merge conflicts

Solution:

# Verify agent token
gh auth status

# Check agent logs
tri cloud history <issue-number>

Documentation Issues

Broken links

Symptoms: Clicking links in README or docs results in 404

Solution:

Report the broken link in a GitHub issue
CI automatically checks for broken links on PR

Outdated command reference

Symptoms: Command in README doesn't match actual behavior

Solution:

Check command registry: tri help
Report discrepancy in issue

Performance Issues

Slow compilation

Symptoms: zig build takes very long

Solutions:

Use zig build --summary all to see what takes longest
Incremental builds help after first full build
Consider using zig build-exe cache if available

High memory usage

Symptoms: Process OOMs during build or training

Solutions:

Reduce batch size for training
Close other applications
Increase swap space (Linux) or check memory limits (macOS)

Error Messages Reference

Error	Category	Solution	Link
E0501	Memory management	Check allocators	src/vsa/README.md
E0502	Allocator leak	Verify cleanup	Memory Guide
E0601	UART timeout	Check hardware connection	UART README
E0701	Training config	Verify env vars	Farm Guide
E0801	Agent token expired	Refresh PAT	Cloud Pipeline

Getting Help

Where to Report Issues

GitHub Issues: https://github.com/gHashTag/trinity/issues
GitHub Discussions: For questions and general discussion
Telegram: Community notifications (see README for link)

What to Include in Bug Reports

Trinity version: tri --version
Zig version: zig version
OS and version
Full error message or stack trace
Steps to reproduce
Expected vs actual behavior

Before Reporting

Search existing issues (your problem may already be reported)
Check documentation index: docs/DOCUMENTATION_INDEX.md
Try latest version (issue may be fixed)

Last updated: 2026-03-24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Troubleshooting Guide

Build Issues

"zig build test" fails

"Test command not yet implemented"

Format errors on commit

FPGA Issues

FPGA programming fails

UART communication fails (no echo)

"openFPGALoader --cable xpc" fails

Training Issues

Training stalls at low steps

Early kill at 30K steps

PPL not improving

Cloud & Deployment Issues

Railway deployment fails

"startCommand cannot be set" error

Agent Issues

"Agent timeout" in logs

Agent fails to create PR

Documentation Issues

Broken links

Outdated command reference

Performance Issues

Slow compilation

High memory usage

Error Messages Reference

Getting Help

Where to Report Issues

What to Include in Bug Reports

Before Reporting

Uh oh!

FilesExpand file tree

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting Guide

Build Issues

"zig build test" fails

"Test command not yet implemented"

Format errors on commit

FPGA Issues

FPGA programming fails

UART communication fails (no echo)

"openFPGALoader --cable xpc" fails

Training Issues

Training stalls at low steps

Early kill at 30K steps

PPL not improving

Cloud & Deployment Issues

Railway deployment fails

"startCommand cannot be set" error

Agent Issues

"Agent timeout" in logs

Agent fails to create PR

Documentation Issues

Broken links

Outdated command reference

Performance Issues

Slow compilation

High memory usage

Error Messages Reference

Getting Help

Where to Report Issues

What to Include in Bug Reports

Before Reporting