This guide helps resolve common issues when working with Trinity.
Symptoms: Compilation errors in test files
Solutions:
- Ensure Zig 0.15.x is installed:
zig version # Should be 0.15.x - Clean build artifacts:
zig build clean zig build test - Check for Zig 0.15 API migration issues:
ArrayList.init()→ArrayList.empty()append(item)→append(allocator, item)std.time.sleep→std.Thread.sleep
Symptoms: tri test shows limited functionality message
Solution: Use zig build test instead:
zig build test # Full test suite
tri test # Limited, use zig build testSymptoms: Pre-commit hook fails with formatting issues
Solution:
zig fmt src/
git add src/
git commitSymptoms: openFPGALoader cannot detect board or fails to program
Critical: JTAG cable MUST be in JTAG mode (PID 0x0008), not bootloader mode (PID 0x0013)
Solution:
# First, switch cable to JTAG mode
fxload -t fx2 -I ./fpga/openxc7-synth/xc7a-xc7s-ftdi.hex -d 0x0013
# Then program
tri fpga flash
# OR
./fpga/tools/flash_no_sudo.sh hslm_full_top.bitNever skip fxload step — programming will fail.
Symptoms: tri fpga uart runs but no response from board
Possible Causes:
- UART headers not soldered — Hardware issue, cannot be fixed in software
- Wrong baud rate — Check UART_README.md for correct rate
- CPLD issue — Abnormal CPLD version indicates hardware problem
Diagnosis:
# Check CPLD version
tri fpga probe
# If CPLD shows 0xFFFE consistently, this is a DLC10 clone with known behaviorSolution: Do not debug software if UART headers are not soldered. Hardware modification required.
Symptoms: Error when trying to use xpc cable type
Cause: openFPGALoader does not support --cable xpc
Solution: Use fxload first, then:
openFPGALoader --cable ft232 --bitstream hslm_full_top.bitSymptoms: Loss stops improving before 10K steps
Cause: Using flat LR schedule instead of cosine
Solution:
# In Railway environment variables
HSLM_LR_SCHEDULE: cosine # NOT flatNever use flat schedule — training dead by 20K steps.
Symptoms: Worker stops automatically around 30K steps
Cause: Old binary bug (fixed in recent versions)
Solution:
# Restart worker with latest binary
tri farm recycle --service <service-id>Symptoms: Loss oscillating or not decreasing
Diagnosis:
- Check context length (should be ≥ 81 for NTP)
- Verify LR schedule is cosine
- Check for repetition rate anomalies
- Review 5-gate record verification
Solution:
# Check SEVO configuration
tri farm evolve --autoSymptoms: Service fails to start on Railway
Checks:
- Dockerfile is being used:
# Check service config railway service instance get --service-id <id> # Should show: builder: NIXPACKS
- Environment variables are set:
railway variable list --service-id <id> # Minimum required: HSLM_OPTIMIZER, HSLM_LR, HSLM_LR_SCHEDULE
- Dockerfile path is correct:
# Must set in service config, NOT just env var dockerfilePath: "Dockerfile.hslm-train"
Solution:
# Update service config
tri deploy update --service-id <id> --dockerfile-path "Dockerfile.hslm-train" --builder nixpacksSymptoms: Deployment fails when startCommand is set
Cause: Training services must use Dockerfile ENTRYPOINT, not Railway's startCommand
Solution:
# Remove startCommand from service config
railway service update --service-id <id> --start-command nullSymptoms: Agent stops after 1 hour without completing task
Cause: Default AGENT_TIMEOUT is 3600s (1 hour)
Solution: Adjust timeout in .ralph/config.json:
{
"timeout_seconds": 7200 // 2 hours
}Symptoms: Agent reports success but no PR is created
Diagnosis:
- Check PAT permissions (must have
reposcope) - Verify branch exists on remote
- Check for merge conflicts
Solution:
# Verify agent token
gh auth status
# Check agent logs
tri cloud history <issue-number>Symptoms: Clicking links in README or docs results in 404
Solution:
- Report the broken link in a GitHub issue
- CI automatically checks for broken links on PR
Symptoms: Command in README doesn't match actual behavior
Solution:
- Check command registry:
tri help - Report discrepancy in issue
Symptoms: zig build takes very long
Solutions:
- Use
zig build --summary allto see what takes longest - Incremental builds help after first full build
- Consider using
zig build-exe cacheif available
Symptoms: Process OOMs during build or training
Solutions:
- Reduce batch size for training
- Close other applications
- Increase swap space (Linux) or check memory limits (macOS)
| Error | Category | Solution | Link |
|---|---|---|---|
| E0501 | Memory management | Check allocators | src/vsa/README.md |
| E0502 | Allocator leak | Verify cleanup | Memory Guide |
| E0601 | UART timeout | Check hardware connection | UART README |
| E0701 | Training config | Verify env vars | Farm Guide |
| E0801 | Agent token expired | Refresh PAT | Cloud Pipeline |
- GitHub Issues: https://github.com/gHashTag/trinity/issues
- GitHub Discussions: For questions and general discussion
- Telegram: Community notifications (see README for link)
- Trinity version:
tri --version - Zig version:
zig version - OS and version
- Full error message or stack trace
- Steps to reproduce
- Expected vs actual behavior
- Search existing issues (your problem may already be reported)
- Check documentation index: docs/DOCUMENTATION_INDEX.md
- Try latest version (issue may be fixed)
Last updated: 2026-03-24