|
1 | 1 | # Changelog |
2 | 2 |
|
| 3 | +## [0.2.0](https://github.com/PlatformNetwork/term-challenge/compare/v0.1.0...v0.2.0) (2026-01-12) |
| 4 | + |
| 5 | + |
| 6 | +### ⚠ BREAKING CHANGES |
| 7 | + |
| 8 | +* **sdk:** SDK API completely redesigned |
| 9 | + |
| 10 | +### Features |
| 11 | + |
| 12 | +* 3-validator task distribution, cancel command, and improved error handling ([e18083b](https://github.com/PlatformNetwork/term-challenge/commit/e18083b7a555280cd6e8d0c2978c00c303651b48)) |
| 13 | +* add assignment monitor for stale validator reassignment ([31fbb15](https://github.com/PlatformNetwork/term-challenge/commit/31fbb15e6fc6138d082d5b0be62ff4769844fd86)) |
| 14 | +* add binary caching to validator worker ([bbf237e](https://github.com/PlatformNetwork/term-challenge/commit/bbf237ebd8d5b0fa3a4ede246cf19e96430c67ad)) |
| 15 | +* add DirectDockerBackend and binary agent runner for local bench testing ([d84ed75](https://github.com/PlatformNetwork/term-challenge/commit/d84ed7586fe97158f6f6d94b293055e6f355463c)) |
| 16 | +* add disable_decay and disable_public_code fields ([172223f](https://github.com/PlatformNetwork/term-challenge/commit/172223f5cf94289b98fd35845921fd171e4004eb)) |
| 17 | +* add epoch calculation with custom start block ([ebe42fa](https://github.com/PlatformNetwork/term-challenge/commit/ebe42fad75bae76ea5982a820648c2fe0e91fdb9)) |
| 18 | +* add multi-file package submission support ([d1d8cba](https://github.com/PlatformNetwork/term-challenge/commit/d1d8cba2b8b97c83e4e0b43322dfe8b47fa761f4)) |
| 19 | +* add real-time task logging to platform server ([54b1b42](https://github.com/PlatformNetwork/term-challenge/commit/54b1b422f0c7fc746d6baddbad499fc1f4de36af)) |
| 20 | +* add status, total_cost_usd and success_rate to leaderboard ([5716384](https://github.com/PlatformNetwork/term-challenge/commit/5716384cfcefca812c7ba76a4e1ef7212931f788)) |
| 21 | +* add Terminus-2 agent adapted for Term SDK 2.0 ([e72c7eb](https://github.com/PlatformNetwork/term-challenge/commit/e72c7ebb147a5ebf91f917dbc4e2202a154274a5)) |
| 22 | +* add time-based reward decay system ([20d978d](https://github.com/PlatformNetwork/term-challenge/commit/20d978d522eb9c52f1ea1942a12f2ac26297fa4a)) |
| 23 | +* add verbose agent logging and evaluation resume support ([4415307](https://github.com/PlatformNetwork/term-challenge/commit/4415307a549464b8d0e3b957a984914c92a95505)) |
| 24 | +* add verbose logging for container backend and compilation ([9886e1f](https://github.com/PlatformNetwork/term-challenge/commit/9886e1f5a86fd7ef1bea5e0e386b48cb5d48b143)) |
| 25 | +* add weight and submitted_at to leaderboard responses ([d6d8e37](https://github.com/PlatformNetwork/term-challenge/commit/d6d8e37442ca30426d846e80a968369e44f9c347)) |
| 26 | +* automatic cleanup of orphan Docker volumes ([cf148a3](https://github.com/PlatformNetwork/term-challenge/commit/cf148a3b2026d20b9a7b84bb0c75caeb3488b75c)) |
| 27 | +* cleanup stale task containers at validator startup ([8da0f7b](https://github.com/PlatformNetwork/term-challenge/commit/8da0f7bd4fe38c4477ae24bebcbc1d183bcdec45)) |
| 28 | +* distributed task evaluation and validator readiness system ([bdcf46d](https://github.com/PlatformNetwork/term-challenge/commit/bdcf46d911e65c45906073b8068603e3e9f923fb)) |
| 29 | +* Docker-in-Docker fixes and glibc compatibility ([75a81c6](https://github.com/PlatformNetwork/term-challenge/commit/75a81c6c2944e9c11fd8ee9fa2301c407dd49107)) |
| 30 | +* Implement StaticX for portable agent binaries ([90652ea](https://github.com/PlatformNetwork/term-challenge/commit/90652ead65478526df664f738f949d6bf77c9958)) |
| 31 | +* improve LLM proxy cost tracking and add Grok provider support ([395fd9b](https://github.com/PlatformNetwork/term-challenge/commit/395fd9bfcfa2ee32a5108e90d5197e876ab5dc4b)) |
| 32 | +* install full SDK with LLM support during compilation ([8674eac](https://github.com/PlatformNetwork/term-challenge/commit/8674eacc4d687d09d76a991dd20d37d31b616082)) |
| 33 | +* LLM proxy with cost tracking, task observability APIs, streaming support ([2eb5fb0](https://github.com/PlatformNetwork/term-challenge/commit/2eb5fb0d506a0f4f95d92d267858bcc1778f05eb)) |
| 34 | +* **maintenance:** add periodic maintenance task + require all validators for consensus ([b0e1713](https://github.com/PlatformNetwork/term-challenge/commit/b0e171329c1f081adf765106be9717bfad9abc5a)) |
| 35 | +* migrate bench run to use binary agent system ([1915444](https://github.com/PlatformNetwork/term-challenge/commit/1915444513a3a2314fbcc18a12127488791e238d)) |
| 36 | +* move validator and task assignment to compile_worker ([7958323](https://github.com/PlatformNetwork/term-challenge/commit/7958323f8344084680eaf5624a8bc335bd80c964)) |
| 37 | +* replace epoch-based submission rate limit with time-based (3.6h cooldown) ([6216f33](https://github.com/PlatformNetwork/term-challenge/commit/6216f3300815c39fd6b3edcc97fa60b6b3363a23)) |
| 38 | +* replace validator whitelist with stake-based auth via metagraph ([bfb91f0](https://github.com/PlatformNetwork/term-challenge/commit/bfb91f09d57e34d338c1dd6e21fb360fcadbe917)) |
| 39 | +* **sdk:** SDK 2.0 with agent-controlled execution model ([41b86a4](https://github.com/PlatformNetwork/term-challenge/commit/41b86a474a8f3f8052901b380010567d79d4d65d)) |
| 40 | +* use ContainerBackend for validator worker task execution ([31d7022](https://github.com/PlatformNetwork/term-challenge/commit/31d7022084ab9544f9b561bb5de9bb16f85c145c)) |
| 41 | +* use secure broker for building compiler image ([be617a2](https://github.com/PlatformNetwork/term-challenge/commit/be617a205dc182038de301afdf16d006f81cf010)) |
| 42 | +* winner-takes-all weight calculation with manual validation ([6915096](https://github.com/PlatformNetwork/term-challenge/commit/691509640d36d285390b78c54d1e39baaed6bb97)) |
| 43 | + |
| 44 | + |
| 45 | +### Bug Fixes |
| 46 | + |
| 47 | +* add --break-system-packages flag to pip install in compiler ([7dcbdec](https://github.com/PlatformNetwork/term-challenge/commit/7dcbdec071ffd116a7b7df711c48f889d5aa66e3)) |
| 48 | +* add --break-system-packages to httpx pip install ([f228ba6](https://github.com/PlatformNetwork/term-challenge/commit/f228ba65fc489d870d24e6e9b522ebaf0d0a7228)) |
| 49 | +* add FLOAT8 cast to RETURNING clause in update_submission_cost ([c514f2c](https://github.com/PlatformNetwork/term-challenge/commit/c514f2cf15b5494a3d5206f5a7184a03859c04bc)) |
| 50 | +* add FLOAT8 casts for all REAL column reads in pg_storage ([8ec0efd](https://github.com/PlatformNetwork/term-challenge/commit/8ec0efdca638a29984fe0b8822964a2e6ad8824d)) |
| 51 | +* add httpx to PyInstaller hidden imports ([b7d25a6](https://github.com/PlatformNetwork/term-challenge/commit/b7d25a6a1729abb80c438cb6aff4cb5b78ffe5e3)) |
| 52 | +* add LLM_MODEL env var support and reduce log noise from /status requests ([f487693](https://github.com/PlatformNetwork/term-challenge/commit/f487693a853806005d67eb071793ccfee239fa3b)) |
| 53 | +* add migration 009 for validator_assignment status column ([17886de](https://github.com/PlatformNetwork/term-challenge/commit/17886decbbda47264780c0be2f486a72e0772580)) |
| 54 | +* add Pong variant to BrokerResponse for auth success parsing ([dad55b4](https://github.com/PlatformNetwork/term-challenge/commit/dad55b43c56e338b7a52351d547118317ecea4c4)) |
| 55 | +* add validator_assignments table and use claude-haiku-4.5 for reviews ([97fdff7](https://github.com/PlatformNetwork/term-challenge/commit/97fdff7d36662da90daf36b445e14461a6b09854)) |
| 56 | +* align default timeout with Harbor/terminal-bench (180s) ([2b41e9c](https://github.com/PlatformNetwork/term-challenge/commit/2b41e9ccebf67a5811050b1bbf7c4ec57c8c74d2)) |
| 57 | +* align LLM proxy signature format with central server ([ca40138](https://github.com/PlatformNetwork/term-challenge/commit/ca401386bcf7108c760b6fd68a0a705fe5c87f20)) |
| 58 | +* always build compiler image, never pull from Docker Hub ([337d345](https://github.com/PlatformNetwork/term-challenge/commit/337d3455ffeacc6ee08733f146879e44f7d0a750)) |
| 59 | +* **broker:** add retry logic for WS connection failures ([1188c30](https://github.com/PlatformNetwork/term-challenge/commit/1188c3037589bc85ef29695262ad00040d5e5f8e)) |
| 60 | +* build compiler image on demand if not found during compilation ([12de066](https://github.com/PlatformNetwork/term-challenge/commit/12de0663f55ab05087face7bab9b7cf5c422beaa)) |
| 61 | +* calculate evaluation costs from llm_usage table ([e5ac0aa](https://github.com/PlatformNetwork/term-challenge/commit/e5ac0aa632a87d4c09629e269a911e3d7f3de4e3)) |
| 62 | +* cast f64 to f32 for PostgreSQL REAL columns in cost updates ([08c3613](https://github.com/PlatformNetwork/term-challenge/commit/08c36131b9e11f7842b53f975185e13b5ac09035)) |
| 63 | +* check if PyInstaller exists before installing ([78a648d](https://github.com/PlatformNetwork/term-challenge/commit/78a648deb53134ca8174dab34106b8e281a12501)) |
| 64 | +* check multiple SDK paths for full SDK installation ([cd9ddb0](https://github.com/PlatformNetwork/term-challenge/commit/cd9ddb040f5bbae9aa79259e72b6c8659b2c3e94)) |
| 65 | +* **ci:** separate coverage job to prevent cancellation ([7ba740d](https://github.com/PlatformNetwork/term-challenge/commit/7ba740d3578f4565c53985b749b48b7d5c6b39e9)) |
| 66 | +* cleanup orphan compiler containers at startup and use UUID in names ([ec2c026](https://github.com/PlatformNetwork/term-challenge/commit/ec2c0260729ee404382cc850352a038ff783c7de)) |
| 67 | +* copy docker directory into images for compiler image building ([ffb42fb](https://github.com/PlatformNetwork/term-challenge/commit/ffb42fb32c2c24be83c2432e0efeb732aa8c5ccc)) |
| 68 | +* correct iteration increment in terminus_2 agent loop ([ddca36c](https://github.com/PlatformNetwork/term-challenge/commit/ddca36cff56f4863469af33f735106290f2dde1a)) |
| 69 | +* correct signature message for my_jobs endpoint ([cd079d7](https://github.com/PlatformNetwork/term-challenge/commit/cd079d7fe4501a65799222fd7b9ec0b6daca7d5a)) |
| 70 | +* decrypt API key before sending to OpenRouter ([4e78be0](https://github.com/PlatformNetwork/term-challenge/commit/4e78be088f043bfb470a53bc6d0a8385073239d1)) |
| 71 | +* deduplicate agent logs by tracking last printed line ([6d6abcd](https://github.com/PlatformNetwork/term-challenge/commit/6d6abcdda4e9e68e14e5cb051c3a85b46a210d8f)) |
| 72 | +* detect and abort stuck agents with consecutive empty responses ([848a3cc](https://github.com/PlatformNetwork/term-challenge/commit/848a3cc620c226fb243aedfde09daf8102ea6b5c)) |
| 73 | +* ensure binutils is installed before PyInstaller ([af6a776](https://github.com/PlatformNetwork/term-challenge/commit/af6a776298e86c428c496a2b57f1a2ad5f25f159)) |
| 74 | +* Harbor-compatible test verification and dynamic challenge_id ([319fdd6](https://github.com/PlatformNetwork/term-challenge/commit/319fdd6a37a19afa6a5a1f49df26afc43d5700be)) |
| 75 | +* improve broker WS error message to include URL ([b8f7877](https://github.com/PlatformNetwork/term-challenge/commit/b8f7877929a75ff8e57c3e8f27ee883a5768db71)) |
| 76 | +* improve Docker error logging for debugging task container failures ([1bffd2a](https://github.com/PlatformNetwork/term-challenge/commit/1bffd2abc2b981c2193143e7132484c1ccbdacf2)) |
| 77 | +* include all migrations (006-009) in embedded migrations list ([83c4245](https://github.com/PlatformNetwork/term-challenge/commit/83c42459acec0b4f0a851e569ac6dfbb3515aa40)) |
| 78 | +* increase limits and reduce validators ([dca4dd5](https://github.com/PlatformNetwork/term-challenge/commit/dca4dd58291463a5b4cc8be31780c4dab49c0cde)) |
| 79 | +* **leaderboard:** show only fully evaluated submissions (status='completed') ([7b7ec1c](https://github.com/PlatformNetwork/term-challenge/commit/7b7ec1c8a305a19eb5909cb475652256643c7e46)) |
| 80 | +* map cache directory paths for Docker-in-Docker mounts ([5c4979d](https://github.com/PlatformNetwork/term-challenge/commit/5c4979d4a210848ec73cca1277be5f7593f91394)) |
| 81 | +* parse pending_jobs field correctly in validator_worker ([146860e](https://github.com/PlatformNetwork/term-challenge/commit/146860e614f22d2bb454778754c9f1ccfb7f4759)) |
| 82 | +* pass LLM proxy env vars to agent binary process ([d630d36](https://github.com/PlatformNetwork/term-challenge/commit/d630d369c26d57c2abe89debf5840fd1635fd981)) |
| 83 | +* preserve HTTP status codes in LLM proxy error handling ([f6aa7bb](https://github.com/PlatformNetwork/term-challenge/commit/f6aa7bbf569cefb87a40741e77ba1e6074519348)) |
| 84 | +* prevent duplicate jobs and add container concurrency limit ([b3e0276](https://github.com/PlatformNetwork/term-challenge/commit/b3e02766e57909c62c4053c3b6df4eccfd68d5af)) |
| 85 | +* PyInstaller extraction issues in task containers ([f73650a](https://github.com/PlatformNetwork/term-challenge/commit/f73650a4c3c7c5e6893ea7515734ce066e87877c)) |
| 86 | +* re-declare TERM_REPO_PATH ARG in Dockerfile.server runtime stage ([5bad625](https://github.com/PlatformNetwork/term-challenge/commit/5bad6252fbd5f511d70157d9089cd631a4c5feb9)) |
| 87 | +* remove global timeout from SDK - let agent builders define their own ([f0ee67f](https://github.com/PlatformNetwork/term-challenge/commit/f0ee67f58c596366f5efdc469045dbac14c8e614)) |
| 88 | +* remove max_steps and timeout_secs from SDK - let agents manage their own limits ([108d262](https://github.com/PlatformNetwork/term-challenge/commit/108d2623a73ae17fa9f921ad030d3e50e3d1a337)) |
| 89 | +* remove restrictive cap_drop, run containers as root ([8bc2f75](https://github.com/PlatformNetwork/term-challenge/commit/8bc2f7578427d882cb14125678991951e2430d6a)) |
| 90 | +* Remove unnecessary borrow in clippy lint ([5277a64](https://github.com/PlatformNetwork/term-challenge/commit/5277a64299b02f30be7faf91414bc02a3b27ceb9)) |
| 91 | +* run verification tests from /workspace directory ([5059f5a](https://github.com/PlatformNetwork/term-challenge/commit/5059f5ac184c54930e9dbe6308f187c7e792dfe1)) |
| 92 | +* **sdk:** add remaining_steps and remaining_secs to AgentContext ([eb6fd06](https://github.com/PlatformNetwork/term-challenge/commit/eb6fd067079d395b6ec28512092af4845ed23369)) |
| 93 | +* send all required fields to log_task API ([f23ec72](https://github.com/PlatformNetwork/term-challenge/commit/f23ec72aba9e98521f6b15e775da60711d620ccf)) |
| 94 | +* set total_validators=2 when queueing submissions + reset window on requeue ([3b0d75f](https://github.com/PlatformNetwork/term-challenge/commit/3b0d75f796001b573cdab4490a7717843aa792d1)) |
| 95 | +* stop agent loop on cost_limit_exceeded and empty responses ([f685359](https://github.com/PlatformNetwork/term-challenge/commit/f685359311cf2d24aae19eaad2c28eddb320e487)) |
| 96 | +* support both 'done' and 'task_complete' in agent response ([9243cbd](https://github.com/PlatformNetwork/term-challenge/commit/9243cbdd88fc2bcf37714d2f09aceb2031d999fd)) |
| 97 | +* update BrokerError to match platform's ContainerError enum format ([496a582](https://github.com/PlatformNetwork/term-challenge/commit/496a58218fb6b86102883fd8227546c55c64f709)) |
| 98 | +* update secure-container-runtime to remove cap_drop restrictions ([a10b952](https://github.com/PlatformNetwork/term-challenge/commit/a10b9523289026d60db30f8260f49359177ecef5)) |
| 99 | +* use /app as standard working directory (matching harbor) ([d58c349](https://github.com/PlatformNetwork/term-challenge/commit/d58c349b35ebf2da4c2db5e006b51443e26b6a34)) |
| 100 | +* use /workspace as default working directory instead of /app ([546af74](https://github.com/PlatformNetwork/term-challenge/commit/546af7413c992d63e4749324568381f2591ec12c)) |
| 101 | +* use bash instead of sh for Harbor test scripts ([0892f5d](https://github.com/PlatformNetwork/term-challenge/commit/0892f5db490df1b7135f86fb88adafcfdc45dc16)) |
| 102 | +* use CHALLENGE_UUID for broker authentication ([2e429a7](https://github.com/PlatformNetwork/term-challenge/commit/2e429a72dc3f503069e0aafb7612774b9f139858)) |
| 103 | +* use correct timeouts from task config ([6b1c812](https://github.com/PlatformNetwork/term-challenge/commit/6b1c8129e048fd718b3a0629c0558ea6224640be)) |
| 104 | +* use exec_shell instead of exec to avoid double shell wrapping ([df0cd46](https://github.com/PlatformNetwork/term-challenge/commit/df0cd46846197b6583ee6885c69156dceb602678)) |
| 105 | +* use fixed 30 task count and deterministic task selection ([c1210ac](https://github.com/PlatformNetwork/term-challenge/commit/c1210ac0a0316c2c074704eefe038bdcf69c5fc0)) |
| 106 | +* use miner's API key directly for LLM security review ([36eff85](https://github.com/PlatformNetwork/term-challenge/commit/36eff853873a941bce24337e50d0ef85de214bef)) |
| 107 | +* use python:3.11 full image for PyInstaller (includes binutils) ([a062d3e](https://github.com/PlatformNetwork/term-challenge/commit/a062d3e5e5711e6a5c1ce4b52761cc7b1006e6b4)) |
| 108 | +* use simple release type with manifest config ([4876e3c](https://github.com/PlatformNetwork/term-challenge/commit/4876e3c4f00cf9d6a923d58f655fc34363e79f2f)) |
| 109 | +* use snake_case serde rename for BrokerResponse to match platform protocol ([999f9ba](https://github.com/PlatformNetwork/term-challenge/commit/999f9bae391d447b3be846c29b74fcf75c3ae437)) |
| 110 | + |
| 111 | + |
| 112 | +### Code Refactoring |
| 113 | + |
| 114 | +* remove direct Docker backend, use container names for HTTP communication ([79120ea](https://github.com/PlatformNetwork/term-challenge/commit/79120ea694e3d4b06f32d5b312d2a37310adcdb5)) |
| 115 | +* remove local platform-repo copying, use git dependency from Cargo.toml ([e52d711](https://github.com/PlatformNetwork/term-challenge/commit/e52d711fb310028a426fd01bdb27f3b8990162c2)) |
| 116 | +* standardize challenge ID to term-challenge, remove CHALLENGE_UUID ([635e53c](https://github.com/PlatformNetwork/term-challenge/commit/635e53c74b8f8276dc4e0c8d3603f7d3a617d717)) |
| 117 | +* use secure-container-runtime types from platform ([c3bfc22](https://github.com/PlatformNetwork/term-challenge/commit/c3bfc22c366faed8a0de5e428569e26ddbe837d6)) |
| 118 | + |
| 119 | + |
| 120 | +### Documentation |
| 121 | + |
| 122 | +* remove remaining_steps/remaining_secs from documentation and examples ([40197be](https://github.com/PlatformNetwork/term-challenge/commit/40197be9f982adcbc6f50ce53db0fe69abe3cd44)) |
| 123 | +* update README with missing features and architecture ([1ecd09f](https://github.com/PlatformNetwork/term-challenge/commit/1ecd09fcc27efaca28aefe13c203ef3e8a3b2152)) |
| 124 | + |
| 125 | + |
| 126 | +### Miscellaneous |
| 127 | + |
| 128 | +* restart CI pipeline ([73a1a6e](https://github.com/PlatformNetwork/term-challenge/commit/73a1a6e1e00c70ed8ff7b3fb838797fdb865d8ab)) |
| 129 | +* update platform dependency with auth fix ([7c70308](https://github.com/PlatformNetwork/term-challenge/commit/7c70308990074a9f412e516530dbdd7a4912423c)) |
| 130 | +* update platform dependency with debug logging ([3750c3b](https://github.com/PlatformNetwork/term-challenge/commit/3750c3bc0f157e78372b9d7362511f3f0626aea1)) |
| 131 | +* update secure-container-runtime dependency to latest build image support ([f020b6d](https://github.com/PlatformNetwork/term-challenge/commit/f020b6d443834b5904489c3ffa4b34045a7c9d0b)) |
| 132 | +* update secure-container-runtime to latest with JWT fix ([8e8de66](https://github.com/PlatformNetwork/term-challenge/commit/8e8de663a2fe0f2e008873a01f364290f540b03b)) |
| 133 | + |
| 134 | + |
| 135 | +### Tests |
| 136 | + |
| 137 | +* add SDK compilation integration tests ([18cbf2d](https://github.com/PlatformNetwork/term-challenge/commit/18cbf2d6018cd5fa38c50ced3c55b5702762c5b5)) |
| 138 | +* add serialization test to verify broker request uses lowercase type ([8181359](https://github.com/PlatformNetwork/term-challenge/commit/8181359d66395c62ebf010077b97e1ab29cb58cc)) |
| 139 | + |
3 | 140 | ## 0.1.0 (2026-01-04) |
4 | 141 |
|
5 | 142 |
|
|
0 commit comments