on https://labs.scale.com/leaderboard/swe_bench_pro_public
it states that glm-4.6 scores abt 9.67%
but when i check the trajectories via https://docent.transluce.org/dashboard/032fb63d-4992-4bfc-911d-3b7dafcb931f
- no glm-4.6 only
glm-4.5 -10222025
glm-4.5-10222025 shows 259 resolved instances out of 731 public resolved instances
259/731 is deffinitely a lot higher than 9%.
What's the issue?
on https://labs.scale.com/leaderboard/swe_bench_pro_public
it states that glm-4.6 scores abt 9.67%
but when i check the trajectories via https://docent.transluce.org/dashboard/032fb63d-4992-4bfc-911d-3b7dafcb931f
glm-4.5 -10222025glm-4.5-10222025shows 259 resolved instances out of 731 public resolved instances259/731 is deffinitely a lot higher than 9%.
What's the issue?