Poster
Ranking Reasoning LLMs under Test-Time Scaling
Mohsen Hariri, Michael Hinczewski, Jing Ma, Vipin Chaudhary
ACL 2026 Main
Ranking reasoning LLMs under repeated sampling, comparing 72 ranking methods across four Olympiad-style math benchmarks and packaging them in Scorio.