News

April 6, 2026

🎉 Ranking Reasoning LLMs under Test-Time Scaling Accepted to ACL 2026 Main

April 6, 2026

🎉 Quantize What Counts: More for Keys, Less for Values Accepted to ACL 2026 Findings

January 25, 2026

🎲 Don’t Pass@𝑘: A Bayesian Framework for Large Language Model Evaluation Accepted to ICLR 2026

October 18, 2025

📦 Julia & Python pkgs for the Bayesian framework are out!

October 15, 2025

📦 vLLM × DFloat11: run your model with 30% less memory!

September 17, 2025

✨ 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Accepted to NeurIPS 2025