π Post
π·οΈ Inference
Tag: Inference
3 items tagged with "Inference"
π Paper
Donβt Pass@π: A Bayesian Framework for Large Language Model Evaluation
Mohsen Hariri, Amirhossein Samandar, Michael Hinczewski, Vipin Chaudhary
A Bayesian framework for evaluating large language models that replaces unstable Pass@k metrics with robust posterior estimates and credible intervals. This method improves sample efficiency, supports graded outcomes, and enables statistically sound model comparisons.
π Paper
Quantize What Counts: More For Keys, Less For Values βοΈπππ’
Mohsen Hariri, Alan Luo, Weicong Chen, Shaochen Zhong, Tianyi Zhang, Qifan Wang, Xia Hu, Xiaotian Han, Vipin Chaudhary
Key-favored KV-cache quantization for LLMs: theory shows keys have larger norms and should get more bits; empirics show 4b-K/2b-V preserves up to 98.3% accuracy while cutting memory.