🎞️ Slide
Test-Time Scaling Under Budget
M.Sc. Thesis in Computer Science
2 items tagged with "Quantization"
M.Sc. Thesis in Computer Science
Key-favored KV-cache quantization for LLMs: theory shows keys have larger norms and should get more bits; empirics show 4b-K/2b-V preserves up to 98.3% accuracy while cutting memory.