🎞️ Slide
Test-Time Scaling Under Budget
M.Sc. Thesis in Computer Science
4 items tagged with "Compression"
M.Sc. Thesis in Computer Science
DFloat11 compresses LLMs to 70% of their original size while maintaining bit-for-bit identical outputs. A lossless compression framework with efficient GPU inference that enables running Llama 3.1 405B on a single node.