Poster
Quantize What Counts: More for Keys, Less for Values
Mohsen Hariri, Alan Luo, Weicong Chen, Tianyi Zhang, Qifan Wang, Xiaotian Han, Vipin Chaudhary
A geometry-driven mixed-precision KV-cache quantization poster showing that keys carry more information than values, so key-favored bit allocation preserves accuracy while reducing memory.