Posters

Medical Image Spatial Grounding with Semantic Sampling

miccai-2026•September 28, 2026 Andrew Seohwan Yu^†, Mohsen Hariri^†, Kunio Nakamura, Mingrui Yang, Xiaojuan Li, Vipin Chaudhary ^†Equal contribution

MICCAI 2026 poster: MIS-Ground measures 3D anatomical spatial grounding in vision-language models under a controlled factorial design, and MIS-SemSam fixes language-side brittleness at decode time for free.

Preview of Medical Image Spatial Grounding with Semantic Sampling

Multimodal AI Vision-Language Models Medical Imaging Spatial Grounding Benchmarking Evaluation

Quantize What Counts: More for Keys, Less for Values

acl-2026•June 4, 2026 Mohsen Hariri, Alan Luo, Weicong Chen, Tianyi Zhang, Qifan Wang, Xiaotian Han, Vipin Chaudhary

A geometry-driven mixed-precision KV-cache quantization poster showing that keys carry more information than values, so key-favored bit allocation preserves accuracy while reducing memory.

Preview of Quantize What Counts: More for Keys, Less for Values

Compression LLMs Efficiency Inference Quantization KV Cache Theory

Ranking Reasoning LLMs under Test-Time Scaling

acl-2026•April 6, 2026 Mohsen Hariri, Michael Hinczewski, Jing Ma, Vipin Chaudhary

Ranking reasoning LLMs under repeated sampling, comparing 72 ranking methods across four Olympiad-style math benchmarks and packaging them in Scorio.

Preview of Ranking Reasoning LLMs under Test-Time Scaling

Statistics Bayesian LLM Ranking Test-Time Scaling Benchmarking Scorio

Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation

iclr-2026•January 25, 2026 Mohsen Hariri, Amirhossein Samandar, Michael Hinczewski, Vipin Chaudhary

Proposed a Bayesian framework that estimates models' success probabilities with quantified uncertainty, yielding more reliable rankings and enabling categorical evaluation of LLMs.

Preview of Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation

Statistics Bayesian LLM Evaluation Test-Time Scaling Benchmarking Scorio

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

neurips-2025•December 2, 2025 Tianyi Zhang, Mohsen Hariri, Shaochen Zhong, Vipin Chaudhary, Yang Sui, Xia Hu, Anshumali Shrivastava

NeurIPS 2025 poster on DFloat11: a lossless compression framework that shrinks LLMs and diffusion transformers to ~70% of their size with bit-for-bit identical outputs, plus a GPU kernel that decompresses on the fly.

Preview of 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Compression Compression Efficiency LLMs GPU Lossless Inference Information Theory