
Hi, I'm Mohsen!
and I love math ❤️. I work on these things.
I'm always excited to start new collaborations, especially when it's something new to me. Feel free to reach out by email; you'll find it in the footer.
News
View allRecent Research
View all
Quantize What Counts: More for Keys, Less for Values
A geometry-driven mixed-precision KV-cache quantization poster showing that keys carry more information than values, so key-favored bit allocation preserves accuracy while reducing memory.
Ranking Reasoning LLMs under Test-Time Scaling
Ranking reasoning LLMs under repeated sampling, comparing 72 ranking methods across four Olympiad-style math benchmarks and packaging them in Scorio.
Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation
Proposed a Bayesian framework that estimates models' success probabilities with quantified uncertainty, yielding more reliable rankings and enabling categorical evaluation of LLMs.
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
NeurIPS 2025 poster on DFloat11: a lossless compression framework that shrinks LLMs and diffusion transformers to ~70% of their size with bit-for-bit identical outputs, plus a GPU kernel that decompresses on the fly.
Recent Posts
View allEntropy of bfloat16 During Training: How Optimizers Shape Weight Distributions
Entropy of bfloat16: 8 Bits Are Doing 2.6 Bits of Work
Simulating LLM Evaluation Datasets Using Psychometric Models
Recent Slides
View allServing Reasoning LLMs Efficiently and Reliably [No Anime]
Serving reasoning LLMs efficiently and reliably: lossless DFloat11 compression, KV-cache quantization, and Bayes@N evaluation and ranking under test-time scaling.
Serving Reasoning LLMs Efficiently and Reliably
Serving reasoning LLMs efficiently and reliably: lossless DFloat11 compression, KV-cache quantization, and Bayes@N evaluation and ranking under test-time scaling.
Python Environments
Python environments, how to create and reproduce them, and when to use pip, conda, micromamba, uv, pipx, lockfiles, and containers.