Simulating LLM Evaluation Datasets Using Psychometric Models
Explore how Item Response Theory (IRT) and other psychometric models can simulate and analyze LLM evaluation datasets. Learn how difficulty, discrimination, and guessing parameters reveal model reasoning patterns, with interactive examples across multiple reading levels.