large reasoning model (LRM)

A large reasoning model (LRM) is a language model optimized for multi-step problem solving that allocates extra computation and uses structured intermediate steps at inference to plan, verify, and refine its answers.

LRMs extend standard LLMs with training and inference techniques, including some of the following:

  • Reinforcement learning on reasoning traces
  • Eexplicit reasoning token for test time thinking
  • Search or self-consistency to improve correctness

Many LRMs also coordinate external tools like code execution.


By Leodanis Pozo Ramos • Updated Nov. 18, 2025