The Effect of Sampling Temperature on Problem Solving in Large Language Models

Abstract

In this research study, we empirically investigate the effect of sampling temperature on the performance of Large Language Models (LLMs) on various problem-solving tasks.

We created a multiple-choice question-and-answer (MCQA) exam by randomly sampling problems from standard LLM benchmarks. Then, we used nine popular LLMs with five prompt-engineering techniques to solve the MCQA problems while increasing the sampling temperature from 0.0 to 1.6.

Despite anecdotal reports to the contrary, our empirical results indicate that changes in temperature from 0.0 to 1.0 do not have a statistically significant impact on LLM performance for problem-solving tasks. In addition, these results appear to generalize across LLMs, prompt-engineering techniques, and problem domains.

Documents

Research paper
Pre-print paper
Research poster
Presentation video
Presentation slides
[Research homepage] (https://matthewrenze.com/research/the-effect-of-sampling-temperature-on-llms/)

Code

Source - contains all source code
Models - contains the model-specific code
Prompts - contains LLM agent prompt code
Exams - contains the code to load exams

Data

Exams - contains the test dataset
Results - contains the high-level test results
Details - contains the low-level test results
Responses - contains the LLM response text
Logs - contains the experiment event logs

Analysis

Plots - contains all data visualizations

Notes

Source contains all scripts for experiments, processing, and analysis
See Requirements.txt for a list of packages used in this experiment.
GitHub Copilot was used in the creation of this experiment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Effect of Sampling Temperature on Problem Solving in Large Language Models

Abstract

Documents

Code

Data

Analysis

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Effect of Sampling Temperature on Problem Solving in Large Language Models

Abstract

Documents

Code

Data

Analysis

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages