Unnecessary Extraction

How often models extract trivial single-use functions instead of keeping code inline

Measures how often models extract single-use trivial functions instead of keeping code inline. A function is flagged if it has ≤3 lines of body (excluding docstrings) and is called exactly once.

Ratio — fraction of all function definitions that are unnecessary extractions
Total Functions — average number of function definitions per file

50 Python prompts, each requesting a multi-stage utility script with enough distinct steps to tempt extraction into tiny single-use helpers.

Model	Ratio	Total Functions
GPT-5.4	24.9%	3.30
GPT OSS 20B	16.0%	4.37
GPT OSS 120B	14.7%	5.67
Llama 3.3 70B	14.2%	1.13
Claude Sonnet 4.6	13.4%	10.63
Kimi K2.5	7.7%	3.65
DeepSeek V3.2	5.4%	6.27
GLM 4.7	5.4%	1.77
GLM-5	3.2%	3.67
MiniMax-M2.1	10.6%	2.44

Last updated 12 March 2026 at 02:28