ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models

Chung-En Sun, Ge Yan, Tsui-Wei Weng

UCSD

arXiv preprint

Abstract

Recent studies show that LLMs with chain-of-thought (CoT) reasoning achieve impressive problem-solving. However, they sometimes produce overly short reasoning, leading to lower accuracy on even simple problems. We identify that reasoning length is encoded as a linear direction in the hidden space, and propose ThinkEdit—a lightweight weight-editing method that suppresses short reasoning by modifying only 0.1% of model parameters. By targeting a small subset of attention heads (~4%), ThinkEdit improves accuracy on short-reasoning cases (+6.39%) and enhances overall performance (+3.34%) across math benchmarks. Our work offers new insights into controlling reasoning behavior inside LLMs.

Figure 1: The overview of ThinkEdit.

Issue of Overly Short Reasoning in Reasoning LLMs

We observe a consistent issue across Deepseek-distilled reasoning models: significantly lower accuracy when the reasoning length is short. This pattern holds across datasets such as GSM8K and MATH-Level5. As shown in the following figure, cumulative accuracy drops sharply for responses with reasoning length below 2000 tokens, contrary to the intuition that shorter reasoning should correspond to easier problems. Instead of solving simple tasks efficiently, models often fail when generating overly brief chains of thought.

Figure 2: The performance of all deepseek-distilled reasoning models degrade significantly when the reasoning length is too short. The x-axis represents the cutoff threshold on reasoning length, and the y-axis shows the corresponding cumulative accuracy.

Understanding How Hidden Representations Affect Reasoning Length

Motivate by this issue, we begin with investigating how reasoning length is encoded in a model's hidden representations. By comparing hidden states from examples with long and short reasoning traces, we extract linear directions that differentiate between longer and shorter chains of thought. Applying these directions to the residual stream allows us to steer models toward longer or shorter reasoning, as shown in the following figures.

Our experiments show that:

Steering toward shorter reasoning consistently reduces accuracy.
Steering toward longer reasoning sometimes improves performance, particularly in larger models.
Middle layers are found to be the most sensitive to reasoning-length manipulations.

These insights suggest that overly short reasoning is driven by identifiable patterns in hidden representations.

Figure 3: Global steering on GSM8K. Positive steering extends reasoning length across all models and improves accuracy in the 8B and 14B models, whereas negative steering consistently shortens reasoning and lowers accuracy across all models.

Figure 4: Layerwise steering on GSM8K. We apply steering to one layer at a time, revealing that the middle layers wield the strongest control over reasoning length and accuracy.

ThinkEdit: Mitigate Overly Short Reasoning through Weight Editing

Building on our discovery of the reasoning-length direction, we propose ThinkEdit that removes the short reasoninging direction in attention heads as follows:

Pinpoint a small subset (4%) of attention heads that strongly contribute to the short reasoning direction.
Project the output weights of these heads onto a subspace orthogonal to the short reasoning direction.

Experiments demonstrate that ThinkEdit effectively mitigates overly short reasoning in reasoning models, boosting short reasoning accuracy by up to +6.39% and improving overall performance by +3.34% across multiple math benchmarks.

Figure 5: Heatmap illustrating the short reasoning contribution for each attention head. Heads with higher values (in red) show stronger alignment with short reasoning behavior. The short-reasoning heads are sparse in reasoning models.

Table 1: Overall accuracy (%) of each model before and after applying ThinkEdit. ThinkEdit improves the reasoning models across all benchmarks.

Table 2: Accuracy (%) of the top 5% / 10% / 20% shortest reasoning responses.ThinkEdit significantly improves the correctness when the reasoning is short.

Table 3: Average reasoning length for the top 5% / 10% / 20% shortest responses (in tokens). ThinkEdit slighty increase the reasoning length when the reasoning is overly short.

Conclusion

ThinkEdit demonstrates that a small-scale weight-editing can correct overly short reasoning in LLMs, leading to substantial improvements in accuracy. This work provides new mechanistic insights into reasoning length control and opens up avenues for further fine-grained model interventions.

Cite this work

Chung-En Sun, Ge Yan, Tsui-Wei Weng. "ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models", arXiv Preprint 2025.


@article{thinkedit,
    title={ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models},
    author={Sun, Chung-En and Yan, Ge and Weng, Tsui-Wei},
    journal={arXiv preprint},
    year={2025},
    url={https://github.com/Trustworthy-ML-Lab/ThinkEdit}
}