Least-to-Most Prompting — laranevans.com

Least-to-most prompting is a decomposition strategy: the model first breaks a hard problem into a sequence of simpler sub-problems, then solves the sub-problems in order, using each answer as context for the next. Zhou et al. (2022), in Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, introduced the technique and showed it improved easy-to-hard generalization on compositional reasoning benchmarks.

The motivating observation: chain-of-thought prompting struggles when the test problem is meaningfully harder than the exemplars. The model imitates the rationale style on similar-complexity problems but fails on problems that require more decomposition steps. Least-to-most addresses this by treating decomposition as the first task and solving as the second.

The mechanism

Two stages, each its own prompt:

1. Decomposition

The model is asked to list the sub-problems that make up the original problem, in dependency order. For "If Amy eats 3 apples on Monday and 5 on Tuesday, and she started the week with a dozen, how many apples remain on Wednesday morning?", the decomposition might be:

How many apples did Amy eat total?
How many apples remain after Tuesday?

The decomposition is itself a chain-of-thought-shaped task. Few-shot exemplars in the decomposition prompt show the model what sub-problem lists look like.

2. Sequential solving

The model solves each sub-problem in order. The answer to sub-problem K becomes part of the context for sub-problem K+1. The final sub-problem's answer is the answer to the original problem.

When it helps

Least-to-most shines on tasks where:

The problem has a natural decomposition into smaller independent or sequentially-dependent steps.
The model can solve the sub-problems but cannot solve the composed problem in one pass.
The benchmark or production task involves problems harder than the available exemplars (the easy-to-hard generalization case).

The original paper reported large gains on the SCAN compositional-generalization benchmark and on multi-step math problems. Practitioner write-ups since echo the same shape: a task that fails on direct CoT often succeeds when explicitly decomposed.

When it does not help

The decomposition is wrong. A bad decomposition locks in errors. Sub-problems that omit a critical step or include irrelevant ones propagate into the final answer.
The sub-problems are not independent. When sub-problem K depends on a quantity sub-problem K+2 has not yet computed, the linear sequence breaks. Tree- or graph-structured decomposition is more appropriate. See Tree of Thoughts.
The task is one-step. Decomposition adds tokens and latency without buying accuracy when the underlying problem has no decomposition.

Plain CoT produces one rationale-and-answer pair. Least-to-most produces a planned sequence of rationale-and-answer pairs, each contributing to the next.
Self-consistency samples multiple chains for the same problem. Least-to-most samples one chain for each of several sub-problems.
Tree of Thoughts generalizes decomposition into search over a tree of candidate intermediate states.

The three patterns compose. A production system decomposes with least-to-most, samples multiple chains per sub-problem with self-consistency, and reaches for Tree of Thoughts when the decomposition itself requires search.

Chain of Thought — the underlying reasoning mechanism.
Tree of Thoughts — the search generalization of decomposition.
Self-Consistency — the voting wrapper that pairs naturally with decomposition.
Prompt Engineering — the broader cluster.

The mechanism

1. Decomposition

2. Sequential solving

When it helps

When it does not help

Comparison to related patterns

Related