The Idea

Here's a surprising finding: when AI gets a multi-step problem wrong and you simply ask "try again," performance actually gets worse with each retry. Vague feedback like "that's wrong, fix it" causes the model to second-guess correct steps while failing to fix the actual error.

Recursive Chain-of-Feedback takes a targeted approach: find the exact step that went wrong, extract it as its own simpler problem, solve that smaller problem (which is much easier to get right), then plug the corrected answer back into the full solution. If the sub-problem is still too hard, decompose it further. It's self-correction that actually works.

Building Blocks

This composition builds on:

Check Your Work Loop Until Done

R-CoF combines self-evaluation (finding errors) with iterative improvement, but adds a crucial element: recursive decomposition. Instead of retrying the whole problem, it isolates and fixes the specific broken piece.

Why "Try Again" Doesn't Work

Naive Retry

Attempt 1: 70% correct
"Try again" → Attempt 2: 65% correct
"Try again" → Attempt 3: 58% correct

Performance degrades because the model overthinks correct steps while missing the real error.

Recursive Chain-of-Feedback

Attempt 1: 70% correct
Find error → Fix step → 85% correct
Find error → Fix step → 92% correct

Performance improves because each correction is targeted and precise.

See It in Action

Question: "A store sells apples for $2 each. John buys 5 apples and pays with a $20 bill. How much change does he get?"

1
AI attempts a solution
Initial attempt
Step 1: Cost = 5 × $2 = $12
Step 2: Change = $20 − $12 = $8
↓ evaluate — find the error
2
Pinpoint the exact broken step
Error identified
Step 1 is wrong: "5 × $2 = $12" — the multiplication is incorrect.
↓ isolate as a simpler problem
3
Solve the sub-problem
Sub-problem
What is 5 × 2?
Sub-answer
5 × 2 = 10

This simpler question is much easier to get right than retrying the whole problem.

↓ plug the fix back in
4
Reconstruct the corrected solution
Corrected solution
Step 1: Cost = 5 × $2 = $10
Step 2: Change = $20 − $10 = $10

The Recursive Part

What if the sub-problem is also too hard? The technique applies itself recursively. Imagine a complex physics problem where the error is in a calculus step, and the calculus sub-problem has an algebra error. R-CoF would:

Each level of recursion makes the problem simpler, until it's easy enough to solve correctly.

Why This Works

The insight is that AI errors are usually local, not global. When a 10-step solution goes wrong, typically one or two specific steps contain mistakes while the rest are fine. Retrying the whole thing risks breaking the good steps. Targeted correction fixes only what's broken.

Making the sub-problem simpler is equally important. AI is much more reliable on easy, focused questions than on complex multi-step ones. By extracting "What is 5 × 2?" from a larger problem, you're playing to the model's strengths.

The Composition

Find the exact step that's wrong. Extract it as a simpler problem. Solve it. Plug the fix back in. If the sub-problem is still too hard, go deeper. Surgical precision instead of blind retrying.

When to Use This

When to Skip This

How It Relates

R-CoF is a more structured approach to the same goal as Reflexion: improving through self-correction. Reflexion works across entire episodes (attempt, reflect, retry), while R-CoF works within a single solution (find the broken step, fix just that piece). They can even be combined: Reflexion for episode-level learning, R-CoF for within-episode step-level correction.

It also relates to Least-to-Most prompting, which decomposes problems into easier sub-problems from the start. R-CoF uses decomposition after failure — only breaking things down when and where errors actually occur, rather than preemptively.