Self-Correcting AI: Build Reflexion Agents That Fix Their Own Code in n8n

The Problem with One-Shot AI Code Generation

One of the most frustrating experiences when using LLMs for code generation is syntactic hallucination — the model produces code that looks correct but crashes immediately at runtime. In 2025, the solution is not just a better prompt. It is a better architecture.

The Reflexion pattern gives your agent a retry loop with feedback:

Generate — the agent writes the code

Execute — the code is tested against a real runtime

Evaluate — success or failure is detected

Reflect — on failure, the error is fed back to the agent

Retry — the agent produces a corrected version

Tests show that models like GPT-4 increase their code generation success rate from ~60% to over 90% when this self-correction loop is applied. In n8n, this translates fragile automations into resilient, self-healing systems.

The Architecture: A Conditional Loop

Unlike a linear flow, this workflow uses a conditional loop. The four actors are:

Generator Agent — writes the initial code
Executor (Code Node / HTTP Request) — runs the code
Evaluator (If Node) — routes to success or failure path
Reflector Agent — reads the error and generates a corrected version

Step 1: The Generator Agent

Start with an AI Agent Node (or Basic LLM Chain).

Set the System Prompt:

You are a Python expert.
Generate only the code, no markdown explanations.
The code must solve this problem: {{$json.problem}}

This node outputs raw Python code as a string.

Step 2: The Execution Test

Connect the agent's output to an HTTP Request Node calling an external code execution API. We use the Piston API for safety:

{
  "url": "https://emkc.org/api/v2/piston/execute",
  "method": "POST",
  "body": {
    "language": "python",
    "version": "3.10.0",
    "files": [{ "content": "{{$json.output_code}}" }]
  }
}

The API returns run.stdout on success and run.stderr on failure.

Step 3: The Evaluator (If Node)

Add an If Node with the condition:

True path (success): run.stderr is empty → deliver the working code
False path (failure): run.stderr is not empty → trigger reflection

Step 4: The Reflector Agent

On the False path, add a second AI Agent Node with these dynamic inputs:

{{$json.output_code}} — the code that failed
{{$json.run.stderr}} — the exact error message

System Prompt:

You are a debugging agent.
The following code failed with this error: [Insert Error].
Analyze why it failed.
Output ONLY the corrected code — no explanations.

Step 5: Closing the Loop

Connect the Reflector Agent's output back to the Executor's input. Critical: add a loop counter to prevent infinite retries.

// At the start of the loop
const attempts = $json.attempts || 0;
if (attempts > 3) {
  throw new Error('Failed after 3 reflection attempts.');
}
return { ...$json, attempts: attempts + 1 };

This limits the loop to 3 retries. After 3 failures, the workflow surfaces the error for human review rather than spinning endlessly.

Why This Works

The key insight is that the feedback signal — the actual runtime error — contains far more information than any static prompt improvement. The model does not guess what went wrong. It reads the exact traceback and fixes the precise issue.

This pattern is especially powerful for:

Data transformation scripts — where schema mismatches cause subtle errors
API integration code — where authentication or endpoint formats change
JSON generation — where structural validation fails on first attempt

Implementing Reflexion in n8n turns your automation from a one-shot gamble into a resilient, self-healing system that improves with every iteration.

The Problem with One-Shot AI Code Generation

The Architecture: A Conditional Loop

Step 1: The Generator Agent

Step 2: The Execution Test

Step 3: The Evaluator (If Node)

Step 4: The Reflector Agent

Step 5: Closing the Loop

Why This Works

Related Articles

How to build with airtable + langchain in n8n

How to implement with langchain + anthropic in n8n

How to automate with n8n + notion in n8n