System Prompt and Identity Engineering

Defining who an agent is before deciding what it can do.

Overview

A system prompt is not a list of rules. It is the agent's identity: its role, its tone, its boundaries, and the format it is expected to respond in. Most unreliable agents trace back to an identity that is either too rigid (breaks on the first edge case) or too vague (the model fills the gaps with its own assumptions, inconsistently, turn after turn). The goal of identity engineering is to write a prompt specific enough to constrain behavior, while general enough to let the model use judgment.

Structure over prose

Treat the system prompt as a document with sections, not a paragraph of instructions. Use clear delimiters — XML tags or Markdown headers — to separate role definition, behavioral rules, tool guidance, and output format. A model parses structured sections far more reliably than it parses a wall of mixed instructions, because each section becomes a distinct unit it can attend to independently rather than one undifferentiated block of text.

Minimal but complete

"Minimal" does not mean short. It means the prompt should contain the smallest set of information that fully specifies the desired behavior — no redundant restatement of the same rule in three different ways, no defensive over-specification for edge cases that will never occur. Start with the leanest version that could plausibly work, observe where the agent's behavior diverges from what you wanted, and add precision only where a real failure occurred. Prompts grown this way stay maintainable; prompts grown by pre-emptively listing every conceivable edge case become brittle and contradictory.

Few-shot examples over exhaustive rules

Rather than enumerating every rule the agent should follow, provide a small number of canonical examples that demonstrate the expected behavior end to end. A handful of well-chosen examples teaches a pattern; a long list of if-then rules teaches the model to pattern-match on phrasing instead of understanding intent.

Identity drift over long sessions

In long-running conversations, an agent's adherence to its system prompt degrades as context fills with tool outputs, retrieved documents, and accumulated history — the original identity gets diluted by sheer token volume. Mitigate this by periodically reinforcing key identity constraints in injected context (not by repeating the entire system prompt, but by surfacing a condensed reminder of the highest-priority constraints at decision points), and by treating identity as something that needs active maintenance across a session rather than something fixed once at the start.

Structured system prompt skeleton placeholder

{
  "system_prompt": {
    "role": "<role>You are a senior automation architect for n8n and Make workflows.</role>",
    "constraints": "<constraints>Never invent API endpoints. Ask before destructive actions.</constraints>",
    "tool_guidance": "<tool_guidance>Use search_docs before answering integration questions.</tool_guidance>",
    "output_format": "<output_format>Respond in Markdown with numbered steps.</output_format>",
    "few_shot_examples": [
      {
        "user": "How do I retry a failed webhook?",
        "assistant": "1. Open the workflow... 2. Add an Error Trigger node..."
      }
    ]
  }
}

Part II — XML section architecture

Production system prompts should read like structured documents, not paragraphs of mixed instructions. Anthropic's cookbook patterns demonstrate that XML-delimited sections — <role>, <constraints>, <tool_guidance>, <output_format>, and <examples> — give the model discrete units to attend to independently. Sections that must never be violated belong in <constraints>; sections that shape tone belong in <role>; sections that define callable behavior belong in <tool_guidance>.

Separating reasoning from output is equally important. Patterns like <thinking> (internal) versus <response> (user-facing) prevent chain-of-thought leakage while still allowing the model to deliberate. The OpenAI prompt taxonomy — role, context, task, format — maps cleanly onto this structure: role and constraints are stable; context and task vary per turn; format is non-negotiable across turns.

Part II — Canonical example design

Two or three end-to-end examples outperform twenty conditional rules. Each canonical example should demonstrate: the user input shape you expect, the reasoning style you want (without exposing raw chain-of-thought if your product forbids it), the tool calls that should or should not fire, and the exact output format — including edge-case handling visible in the example itself.

Choose examples that cover different branches of behavior, not variations of the same happy path. For an n8n advisor agent, one example might show a safe read-only diagnostic; another might show refusing a destructive workflow change and asking for confirmation. The model generalizes from pattern, not from rule phrasing — examples teach intent; rules teach keyword matching.

Part II — Identity reinforcement schedule

Identity drift is predictable: it accelerates after long tool-output chains, after context compaction, and after retrieval injects large document blocks. A harness should reinforce identity at decision points, not by re-injecting the full system prompt (expensive and noisy), but by surfacing a condensed constraint block — typically three to five bullets covering non-negotiables: scope limits, output format, and safety boundaries.

Practical triggers for reinforcement: before any destructive tool call, after compaction events, when the session exceeds N turns, and when the agent switches task domains mid-session. Treat identity as operational state the harness maintains, not static text written once at session start.

Part II — Prompt versioning and the eval loop

Mature teams version system prompts like application code: semantic versioning, changelog entries, and regression evals before every deploy. A prompt change that fixes one failure mode often introduces another — the only safe workflow is edit one section → run regression suite → inspect trajectory diffs → promote or rollback.

Case study: An n8n support agent began suggesting webhook URLs that did not exist after turn twelve. Root cause: identity constraints were buried mid-context beneath tool outputs. Fix: move non-negotiable constraints to a cached static prefix, add a pre-tool-call identity reminder, and add a regression eval that runs a twelve-turn scenario with injected tool noise. Pass rate went from 61% to 94% without lengthening the base prompt.

Production system prompt — n8n workflow advisor

{
  "version": "1.3.0",
  "sections": {
    "role": "<role>You are a senior n8n automation architect. Be direct, precise, and conservative about destructive changes.</role>",
    "constraints": "<constraints>Never invent node types or API endpoints. Refuse unsandboxed credential access. Ask before deleting workflows.</constraints>",
    "tool_guidance": "<tool_guidance>Call search_templates before recommending patterns. Use get_node_docs for version-specific parameters.</tool_guidance>",
    "output_format": "<output_format>Respond in Markdown: Summary, Steps, Risks, Verification checklist.</output_format>",
    "examples": [
      {
        "user": "My HTTP node keeps timing out on retries.",
        "assistant": "## Summary\nLikely retry config or upstream latency...\n## Steps\n1. Add Error Trigger...\n## Risks\nUnbounded retries can stall the queue.\n## Verification\n- Confirm P95 latency < 30s on test webhook"
      }
    ]
  }
}