Skip to content

Chapter 4

The Inner Critic: How Agents Learn to Check Themselves

Interactive Graph (beta)

Toggle graph

Inspired by Error Monitoring and Meta-Cognition in the Brain

Have you ever sent an email and then, seconds later, thought: “Wait… did I attach the file?” That moment of self-checking — catching an error before it causes damage — is your brain’s inner critic at work.

We rely on this mechanism constantly: rereading a message, proofreading an essay, double-checking a math solution. It’s not about generating the first draft of action, but about monitoring and refining what’s already been produced.

Agents need the same. If they only ever produce a first response and never check their own work, they’ll forever be brittle. But if they can look back, evaluate, and revise, they start to feel more intelligent — more like a system that’s aware of its own limits.

The Brain’s Analogy: The Error Detector

Neuroscience points to the anterior cingulate cortex (ACC) as one of the brain’s “oops detectors.” It lights up when we make mistakes or when something doesn’t quite add up. Pair that with the prefrontal cortex, which steps in to adjust behavior, and you have the neural basis for self-correction.

For agents, this means building a feedback loop: not just generating an answer, but pausing to ask, “Does this make sense? Could this be better?”

Why Agents Need an Inner Critic

Example task: “Write a two-paragraph summary of climate change policies.” With an inner critic:

  1. Write a first draft.
  2. Review against criteria (coverage, balance, tone).
  3. Revise based on critique.

Real-World Scenarios

  • Creative Writing: Draft → critique → rewrite till narrative flows.
  • Code Generation: Write → test → fix → repeat until green.
  • Strategic Planning: Evaluate whether each step advances the goal; adjust if not.
  • Conversation: Detect misunderstandings and correct next turn.

Sketch of the Pattern

def agent_with_inner_critic(task):
    draft = generate_output(task)
    critique = evaluate(draft, criteria=["accuracy", "clarity", "tone"])
    if critique == "good enough":
        return draft
    else:
        return revise(draft, critique)

Design Lessons for Agents

  1. First Drafts Are Starting Points.
  2. Critique Needs Clear Criteria.
  3. Stop at “Good Enough.”
  4. Memory Makes Critique Smarter.

Conclusion

An agent with an inner critic doesn’t just speak — it listens to itself. That loop of generate → critique → refine is a cornerstone of robust intelligence.