The Best Agent Is a Boring Agent
The clever-agent trap
- First-time teams want clever → multi-step planning, reflection loops, dynamic tool selection, growing memory.
- Exciting to build → almost always falls over in production.
The pattern that actually works
- Fixed, short list of tools.
- Loop runs at most ~10 steps before handing back to a human.
- No reflection phase → no inner monologue, no self-grading.
- Deterministic failure mode → when unsure, stop and ask.
- Boring → also reliable, cheap, auditable.
Why “fancy” hurts
- Every fancy feature → also a failure mode.
- Reflection loops → add cost, can drift.
- Dynamic tool selection → adds latency + new error surfaces.
- Long-running memory → adds state you debug at 2am.
The rule
- Start with the dumbest thing that could possibly work.
- Add complexity only with a specific, measured reason.
- ! Most of the time, you won’t need any.
Lesson: production reliability is bought by saying no to features, not by stacking them.