← blog.buildwithjz.com

Day 11: The Blind CEO — When Your AI Agent Can't See Its Own Team

2026-04-12 · MoneyMachine

Date: 2026-04-12 Author: Jeff (written with AI assistance from Claude Opus 4.6) Phase: v5.1 Foundation Fix


The Symptom

Adrian, my CEO agent, had been sending me status reports via Telegram for days. They all said the same thing:

AgentStatus
AdrianRunning
ScoutNot deployed
BuilderNot deployed
Revenue OpsNot deployed
Site BuilderNot deployed
Content WriterNot deployed

Six agents “not deployed.” ThinkPad “pending OpenClaw install.” Revenue at $0. Only Adrian running.

Here’s the thing: all six agents WERE deployed. They’d been running cron jobs for days. Scout had collected 960+ demand signals. Builder had 23 deliverables ready for review. Marketer had 53 pieces of content waiting. Revenue Ops had flagged a cost anomaly.

Adrian couldn’t see any of it.

Root Cause: Three Layers of Blindness

Layer 1: Sandbox “all” mode

OpenClaw’s sandbox feature isolates agents into their own directory during execution. We’d set sandbox.mode: "all" — which means every agent, including Adrian, gets sandboxed. Adrian literally could not read other agents’ workspace directories. Every heartbeat, he’d try ls /home/agentops/.openclaw/workspaces/scout/ready-for-review/ and get “Path escapes sandbox root.”

The gateway log was full of these errors:

[tools] read failed: Path escapes sandbox root (~/.openclaw/sandboxes/agent-main-f331f052):
  /home/agentops/.openclaw/workspaces/scout/tasks.json

The CEO agent couldn’t see any of his team’s work. So he reported what he remembered from memory — which was weeks old.

Layer 2: Stale SOUL.md and MEMORY.md

Adrian’s SOUL.md was from March 7. It still said:

  • “Replace Jeff’s $10,000/month work income” (we’d updated to $15k)
  • Referenced “Site Builder” and “Content Writer” (retired in v3)
  • No mention of Mac Mini, Ollama Cloud, or GLM-5.1

His MEMORY.md was worse. It contained:

  • “OpenClaw version: 2026.3.2” (we were on 2026.4.10)
  • “ThinkPad P16: OpenClaw NOT YET installed” (ThinkPad was decommissioned a month ago)
  • Old v2 agent roster with trading-research agent

Adrian was making strategic decisions based on a reality that hadn’t existed for over a month.

Layer 3: Persistent Sessions

Even after I updated the files on disk, Adrian kept reporting old information. Why? OpenClaw sessions are persistent. His Telegram session had cached the old SOUL.md and MEMORY.md in its context window when the session started. Updating files on disk doesn’t force a running session to re-read them.

I had to kill the old session and force Adrian to start a completely fresh one before he’d read the updated files.

The Fix

Sandbox: Changed from "all" to "off". I tried "non-main" first (which should exempt the main agent), but a pre-existing sandbox state directory was being reused. Nuking it entirely was the only thing that worked reliably.

SOUL.md: Complete rewrite. New v5 agent roster, correct infrastructure, correct revenue targets, explicit note that he’s sandbox-exempt with full filesystem access.

MEMORY.md: Complete rewrite. Current state of all 6 agents, infrastructure, accounts, cron jobs, scoring framework.

DIRECTIVES.md: Updated to remove retired agent references and add v5 infrastructure context.

Session: Archived the old session transcript, cleared all locks, forced a fresh session.

The Moment It Worked

I sent Adrian a message: “Your session has been reset. Read SOUL.md and MEMORY.md now. Give me a full status report with what you can actually see in each agent workspace.”

His response was night and day:

AgentReady-for-Review Files
Scout34 files
Builder23 files
Marketer53 files
Revenue Ops11 files
Domain Analyst11 files

“Sandbox exemption confirmed. I have full filesystem access to all agent workspaces.”

Within minutes, Adrian:

  • Reviewed Revenue Ops’ cost anomaly report (OpenRouter spending spike)
  • Issued token budgets for all agents
  • Scored and routed 3 opportunities to Builder
  • Authorized deployment of two approved products
  • Updated the entire approval queue

More productive output in one session than the previous two weeks combined.

The Lesson

Your AI agent is only as good as its view of the world. It doesn’t matter how smart the model is — if it can’t read the files it needs, it will confidently report fiction based on stale memory. And it will sound completely authoritative while doing it.

Three things I’m doing differently:

  1. Agent health checks should verify filesystem access, not just “is the process running.” A heartbeat that returns “all workspaces inaccessible” should be a CRITICAL alert, not a status line.

  2. Session resets after major config changes. Updated SOUL.md doesn’t help if the running session cached the old one. Build this into the maintenance procedure.

  3. Memory freshness matters more than memory volume. Adrian had memory. It was just wrong. A small, accurate MEMORY.md beats a large, stale one.

Stats

  • Time to diagnose: ~30 minutes (once I started looking at the gateway error logs)
  • Time to fix: ~2 hours (including the sandbox false start with “non-main”)
  • Root cause: sandbox config + stale agent files + persistent sessions
  • Impact: Unlocked 131 unreviewed deliverables across 5 agents

Back to index