Everyone talks about Clawdbot (openClaw), but here's how it works:

I dug into Clawdbot's internals to understand how it actually handles agent execution, tool use, and browser automation. Turns out there's a lot worth learning here if you're building AI systems yourself.
The whole thing started because I wanted to know how reliable its memory system actually is. Understanding the architecture gives you a much clearer picture of where it shines and where it breaks down.
What Clawdbot Actually Is Under the Hood
Most people think of Clawdbot as just a personal assistant you can run locally or hit through APIs. But technically? It's a TypeScript CLI app. Not Python, not a web framework, just a process running on your machine that exposes a gateway server for channel connections, makes LLM API calls, and executes tools locally.
The Message Flow
When you send a message, here's the path it takes.
First, a Channel Adapter picks it up and normalizes everything (formatting, attachments). Each messenger gets its own adapter.
Then the Gateway Server routes it to the right session. This is really the core of the whole system. It coordinates multiple overlapping requests using a lane-based command queue. Each session gets its own lane, and only low-risk tasks run in parallel lanes.
This is way better than async/await chaos everywhere. If you've built agents before, you know what happens with naive parallelization: interleaved garbage logs, shared state nightmares, race conditions you have to constantly think about. The lane abstraction flips the mental model from "what do I need to lock?" to "what can I safely run in parallel?"
Cognition made the same point in their "Don't Build Multi-Agents" post. Serial by default, parallel only when you explicitly choose it.
Where the AI Actually Happens
The Agent Runner handles model selection, API key rotation (marking keys in cooldown if they fail), and fallbacks. It builds the system prompt dynamically from available tools, skills, and memory, then appends session history from a .jsonl file.
Before the call goes out, a context window guard checks if there's room. If context is nearly full, it either compresses the session or fails gracefully.
The LLM call streams responses and abstracts over different providers. If the response includes a tool call, Clawdbot executes it locally, adds results to the conversation, and loops. This continues until the model returns final text or hits the turn limit (around 20 by default).
How Memory Works
Without memory, any AI assistant is basically useless across sessions. Clawdbot handles this two ways: JSONL transcripts for session history, and markdown files in MEMORY.md or a memory/ folder for longer-term storage.
Search uses both vector and keyword matching, so "authentication bug" finds semantic matches like "auth issues" plus exact phrase hits. SQLite handles the vectors, FTS5 handles keywords. A file watcher triggers smart syncing when things change.
The interesting part: the agent writes memory using a standard file tool. No special API. It just writes to memory/*.md. When a new conversation starts, a hook summarizes the previous one into markdown.
The whole system is surprisingly simple. No memory merging, no periodic compression. Old memories stick around forever with equal weight. I tend to prefer this kind of explainable simplicity over complex systems that are hard to debug.
Computer Access
This is where Clawdbot gets interesting. It gives the agent real computer access through an exec tool that can run shell commands in a Docker sandbox (default), directly on host, or on remote machines. Plus filesystem tools, a Playwright-based browser, and process management for long-running tasks.
Safety Approach
Like Claude Code, there's an allowlist system where you approve commands once, always, or deny them.
json
// ~/.clawdbot/exec-approvals.json
{
"agents": {
"main": {
"allowlist": [
{ "pattern": "/usr/bin/npm", "lastUsedAt": 1706644800 },
{ "pattern": "/opt/homebrew/bin/git", "lastUsedAt": 1706644900 }
]
}
}
}Basic commands like grep, sort, head, tail are pre-approved. Dangerous shell patterns get blocked:
bash
# rejected before execution:
npm install $(cat /etc/passwd) # command substitution
cat file > /etc/hosts # redirection
rm -rf / || echo "failed" # chained operators
(sudo rm -rf /) # subshellThe philosophy is maximum autonomy within whatever boundaries you set.
Browser Automation
Instead of screenshots, the browser tool uses semantic snapshots, basically a text representation of the page's accessibility tree:
bash
- button "Sign In" [ref=1]
- textbox "Email" [ref=2]
- textbox "Password" [ref=3]
- link "Forgot password?" [ref=4]
- heading "Welcome back"
- list
- listitem "Dashboard"
- listitem "Settings"Way more token-efficient, and the model can interact through reference numbers instead of coordinates. Much more reliable for actual automation.
Subscribe to Updates
Get notified when new posts are published.