The Hidden Knowledge Graph in Your ~/.claude/ Directory

And what it means for context engineering

April 15, 202616 min read

Your agent knows more about you than you think

Run this command right now:

find ~/.claude -type f | wc -l

Most developers who have used Claude Code for a few months have hundreds of files in that directory. Not just session logs. Memory files where Claude has written down what it learned about you. Plans it created for multi-step tasks. Subagent session files showing how it decomposed your work. Task tracking with numbered JSON files.

Now try:

cat ~/.claude/projects/*/memory/MEMORY.md

That file is Claude's notebook about YOU. Your coding preferences, your architecture decisions, your project conventions, all captured automatically during sessions and loaded into every future conversation.

Anthropic recently published their thinking on context engineering — the art of curating the right tokens for LLM inference. But the agents are already doing context engineering on their own. They are building persistent knowledge stores on your machine, and the quality of that stored context directly determines how useful the agent is in your next session.

An academic study accepted at MSR 2026 found CLAUDE.md and AGENTS.md files in hundreds of open-source repos, calling them "AI configuration files" that represent a new form of documentation negotiated between humans and AI. These files are becoming a standard part of project infrastructure, and most developers have never opened theirs.

Claude Code: the most comprehensive knowledge store

Sessions

The obvious part. JSONL files in ~/.claude/projects/<encoded-path>/. Each line is a discrete event: user message, assistant response, tool call, tool result, file-history-snapshot. Parent-child UUID chains link main sessions to subagent sessions. The append-only design means crash recovery is built in. Only the last partial line can be lost.

Memory: the part most developers do not know about

This is where it gets interesting. Claude Code has a layered memory architecture, all stored as local Markdown files:

Layer 1: CLAUDE.md. Static rules the developer writes manually. Project conventions, build commands, architecture decisions. Loaded at the start of every session. Survives compaction (when the context window fills up, Claude re-reads this file from disk). A layered priority system: /etc/claude-code/CLAUDE.md (system-wide) → ~/.claude/CLAUDE.md (user global) → project CLAUDE.md → .claude/rules/*.md → CLAUDE.local.md (most specific wins).

Layer 2: Auto Memory. Notes Claude writes itself during sessions, organized into four typed categories:

user — who you are: role, technical level, areas of knowledge
feedback — your corrections: "don't summarize at the end of messages," "always use real databases in tests, no mocks"
project — current project state: active decisions, initiatives, constraints
reference — where to find things: "pipeline bugs tracked in Linear project INGEST"

Each is a separate Markdown file in ~/.claude/projects/<project>/memory/. The entry point is MEMORY.md, an index where each line points to a detailed file. Claude reads the index, then pulls specific files when relevant. Auto memory has been on by default since Claude Code v2.1.59. The first 200 lines of MEMORY.md get loaded into every session. Anything beyond that is invisible.

Auto Dream. A background consolidation process identified in third-party analysis of the Claude Code memory system. It cleans up stale memories: replaces vague time references with exact dates, merges contradictions, removes references to deleted files. Not documented in the official Claude Code docs but observable in the behavior of the memory files over time.

Plans and tasks

~/.claude/plans/ contains markdown plans that survive across sessions. These represent the agent's long-term thinking, decomposed into steps. Task tracking via numbered JSON files per session (~/.claude/tasks/<session-id>/1.json, 2.json, etc.) captures planning artifacts with dependencies and status.

Subagent sessions

Separate JSONL files for each subagent spawned: agent-<agentId>.jsonl. The parent-child UUID chain shows how complex tasks were decomposed. This is architectural thinking captured in data. When the agent broke down "refactor the auth module" into discrete subtasks, each subtask's conversation is preserved separately with a link back to the parent.

The limitation that matters

Claude Code's memory is locked to Claude Code. There is no export format or cross-agent protocol. Switch to Cursor or Codex, and you start from scratch. Your memory files are rich, but siloed.

Three knowledge graphs, zero integration

Before diving into Cursor and Codex, here is why this matters. A developer using all three agents (the "2-3 tool" pattern from our comparison post) has three separate knowledge graphs about the same codebase:

Claude Code knows your preferences and decisions (auto-memory, four typed categories, plans, subagent chains)
Cursor knows which lines you wrote vs the AI (code tracking at the commit level)
Codex knows your automation patterns (scheduled tasks, triggers, event-driven workflows)

None of them can read each other's data. The context engineering implications are significant: Anthropic's own research shows that just-in-time context retrieval is how effective agents work. But right now, each agent can only retrieve context from its own silo. The developer's complete engineering history is fragmented.

The MSR 2026 study found that CLAUDE.md, AGENTS.md, copilot-instructions.md, and .cursorrules all serve the same purpose (giving the agent persistent context) but use incompatible formats with no interoperability.

Cursor: the silent tracker

The SQLite database

Cursor's primary storage is a SQLite database at ~/Library/Application Support/Cursor/User/globalStorage/state.vscdb. Size varies by usage, from a few megabytes for light users to gigabytes for heavy ones. It contains conversations, completions, and agent interactions. Each workspace also gets its own smaller DB at ~/Library/Application Support/Cursor/User/workspaceStorage/<hash>/state.vscdb.

AI Code Tracking: Cursor already has git blame for AI

This is the buried revelation. Cursor tracks which lines of committed code were AI-generated vs human-written. At commit time, Cursor compares saved code signatures to the committed code and marks matches as AI-generated. The process runs entirely locally. No source code leaves the machine.

Current limitations:

AI Code Tracking does not attribute lines from Background Agents or the Cursor CLI to specific commits yet
Automated code formatting can invalidate diff signatures and break attribution
The developer must commit from the same machine where AI code was written
Enterprise plan required for the tracking API

Terminal session capture

Cursor captures running terminal processes with metadata: PID, working directory, command, start time, exit code, full output. The AI reads these to understand what you have been running without you pasting terminal output into chat.

MCP configurations

Per-project MCP server instructions and metadata stored locally. These define what tools the AI can access, and like CLAUDE.md, they represent context engineering that persists across sessions.

Codex: the automation engine

Sessions and threads

Codex stores session data as JSONL, similar to Claude Code. Thread titles, workspace roots, and prompt history are stored in a global state JSON file. The /resume command can jump directly to a session by ID.

The SQLite automation database

Where Codex diverges: it stores automations, inbox items, and automation run history in SQLite. This captures not just what you did interactively, but what you scheduled the agent to do autonomously. GitHub Triggers, scheduled tasks, event-driven responses. The automation history is a record of how you delegate to AI.

Skills and plugins

Installed skill packs with SKILL.md files, scripts, and agent configurations. These are the equivalent of Claude Code's memory, but focused on capabilities rather than preferences. The skills system follows the AGENTS.md open standard, making it the most interoperable of the three.

Global state

The Codex desktop app maintains a comprehensive model of how you organize your work: workspace roots, project labels, prompt history (every prompt you have ever typed, stored chronologically), window state. This is the most complete record of your AI interaction patterns.

What this means for you

Your agent's memory is only as good as what it has seen

If you have been using Claude Code for 6 months and switch to Cursor, you lose 6 months of accumulated context. Your new agent does not know your conventions, your past failures, or your architectural decisions. The cold start problem is not just about LLM training data. It is about the local knowledge your agent has built.

Context engineering is happening whether you manage it or not

Claude Code's auto-memory is writing files about you right now. If you have never opened ~/.claude/projects/*/memory/, you have unreviewed context shaping every future session. Run /memory in your next session and audit what is there. Delete what is wrong. Add what is missing.

The files your agent creates are engineering artifacts

CLAUDE.md files are showing up in git repos across GitHub. The MSR 2026 study found them in hundreds of open-source repositories. They are becoming a standard part of project documentation, not written for humans, but for AI agents. Your CLAUDE.md is as important as your README.md.

The export problem is real and unsolved

There is no standard format for exporting agent memory, no protocol for cross-agent context transfer, and no tool that reads all three knowledge stores. (This is part of why we built promptarc — to read sessions across agents and make the data portable.)

Five things to do today

Audit your Claude Code memory. Run /memory in your next session. Read what Claude has stored about you. Delete entries that are wrong. Add entries that are missing. This directly improves every future session.
Read your CLAUDE.md hierarchy. Check all four layers: ~/.claude/CLAUDE.md (global), your project's CLAUDE.md, .claude/rules/*.md, and CLAUDE.local.md. Understand what Claude sees when it starts a session.
Check your Cursor database. Look at ~/Library/Application Support/Cursor/User/globalStorage/state.vscdb. That is months of conversation history, terminal captures, and completion data, all local, all readable.
Export what matters. If you switch agents or machines, your knowledge graph does not follow you. Copy your CLAUDE.md and memory files manually. Or use a tool that reads the data and makes it portable.
Try reading your data across all three agents. Each agent stores valuable context that the others cannot see. If you use more than one agent, consider tools that can read all three formats and surface what you have done across your entire AI workflow, not just one silo.

Your AI agent has been doing context engineering on your behalf for months. The question is whether you are curating it, or leaving it to chance.

Last updated: April 2026. If we got something wrong → promptarc.dev/feedback