wip: [01-stabilize] paused at task 1/1 - OCR Hallucination Immune logic via Semantic delta window and fret-isolation

2026-03-29 22:08:40 +09:00
parent aca7bf592a
commit 2507de45d3
4289 changed files with 732689 additions and 28672 deletions
--- a/.agent/services/claude-mem/docs/public/CLAUDE.md
+++ b/.agent/services/claude-mem/docs/public/CLAUDE.md
@@ -0,0 +1,88 @@
+# Claude-Mem Public Documentation
+
+## What This Folder Is
+
+This `docs/public/` folder contains the **Mintlify documentation site** - the official user-facing documentation for claude-mem. It's a structured documentation platform with a specific file format and organization.
+
+## Folder Structure
+
+```
+docs/
+├── public/          ← You are here (Mintlify MDX files)
+│   ├── *.mdx       - User-facing documentation pages
+│   ├── docs.json   - Mintlify configuration and navigation
+│   ├── architecture/ - Technical architecture docs
+│   ├── usage/      - User guides and workflows
+│   └── *.webp, *.gif - Assets (logos, screenshots)
+└── context/        ← Internal documentation (DO NOT put here)
+    └── *.md        - Planning docs, audits, references
+```
+
+## File Requirements
+
+### Mintlify Documentation Files (.mdx)
+All official documentation files must be:
+- Written in `.mdx` format (Markdown with JSX support)
+- Listed in `docs.json` navigation structure
+- Follow Mintlify's schema and conventions
+
+The documentation is organized into these sections:
+- **Get Started**: Introduction, installation, usage guides
+- **Best Practices**: Context engineering, progressive disclosure
+- **Configuration & Development**: Settings, dev workflow, troubleshooting
+- **Architecture**: System design, components, technical details
+
+### Configuration File
+`docs.json` defines:
+- Site metadata (name, description, theme)
+- Navigation structure
+- Branding (logos, colors)
+- Footer links and social media
+
+## What Does NOT Belong Here
+
+**Planning documents, design docs, and reference materials go in `/docs/context/` instead:**
+
+Files that belong in `/docs/context/` (NOT here):
+- Planning documents (`*-plan.md`, `*-outline.md`)
+- Implementation analysis (`*-audit.md`, `*-code-reference.md`)
+- Error tracking (`typescript-errors.md`)
+- Internal design documents
+- PR review responses
+- Reference materials (like `agent-sdk-ref.md`)
+- Work-in-progress documentation
+
+## How to Add Official Documentation
+
+1. Create a new `.mdx` file in the appropriate subdirectory
+2. Add the file path to `docs.json` navigation
+3. Use Mintlify's frontmatter and components
+4. Follow the existing documentation style
+5. Test locally: `npx mintlify dev`
+
+## Development Workflow
+
+**For contributors working on claude-mem:**
+- Read `/CLAUDE.md` in the project root for development instructions
+- Place planning/design docs in `/docs/context/`
+- Only add user-facing documentation to `/docs/public/`
+- Test documentation locally with Mintlify CLI before committing
+
+## Testing Documentation
+
+```bash
+# Validate docs structure
+npx mintlify validate
+
+# Check for broken links
+npx mintlify broken-links
+
+# Run local dev server
+npx mintlify dev
+```
+
+## Summary
+
+**Simple Rule**:
+- `/docs/public/` = Official user documentation (Mintlify .mdx files) ← YOU ARE HERE
+- `/docs/context/` = Internal docs, plans, references, audits
--- a/.agent/services/claude-mem/docs/public/architecture-evolution.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture-evolution.mdx
--- a/.agent/services/claude-mem/docs/public/architecture/database.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/database.mdx
@@ -0,0 +1,309 @@
+---
+title: "Database Architecture"
+description: "SQLite schema, FTS5 search, and data storage"
+---
+
+# Database Architecture
+
+Claude-Mem uses SQLite 3 with the bun:sqlite native module for persistent storage and FTS5 for full-text search.
+
+## Database Location
+
+**Path**: `~/.claude-mem/claude-mem.db`
+
+The database uses SQLite's WAL (Write-Ahead Logging) mode for concurrent reads/writes.
+
+## Database Implementation
+
+**Primary Implementation**: bun:sqlite (native SQLite module)
+- Used by: SessionStore and SessionSearch
+- Format: Synchronous API with better performance
+- **Note**: Database.ts (using bun:sqlite) is legacy code
+
+## Core Tables
+
+### 1. sdk_sessions
+
+Tracks active and completed sessions.
+
+```sql
+CREATE TABLE sdk_sessions (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  sdk_session_id TEXT UNIQUE NOT NULL,
+  claude_session_id TEXT,
+  project TEXT NOT NULL,
+  prompt_counter INTEGER DEFAULT 0,
+  status TEXT NOT NULL DEFAULT 'active',
+  created_at TEXT NOT NULL,
+  created_at_epoch INTEGER NOT NULL,
+  completed_at TEXT,
+  completed_at_epoch INTEGER,
+  last_activity_at TEXT,
+  last_activity_epoch INTEGER
+);
+```
+
+**Indexes**:
+- `idx_sdk_sessions_claude_session` on `claude_session_id`
+- `idx_sdk_sessions_project` on `project`
+- `idx_sdk_sessions_status` on `status`
+- `idx_sdk_sessions_created_at` on `created_at_epoch DESC`
+
+### 2. observations
+
+Individual tool executions with hierarchical structure.
+
+```sql
+CREATE TABLE observations (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  session_id TEXT NOT NULL,
+  sdk_session_id TEXT NOT NULL,
+  claude_session_id TEXT,
+  project TEXT NOT NULL,
+  prompt_number INTEGER,
+  tool_name TEXT NOT NULL,
+  correlation_id TEXT,
+
+  -- Hierarchical fields
+  title TEXT,
+  subtitle TEXT,
+  narrative TEXT,
+  text TEXT,
+  facts TEXT,
+  concepts TEXT,
+  type TEXT,
+  files_read TEXT,
+  files_modified TEXT,
+
+  created_at TEXT NOT NULL,
+  created_at_epoch INTEGER NOT NULL,
+
+  FOREIGN KEY (sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
+);
+```
+
+**Observation Types**:
+- `decision` - Architectural or design decisions
+- `bugfix` - Bug fixes and corrections
+- `feature` - New features or capabilities
+- `refactor` - Code refactoring and cleanup
+- `discovery` - Learnings about the codebase
+- `change` - General changes and modifications
+
+**Indexes**:
+- `idx_observations_session` on `session_id`
+- `idx_observations_sdk_session` on `sdk_session_id`
+- `idx_observations_project` on `project`
+- `idx_observations_tool_name` on `tool_name`
+- `idx_observations_created_at` on `created_at_epoch DESC`
+- `idx_observations_type` on `type`
+
+### 3. session_summaries
+
+AI-generated session summaries (multiple per session).
+
+```sql
+CREATE TABLE session_summaries (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  sdk_session_id TEXT NOT NULL,
+  claude_session_id TEXT,
+  project TEXT NOT NULL,
+  prompt_number INTEGER,
+
+  -- Summary fields
+  request TEXT,
+  investigated TEXT,
+  learned TEXT,
+  completed TEXT,
+  next_steps TEXT,
+  notes TEXT,
+
+  created_at TEXT NOT NULL,
+  created_at_epoch INTEGER NOT NULL,
+
+  FOREIGN KEY (sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
+);
+```
+
+**Indexes**:
+- `idx_session_summaries_sdk_session` on `sdk_session_id`
+- `idx_session_summaries_project` on `project`
+- `idx_session_summaries_created_at` on `created_at_epoch DESC`
+
+### 4. user_prompts
+
+Raw user prompts with FTS5 search (as of v4.2.0).
+
+```sql
+CREATE TABLE user_prompts (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  sdk_session_id TEXT NOT NULL,
+  claude_session_id TEXT,
+  project TEXT NOT NULL,
+  prompt_number INTEGER,
+  prompt_text TEXT NOT NULL,
+  created_at TEXT NOT NULL,
+  created_at_epoch INTEGER NOT NULL,
+
+  FOREIGN KEY (sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
+);
+```
+
+**Indexes**:
+- `idx_user_prompts_sdk_session` on `sdk_session_id`
+- `idx_user_prompts_project` on `project`
+- `idx_user_prompts_created_at` on `created_at_epoch DESC`
+
+### Legacy Tables
+
+- **sessions**: Legacy session tracking (v3.x)
+- **memories**: Legacy compressed memory chunks (v3.x)
+- **overviews**: Legacy session summaries (v3.x)
+
+## FTS5 Full-Text Search
+
+SQLite FTS5 (Full-Text Search) virtual tables enable fast full-text search across observations, summaries, and user prompts.
+
+### FTS5 Virtual Tables
+
+#### observations_fts
+
+```sql
+CREATE VIRTUAL TABLE observations_fts USING fts5(
+  title,
+  subtitle,
+  narrative,
+  text,
+  facts,
+  concepts,
+  content='observations',
+  content_rowid='id'
+);
+```
+
+#### session_summaries_fts
+
+```sql
+CREATE VIRTUAL TABLE session_summaries_fts USING fts5(
+  request,
+  investigated,
+  learned,
+  completed,
+  next_steps,
+  notes,
+  content='session_summaries',
+  content_rowid='id'
+);
+```
+
+#### user_prompts_fts
+
+```sql
+CREATE VIRTUAL TABLE user_prompts_fts USING fts5(
+  prompt_text,
+  content='user_prompts',
+  content_rowid='id'
+);
+```
+
+### Automatic Synchronization
+
+FTS5 tables stay in sync via triggers:
+
+```sql
+-- Insert trigger example
+CREATE TRIGGER observations_ai AFTER INSERT ON observations BEGIN
+  INSERT INTO observations_fts(rowid, title, subtitle, narrative, text, facts, concepts)
+  VALUES (new.id, new.title, new.subtitle, new.narrative, new.text, new.facts, new.concepts);
+END;
+
+-- Update trigger example
+CREATE TRIGGER observations_au AFTER UPDATE ON observations BEGIN
+  INSERT INTO observations_fts(observations_fts, rowid, title, subtitle, narrative, text, facts, concepts)
+  VALUES('delete', old.id, old.title, old.subtitle, old.narrative, old.text, old.facts, old.concepts);
+  INSERT INTO observations_fts(rowid, title, subtitle, narrative, text, facts, concepts)
+  VALUES (new.id, new.title, new.subtitle, new.narrative, new.text, new.facts, new.concepts);
+END;
+
+-- Delete trigger example
+CREATE TRIGGER observations_ad AFTER DELETE ON observations BEGIN
+  INSERT INTO observations_fts(observations_fts, rowid, title, subtitle, narrative, text, facts, concepts)
+  VALUES('delete', old.id, old.title, old.subtitle, old.narrative, old.text, old.facts, old.concepts);
+END;
+```
+
+### FTS5 Query Syntax
+
+FTS5 supports rich query syntax:
+
+- **Simple**: `"error handling"`
+- **AND**: `"error" AND "handling"`
+- **OR**: `"bug" OR "fix"`
+- **NOT**: `"bug" NOT "feature"`
+- **Phrase**: `"'exact phrase'"`
+- **Column**: `title:"authentication"`
+
+### Security
+
+As of v4.2.3, all FTS5 queries are properly escaped to prevent SQL injection:
+- Double quotes are escaped: `query.replace(/"/g, '""')`
+- Comprehensive test suite with 332 injection attack tests
+
+## Database Classes
+
+### SessionStore
+
+CRUD operations for sessions, observations, summaries, and user prompts.
+
+**Location**: `src/services/sqlite/SessionStore.ts`
+
+**Methods**:
+- `createSession()`
+- `getSession()`
+- `updateSession()`
+- `createObservation()`
+- `getObservations()`
+- `createSummary()`
+- `getSummaries()`
+- `createUserPrompt()`
+
+### SessionSearch
+
+FTS5 full-text search with 8 specialized search methods.
+
+**Location**: `src/services/sqlite/SessionSearch.ts`
+
+**Methods**:
+- `searchObservations()` - Full-text search across observations
+- `searchSessions()` - Full-text search across summaries
+- `searchUserPrompts()` - Full-text search across user prompts
+- `findByConcept()` - Find by concept tags
+- `findByFile()` - Find by file references
+- `findByType()` - Find by observation type
+- `getRecentContext()` - Get recent session context
+- `advancedSearch()` - Combined filters
+
+## Migrations
+
+Database schema is managed via migrations in `src/services/sqlite/migrations.ts`.
+
+**Migration History**:
+- Migration 001: Initial schema (sessions, memories, overviews, diagnostics, transcript_events)
+- Migration 002: Hierarchical memory fields (title, subtitle, facts, concepts, files_touched)
+- Migration 003: SDK sessions and observations
+- Migration 004: Session summaries
+- Migration 005: Multi-prompt sessions (prompt_counter, prompt_number)
+- Migration 006: FTS5 virtual tables and triggers
+- Migration 007-010: Various improvements and user prompts table
+
+## Performance Considerations
+
+- **Indexes**: All foreign keys and frequently queried columns are indexed
+- **FTS5**: Full-text search is significantly faster than LIKE queries
+- **Triggers**: Automatic synchronization has minimal overhead
+- **Connection Pooling**: bun:sqlite reuses connections efficiently
+- **Synchronous API**: bun:sqlite uses synchronous API for better performance
+
+## Troubleshooting
+
+See [Troubleshooting - Database Issues](../troubleshooting.md#database-issues) for common problems and solutions.
--- a/.agent/services/claude-mem/docs/public/architecture/hooks.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/hooks.mdx
@@ -0,0 +1,955 @@
+---
+title: "Hook Lifecycle"
+description: "Complete guide to the 5-stage memory agent lifecycle for platform implementers"
+---
+
+# Hook Lifecycle
+
+Claude-Mem implements a **5-stage hook system** that captures development work across Claude Code sessions. This document provides a complete technical reference for developers implementing this pattern on other platforms.
+
+## Architecture Overview
+
+### System Architecture
+
+This two-process architecture works in both Claude Code and VS Code:
+
+```mermaid
+graph TB
+    subgraph EXT["Extension Process (runs in IDE)"]
+        direction TB
+        ACT[Extension Activation]
+        HOOKS[Hook Event Handlers]
+        ACT --> HOOKS
+
+        subgraph HOOK_HANDLERS["5 Lifecycle Hooks"]
+            H1[SessionStart<br/>activate function]
+            H2[UserPromptSubmit<br/>command handler]
+            H3[PostToolUse<br/>middleware]
+            H4[Stop<br/>idle timeout]
+            H5[SessionEnd<br/>deactivate function]
+        end
+
+        HOOKS --> HOOK_HANDLERS
+    end
+
+    HOOK_HANDLERS -->|"HTTP<br/>(fire-and-forget<br/>2s timeout)"| HTTP[Worker HTTP API<br/>Port 37777]
+
+    subgraph WORKER["Worker Process (separate Node.js)"]
+        direction TB
+        HTTP --> API[Express Server]
+        API --> SESS[Session Manager]
+        API --> AGENT[SDK Agent]
+        API --> DB[Database Manager]
+
+        AGENT -->|Event-Driven| CLAUDE[Claude Agent SDK]
+        CLAUDE --> SQLITE[(SQLite + FTS5)]
+        CLAUDE --> CHROMA[(Chroma Vectors)]
+    end
+
+    style EXT fill:#e1f5ff
+    style WORKER fill:#fff4e1
+    style HOOK_HANDLERS fill:#f0f0f0
+```
+
+**Key Principles:**
+- Extension process never blocks (fire-and-forget HTTP)
+- Worker processes observations asynchronously
+- Session state persists across IDE restarts
+
+### VS Code Extension API Integration Points
+
+For developers porting to VS Code, here's where to hook into the VS Code Extension API:
+
+```mermaid
+graph LR
+    subgraph VSCODE["VS Code Extension API"]
+        direction TB
+        A["activate(context)"]
+        B["commands.registerCommand()"]
+        C["chat.createChatParticipant()"]
+        D["workspace.onDidSaveTextDocument()"]
+        E["window.onDidChangeActiveTextEditor()"]
+        F["deactivate()"]
+    end
+
+    subgraph HOOKS["Hook Equivalents"]
+        direction TB
+        G[SessionStart]
+        H[UserPromptSubmit]
+        I[PostToolUse]
+        J[Stop/Summary]
+        K[SessionEnd]
+    end
+
+    subgraph WORKER_API["Worker HTTP Endpoints"]
+        direction TB
+        L[GET /api/context/inject]
+        M[POST /sessions/init]
+        N[POST /sessions/observations]
+        O[POST /sessions/summarize]
+        P[POST /sessions/complete]
+    end
+
+    A --> G
+    B --> H
+    C --> H
+    D --> I
+    E --> I
+    F --> K
+
+    G --> L
+    H --> M
+    I --> N
+    J --> O
+    K --> P
+
+    style VSCODE fill:#007acc,color:#fff
+    style HOOKS fill:#f0f0f0
+    style WORKER_API fill:#4caf50,color:#fff
+```
+
+**Implementation Examples:**
+
+```typescript
+// VS Code Extension - SessionStart Hook
+export async function activate(context: vscode.ExtensionContext) {
+  const sessionId = generateSessionId()
+  const project = workspace.name || 'default'
+
+  // Fetch context from worker
+  const response = await fetch(`http://localhost:37777/api/context/inject?project=${project}`)
+  const context = await response.text()
+
+  // Inject into chat or UI panel
+  injectContextToChat(context)
+}
+
+// VS Code Extension - UserPromptSubmit Hook
+const command = vscode.commands.registerCommand('extension.command', async (prompt) => {
+  await fetch('http://localhost:37777/sessions/init', {
+    method: 'POST',
+    body: JSON.stringify({ sessionId, project, userPrompt: prompt })
+  })
+})
+
+// VS Code Extension - PostToolUse Hook (middleware pattern)
+workspace.onDidSaveTextDocument(async (document) => {
+  await fetch('http://localhost:37777/api/sessions/observations', {
+    method: 'POST',
+    body: JSON.stringify({
+      claudeSessionId: sessionId,
+      tool_name: 'FileSave',
+      tool_input: { path: document.uri.path },
+      tool_response: 'File saved successfully'
+    })
+  })
+})
+```
+
+### Async Processing Pipeline
+
+How observations flow from extension to database without blocking the IDE:
+
+```mermaid
+graph TB
+    A["Extension: Tool Use Event"] --> B{"Skip List?<br/>(TodoWrite, AskUserQuestion, etc.)"}
+    B -->|"Skip"| X["Discard"]
+    B -->|"Keep"| C["Strip Privacy Tags<br/>&lt;private&gt;...&lt;/private&gt;"]
+    C --> D["HTTP POST to Worker<br/>Port 37777"]
+    D --> E["2s timeout<br/>fire-and-forget"]
+    E --> F["Extension continues<br/>(non-blocking)"]
+
+    D -.Async Path.-> G["Worker: Queue Observation"]
+    G --> H["SDK Agent picks up<br/>(event-driven)"]
+    H --> I["Call Claude API<br/>(compress observation)"]
+    I --> J["Parse XML response"]
+    J --> K["Save to SQLite<br/>(sdk_sessions table)"]
+    K --> L["Sync to Chroma<br/>(vector embeddings)"]
+
+    style F fill:#90EE90,stroke:#2d6b2d,stroke-width:3px
+    style L fill:#87CEEB,stroke:#2d5f8d,stroke-width:3px
+    style E fill:#ffeb3b,stroke:#c6a700,stroke-width:2px
+```
+
+**Critical Pattern:** The extension's HTTP call has a 2-second timeout and doesn't wait for AI processing. The worker handles compression asynchronously using an event-driven queue.
+
+## The 5 Lifecycle Stages
+
+| Stage | Hook | Trigger | Purpose |
+|-------|------|---------|---------|
+| **1. SessionStart** | `context-hook.js` | User opens Claude Code | Inject prior context silently |
+| **2. UserPromptSubmit** | `new-hook.js` | User submits a prompt | Create/get session, save prompt, init worker |
+| **3. PostToolUse** | `save-hook.js` | Claude uses any tool | Queue observation for AI compression |
+| **4. Stop** | `summary-hook.js` | User stops asking questions | Generate session summary |
+| **5. SessionEnd** | `cleanup-hook.js` | Session closes | Mark session completed |
+
+## Hook Configuration
+
+Hooks are configured in `plugin/hooks/hooks.json`:
+
+```json
+{
+  "hooks": {
+    "SessionStart": [{
+      "matcher": "startup|clear|compact",
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/smart-install.js",
+        "timeout": 300
+      }, {
+        "type": "command",
+        "command": "bun ${CLAUDE_PLUGIN_ROOT}/scripts/worker-service.cjs start",
+        "timeout": 60
+      }, {
+        "type": "command",
+        "command": "bun ${CLAUDE_PLUGIN_ROOT}/scripts/context-hook.js",
+        "timeout": 60
+      }]
+    }],
+    "UserPromptSubmit": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/new-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "PostToolUse": [{
+      "matcher": "*",
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/save-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "Stop": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/summary-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "SessionEnd": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-hook.js",
+        "timeout": 120
+      }]
+    }]
+  }
+}
+```
+
+---
+
+## Stage 1: SessionStart
+
+**Timing**: When user opens Claude Code or resumes session
+
+**Hooks Triggered** (in order):
+1. `smart-install.js` - Ensures dependencies are installed
+2. `worker-service.cjs start` - Starts the worker service
+3. `context-hook.js` - Fetches and silently injects prior session context
+
+<Note>
+As of Claude Code 2.1.0 (ultrathink update), SessionStart hooks no longer display user-visible messages. Context is silently injected via `hookSpecificOutput.additionalContext`.
+</Note>
+
+### Sequence Diagram
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant IDE as IDE/Extension
+    participant ContextHook as context-hook.js
+    participant Worker as Worker Service
+    participant DB as SQLite Database
+
+    User->>IDE: Opens workspace / resumes session
+    IDE->>ContextHook: Trigger SessionStart hook
+    ContextHook->>ContextHook: Generate/reuse session_id
+    ContextHook->>Worker: Health check (max 10s retry)
+
+    alt Worker Ready
+        ContextHook->>Worker: GET /api/context/inject?project=X
+        Worker->>DB: SELECT * FROM observations<br/>WHERE project=X<br/>ORDER BY created_at DESC<br/>LIMIT 50
+        DB-->>Worker: Last 50 observations
+        Worker-->>ContextHook: Context markdown
+        ContextHook-->>IDE: hookSpecificOutput.additionalContext
+        IDE->>IDE: Inject context to Claude's prompt
+        IDE-->>User: Session ready with context
+    else Worker Not Ready
+        ContextHook-->>IDE: Empty context (graceful degradation)
+        IDE-->>User: Session ready without context
+    end
+
+    Note over User,DB: Total time: <300ms (with health check)
+```
+
+### Context Hook (`context-hook.js`)
+
+**Purpose**: Inject context from previous sessions into Claude's initial context.
+
+**Input** (via stdin):
+```json
+{
+  "session_id": "claude-session-123",
+  "cwd": "/path/to/project",
+  "source": "startup"
+}
+```
+
+**Processing**:
+1. Wait for worker to be available (health check, max 10 seconds)
+2. Call: `GET http://127.0.0.1:37777/api/context/inject?project={project}`
+3. Return formatted context as `additionalContext` in `hookSpecificOutput`
+
+**Output** (via stdout):
+```json
+{
+  "hookSpecificOutput": {
+    "hookEventName": "SessionStart",
+    "additionalContext": "<<formatted context markdown>>"
+  }
+}
+```
+
+**Implementation**: `src/hooks/context-hook.ts`
+
+---
+
+## Stage 2: UserPromptSubmit
+
+**Timing**: When user submits any prompt in a session
+
+**Hook**: `new-hook.js`
+
+### Sequence Diagram
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant IDE as IDE/Extension
+    participant NewHook as new-hook.js
+    participant DB as Direct SQLite Access
+    participant Worker as Worker Service
+
+    User->>IDE: Submits prompt: "Add login feature"
+    IDE->>NewHook: Trigger UserPromptSubmit<br/>{ session_id, cwd, prompt }
+
+    NewHook->>NewHook: Extract project = basename(cwd)
+    NewHook->>NewHook: Strip privacy tags<br/>&lt;private&gt;...&lt;/private&gt;
+
+    alt Prompt fully private (empty after stripping)
+        NewHook-->>IDE: Skip (don't save)
+    else Prompt has content
+        NewHook->>DB: INSERT OR IGNORE INTO sdk_sessions<br/>(claude_session_id, project, first_user_prompt)
+        DB-->>NewHook: sessionDbId (new or existing)
+
+        NewHook->>DB: UPDATE sdk_sessions<br/>SET prompt_counter = prompt_counter + 1<br/>WHERE id = sessionDbId
+        DB-->>NewHook: promptNumber (e.g., 1 for first, 2 for continuation)
+
+        NewHook->>DB: INSERT INTO user_prompts<br/>(session_id, prompt_number, prompt)
+
+        NewHook->>Worker: POST /sessions/{sessionDbId}/init<br/>{ project, userPrompt, promptNumber }<br/>(fire-and-forget, 2s timeout)
+        Worker-->>NewHook: 200 OK (or timeout)
+
+        NewHook-->>IDE: { continue: true, suppressOutput: true }
+        IDE-->>User: Prompt accepted
+    end
+
+    Note over NewHook,DB: Idempotent: Same session_id → same sessionDbId
+```
+
+**Key Pattern:** The `INSERT OR IGNORE` ensures the same `session_id` always maps to the same `sessionDbId`, enabling conversation continuations.
+
+**Input** (via stdin):
+```json
+{
+  "session_id": "claude-session-123",
+  "cwd": "/path/to/project",
+  "prompt": "User's actual prompt text"
+}
+```
+
+**Processing Steps**:
+
+```typescript
+// 1. Extract project name from working directory
+project = path.basename(cwd)
+
+// 2. Create or get database session (IDEMPOTENT)
+sessionDbId = db.createSDKSession(session_id, project, prompt)
+// INSERT OR IGNORE: Creates new row if first prompt, returns existing if continuation
+
+// 3. Increment prompt counter
+promptNumber = db.incrementPromptCounter(sessionDbId)
+// Returns 1 for first prompt, 2 for continuation, etc.
+
+// 4. Strip privacy tags
+cleanedPrompt = stripMemoryTagsFromPrompt(prompt)
+// Removes <private>...</private> and <claude-mem-context>...</claude-mem-context>
+
+// 5. Skip if fully private
+if (!cleanedPrompt || cleanedPrompt.trim() === '') {
+  return  // Don't save, don't call worker
+}
+
+// 6. Save user prompt to database
+db.saveUserPrompt(session_id, promptNumber, cleanedPrompt)
+
+// 7. Initialize session via worker HTTP
+POST http://127.0.0.1:37777/sessions/{sessionDbId}/init
+Body: { project, userPrompt, promptNumber }
+```
+
+**Output**:
+```json
+{ "continue": true, "suppressOutput": true }
+```
+
+**Implementation**: `src/hooks/new-hook.ts`
+
+<Note>
+The same `session_id` flows through ALL hooks in a conversation. The `createSDKSession` call is idempotent - it returns the existing session for continuation prompts.
+</Note>
+
+---
+
+## Stage 3: PostToolUse
+
+**Timing**: After Claude uses any tool (Read, Bash, Grep, Write, etc.)
+
+**Hook**: `save-hook.js`
+
+### Sequence Diagram
+
+```mermaid
+sequenceDiagram
+    participant Claude as Claude AI
+    participant IDE as IDE/Extension
+    participant SaveHook as save-hook.js
+    participant Worker as Worker Service
+    participant Agent as SDK Agent
+    participant DB as SQLite + Chroma
+
+    Claude->>IDE: Uses tool: Read("/src/auth.ts")
+    IDE->>SaveHook: PostToolUse hook triggered<br/>{ session_id, tool_name, tool_input, tool_response }
+
+    SaveHook->>SaveHook: Check skip list<br/>(TodoWrite, AskUserQuestion, etc.)
+
+    alt Tool in skip list
+        SaveHook-->>IDE: Discard (low-value tool)
+    else Tool allowed
+        SaveHook->>SaveHook: Strip privacy tags from input/response
+
+        SaveHook->>SaveHook: Ensure worker running<br/>(health check)
+
+        SaveHook->>Worker: POST /api/sessions/observations<br/>{ claudeSessionId, tool_name, tool_input, tool_response, cwd }<br/>(fire-and-forget, 2s timeout)
+
+        SaveHook-->>IDE: { continue: true, suppressOutput: true }
+        IDE-->>Claude: Tool execution complete
+
+        Note over Worker,DB: Async path (doesn't block IDE)
+
+        Worker->>Worker: createSDKSession(claudeSessionId)<br/>→ returns sessionDbId
+        Worker->>Worker: Check if prompt was private<br/>(skip if fully private)
+        Worker->>Agent: Queue observation for processing
+        Agent->>Agent: Call Claude SDK to compress<br/>observation into structured format
+        Agent->>DB: Save compressed observation<br/>to sdk_sessions table
+        Agent->>DB: Sync to Chroma vector DB
+    end
+
+    Note over SaveHook,DB: Total sync time: ~2ms<br/>AI processing: 1-3s (async)
+```
+
+**Key Pattern:** The hook returns immediately after HTTP POST. AI compression happens asynchronously in the worker without blocking Claude's tool execution.
+
+**Input** (via stdin):
+```json
+{
+  "session_id": "claude-session-123",
+  "cwd": "/path/to/project",
+  "tool_name": "Read",
+  "tool_input": { "file_path": "/src/index.ts" },
+  "tool_response": "file contents..."
+}
+```
+
+**Processing Steps**:
+
+```typescript
+// 1. Check blocklist - skip low-value tools
+const SKIP_TOOLS = {
+  'ListMcpResourcesTool',  // MCP infrastructure noise
+  'SlashCommand',          // Command invocation
+  'Skill',                 // Skill invocation
+  'TodoWrite',             // Task management meta-tool
+  'AskUserQuestion'        // User interaction
+}
+
+if (SKIP_TOOLS[tool_name]) return
+
+// 2. Ensure worker is running
+await ensureWorkerRunning()
+
+// 3. Send to worker (fire-and-forget HTTP)
+POST http://127.0.0.1:37777/api/sessions/observations
+Body: {
+  claudeSessionId: session_id,
+  tool_name,
+  tool_input,
+  tool_response,
+  cwd
+}
+Timeout: 2000ms
+```
+
+**Worker Processing**:
+1. Looks up or creates session: `createSDKSession(claudeSessionId, '', '')`
+2. Gets prompt counter
+3. Checks privacy (skips if user prompt was entirely private)
+4. Strips memory tags from `tool_input` and `tool_response`
+5. Queues observation for SDK agent processing
+6. SDK agent calls Claude to compress into structured observation
+7. Stores observation in database and syncs to Chroma
+
+**Output**:
+```json
+{ "continue": true, "suppressOutput": true }
+```
+
+**Implementation**: `src/hooks/save-hook.ts`
+
+---
+
+## Stage 4: Stop
+
+**Timing**: When user stops or pauses asking questions
+
+**Hook**: `summary-hook.js`
+
+### Sequence Diagram
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant IDE as IDE/Extension
+    participant SummaryHook as summary-hook.js
+    participant Worker as Worker Service
+    participant Agent as SDK Agent
+    participant DB as SQLite Database
+
+    User->>IDE: Stops asking questions<br/>(pause, idle, or explicit stop)
+    IDE->>SummaryHook: Stop hook triggered<br/>{ session_id, cwd, transcript_path }
+
+    SummaryHook->>SummaryHook: Read transcript JSONL file
+    SummaryHook->>SummaryHook: Extract last user message<br/>(type: "user")
+    SummaryHook->>SummaryHook: Extract last assistant message<br/>(type: "assistant", filter &lt;system-reminder&gt;)
+
+    SummaryHook->>Worker: POST /api/sessions/summarize<br/>{ claudeSessionId, last_user_message, last_assistant_message }<br/>(fire-and-forget, 2s timeout)
+
+    SummaryHook->>Worker: POST /api/processing<br/>{ isProcessing: false }<br/>(stop spinner)
+
+    SummaryHook-->>IDE: { continue: true, suppressOutput: true }
+    IDE-->>User: Session paused/stopped
+
+    Note over Worker,DB: Async path
+
+    Worker->>Worker: Lookup sessionDbId from claudeSessionId
+    Worker->>Agent: Queue summarization request
+    Agent->>Agent: Call Claude SDK with prompt:<br/>"Summarize: request, investigated, learned, completed, next_steps"
+    Agent->>Agent: Parse XML response
+    Agent->>DB: INSERT INTO session_summaries<br/>{ session_id, request, investigated, learned, completed, next_steps }
+    Agent->>DB: Sync to Chroma (for semantic search)
+
+    Note over SummaryHook,DB: Total sync time: ~2ms<br/>AI summarization: 2-5s (async)
+```
+
+**Key Pattern:** The summary is generated asynchronously and doesn't block the user from resuming work or closing the session.
+
+**Input** (via stdin):
+```json
+{
+  "session_id": "claude-session-123",
+  "cwd": "/path/to/project",
+  "transcript_path": "/path/to/transcript.jsonl"
+}
+```
+
+**Processing Steps**:
+
+```typescript
+// 1. Extract last messages from transcript JSONL
+const lines = fs.readFileSync(transcript_path, 'utf-8').split('\n')
+// Find last user message (type: "user")
+// Find last assistant message (type: "assistant", filter <system-reminder> tags)
+
+// 2. Ensure worker is running
+await ensureWorkerRunning()
+
+// 3. Send summarization request (fire-and-forget HTTP)
+POST http://127.0.0.1:37777/api/sessions/summarize
+Body: {
+  claudeSessionId: session_id,
+  last_user_message: string,
+  last_assistant_message: string
+}
+Timeout: 2000ms
+
+// 4. Stop processing spinner
+POST http://127.0.0.1:37777/api/processing
+Body: { isProcessing: false }
+```
+
+**Worker Processing**:
+1. Queues summarization for SDK agent
+2. Agent calls Claude to generate structured summary
+3. Summary stored in database with fields: `request`, `investigated`, `learned`, `completed`, `next_steps`
+
+**Output**:
+```json
+{ "continue": true, "suppressOutput": true }
+```
+
+**Implementation**: `src/hooks/summary-hook.ts`
+
+---
+
+## Stage 5: SessionEnd
+
+**Timing**: When Claude Code session closes (exit, clear, logout, etc.)
+
+**Hook**: `cleanup-hook.js`
+
+### Sequence Diagram
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant IDE as IDE/Extension
+    participant CleanupHook as cleanup-hook.js
+    participant Worker as Worker Service
+    participant DB as SQLite Database
+    participant SSE as SSE Clients (Viewer UI)
+
+    User->>IDE: Closes session<br/>(exit, clear, logout)
+    IDE->>CleanupHook: SessionEnd hook triggered<br/>{ session_id, cwd, transcript_path, reason }
+
+    CleanupHook->>Worker: POST /api/sessions/complete<br/>{ claudeSessionId, reason }<br/>(fire-and-forget, 2s timeout)
+
+    CleanupHook-->>IDE: { continue: true, suppressOutput: true }
+    IDE-->>User: Session closed
+
+    Note over Worker,SSE: Async path
+
+    Worker->>Worker: Lookup sessionDbId from claudeSessionId
+    Worker->>DB: UPDATE sdk_sessions<br/>SET status = 'completed', completed_at = NOW()<br/>WHERE claude_session_id = claudeSessionId
+    Worker->>SSE: Broadcast session completion event<br/>(for live viewer UI updates)
+
+    SSE-->>SSE: Update UI to show session as completed
+
+    Note over CleanupHook,SSE: Total sync time: ~2ms
+```
+
+**Key Pattern:** Session completion is tracked for analytics and UI updates, but doesn't prevent the user from closing the IDE.
+
+**Input** (via stdin):
+```json
+{
+  "session_id": "claude-session-123",
+  "cwd": "/path/to/project",
+  "transcript_path": "/path/to/transcript.jsonl",
+  "reason": "exit"
+}
+```
+
+**Processing Steps**:
+
+```typescript
+// Send session complete (fire-and-forget HTTP)
+POST http://127.0.0.1:37777/api/sessions/complete
+Body: {
+  claudeSessionId: session_id,
+  reason: string  // 'exit' | 'clear' | 'logout' | 'prompt_input_exit' | 'other'
+}
+Timeout: 2000ms
+```
+
+**Worker Processing**:
+1. Finds session by `claudeSessionId`
+2. Marks session as 'completed' in database
+3. Broadcasts session completion event to SSE clients
+
+**Output**:
+```json
+{ "continue": true, "suppressOutput": true }
+```
+
+**Implementation**: `src/hooks/cleanup-hook.ts`
+
+---
+
+## Session State Machine
+
+Understanding session lifecycle and state transitions:
+
+```mermaid
+stateDiagram-v2
+    [*] --> Initialized: SessionStart hook<br/>(generate session_id)
+
+    Initialized --> Active: UserPromptSubmit<br/>(first prompt)
+
+    Active --> Active: UserPromptSubmit<br/>(continuation prompts)<br/>promptNumber++
+
+    Active --> ObservationQueued: PostToolUse hook<br/>(tool execution captured)
+
+    ObservationQueued --> Active: Observation processed<br/>(async, non-blocking)
+
+    Active --> Summarizing: Stop hook<br/>(user pauses/stops)
+
+    Summarizing --> Active: User resumes<br/>(new prompt submitted)
+
+    Summarizing --> Completed: SessionEnd hook<br/>(session closes)
+
+    Active --> Completed: SessionEnd hook<br/>(session closes)
+
+    Completed --> [*]
+
+    note right of Active
+        session_id: constant (e.g., "claude-session-abc123")
+        sessionDbId: constant (e.g., 42)
+        promptNumber: increments (1, 2, 3, ...)
+        All operations use same sessionDbId
+    end note
+
+    note right of ObservationQueued
+        Fire-and-forget HTTP
+        AI compression happens async
+        IDE never blocks
+    end note
+```
+
+**Key Insights:**
+- `session_id` never changes during a conversation
+- `sessionDbId` is the database primary key for the session
+- `promptNumber` increments with each user prompt
+- State transitions are non-blocking (fire-and-forget pattern)
+
+---
+
+## Database Schema
+
+The session-centric data model that enables cross-session memory:
+
+```mermaid
+erDiagram
+    SDK_SESSIONS ||--o{ USER_PROMPTS : "has many"
+    SDK_SESSIONS ||--o{ OBSERVATIONS : "has many"
+    SDK_SESSIONS ||--o{ SESSION_SUMMARIES : "has many"
+
+    SDK_SESSIONS {
+        integer id PK "Auto-increment primary key"
+        text claude_session_id UK "From IDE (e.g., 'claude-session-123')"
+        text project "Project name from cwd basename"
+        text first_user_prompt "Initial prompt that started session"
+        integer prompt_counter "Increments with each UserPromptSubmit"
+        text status "initialized | active | completed"
+        datetime created_at
+        datetime completed_at
+    }
+
+    USER_PROMPTS {
+        integer id PK
+        integer session_id FK "References SDK_SESSIONS.id"
+        integer prompt_number "1, 2, 3, ... matches prompt_counter"
+        text prompt "User's actual prompt (tags stripped)"
+        datetime created_at
+    }
+
+    OBSERVATIONS {
+        integer id PK
+        integer session_id FK "References SDK_SESSIONS.id"
+        integer prompt_number "Which prompt this observation belongs to"
+        text tool_name "Read, Bash, Grep, Write, etc."
+        text tool_input_json "Stripped of privacy tags"
+        text tool_response_text "Stripped of privacy tags"
+        text compressed_observation "AI-generated structured observation"
+        datetime created_at
+    }
+
+    SESSION_SUMMARIES {
+        integer id PK
+        integer session_id FK "References SDK_SESSIONS.id"
+        text request "What user requested"
+        text investigated "What was explored"
+        text learned "What was discovered"
+        text completed "What was accomplished"
+        text next_steps "What remains to be done"
+        datetime created_at
+    }
+```
+
+**Idempotency Pattern:**
+
+```sql
+-- This ensures same session_id always maps to same sessionDbId
+INSERT OR IGNORE INTO sdk_sessions (claude_session_id, project, first_user_prompt)
+VALUES (?, ?, ?)
+RETURNING id;
+
+-- If already exists, returns existing row
+-- If new, creates and returns new row
+```
+
+**Foreign Key Cascade:**
+
+All child tables (user_prompts, observations, session_summaries) use `session_id` foreign key referencing `SDK_SESSIONS.id`. This ensures:
+- All data for a session is queryable by sessionDbId
+- Session deletions cascade to child tables
+- Efficient joins for context injection
+
+<Warning>
+Never generate your own session IDs. Always use the `session_id` provided by the IDE - this is the source of truth for linking all data together.
+</Warning>
+
+---
+
+## Privacy & Tag Stripping
+
+### Dual-Tag System
+
+```typescript
+// User-Level Privacy Control (manual)
+<private>sensitive data</private>
+
+// System-Level Recursion Prevention (auto-injected)
+<claude-mem-context>...</claude-mem-context>
+```
+
+### Processing Pipeline
+
+**Location**: `src/utils/tag-stripping.ts`
+
+```typescript
+// Called by: new-hook.js (user prompts)
+stripMemoryTagsFromPrompt(prompt: string): string
+
+// Called by: save-hook.js (tool_input, tool_response)
+stripMemoryTagsFromJson(jsonString: string): string
+```
+
+**Execution Order** (Edge Processing):
+1. `new-hook.js` strips tags from user prompt before saving
+2. `save-hook.js` strips tags from tool data before sending to worker
+3. Worker strips tags again (defense in depth) before storing
+
+---
+
+## SDK Agent Processing
+
+### Query Loop (Event-Driven)
+
+**Location**: `src/services/worker/SDKAgent.ts`
+
+```typescript
+async startSession(session: ActiveSession, worker?: any) {
+  // 1. Create event-driven message generator
+  const messageGenerator = this.createMessageGenerator(session)
+
+  // 2. Run Agent SDK query loop
+  const queryResult = query({
+    prompt: messageGenerator,
+    options: {
+      model: 'claude-sonnet-4-5',
+      disallowedTools: ['Bash', 'Read', 'Write', ...],  // Observer-only
+      abortController: session.abortController
+    }
+  })
+
+  // 3. Process responses
+  for await (const message of queryResult) {
+    if (message.type === 'assistant') {
+      await this.processSDKResponse(session, text, worker)
+    }
+  }
+}
+```
+
+### Message Types
+
+The message generator yields three types of prompts:
+
+1. **Initial Prompt** (prompt #1): Full instructions for starting observation
+2. **Continuation Prompt** (prompt #2+): Context-only for continuing work
+3. **Observation Prompts**: Tool use data to compress into observations
+4. **Summary Prompts**: Session data to summarize
+
+---
+
+## Implementation Checklist
+
+For developers implementing this pattern on other platforms:
+
+### Hook Registration
+- [ ] Define hook entry points in platform config
+- [ ] 5 hook types: SessionStart (2 hooks), UserPromptSubmit, PostToolUse, Stop, SessionEnd
+- [ ] Pass `session_id`, `cwd`, and context-specific data
+
+### Database Schema
+- [ ] SQLite with WAL mode
+- [ ] 4 main tables: `sdk_sessions`, `user_prompts`, `observations`, `session_summaries`
+- [ ] Indices for common queries
+
+### Worker Service
+- [ ] HTTP server on configurable port (default 37777)
+- [ ] Bun runtime for process management
+- [ ] 3 core services: SessionManager, SDKAgent, DatabaseManager
+
+### Hook Implementation
+- [ ] context-hook: `GET /api/context/inject` (with health check)
+- [ ] new-hook: createSDKSession, saveUserPrompt, `POST /sessions/{id}/init`
+- [ ] save-hook: Skip low-value tools, `POST /api/sessions/observations`
+- [ ] summary-hook: Parse transcript, `POST /api/sessions/summarize`
+- [ ] cleanup-hook: `POST /api/sessions/complete`
+
+### Privacy & Tags
+- [ ] Implement `stripMemoryTagsFromPrompt()` and `stripMemoryTagsFromJson()`
+- [ ] Process tags at hook layer (edge processing)
+- [ ] Max tag count = 100 (ReDoS protection)
+
+### SDK Integration
+- [ ] Call Claude Agent SDK to process observations/summaries
+- [ ] Parse XML responses for structured data
+- [ ] Store to database + sync to vector DB
+
+---
+
+## Key Design Principles
+
+1. **Session ID is Source of Truth**: Never generate your own session IDs
+2. **Idempotent Database Operations**: Use `INSERT OR IGNORE` for session creation
+3. **Edge Processing for Privacy**: Strip tags at hook layer before data reaches worker
+4. **Fire-and-Forget for Non-Blocking**: HTTP timeouts prevent IDE blocking
+5. **Event-Driven, Not Polling**: Zero-latency queue notification to SDK agent
+6. **Everything Saves Always**: No "orphaned" sessions
+
+---
+
+## Common Pitfalls
+
+| Problem | Root Cause | Solution |
+|---------|-----------|----------|
+| Session ID mismatch | Different `session_id` used in different hooks | Always use ID from hook input |
+| Duplicate sessions | Creating new session instead of using existing | Use `INSERT OR IGNORE` with `session_id` as key |
+| Blocking IDE | Waiting for full response | Use fire-and-forget with short timeouts |
+| Memory tags in DB | Stripping tags in wrong layer | Strip at hook layer, before HTTP send |
+| Worker not found | Health check too fast | Add retry loop with exponential backoff |
+
+---
+
+## Related Documentation
+
+- [Worker Service](/architecture/worker-service) - HTTP API and async processing
+- [Database Schema](/architecture/database) - SQLite tables and FTS5 search
+- [Privacy Tags](/usage/private-tags) - Using `<private>` tags
+- [Troubleshooting](/troubleshooting) - Common hook issues
--- a/.agent/services/claude-mem/docs/public/architecture/overview.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/overview.mdx
@@ -0,0 +1,240 @@
+---
+title: "Architecture Overview"
+description: "System components and data flow in Claude-Mem"
+---
+
+# Architecture Overview
+
+## System Components
+
+Claude-Mem operates as a Claude Code plugin with five core components:
+
+1. **Plugin Hooks** - Capture lifecycle events (6 hook files)
+2. **Smart Install** - Cached dependency checker (pre-hook script, runs before context-hook)
+3. **Worker Service** - Process observations via Claude Agent SDK + HTTP API (10 search endpoints)
+4. **Database Layer** - Store sessions and observations (SQLite + FTS5 + ChromaDB)
+5. **mem-search Skill** - Skill-based search with progressive disclosure (v5.4.0+)
+6. **Viewer UI** - Web-based real-time memory stream visualization
+
+## Technology Stack
+
+| Layer                  | Technology                                |
+|------------------------|-------------------------------------------|
+| **Language**           | TypeScript (ES2022, ESNext modules)       |
+| **Runtime**            | Node.js 18+                               |
+| **Database**           | SQLite 3 with bun:sqlite driver           |
+| **Vector Store**       | ChromaDB (optional, for semantic search)  |
+| **HTTP Server**        | Express.js 4.18                           |
+| **Real-time**          | Server-Sent Events (SSE)                  |
+| **UI Framework**       | React + TypeScript                        |
+| **AI SDK**             | @anthropic-ai/claude-agent-sdk            |
+| **Build Tool**         | esbuild (bundles TypeScript)              |
+| **Process Manager**    | Bun                                       |
+| **Testing**            | Node.js built-in test runner              |
+
+## Data Flow
+
+### Memory Pipeline
+```
+Hook (stdin) → Database → Worker Service → SDK Processor → Database → Next Session Hook
+```
+
+1. **Input**: Claude Code sends tool execution data via stdin to hooks
+2. **Storage**: Hooks write observations to SQLite database
+3. **Processing**: Worker service reads observations, processes via SDK
+4. **Output**: Processed summaries written back to database
+5. **Retrieval**: Next session's context hook reads summaries from database
+
+### Search Pipeline
+```
+User Query → MCP Tools Invoked → HTTP API → SessionSearch Service → FTS5 Database → Search Results → Claude
+```
+
+1. **User Query**: User asks naturally: "What bugs did we fix?"
+2. **MCP Tools Invoked**: Claude recognizes intent and invokes MCP search tools
+3. **HTTP API**: MCP tools call HTTP endpoint (e.g., `/api/search/observations`)
+4. **SessionSearch**: Worker service queries FTS5 virtual tables
+5. **Format**: Results formatted and returned via MCP
+6. **Return**: Claude presents formatted results to user
+
+Uses 3-layer progressive disclosure: search → timeline → get_observations
+
+## Session Lifecycle
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│ 0. Smart Install Pre-Hook Fires                                 │
+│    Checks dependencies (cached), only runs on version changes   │
+│    Not a lifecycle hook - runs before context-hook starts       │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 1. Session Starts → Context Hook Fires                          │
+│    Starts Bun worker if needed, injects context from previous   │
+│    sessions (configurable observation count)                    │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 2. User Types Prompt → UserPromptSubmit Hook Fires              │
+│    Creates session in database, saves raw user prompt for FTS5  │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 3. Claude Uses Tools → PostToolUse Hook Fires (100+ times)      │
+│    Captures tool executions, sends to worker for AI compression │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 4. Worker Processes → Claude Agent SDK Analyzes                 │
+│    Extracts structured learnings via iterative AI processing    │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 5. Claude Stops → Summary Hook Fires                            │
+│    Generates final summary with request, completions, learnings │
+└─────────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────────┐
+│ 6. Session Ends → Cleanup Hook Fires                            │
+│    Marks session complete (graceful, not DELETE), ready for     │
+│    next session context. Skips on /clear to preserve ongoing    │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+## Directory Structure
+
+```
+claude-mem/
+├── src/
+│   ├── hooks/                  # Hook implementations (6 hooks)
+│   │   ├── context-hook.ts     # SessionStart
+│   │   ├── user-message-hook.ts # UserMessage (for debugging)
+│   │   ├── new-hook.ts         # UserPromptSubmit
+│   │   ├── save-hook.ts        # PostToolUse
+│   │   ├── summary-hook.ts     # Stop
+│   │   ├── cleanup-hook.ts     # SessionEnd
+│   │   └── hook-response.ts    # Hook response utilities
+│   │
+│   ├── sdk/                    # Claude Agent SDK integration
+│   │   ├── prompts.ts          # XML prompt builders
+│   │   ├── parser.ts           # XML response parser
+│   │   └── worker.ts           # Main SDK agent loop
+│   │
+│   ├── services/
+│   │   ├── worker-service.ts   # Express HTTP + SSE service
+│   │   └── sqlite/             # Database layer
+│   │       ├── SessionStore.ts # CRUD operations
+│   │       ├── SessionSearch.ts # FTS5 search service
+│   │       ├── migrations.ts
+│   │       └── types.ts
+│   │
+│   ├── ui/                     # Viewer UI
+│   │   └── viewer/             # React + TypeScript web interface
+│   │       ├── components/     # UI components
+│   │       ├── hooks/          # React hooks
+│   │       ├── utils/          # Utilities
+│   │       └── assets/         # Fonts, logos
+│   │
+│   ├── shared/                 # Shared utilities
+│   │   ├── config.ts
+│   │   ├── paths.ts
+│   │   └── storage.ts
+│   │
+│   └── utils/
+│       ├── logger.ts
+│       ├── platform.ts
+│       └── port-allocator.ts
+│
+├── scripts/                    # Build and utility scripts
+│   └── smart-install.js        # Cached dependency checker (pre-hook)
+│
+├── plugin/                     # Plugin distribution
+│   ├── .claude-plugin/
+│   │   └── plugin.json
+│   ├── hooks/
+│   │   └── hooks.json
+│   ├── scripts/                # Built executables
+│   │   ├── context-hook.js
+│   │   ├── user-message-hook.js
+│   │   ├── new-hook.js
+│   │   ├── save-hook.js
+│   │   ├── summary-hook.js
+│   │   ├── cleanup-hook.js
+│   │   └── worker-service.cjs  # Background worker + HTTP API
+│   │
+│   ├── skills/                 # Agent skills (v5.4.0+)
+│   │   ├── mem-search/         # Search skill with progressive disclosure (v5.5.0)
+│   │   │   ├── SKILL.md        # Skill frontmatter (~250 tokens)
+│   │   │   ├── operations/     # 12 detailed operation docs
+│   │   │   └── principles/     # 2 principle guides
+│   │   ├── troubleshoot/       # Troubleshooting skill
+│   │   │   ├── SKILL.md
+│   │   │   └── operations/     # 6 operation docs
+│   │   └── version-bump/       # Version management skill (deprecated)
+│   │
+│   └── ui/                     # Built viewer UI
+│       └── viewer.html         # Self-contained bundle
+│
+├── tests/                      # Test suite
+├── docs/                       # Documentation
+└── ecosystem.config.cjs        # Process configuration (deprecated)
+```
+
+## Component Details
+
+### 1. Plugin Hooks (6 Hooks)
+- **context-hook.js** - SessionStart: Starts Bun worker, injects context
+- **user-message-hook.js** - UserMessage: Debugging hook
+- **new-hook.js** - UserPromptSubmit: Creates session, saves prompt
+- **save-hook.js** - PostToolUse: Captures tool executions
+- **summary-hook.js** - Stop: Generates session summary
+- **cleanup-hook.js** - SessionEnd: Marks session complete
+
+**Note**: smart-install.js is a pre-hook dependency checker (not a lifecycle hook). It's called before context-hook via command chaining in hooks.json and only runs when dependencies need updating.
+
+See [Plugin Hooks](/architecture/hooks) for detailed hook documentation.
+
+### 2. Worker Service
+Express.js HTTP server on port 37777 (configurable) with:
+- 10 search HTTP API endpoints (v5.4.0+)
+- 8 viewer UI HTTP/SSE endpoints
+- Async observation processing via Claude Agent SDK
+- Real-time updates via Server-Sent Events
+- Auto-managed by Bun
+
+See [Worker Service](/architecture/worker-service) for HTTP API and endpoints.
+
+### 3. Database Layer
+SQLite3 with bun:sqlite driver featuring:
+- FTS5 virtual tables for full-text search
+- SessionStore for CRUD operations
+- SessionSearch for FTS5 queries
+- Location: `~/.claude-mem/claude-mem.db`
+
+See [Database Architecture](/architecture/database) for schema and FTS5 search.
+
+### 4. mem-search Skill (v5.4.0+)
+Skill-based search with progressive disclosure providing 10 search operations:
+- Search observations, sessions, prompts (full-text FTS5)
+- Filter by type, concept, file
+- Get recent context, timeline, timeline by query
+- API help documentation
+
+**Token Savings**: ~2,250 tokens per session vs MCP approach
+- Skill frontmatter: ~250 tokens (loaded at session start)
+- Full instructions: ~2,500 tokens (loaded on-demand when invoked)
+- HTTP API endpoints instead of MCP tools
+
+**Skill Enhancement (v5.5.0)**: Renamed from "search" to "mem-search" for better scope differentiation. Effectiveness increased from 67% to 100% with enhanced triggers and comprehensive documentation.
+
+See [Search Architecture](/architecture/search-architecture) for technical details and examples.
+
+### 5. Viewer UI
+React + TypeScript web interface at http://localhost:37777 featuring:
+- Real-time memory stream via Server-Sent Events
+- Infinite scroll pagination with automatic deduplication
+- Project filtering and settings persistence
+- GPU-accelerated animations
+- Self-contained HTML bundle (viewer.html)
+
+Built with esbuild into a single file deployment.
--- a/.agent/services/claude-mem/docs/public/architecture/pm2-to-bun-migration.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/pm2-to-bun-migration.mdx
@@ -0,0 +1,559 @@
+---
+title: "PM2 to Bun Migration"
+description: "Complete technical documentation for the process management and database driver migration in v7.1.0"
+---
+
+<Note>
+**Historical Migration Documentation**
+
+This document describes the PM2 to Bun migration that occurred in v7.1.0 (December 2025). If you're installing claude-mem for the first time, this migration has already been completed and you can use the current Bun-based system documented in the main guides.
+
+This documentation is preserved for users upgrading from versions older than v7.1.0.
+</Note>
+
+# PM2 to Bun Migration: Complete Technical Documentation
+
+**Version**: 7.1.0
+**Date**: December 2025
+**Migration Type**: Process Management (PM2 → Bun) + Database Driver (better-sqlite3 → bun:sqlite)
+
+## Executive Summary
+
+Claude-mem version 7.1.0 introduces two major architectural migrations:
+
+1. **Process Management**: PM2 → Custom Bun-based ProcessManager
+2. **Database Driver**: better-sqlite3 npm package → bun:sqlite runtime module
+
+Both migrations are **automatic** and **transparent** to end users. The first time a hook fires after updating to 7.1.0+, the system performs a one-time cleanup of legacy PM2 processes and transitions to the new architecture.
+
+### Key Benefits
+
+- **Simplified Dependencies**: Removes PM2 and better-sqlite3 npm packages
+- **Improved Cross-Platform Support**: Better Windows compatibility
+- **Faster Installation**: No native module compilation required
+- **Built-in Runtime**: Leverages Bun's built-in process management and SQLite
+- **Reduced Complexity**: Custom ProcessManager is simpler than PM2 integration
+
+### Migration Impact
+
+- **Data Preservation**: User data, settings, and database remain unchanged
+- **Automatic Cleanup**: Old PM2 processes automatically terminated (all platforms)
+- **No User Action Required**: Migration happens automatically on first hook trigger
+- **Backward Compatible**: SQLite database format unchanged (only driver changed)
+
+## Architecture Comparison
+
+### Old System (PM2-based)
+
+<AccordionGroup>
+<Accordion title="Process Management (PM2)">
+**Component**: PM2 (Process Manager 2)
+- **Package**: `pm2` npm dependency
+- **Process Name**: `claude-mem-worker`
+- **Management**: External PM2 daemon manages lifecycle
+- **Discovery**: `pm2 list`, `pm2 describe` commands
+- **Auto-restart**: PM2 automatically restarts on crash
+- **Logs**: `~/.pm2/logs/claude-mem-worker-*.log`
+- **PID File**: `~/.pm2/pids/claude-mem-worker.pid`
+
+**Lifecycle Commands**:
+```bash
+pm2 start <script>           # Start worker
+pm2 stop claude-mem-worker   # Stop worker
+pm2 restart claude-mem-worker # Restart worker
+pm2 delete claude-mem-worker  # Remove from PM2
+pm2 logs claude-mem-worker    # View logs
+```
+
+**Pain Points**:
+- Additional npm dependency required
+- PM2 daemon must be running
+- Potential conflicts with other PM2 processes
+- Windows compatibility issues
+- Complex configuration for simple use case
+</Accordion>
+
+<Accordion title="Database Driver (better-sqlite3)">
+**Component**: better-sqlite3
+- **Package**: `better-sqlite3` npm package (native module)
+- **Installation**: Requires native compilation (node-gyp)
+- **Windows**: Requires Visual Studio build tools + Python
+- **Import**: `import Database from 'better-sqlite3'`
+
+**Installation Requirements**:
+- Node.js development headers
+- C++ compiler (gcc/clang on Mac/Linux, MSVC on Windows)
+- Python (for node-gyp)
+- Windows: Visual Studio Build Tools
+</Accordion>
+</AccordionGroup>
+
+### New System (Bun-based)
+
+<AccordionGroup>
+<Accordion title="Process Management (Custom ProcessManager)">
+**Component**: Custom ProcessManager (`src/services/process/ProcessManager.ts`)
+- **Package**: Built-in Bun APIs (no external dependency)
+- **Process Spawn**: `Bun.spawn()` with detached mode
+- **Management**: Direct process control via PID file
+- **Discovery**: PID file + process existence check + HTTP health check
+- **Auto-restart**: Hook-triggered restart on failure detection
+- **Logs**: `~/.claude-mem/logs/worker-YYYY-MM-DD.log`
+- **PID File**: `~/.claude-mem/.worker.pid`
+- **Port File**: `~/.claude-mem/.worker.port` (new)
+
+**Lifecycle Commands**:
+```bash
+npm run worker:start    # Start worker
+npm run worker:stop     # Stop worker
+npm run worker:restart  # Restart worker
+npm run worker:status   # Check status
+npm run worker:logs     # View logs
+```
+
+**Core Mechanisms**:
+
+1. **PID File Management**:
+   - File: `~/.claude-mem/.worker.pid`
+   - Content: Process ID (e.g., "35557")
+   - Validation: Process existence via `kill(pid, 0)` signal
+
+2. **Port File Management**:
+   - File: `~/.claude-mem/.worker.port`
+   - Content: Two lines (port number, PID)
+   - Purpose: Track port binding and validate PID match
+
+3. **Health Checking**:
+   - Layer 1: PID file exists?
+   - Layer 2: Process alive? (`kill(pid, 0)`)
+   - Layer 3: HTTP health check (`GET /health`)
+   - All three must pass for "healthy" status
+
+**Advantages**:
+- No external dependencies
+- Simpler codebase (direct control)
+- Better error handling and validation
+- Platform-agnostic (Bun handles platform differences)
+</Accordion>
+
+<Accordion title="Database Driver (bun:sqlite)">
+**Component**: bun:sqlite
+- **Package**: Built into Bun runtime (no npm package)
+- **Installation**: None required (comes with Bun ≥1.0)
+- **Platform**: Works anywhere Bun works
+- **Import**: `import { Database } from 'bun:sqlite'`
+- **API**: Similar to better-sqlite3 (synchronous)
+
+**Installation Requirements**:
+- Bun ≥1.0 (automatically installed if missing)
+- No native compilation required
+- No platform-specific build tools needed
+
+**Compatibility**:
+- SQLite database format: **Unchanged**
+- Database file: `~/.claude-mem/claude-mem.db` (same location)
+- Query syntax: **Identical** (both use SQLite SQL)
+</Accordion>
+</AccordionGroup>
+
+## Migration Mechanics
+
+### One-Time PM2 Cleanup
+
+The migration system uses a marker-based approach to perform PM2 cleanup exactly once.
+
+**Implementation**: `src/shared/worker-utils.ts:73-86`
+
+```typescript
+// Clean up legacy PM2 (one-time migration)
+const pm2MigratedMarker = join(DATA_DIR, '.pm2-migrated');
+
+if (!existsSync(pm2MigratedMarker)) {
+  try {
+    spawnSync('pm2', ['delete', 'claude-mem-worker'], { stdio: 'ignore' });
+    // Mark migration as complete
+    writeFileSync(pm2MigratedMarker, new Date().toISOString(), 'utf-8');
+    logger.debug('SYSTEM', 'PM2 cleanup completed and marked');
+  } catch {
+    // PM2 not installed or process doesn't exist - still mark as migrated
+    writeFileSync(pm2MigratedMarker, new Date().toISOString(), 'utf-8');
+  }
+}
+```
+
+### Migration Trigger Points
+
+<Steps>
+<Step title="Hook Execution">
+SessionStart, UserPromptSubmit, or PostToolUse hooks execute using new 7.1.0 code
+</Step>
+<Step title="Worker Status Check">
+`ensureWorkerRunning()` checks if `~/.claude-mem/.worker.pid` exists (it doesn't for first run after update)
+</Step>
+<Step title="Start Worker Decision">
+Worker not running → Call `startWorker()`
+</Step>
+<Step title="Migration Check">
+Check if `~/.claude-mem/.pm2-migrated` exists
+</Step>
+<Step title="PM2 Cleanup">
+Execute `pm2 delete claude-mem-worker` (errors ignored), create marker file
+</Step>
+<Step title="New Worker Start">
+Spawn new Bun-managed worker process with PID and port files
+</Step>
+</Steps>
+
+### Marker File
+
+**Location**: `~/.claude-mem/.pm2-migrated`
+
+**Content**: ISO 8601 timestamp
+```
+2025-12-13T00:18:39.673Z
+```
+
+**Purpose**:
+- One-time migration flag
+- Prevents repeated PM2 cleanup on every start
+- Persists across restarts and reboots
+
+**Lifecycle**:
+- Created: First hook trigger after update to 7.1.0+ (all platforms)
+- Updated: Never
+- Deleted: Never (user could manually delete to force re-migration)
+
+## User Experience Timeline
+
+### First Session After Update
+
+<Note>
+This is the critical migration moment. The process takes approximately 2-5 seconds.
+</Note>
+
+**Step-by-Step Execution**:
+
+1. **Hook fires** (SessionStart most common)
+2. **Worker status check**: No PID file → worker not running
+3. **Migration check**: No marker file → run PM2 cleanup
+4. **PM2 cleanup**: `pm2 delete claude-mem-worker` (old worker terminated)
+5. **Marker creation**: `~/.claude-mem/.pm2-migrated` with timestamp
+6. **New worker start**: Bun process spawned, PID/port files created
+7. **Verification**: Process check + HTTP health check
+8. **Hook completes**: Claude Code session starts normally
+
+**User Observable Behavior**:
+- Slight delay on first startup (PM2 cleanup + new worker spawn)
+- No error messages (cleanup failures silently handled)
+- Worker appears running via `npm run worker:status`
+- Old PM2 worker no longer in `pm2 list`
+
+### Subsequent Sessions
+
+After migration completes, every hook trigger follows the fast path:
+
+1. PID file exists? **YES**
+2. Process alive? **YES**
+3. HTTP health check? **SUCCESS**
+4. Result: Worker already running, done (~50ms)
+
+No migration logic runs on subsequent sessions.
+
+## Platform-Specific Behavior
+
+### Platform Comparison
+
+| Feature | macOS | Linux | Windows |
+|---------|-------|-------|---------|
+| PM2 Cleanup | Attempted | Attempted | Attempted |
+| Marker File | Created | Created | Created |
+| Process Signals | POSIX (native) | POSIX (native) | Bun abstraction |
+| Bun Support | Full | Full | Full |
+| PID File | Yes | Yes | Yes |
+| Port File | Yes | Yes | Yes |
+| Health Check | HTTP | HTTP | HTTP |
+| Migration Delay | ~2-5s first time | ~2-5s first time | ~2-5s first time |
+
+### Platform Notes
+
+<Tabs>
+<Tab title="macOS">
+- POSIX signal handling works natively
+- Bun fully supported
+- No platform-specific workarounds needed
+</Tab>
+<Tab title="Linux">
+- Identical behavior to macOS
+- POSIX signal handling
+- Works on Ubuntu, Debian, RHEL, CentOS, Arch
+- Alpine may require glibc (not musl)
+</Tab>
+<Tab title="Windows">
+- PM2 cleanup now runs (safe due to try/catch)
+- Bun abstracts signal handling differences
+- Path module handles Windows separators
+- File locking handled by SQLite
+</Tab>
+</Tabs>
+
+## Observable Changes
+
+### Command Changes
+
+| Old (PM2) | New (Bun) | Notes |
+|-----------|-----------|-------|
+| `pm2 list` | `npm run worker:status` | Shows worker status |
+| `pm2 start <script>` | `npm run worker:start` | Start worker |
+| `pm2 stop claude-mem-worker` | `npm run worker:stop` | Stop worker |
+| `pm2 restart claude-mem-worker` | `npm run worker:restart` | Restart worker |
+| `pm2 delete claude-mem-worker` | `npm run worker:stop` | Remove worker |
+| `pm2 logs claude-mem-worker` | `npm run worker:logs` | View logs |
+| `pm2 describe claude-mem-worker` | `npm run worker:status` | Detailed status |
+| `pm2 monit` | No equivalent | PM2-specific monitoring |
+
+### File Location Changes
+
+**Logs**:
+```
+Old: ~/.pm2/logs/claude-mem-worker-out.log
+     ~/.pm2/logs/claude-mem-worker-error.log
+
+New: ~/.claude-mem/logs/worker-YYYY-MM-DD.log
+```
+
+**PID Files**:
+```
+Old: ~/.pm2/pids/claude-mem-worker.pid
+
+New: ~/.claude-mem/.worker.pid
+```
+
+**Process State**:
+```
+Old: PM2 daemon memory (pm2 save)
+
+New: ~/.claude-mem/.worker.pid
+     ~/.claude-mem/.worker.port
+     ~/.claude-mem/.pm2-migrated (all platforms)
+```
+
+**Database** (unchanged):
+```
+Same: ~/.claude-mem/claude-mem.db
+```
+
+### User-Visible Changes
+
+**Before Update**:
+```bash
+$ pm2 list
+┌────┬────────────────────┬─────────┬─────────┬──────────┐
+│ id │ name               │ status  │ restart │ uptime   │
+├────┼────────────────────┼─────────┼─────────┼──────────┤
+│ 0  │ claude-mem-worker  │ online  │ 0       │ 2d 5h    │
+└────┴────────────────────┴─────────┴─────────┴──────────┘
+```
+
+**After Update**:
+```bash
+$ pm2 list
+# Empty - worker no longer managed by PM2
+
+$ npm run worker:status
+Worker is running
+PID: 35557
+Port: 37777
+Uptime: 2h 15m
+```
+
+### Orphaned Files
+
+After migration, these PM2 files may remain (safe to delete):
+
+```
+~/.pm2/                    # Entire PM2 directory
+~/.pm2/logs/               # Old logs
+~/.pm2/pids/               # Old PID files
+~/.pm2/pm2.log             # PM2 daemon log
+~/.pm2/dump.pm2            # PM2 process dump
+```
+
+**Cleanup (optional)**:
+```bash
+# Remove PM2 entirely (if not used for other processes)
+pm2 kill
+rm -rf ~/.pm2
+
+# Or just remove claude-mem logs
+rm -f ~/.pm2/logs/claude-mem-worker-*.log
+rm -f ~/.pm2/pids/claude-mem-worker.pid
+```
+
+## File System State
+
+### State Directory Structure
+
+**Before Migration** (PM2 system):
+```
+~/.claude-mem/
+├── claude-mem.db          # Database (unchanged)
+├── chroma/                # Vector embeddings (unchanged)
+├── logs/                  # Application logs (unchanged)
+└── settings.json          # User settings (unchanged)
+
+~/.pm2/
+├── logs/
+│   ├── claude-mem-worker-out.log
+│   └── claude-mem-worker-error.log
+├── pids/
+│   └── claude-mem-worker.pid
+└── pm2.log
+```
+
+**After Migration** (Bun system):
+```
+~/.claude-mem/
+├── claude-mem.db          # Database (same file)
+├── chroma/                # Vector embeddings (unchanged)
+├── logs/
+│   └── worker-2025-12-13.log  # New log format
+├── settings.json          # User settings (unchanged)
+├── .worker.pid            # NEW: Process ID
+├── .worker.port           # NEW: Port + PID
+└── .pm2-migrated          # NEW: Migration marker (all platforms)
+
+~/.pm2/                    # Orphaned (safe to delete)
+├── logs/                  # Old logs (no longer written)
+├── pids/                  # Old PID (no longer updated)
+└── pm2.log                # PM2 daemon log (not used)
+```
+
+## Edge Cases and Troubleshooting
+
+### Scenario 1: Migration Fails (PM2 Still Running)
+
+<Warning>
+This is rare but can happen if PM2 has watch mode enabled or the process is manually restarted.
+</Warning>
+
+**Symptoms**:
+- `pm2 list` still shows `claude-mem-worker`
+- Port conflict errors in logs
+- Worker fails to start
+
+**Resolution**:
+```bash
+# Manual cleanup
+pm2 delete claude-mem-worker
+pm2 save  # Persist the deletion
+
+# Force re-migration (optional)
+rm ~/.claude-mem/.pm2-migrated
+
+# Restart worker
+npm run worker:restart
+```
+
+### Scenario 2: Stale PID File (Process Dead)
+
+**Symptoms**:
+- `npm run worker:status` shows "not running"
+- `.worker.pid` file exists
+- Process ID doesn't exist
+
+**Automatic Recovery**: Next hook trigger detects dead process and starts a fresh worker.
+
+**Manual Resolution**:
+```bash
+rm ~/.claude-mem/.worker.pid
+rm ~/.claude-mem/.worker.port
+npm run worker:start
+```
+
+### Scenario 3: Port Already in Use
+
+**Error**: `EADDRINUSE: address already in use`
+
+**Resolution**:
+```bash
+# Check what's using the port
+lsof -i :37777
+
+# Kill the process
+kill -9 <PID>
+
+# Restart worker
+npm run worker:restart
+```
+
+### Common Error Messages
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `EADDRINUSE` | Port already in use | `lsof -i :37777` then kill conflicting process |
+| `No such process` | Stale PID file | Automatic cleanup on next hook trigger |
+| `pm2: command not found` | PM2 not installed | None needed (error is caught and ignored) |
+| `Invalid port X` | Port validation failed | Update `CLAUDE_MEM_WORKER_PORT` in settings |
+
+## Developer Notes
+
+### Testing the Migration
+
+```bash
+# 1. Install old version (with PM2)
+git checkout <pre-7.1.0-tag>
+npm install && npm run build && npm run sync-marketplace
+
+# 2. Start PM2 worker
+pm2 start plugin/scripts/worker-cli.js --name claude-mem-worker
+
+# 3. Update to new version
+git checkout main
+npm install && npm run build && npm run sync-marketplace
+
+# 4. Trigger hook
+node plugin/scripts/session-start-hook.js
+
+# 5. Verify migration
+pm2 list  # Should NOT show claude-mem-worker
+cat ~/.claude-mem/.pm2-migrated  # Should exist
+npm run worker:status  # Should show Bun worker running
+```
+
+### Architecture Decisions
+
+**Why Custom ProcessManager Instead of PM2?**
+1. **Simplicity**: Direct control, no external daemon
+2. **Dependencies**: Remove npm dependency
+3. **Cross-platform**: Bun handles platform differences
+4. **Bundle Size**: Reduce plugin package size
+5. **Control**: Fine-grained error handling and validation
+
+**Why One-Time Marker Instead of Always Running PM2 Delete?**
+1. **Performance**: Avoid unnecessary process spawning
+2. **Idempotency**: Migration runs exactly once
+3. **Debugging**: Timestamp shows when migration occurred
+4. **Simplicity**: Clear migration state
+
+**Why Run PM2 Cleanup on All Platforms?**
+1. **Quality Migration**: Clean up orphaned processes
+2. **Consistency**: Same behavior across all platforms
+3. **Safety**: Error handling already in place (try/catch)
+4. **No Downside**: If PM2 not installed, error is caught and ignored
+
+## Summary
+
+The migration from PM2 to Bun-based ProcessManager is a **one-time, automatic, transparent** transition that:
+
+1. **Removes external dependencies** (PM2, better-sqlite3)
+2. **Simplifies architecture** (direct process control)
+3. **Improves cross-platform support** (especially Windows)
+4. **Preserves user data** (database, settings, logs unchanged)
+5. **Requires no user action** (automatic on first hook trigger)
+
+**Key Migration Moment**: First hook trigger after update to 7.1.0+
+**Duration**: ~2-5 seconds (one-time delay)
+**Impact**: Seamless transition, user-invisible
+**Rollback**: Not needed (migration is forward-only, safe)
+
+For most users, the migration will be completely transparent - they'll see no errors, no data loss, and experience improved reliability and simpler troubleshooting going forward.
--- a/.agent/services/claude-mem/docs/public/architecture/search-architecture.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/search-architecture.mdx
@@ -0,0 +1,497 @@
+---
+title: "Search Architecture"
+description: "MCP tools with 3-layer workflow for token-efficient memory retrieval"
+---
+
+# Search Architecture
+
+Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.
+
+## Overview
+
+**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service
+
+**Key Components**:
+1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT`
+2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API
+3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777)
+4. **Worker Service** - Express.js server with FTS5 full-text search
+5. **SQLite Database** - Persistent storage with FTS5 virtual tables
+6. **Chroma Vector DB** - Semantic search with hybrid retrieval
+
+**Token Efficiency**: ~10x savings through 3-layer workflow pattern
+
+## How It Works
+
+### 1. User Query
+
+Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:
+
+```
+Step 1: search(query="authentication bug", type="bugfix", limit=10)
+Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
+Step 3: get_observations(ids=[123, 456, 789])
+```
+
+### 2. MCP Protocol
+
+MCP server receives tool call via JSON-RPC over stdio:
+
+```json
+{
+  "method": "tools/call",
+  "params": {
+    "name": "search",
+    "arguments": {
+      "query": "authentication bug",
+      "type": "bugfix",
+      "limit": 10
+    }
+  }
+}
+```
+
+### 3. HTTP API Call
+
+MCP server translates to HTTP request:
+
+```typescript
+const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
+const response = await fetch(url);
+```
+
+### 4. Worker Processing
+
+Worker service executes FTS5 query:
+
+```sql
+SELECT * FROM observations_fts
+WHERE observations_fts MATCH ?
+AND type = 'bugfix'
+ORDER BY rank
+LIMIT 10
+```
+
+### 5. Results Returned
+
+Worker returns structured data → MCP server → Claude:
+
+```json
+{
+  "content": [{
+    "type": "text",
+    "text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
+  }]
+}
+```
+
+### 6. Claude Processes Results
+
+Claude reviews the index, decides which observations are relevant, and can:
+- Use `timeline` to get context
+- Use `get_observations` to fetch full details for selected IDs
+
+## The 4 MCP Tools
+
+### `__IMPORTANT` - Workflow Documentation
+
+Always visible to Claude. Explains the 3-layer workflow pattern.
+
+**Description:**
+```
+3-LAYER WORKFLOW (ALWAYS FOLLOW):
+1. search(query) → Get index with IDs (~50-100 tokens/result)
+2. timeline(anchor=ID) → Get context around interesting results
+3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
+NEVER fetch full details without filtering first. 10x token savings.
+```
+
+**Purpose:** Ensures Claude follows token-efficient pattern
+
+### `search` - Search Memory Index
+
+**Tool Definition:**
+```typescript
+{
+  name: 'search',
+  description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
+  inputSchema: {
+    type: 'object',
+    properties: {},
+    additionalProperties: true  // Accepts any parameters
+  }
+}
+```
+
+**HTTP Endpoint:** `GET /api/search`
+
+**Parameters:**
+- `query` - Full-text search query
+- `limit` - Maximum results (default: 20)
+- `type` - Filter by observation type
+- `project` - Filter by project name
+- `dateStart`, `dateEnd` - Date range filters
+- `offset` - Pagination offset
+- `orderBy` - Sort order
+
+**Returns:** Compact index with IDs, titles, dates, types (~50-100 tokens per result)
+
+### `timeline` - Get Chronological Context
+
+**Tool Definition:**
+```typescript
+{
+  name: 'timeline',
+  description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
+  inputSchema: {
+    type: 'object',
+    properties: {},
+    additionalProperties: true
+  }
+}
+```
+
+**HTTP Endpoint:** `GET /api/timeline`
+
+**Parameters:**
+- `anchor` - Observation ID to center timeline around (optional if query provided)
+- `query` - Search query to find anchor automatically (optional if anchor provided)
+- `depth_before` - Number of observations before anchor (default: 3)
+- `depth_after` - Number of observations after anchor (default: 3)
+- `project` - Filter by project name
+
+**Returns:** Chronological view showing what happened before/during/after
+
+### `get_observations` - Fetch Full Details
+
+**Tool Definition:**
+```typescript
+{
+  name: 'get_observations',
+  description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
+  inputSchema: {
+    type: 'object',
+    properties: {
+      ids: {
+        type: 'array',
+        items: { type: 'number' },
+        description: 'Array of observation IDs to fetch (required)'
+      }
+    },
+    required: ['ids'],
+    additionalProperties: true
+  }
+}
+```
+
+**HTTP Endpoint:** `POST /api/observations/batch`
+
+**Body:**
+```json
+{
+  "ids": [123, 456, 789],
+  "orderBy": "date_desc",
+  "project": "my-app"
+}
+```
+
+**Returns:** Complete observation details (~500-1,000 tokens per observation)
+
+## MCP Server Implementation
+
+**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
+
+**Role:** Thin wrapper that translates MCP protocol to HTTP API calls
+
+**Key Characteristics:**
+- ~312 lines of code (reduced from ~2,718 lines in old implementation)
+- No business logic - just protocol translation
+- Single source of truth: Worker HTTP API
+- Simple schemas with `additionalProperties: true`
+
+**Handler Example:**
+```typescript
+{
+  name: 'search',
+  handler: async (args: any) => {
+    const endpoint = '/api/search';
+    const searchParams = new URLSearchParams();
+
+    for (const [key, value] of Object.entries(args)) {
+      searchParams.append(key, String(value));
+    }
+
+    const url = `http://localhost:37777${endpoint}?${searchParams}`;
+    const response = await fetch(url);
+    return await response.json();
+  }
+}
+```
+
+## Worker HTTP API
+
+**Location:** `src/services/worker-service.ts`
+
+**Port:** 37777
+
+**Search Endpoints:**
+```typescript
+GET  /api/search           # Main search (used by MCP search tool)
+GET  /api/timeline         # Timeline context (used by MCP timeline tool)
+POST /api/observations/batch  # Fetch by IDs (used by MCP get_observations tool)
+GET  /api/health           # Health check
+```
+
+**Database Access:**
+- Uses `SessionSearch` service for FTS5 queries
+- Uses `SessionStore` for structured queries
+- Hybrid search with ChromaDB for semantic similarity
+
+**FTS5 Full-Text Search:**
+```typescript
+// search tool → HTTP GET → FTS5 query
+SELECT * FROM observations_fts
+WHERE observations_fts MATCH ?
+AND type = ?
+AND date >= ? AND date <= ?
+ORDER BY rank
+LIMIT ? OFFSET ?
+```
+
+## The 3-Layer Workflow Pattern
+
+### Design Philosophy
+
+The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture.
+
+**Layer 1: Index (Search)**
+- **What:** Compact table with IDs, titles, dates, types
+- **Cost:** ~50-100 tokens per result
+- **Purpose:** Survey what exists before committing tokens
+- **Decision Point:** "Which observations are relevant?"
+
+**Layer 2: Context (Timeline)**
+- **What:** Chronological view of observations around a point
+- **Cost:** Variable based on depth
+- **Purpose:** Understand narrative arc, see what led to/from a point
+- **Decision Point:** "Do I need full details?"
+
+**Layer 3: Details (Get Observations)**
+- **What:** Complete observation data (narrative, facts, files, concepts)
+- **Cost:** ~500-1,000 tokens per observation
+- **Purpose:** Deep dive on validated, relevant observations
+- **Decision Point:** "Apply knowledge to current task"
+
+### Token Efficiency
+
+**Traditional RAG Approach:**
+```
+Fetch 20 observations upfront: 10,000-20,000 tokens
+Relevance: ~10% (only 2 observations actually useful)
+Waste: 18,000 tokens on irrelevant context
+```
+
+**3-Layer Workflow:**
+```
+Step 1: search (20 results)        ~1,000-2,000 tokens
+Step 2: Review index, filter to 3 relevant IDs
+Step 3: get_observations (3 IDs)   ~1,500-3,000 tokens
+Total: 2,500-5,000 tokens (50-75% savings)
+```
+
+**10x Savings:** By filtering at index level before fetching full details
+
+## Architecture Evolution
+
+### Before: Complex MCP Implementation
+
+**Approach:** 9 MCP tools with detailed parameter schemas
+
+**Token Cost:** ~2,500 tokens in tool definitions per session
+- `search_observations` - Full-text search
+- `find_by_type` - Filter by type
+- `find_by_file` - Filter by file
+- `find_by_concept` - Filter by concept
+- `get_recent_context` - Recent sessions
+- `get_observation` - Fetch single observation
+- `get_session` - Fetch session
+- `get_prompt` - Fetch prompt
+- `help` - API documentation
+
+**Problems:**
+- Overlapping operations (search_observations vs find_by_type)
+- Complex parameter schemas
+- No built-in workflow guidance
+- High token cost at session start
+
+**Code Size:** ~2,718 lines in mcp-server.ts
+
+### After: Streamlined MCP Implementation
+
+**Approach:** 4 MCP tools following 3-layer workflow
+
+**Token Cost:** ~312 lines of code, simplified tool definitions
+
+**Tools:**
+1. `__IMPORTANT` - Workflow guidance (always visible)
+2. `search` - Step 1 (index)
+3. `timeline` - Step 2 (context)
+4. `get_observations` - Step 3 (details)
+
+**Benefits:**
+- Progressive disclosure built into tool design
+- No overlapping operations
+- Simple schemas (`additionalProperties: true`)
+- Clear workflow pattern
+- ~10x token savings
+
+**Code Size:** ~312 lines in mcp-server.ts (88% reduction)
+
+### Key Insight
+
+**Before:** Progressive disclosure was something Claude had to remember
+
+**After:** Progressive disclosure is enforced by tool design itself
+
+The 3-layer workflow pattern makes it structurally difficult to waste tokens:
+- Can't fetch details without first getting IDs from search
+- Can't search without seeing workflow reminder (`__IMPORTANT`)
+- Timeline provides middle ground between index and full details
+
+## Configuration
+
+### Claude Desktop
+
+Add to `claude_desktop_config.json`:
+
+```json
+{
+  "mcpServers": {
+    "mcp-search": {
+      "command": "node",
+      "args": [
+        "/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
+      ]
+    }
+  }
+}
+```
+
+### Claude Code
+
+MCP server is automatically configured via plugin installation. No manual setup required.
+
+**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code.
+
+## Security
+
+### FTS5 Injection Prevention
+
+All search queries are escaped before FTS5 processing:
+
+```typescript
+function escapeFTS5Query(query: string): string {
+  return query.replace(/"/g, '""');
+}
+```
+
+**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.
+
+### MCP Protocol Security
+
+- Stdio transport (no network exposure)
+- Local-only HTTP API (localhost:37777)
+- No authentication needed (local development only)
+
+## Performance
+
+**FTS5 Full-Text Search:** Sub-10ms for typical queries
+
+**MCP Overhead:** Minimal - simple protocol translation
+
+**Caching:** HTTP layer allows response caching (future enhancement)
+
+**Pagination:** Efficient with offset/limit
+
+**Batching:** `get_observations` accepts multiple IDs in single call
+
+## Benefits Over Alternative Approaches
+
+### vs. Traditional RAG
+
+**Traditional RAG:**
+- Fetches everything upfront
+- High token cost
+- Low relevance ratio
+
+**3-Layer MCP:**
+- Fetches only what's needed
+- ~10x token savings
+- 100% relevance (Claude chooses what to fetch)
+
+### vs. Previous MCP Implementation (v5.x)
+
+**Previous (9 tools):**
+- Complex schemas
+- Overlapping operations
+- No workflow guidance
+- ~2,500 tokens in definitions
+
+**Current (4 tools):**
+- Simple schemas
+- Clear workflow
+- Built-in guidance
+- ~312 lines of code
+
+### vs. Skill-Based Approach (Previously)
+
+**Skill approach:**
+- Required separate skill files
+- HTTP API called directly via curl
+- Progressive disclosure through skill loading
+
+**MCP approach:**
+- Native MCP protocol (better Claude integration)
+- Cleaner architecture (protocol translation layer)
+- Works with both Claude Desktop and Claude Code
+- Simpler to maintain (no skill files)
+
+**Migration:** Skill-based search was removed in favor of streamlined MCP architecture.
+
+## Troubleshooting
+
+### MCP Server Not Connected
+
+**Symptoms:** Tools not appearing in Claude
+
+**Solution:**
+1. Check MCP server path in configuration
+2. Verify worker service is running: `curl http://localhost:37777/api/health`
+3. Restart Claude Desktop/Code
+
+### Worker Service Not Running
+
+**Symptoms:** MCP tools fail with connection errors
+
+**Solution:**
+```bash
+npm run worker:status       # Check status
+npm run worker:restart      # Restart worker
+npm run worker:logs         # View logs
+```
+
+### Empty Search Results
+
+**Symptoms:** search() returns no results
+
+**Troubleshooting:**
+1. Test API directly: `curl "http://localhost:37777/api/search?query=test"`
+2. Check database: `ls ~/.claude-mem/claude-mem.db`
+3. Verify observations exist: `curl "http://localhost:37777/api/health"`
+
+## Next Steps
+
+- [Memory Search Usage](/usage/search-tools) - User guide with examples
+- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
+- [Worker Service Architecture](/architecture/worker-service) - HTTP API details
+- [Database Schema](/architecture/database) - FTS5 tables and indexes
--- a/.agent/services/claude-mem/docs/public/architecture/worker-service.mdx
+++ b/.agent/services/claude-mem/docs/public/architecture/worker-service.mdx
@@ -0,0 +1,695 @@
+---
+title: "Worker Service"
+description: "HTTP API and Bun process management"
+---
+
+# Worker Service
+
+The worker service is a long-running HTTP API built with Express.js and managed natively by Bun. It processes observations through the Claude Agent SDK separately from hook execution to prevent timeout issues.
+
+## Overview
+
+- **Technology**: Express.js HTTP server
+- **Runtime**: Bun (auto-installed if missing)
+- **Process Manager**: Native Bun process management via ProcessManager
+- **Port**: Fixed port 37777 (configurable via `CLAUDE_MEM_WORKER_PORT`)
+- **Location**: `src/services/worker-service.ts`
+- **Built Output**: `plugin/scripts/worker-service.cjs`
+- **Model**: Configurable via `CLAUDE_MEM_MODEL` environment variable (default: sonnet)
+
+## REST API Endpoints
+
+The worker service exposes 22 HTTP endpoints organized into six categories:
+
+### Viewer & Health Endpoints
+
+#### 1. Viewer UI
+```
+GET /
+```
+
+**Purpose**: Serves the web-based viewer UI (v5.1.0+)
+
+**Response**: HTML page with embedded React application
+
+**Features**:
+- Real-time memory stream visualization
+- Infinite scroll pagination
+- Project filtering
+- SSE-based live updates
+- Theme toggle (light/dark mode) as of v5.1.2
+
+#### 2. Health Check
+```
+GET /health
+```
+
+**Purpose**: Worker health status check
+
+**Response**:
+```json
+{
+  "status": "ok",
+  "uptime": 12345,
+  "port": 37777
+}
+```
+
+#### 3. Server-Sent Events Stream
+```
+GET /stream
+```
+
+**Purpose**: Real-time updates for viewer UI
+
+**Response**: SSE stream with events:
+- `observation-created`: New observation added
+- `session-summary-created`: New summary generated
+- `user-prompt-created`: New prompt recorded
+
+**Event Format**:
+```
+event: observation-created
+data: {"id": 123, "title": "...", ...}
+```
+
+### Data Retrieval Endpoints
+
+#### 4. Get Prompts
+```
+GET /api/prompts?project=my-project&limit=20&offset=0
+```
+
+**Purpose**: Retrieve paginated user prompts
+
+**Query Parameters**:
+- `project` (optional): Filter by project name
+- `limit` (default: 20): Number of results
+- `offset` (default: 0): Pagination offset
+
+**Response**:
+```json
+{
+  "prompts": [{
+    "id": 1,
+    "session_id": "abc123",
+    "prompt": "User's prompt text",
+    "prompt_number": 1,
+    "created_at": "2025-11-06T10:30:00Z"
+  }],
+  "total": 150,
+  "hasMore": true
+}
+```
+
+#### 5. Get Observations
+```
+GET /api/observations?project=my-project&limit=20&offset=0
+```
+
+**Purpose**: Retrieve paginated observations
+
+**Query Parameters**:
+- `project` (optional): Filter by project name
+- `limit` (default: 20): Number of results
+- `offset` (default: 0): Pagination offset
+
+**Response**:
+```json
+{
+  "observations": [{
+    "id": 123,
+    "title": "Fix authentication bug",
+    "type": "bugfix",
+    "narrative": "...",
+    "created_at": "2025-11-06T10:30:00Z"
+  }],
+  "total": 500,
+  "hasMore": true
+}
+```
+
+#### 6. Get Summaries
+```
+GET /api/summaries?project=my-project&limit=20&offset=0
+```
+
+**Purpose**: Retrieve paginated session summaries
+
+**Query Parameters**:
+- `project` (optional): Filter by project name
+- `limit` (default: 20): Number of results
+- `offset` (default: 0): Pagination offset
+
+**Response**:
+```json
+{
+  "summaries": [{
+    "id": 456,
+    "session_id": "abc123",
+    "request": "User's original request",
+    "completed": "Work finished",
+    "created_at": "2025-11-06T10:30:00Z"
+  }],
+  "total": 100,
+  "hasMore": true
+}
+```
+
+#### 7. Get Observation by ID
+```
+GET /api/observation/:id
+```
+
+**Purpose**: Retrieve a single observation by its ID
+
+**Path Parameters**:
+- `id` (required): Observation ID
+
+**Response**:
+```json
+{
+  "id": 123,
+  "sdk_session_id": "abc123",
+  "project": "my-project",
+  "type": "bugfix",
+  "title": "Fix authentication bug",
+  "narrative": "...",
+  "created_at": "2025-11-06T10:30:00Z",
+  "created_at_epoch": 1730886600000
+}
+```
+
+**Error Response** (404):
+```json
+{
+  "error": "Observation #123 not found"
+}
+```
+
+#### 8. Get Observations by IDs (Batch)
+```
+POST /api/observations/batch
+```
+
+**Purpose**: Retrieve multiple observations by their IDs in a single request
+
+**Request Body**:
+```json
+{
+  "ids": [123, 456, 789],
+  "orderBy": "date_desc",
+  "limit": 10,
+  "project": "my-project"
+}
+```
+
+**Body Parameters**:
+- `ids` (required): Array of observation IDs
+- `orderBy` (optional): Sort order - `date_desc` or `date_asc` (default: `date_desc`)
+- `limit` (optional): Maximum number of results to return
+- `project` (optional): Filter by project name
+
+**Response**:
+```json
+[
+  {
+    "id": 789,
+    "sdk_session_id": "abc123",
+    "project": "my-project",
+    "type": "feature",
+    "title": "Add new feature",
+    "narrative": "...",
+    "created_at": "2025-11-06T12:00:00Z",
+    "created_at_epoch": 1730891400000
+  },
+  {
+    "id": 456,
+    "sdk_session_id": "abc124",
+    "project": "my-project",
+    "type": "bugfix",
+    "title": "Fix authentication bug",
+    "narrative": "...",
+    "created_at": "2025-11-06T10:30:00Z",
+    "created_at_epoch": 1730886600000
+  }
+]
+```
+
+**Error Responses**:
+- `400 Bad Request`: `{"error": "ids must be an array of numbers"}`
+- `400 Bad Request`: `{"error": "All ids must be integers"}`
+
+**Use Case**: This endpoint is used by the `get_observations` MCP tool to efficiently retrieve multiple observations in a single request, avoiding the overhead of multiple individual requests.
+
+#### 9. Get Session by ID
+```
+GET /api/session/:id
+```
+
+**Purpose**: Retrieve a single session by its ID
+
+**Path Parameters**:
+- `id` (required): Session ID
+
+**Response**:
+```json
+{
+  "id": 456,
+  "sdk_session_id": "abc123",
+  "project": "my-project",
+  "request": "User's original request",
+  "completed": "Work finished",
+  "created_at": "2025-11-06T10:30:00Z"
+}
+```
+
+**Error Response** (404):
+```json
+{
+  "error": "Session #456 not found"
+}
+```
+
+#### 10. Get Prompt by ID
+```
+GET /api/prompt/:id
+```
+
+**Purpose**: Retrieve a single user prompt by its ID
+
+**Path Parameters**:
+- `id` (required): Prompt ID
+
+**Response**:
+```json
+{
+  "id": 1,
+  "session_id": "abc123",
+  "prompt": "User's prompt text",
+  "prompt_number": 1,
+  "created_at": "2025-11-06T10:30:00Z"
+}
+```
+
+**Error Response** (404):
+```json
+{
+  "error": "Prompt #1 not found"
+}
+```
+
+#### 12. Get Stats
+```
+GET /api/stats
+```
+
+**Purpose**: Get database statistics by project
+
+**Response**:
+```json
+{
+  "byProject": {
+    "my-project": {
+      "observations": 245,
+      "summaries": 12,
+      "prompts": 48
+    },
+    "other-project": {
+      "observations": 156,
+      "summaries": 8,
+      "prompts": 32
+    }
+  },
+  "total": {
+    "observations": 401,
+    "summaries": 20,
+    "prompts": 80,
+    "sessions": 20
+  }
+}
+```
+
+#### 13. Get Projects
+```
+GET /api/projects
+```
+
+**Purpose**: Get list of distinct projects from observations
+
+**Response**:
+```json
+{
+  "projects": ["my-project", "other-project", "test-project"]
+}
+```
+
+### Settings Endpoints
+
+#### 14. Get Settings
+```
+GET /api/settings
+```
+
+**Purpose**: Retrieve user settings
+
+**Response**:
+```json
+{
+  "sidebarOpen": true,
+  "selectedProject": "my-project",
+  "theme": "dark"
+}
+```
+
+#### 15. Save Settings
+```
+POST /api/settings
+```
+
+**Purpose**: Persist user settings
+
+**Request Body**:
+```json
+{
+  "sidebarOpen": false,
+  "selectedProject": "other-project",
+  "theme": "light"
+}
+```
+
+**Response**:
+```json
+{
+  "success": true
+}
+```
+
+### Queue Management Endpoints
+
+#### 16. Get Pending Queue Status
+```
+GET /api/pending-queue
+```
+
+**Purpose**: View current processing queue status and identify stuck messages
+
+**Response**:
+```json
+{
+  "queue": {
+    "messages": [
+      {
+        "id": 123,
+        "session_db_id": 45,
+        "claude_session_id": "abc123",
+        "message_type": "observation",
+        "status": "pending",
+        "retry_count": 0,
+        "created_at_epoch": 1730886600000,
+        "started_processing_at_epoch": null,
+        "completed_at_epoch": null
+      }
+    ],
+    "totalPending": 5,
+    "totalProcessing": 2,
+    "totalFailed": 0,
+    "stuckCount": 1
+  },
+  "recentlyProcessed": [
+    {
+      "id": 122,
+      "session_db_id": 44,
+      "status": "processed",
+      "completed_at_epoch": 1730886500000
+    }
+  ],
+  "sessionsWithPendingWork": [44, 45, 46]
+}
+```
+
+**Status Definitions**:
+- `pending`: Message queued, not yet processed
+- `processing`: Message currently being processed by SDK agent
+- `processed`: Message completed successfully
+- `failed`: Message failed after max retry attempts (3 by default)
+
+**Stuck Detection**: Messages in `processing` status for >5 minutes are considered stuck and included in `stuckCount`
+
+**Use Case**: Check queue health after worker crashes or restarts to identify unprocessed observations
+
+#### 17. Trigger Manual Recovery
+```
+POST /api/pending-queue/process
+```
+
+**Purpose**: Manually trigger processing of pending queues (replaces automatic recovery in v5.x+)
+
+**Request Body**:
+```json
+{
+  "sessionLimit": 10
+}
+```
+
+**Body Parameters**:
+- `sessionLimit` (optional): Maximum number of sessions to process (default: 10, max: 100)
+
+**Response**:
+```json
+{
+  "success": true,
+  "totalPendingSessions": 15,
+  "sessionsStarted": 10,
+  "sessionsSkipped": 2,
+  "startedSessionIds": [44, 45, 46, 47, 48, 49, 50, 51, 52, 53]
+}
+```
+
+**Response Fields**:
+- `totalPendingSessions`: Total sessions with pending messages in database
+- `sessionsStarted`: Number of sessions we started processing this request
+- `sessionsSkipped`: Sessions already actively processing (not restarted)
+- `startedSessionIds`: Database IDs of sessions started
+
+**Behavior**:
+- Processes up to `sessionLimit` sessions with pending work
+- Skips sessions already actively processing (prevents duplicate agents)
+- Starts non-blocking SDK agents for each session
+- Returns immediately with status (processing continues in background)
+
+**Use Case**: Manually recover stuck observations after worker crashes, or when automatic recovery was disabled
+
+**Recovery Strategy Note**: As of v5.x, automatic recovery on worker startup is disabled by default. Users must manually trigger recovery using this endpoint or the CLI tool (`bun scripts/check-pending-queue.ts`) to maintain explicit control over reprocessing.
+
+### Session Management Endpoints
+
+#### 19. Initialize Session
+```
+POST /sessions/:sessionDbId/init
+```
+
+**Request Body**:
+```json
+{
+  "sdk_session_id": "abc-123",
+  "project": "my-project"
+}
+```
+
+**Response**:
+```json
+{
+  "success": true,
+  "session_id": "abc-123"
+}
+```
+
+#### 20. Add Observation
+```
+POST /sessions/:sessionDbId/observations
+```
+
+**Request Body**:
+```json
+{
+  "tool_name": "Read",
+  "tool_input": {...},
+  "tool_result": "...",
+  "correlation_id": "xyz-789"
+}
+```
+
+**Response**:
+```json
+{
+  "success": true,
+  "observation_id": 123
+}
+```
+
+#### 21. Generate Summary
+```
+POST /sessions/:sessionDbId/summarize
+```
+
+**Request Body**:
+```json
+{
+  "trigger": "stop"
+}
+```
+
+**Response**:
+```json
+{
+  "success": true,
+  "summary_id": 456
+}
+```
+
+#### 22. Session Status
+```
+GET /sessions/:sessionDbId/status
+```
+
+**Response**:
+```json
+{
+  "session_id": "abc-123",
+  "status": "active",
+  "observation_count": 42,
+  "summary_count": 1
+}
+```
+
+#### 23. Delete Session
+```
+DELETE /sessions/:sessionDbId
+```
+
+**Response**:
+```json
+{
+  "success": true
+}
+```
+
+**Note**: As of v4.1.0, the cleanup hook no longer calls this endpoint. Sessions are marked complete instead of deleted to allow graceful worker shutdown.
+
+## Bun Process Management
+
+### Overview
+
+The worker is managed by the native `ProcessManager` class which handles:
+- Process spawning with Bun runtime
+- PID file tracking at `~/.claude-mem/worker.pid`
+- Health checks with automatic retry
+- Graceful shutdown with SIGTERM/SIGKILL fallback
+
+### Commands
+
+```bash
+# Start worker (auto-starts on first session)
+npm run worker:start
+
+# Stop worker
+npm run worker:stop
+
+# Restart worker
+npm run worker:restart
+
+# View logs
+npm run worker:logs
+
+# Check status
+npm run worker:status
+```
+
+### Auto-Start Behavior
+
+The worker service auto-starts when the SessionStart hook fires. Manual start is optional.
+
+### Bun Requirement
+
+Bun is required to run the worker service. If Bun is not installed, the smart-install script will automatically install it on first run:
+
+- **Windows**: `powershell -c "irm bun.sh/install.ps1 | iex"`
+- **macOS/Linux**: `curl -fsSL https://bun.sh/install | bash`
+
+You can also install manually via:
+- `winget install Oven-sh.Bun` (Windows)
+- `brew install oven-sh/bun/bun` (macOS)
+
+## Claude Agent SDK Integration
+
+The worker service routes observations to the Claude Agent SDK for AI-powered processing:
+
+### Processing Flow
+
+1. **Observation Queue**: Observations accumulate in memory
+2. **SDK Processing**: Observations sent to Claude via Agent SDK
+3. **XML Parsing**: Responses parsed for structured data
+4. **Database Storage**: Processed observations stored in SQLite
+
+### SDK Components
+
+- **Prompts** (`src/sdk/prompts.ts`): Builds XML-structured prompts
+- **Parser** (`src/sdk/parser.ts`): Parses Claude's XML responses
+- **Worker** (`src/sdk/worker.ts`): Main SDK agent loop
+
+### Model Configuration
+
+Set the AI model used for processing via environment variable:
+
+```bash
+export CLAUDE_MEM_MODEL=sonnet
+```
+
+Available shorthand models (forward to latest version):
+- `haiku` - Fast, cost-efficient
+- `sonnet` - Balanced (default)
+- `opus` - Most capable
+
+## Port Allocation
+
+The worker uses a fixed port (37777 by default) for consistent communication:
+
+- **Default**: Port 37777
+- **Override**: Set `CLAUDE_MEM_WORKER_PORT` environment variable
+- **Port File**: `${CLAUDE_PLUGIN_ROOT}/data/worker.port` tracks current port
+
+If port 37777 is in use, the worker will fail to start. Set a custom port via environment variable.
+
+## Data Storage
+
+The worker service stores data in the user data directory:
+
+```
+~/.claude-mem/
+├── claude-mem.db           # SQLite database (bun:sqlite)
+├── worker.pid              # PID file for process tracking
+├── settings.json           # User settings
+└── logs/
+    └── worker-YYYY-MM-DD.log  # Daily rotating logs
+```
+
+## Error Handling
+
+The worker implements graceful degradation:
+
+- **Database Errors**: Logged but don't crash the service
+- **SDK Errors**: Retried with exponential backoff
+- **Network Errors**: Logged and skipped
+- **Invalid Input**: Validated and rejected with error response
+
+## Performance
+
+- **Async Processing**: Observations processed asynchronously
+- **In-Memory Queue**: Fast observation accumulation
+- **Batch Processing**: Multiple observations processed together
+- **Connection Pooling**: SQLite connections reused
+
+## Troubleshooting
+
+See [Troubleshooting - Worker Issues](../troubleshooting.md#worker-service-issues) for common problems and solutions.
--- a/.agent/services/claude-mem/docs/public/beta-features.mdx
+++ b/.agent/services/claude-mem/docs/public/beta-features.mdx
@@ -0,0 +1,151 @@
+---
+title: "Beta Features"
+description: "Try experimental features like Endless Mode before they're released"
+---
+
+# Beta Features
+
+<Warning>
+**Endless Mode is experimental and not included in the stable release.** You must manually switch to the beta branch to try it. The efficiency projections below are based on theoretical modeling, not production measurements. Expect slower performance than standard mode and potential bugs.
+</Warning>
+
+Claude-Mem offers a beta channel for users who want to try experimental features before they're released to the stable channel.
+
+## Version Channel Switching
+
+You can switch between stable and beta versions directly from the web viewer UI at http://localhost:37777.
+
+### How to Access
+
+1. Open the Claude-Mem viewer at http://localhost:37777
+2. Click the **Settings** gear icon in the top-right
+3. Find the **Version Channel** section
+4. Click **Try Beta (Endless Mode)** to switch to beta, or **Switch to Stable** to return
+
+### What Happens When You Switch
+
+When switching versions:
+
+1. **Local changes are discarded** - Any modifications in the plugin directory are reset
+2. **Git fetch and checkout** - The installed plugin switches to the target branch
+3. **Dependencies reinstall** - `npm install` runs to ensure correct dependencies
+4. **Worker restarts automatically** - The background service restarts with the new version
+
+**Your memory data is always preserved.** The database at `~/.claude-mem/claude-mem.db` is not affected by version switching. All your observations, sessions, and summaries remain intact.
+
+### Version Indicators
+
+The Version Channel section shows your current status:
+
+- **Stable** (green badge) - You're running the production release
+- **Beta** (orange badge) - You're running the beta with experimental features
+
+You'll also see the exact branch name (e.g., `main` for stable, `beta/7.0` for beta).
+
+## Endless Mode (Beta)
+
+The flagship experimental feature in beta is **Endless Mode** - a biomimetic memory architecture that dramatically extends how long Claude can maintain context in a session.
+
+### The Problem Endless Mode Solves
+
+In standard Claude Code sessions:
+
+- Tool outputs (file reads, bash output, search results) accumulate in the context window
+- Each tool can add 1-10k+ tokens to the context
+- After ~50 tool uses, the context window fills up (~200k tokens)
+- You're forced to start a new session, losing conversational continuity
+
+Worse, Claude **re-synthesizes all previous tool outputs** on every response. This is O(N²) complexity - quadratically growing both in tokens and compute.
+
+### How Endless Mode Works
+
+Endless Mode applies a biomimetic memory architecture inspired by how human memory works:
+
+**Two-Tier Memory System:**
+
+```
+Working Memory (Context Window):
+  → Compressed observations only (~500 tokens each)
+  → Fast, efficient, manageable
+
+Archive Memory (Transcript File):
+  → Full tool outputs preserved on disk
+  → Perfect recall, searchable
+```
+
+**The Key Innovation**: After each tool use, Endless Mode:
+1. Waits for the worker to generate a compressed observation (blocking)
+2. Transforms the transcript file on disk
+3. Replaces the full tool output with the compressed observation
+4. Claude resumes with the compressed context
+
+This transforms O(N²) scaling into O(N) - linear instead of quadratic.
+
+### Projected Results
+
+Based on theoretical modeling (not production measurements):
+
+- **Token savings**: Significant reduction in context window usage
+- **Efficiency gain**: More tool uses before context exhaustion
+- **Quality preservation**: Observations cache the synthesis result, so no information is lost
+
+### Important Caveats
+
+Endless Mode is experimental and has significant limitations:
+
+- **Not in stable release** - You must manually switch to the beta branch to use this feature
+- **Still in development** - May have bugs, breaking changes, or incomplete functionality
+- **Slower than standard mode** - Blocking observation generation adds latency to each tool use
+- **Theoretical projections** - The efficiency claims above are based on simulations, not real-world production data
+- **Requires working database** - Observations must save successfully for transformation
+- **New architecture** - Less battle-tested than standard mode
+
+### When to Use Beta
+
+Consider switching to beta if you:
+
+- Frequently hit context window limits
+- Work on long, complex sessions with many tool uses
+- Want to help test and provide feedback on new features
+- Are comfortable with experimental software
+
+### When to Stay on Stable
+
+Stay on stable if you:
+
+- Need maximum reliability for critical work
+- Prefer battle-tested, production-ready features
+- Don't frequently hit context limits
+- Want the smoothest, fastest experience
+
+## Checking for Updates
+
+While on beta (or stable), you can check for updates:
+
+1. Open Settings in the viewer
+2. In the Version Channel section, click **Check for Updates**
+3. The plugin will pull the latest changes and restart
+
+## Switching Back
+
+If you encounter issues on beta:
+
+1. Open Settings in the viewer
+2. Click **Switch to Stable**
+3. Wait for the worker to restart
+
+Your memory data is preserved, and you'll be back on the stable release.
+
+## Providing Feedback
+
+If you encounter bugs or have feedback about beta features:
+
+- Open an issue at [GitHub Issues](https://github.com/thedotmack/claude-mem/issues)
+- Include your branch (`beta/7.0` etc.) in the report
+- Describe what you expected vs. what happened
+
+## Next Steps
+
+- [Configuration](configuration) - Customize other Claude-Mem settings
+- [Troubleshooting](troubleshooting) - Common issues and solutions
+- [Architecture Overview](architecture/overview) - Understand how Claude-Mem works
--- a/.agent/services/claude-mem/docs/public/claude-mem-logo-for-dark-mode.webp
+++ b/.agent/services/claude-mem/docs/public/claude-mem-logo-for-dark-mode.webp
--- a/.agent/services/claude-mem/docs/public/claude-mem-logo-for-light-mode.webp
+++ b/.agent/services/claude-mem/docs/public/claude-mem-logo-for-light-mode.webp
--- a/.agent/services/claude-mem/docs/public/claude-mem-logomark.webp
+++ b/.agent/services/claude-mem/docs/public/claude-mem-logomark.webp
--- a/.agent/services/claude-mem/docs/public/cm-preview.gif
+++ b/.agent/services/claude-mem/docs/public/cm-preview.gif
--- a/.agent/services/claude-mem/docs/public/configuration.mdx
+++ b/.agent/services/claude-mem/docs/public/configuration.mdx
@@ -0,0 +1,505 @@
+---
+title: "Configuration"
+description: "Environment variables and settings for Claude-Mem"
+---
+
+# Configuration
+
+## Settings File
+
+Settings are managed in `~/.claude-mem/settings.json`. The file is auto-created with defaults on first run.
+
+### Core Settings
+
+| Setting                       | Default                         | Description                           |
+|-------------------------------|---------------------------------|---------------------------------------|
+| `CLAUDE_MEM_MODEL`            | `sonnet`                        | AI model for processing observations (when using Claude) |
+| `CLAUDE_MEM_PROVIDER`         | `claude`                        | AI provider: `claude`, `gemini`, or `openrouter` |
+| `CLAUDE_MEM_MODE`             | `code`                          | Active mode profile (e.g., `code--es`, `email-investigation`) |
+| `CLAUDE_MEM_CONTEXT_OBSERVATIONS` | `50`                        | Number of observations to inject      |
+| `CLAUDE_MEM_WORKER_PORT`      | `37777`                         | Worker service port                   |
+| `CLAUDE_MEM_WORKER_HOST`      | `127.0.0.1`                     | Worker service host address           |
+| `CLAUDE_MEM_SKIP_TOOLS`       | `ListMcpResourcesTool,SlashCommand,Skill,TodoWrite,AskUserQuestion` | Comma-separated tools to exclude from observations |
+
+### Gemini Provider Settings
+
+| Setting                       | Default                         | Description                           |
+|-------------------------------|---------------------------------|---------------------------------------|
+| `CLAUDE_MEM_GEMINI_API_KEY`   | —                               | Gemini API key ([get free key](https://aistudio.google.com/app/apikey)) |
+| `CLAUDE_MEM_GEMINI_MODEL`     | `gemini-2.5-flash-lite`          | Gemini model: `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash-preview` |
+
+See [Gemini Provider](usage/gemini-provider) for detailed configuration and free tier information.
+
+### OpenRouter Provider Settings
+
+| Setting                                      | Default                     | Description                           |
+|----------------------------------------------|-----------------------------|---------------------------------------|
+| `CLAUDE_MEM_OPENROUTER_API_KEY`              | —                           | OpenRouter API key ([get key](https://openrouter.ai/keys)) |
+| `CLAUDE_MEM_OPENROUTER_MODEL`                | `xiaomi/mimo-v2-flash:free` | Model identifier (supports 100+ models) |
+| `CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES` | `20`                        | Max messages in conversation history  |
+| `CLAUDE_MEM_OPENROUTER_MAX_TOKENS`           | `100000`                    | Token budget safety limit             |
+| `CLAUDE_MEM_OPENROUTER_SITE_URL`             | —                           | Optional: URL for analytics           |
+| `CLAUDE_MEM_OPENROUTER_APP_NAME`             | `claude-mem`                | Optional: App name for analytics      |
+
+See [OpenRouter Provider](usage/openrouter-provider) for detailed configuration, free model list, and usage guide.
+
+### System Configuration
+
+| Setting                       | Default                         | Description                           |
+|-------------------------------|---------------------------------|---------------------------------------|
+| `CLAUDE_MEM_DATA_DIR`         | `~/.claude-mem`                 | Data directory location               |
+| `CLAUDE_MEM_LOG_LEVEL`        | `INFO`                          | Log verbosity (DEBUG, INFO, WARN, ERROR, SILENT) |
+| `CLAUDE_MEM_PYTHON_VERSION`   | `3.13`                          | Python version for chroma-mcp         |
+| `CLAUDE_CODE_PATH`            | _(auto-detect)_                 | Path to Claude Code CLI (for Windows) |
+
+## Model Configuration
+
+Configure which AI model processes your observations.
+
+### Available Models
+
+Shorthand model names automatically forward to the latest version:
+
+- `haiku` - Fast, cost-efficient
+- `sonnet` - Balanced (default)
+- `opus` - Most capable
+
+### Using the Interactive Script
+
+```bash
+./claude-mem-settings.sh
+```
+
+This script manages settings in `~/.claude-mem/settings.json`.
+
+### Manual Configuration
+
+Edit `~/.claude-mem/settings.json`:
+
+```json
+{
+  "CLAUDE_MEM_MODEL": "sonnet"
+}
+```
+
+## Mode Configuration
+
+Configure the active workflow mode and language.
+
+### Settings
+
+| Setting | Default | Description |
+|---------|---------|-------------|
+| `CLAUDE_MEM_MODE` | `code` | Defines behavior and language. See [Modes & Languages](modes). |
+
+### Examples
+
+**Spanish Code Mode:**
+```json
+{
+  "CLAUDE_MEM_MODE": "code--es"
+}
+```
+
+**Email Investigation Mode:**
+```json
+{
+  "CLAUDE_MEM_MODE": "email-investigation"
+}
+```
+
+## Files and Directories
+
+### Data Directory Structure
+
+The data directory location depends on the environment:
+- **Production (installed plugin)**: `~/.claude-mem/` (always, regardless of CLAUDE_PLUGIN_ROOT)
+- **Development**: Can be overridden with `CLAUDE_MEM_DATA_DIR`
+
+```
+~/.claude-mem/
+├── claude-mem.db           # SQLite database
+├── .install-version        # Cached version for smart installer
+├── worker.port             # Current worker port file
+└── logs/
+    ├── worker-out.log      # Worker stdout logs
+    └── worker-error.log    # Worker stderr logs
+```
+
+### Plugin Directory Structure
+
+```
+${CLAUDE_PLUGIN_ROOT}/
+├── .claude-plugin/
+│   └── plugin.json         # Plugin metadata
+├── .mcp.json               # MCP server configuration
+├── hooks/
+│   └── hooks.json          # Hook configuration
+├── scripts/                # Built executables
+│   ├── smart-install.js    # Smart installer script
+│   ├── context-hook.js     # Context injection hook
+│   ├── new-hook.js         # Session creation hook
+│   ├── save-hook.js        # Observation capture hook
+│   ├── summary-hook.js     # Summary generation hook
+│   ├── worker-service.cjs  # Worker service (CJS)
+│   └── mcp-server.cjs      # MCP search server (CJS)
+└── ui/
+    └── viewer.html         # Web viewer UI bundle
+```
+
+## Plugin Configuration
+
+### Hooks Configuration
+
+Hooks are configured in `plugin/hooks/hooks.json`:
+
+```json
+{
+  "description": "Claude-mem memory system hooks",
+  "hooks": {
+    "SessionStart": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/smart-install.js && node ${CLAUDE_PLUGIN_ROOT}/scripts/context-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "UserPromptSubmit": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/new-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "PostToolUse": [{
+      "matcher": "*",
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/save-hook.js",
+        "timeout": 120
+      }]
+    }],
+    "Stop": [{
+      "hooks": [{
+        "type": "command",
+        "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/summary-hook.js",
+        "timeout": 120
+      }]
+    }]
+  }
+}
+```
+
+### Search Configuration
+
+Claude-Mem provides MCP search tools for querying your project history.
+
+**No configuration required** - MCP tools are automatically available in Claude Code sessions.
+
+Search operations are provided via:
+- **MCP Server**: 3 tools (search, timeline, get_observations) with progressive disclosure
+- **HTTP API**: 10 endpoints on worker service port 37777
+- **Auto-Invocation**: Claude recognizes natural language queries about past work
+
+## Version Channel
+
+Claude-Mem supports switching between stable and beta versions via the web viewer UI.
+
+### Accessing Version Channel
+
+1. Open the viewer at http://localhost:37777
+2. Click the Settings gear icon
+3. Find the **Version Channel** section
+
+### Switching Versions
+
+- **Try Beta**: Click "Try Beta (Endless Mode)" to switch to the beta branch with experimental features
+- **Switch to Stable**: Click "Switch to Stable" to return to the production release
+- **Check for Updates**: Pull the latest changes for your current branch
+
+**Your memory data is preserved** when switching versions. Only the plugin code changes.
+
+<Note>
+Endless Mode is experimental and slower than standard mode. See [Beta Features](beta-features) for full details and important limitations.
+</Note>
+
+## Worker Service Management
+
+Worker service is managed by Bun as a background process. The worker auto-starts on first session and runs continuously in the background.
+
+## Folder Context Files
+
+Claude-mem can automatically generate `CLAUDE.md` files in your project folders with activity timelines. This feature is disabled by default.
+
+| Setting | Default | Description |
+|---------|---------|-------------|
+| `CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED` | `false` | Enable auto-generation of folder CLAUDE.md files |
+
+See [Folder Context Files](usage/folder-context) for full documentation on how this feature works, configuration options, and git integration recommendations.
+
+## Context Injection Configuration
+
+Claude-Mem injects past observations into each new session, giving Claude awareness of recent work. You can configure exactly what gets injected using the **Context Settings Modal**.
+
+### Context Settings Modal
+
+Access the settings modal from the web viewer at http://localhost:37777:
+
+1. Click the **gear icon** in the header
+2. Adjust settings in the right panel
+3. See changes reflected live in the **Terminal Preview** on the left
+4. Settings auto-save as you change them
+
+The Terminal Preview shows exactly what will be injected at the start of your next Claude Code session for the selected project.
+
+### Loading Settings
+
+Control how many observations are injected:
+
+| Setting | Default | Range | Description |
+|---------|---------|-------|-------------|
+| **Observations** | 50 | 1-200 | Total number of recent observations to include |
+| **Sessions** | 10 | 1-50 | Number of recent sessions to pull observations from |
+
+**Considerations**:
+- **Higher values** = More context but slower SessionStart and more tokens used
+- **Lower values** = Faster SessionStart but less historical awareness
+- Default of 50 observations from 10 sessions balances context richness with performance
+
+### Filter Settings
+
+Control which observation types and concepts are included:
+
+**Types** (select any combination):
+- `bugfix` - Bug fixes and error resolutions
+- `feature` - New functionality additions
+- `refactor` - Code restructuring
+- `discovery` - Learnings about how code works
+- `decision` - Architectural or design decisions
+- `change` - General code changes
+
+**Concepts** (select any combination):
+- `how-it-works` - System behavior explanations
+- `why-it-exists` - Rationale for code/design
+- `what-changed` - Change summaries
+- `problem-solution` - Problem/solution pairs
+- `gotcha` - Edge cases and pitfalls
+- `pattern` - Recurring patterns
+- `trade-off` - Design trade-offs
+
+Use "All" or "None" buttons to quickly select/deselect all options.
+
+### Display Settings
+
+Control how observations appear in the context:
+
+**Full Observations**:
+| Setting | Default | Options | Description |
+|---------|---------|---------|-------------|
+| **Count** | 5 | 0-20 | How many observations show expanded details |
+| **Field** | narrative | narrative, facts | Which field to expand |
+
+The most recent N observations (set by Count) show their full narrative or facts. Remaining observations show only title, type, and token counts in a compact table format.
+
+**Token Economics** (toggles):
+| Setting | Default | Description |
+|---------|---------|-------------|
+| **Read cost** | true | Show tokens to read each observation |
+| **Work investment** | true | Show tokens spent creating the observation |
+| **Savings** | true | Show total tokens saved by reusing context |
+
+Token economics help you understand the value of cached observations vs. re-reading files.
+
+### Advanced Settings
+
+| Setting | Default | Description |
+|---------|---------|-------------|
+| **Model** | sonnet | AI model for generating observations |
+| **Worker Port** | 37777 | Port for background worker service |
+| **MCP search server** | true | Enable Model Context Protocol search tools |
+| **Include last summary** | false | Add previous session's summary to context |
+| **Include last message** | false | Add previous session's final message |
+
+### Manual Configuration
+
+Settings are stored in `~/.claude-mem/settings.json`:
+
+```json
+{
+  "CLAUDE_MEM_CONTEXT_OBSERVATIONS": "100",
+  "CLAUDE_MEM_CONTEXT_SESSION_COUNT": "20",
+  "CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES": "bugfix,decision,discovery",
+  "CLAUDE_MEM_CONTEXT_OBSERVATION_CONCEPTS": "how-it-works,gotcha",
+  "CLAUDE_MEM_CONTEXT_FULL_COUNT": "10",
+  "CLAUDE_MEM_CONTEXT_FULL_FIELD": "narrative",
+  "CLAUDE_MEM_CONTEXT_SHOW_READ_TOKENS": "true",
+  "CLAUDE_MEM_CONTEXT_SHOW_WORK_TOKENS": "true",
+  "CLAUDE_MEM_CONTEXT_SHOW_SAVINGS_AMOUNT": "true",
+  "CLAUDE_MEM_CONTEXT_SHOW_LAST_SUMMARY": "false",
+  "CLAUDE_MEM_CONTEXT_SHOW_LAST_MESSAGE": "false"
+}
+```
+
+**Note**: The Context Settings Modal (at http://localhost:37777) is the recommended way to configure these settings, as it provides live preview of changes.
+
+## Customization
+
+Settings can be customized in `~/.claude-mem/settings.json`.
+
+### Custom Data Directory
+
+Edit `~/.claude-mem/settings.json`:
+```json
+{
+  "CLAUDE_MEM_DATA_DIR": "/custom/path"
+}
+```
+
+### Custom Worker Port
+
+Edit `~/.claude-mem/settings.json`:
+```json
+{
+  "CLAUDE_MEM_WORKER_PORT": "38000"
+}
+```
+
+Then restart the worker:
+```bash
+npm run worker:restart
+```
+
+### Custom Model
+
+Edit `~/.claude-mem/settings.json`:
+```json
+{
+  "CLAUDE_MEM_MODEL": "opus"
+}
+```
+
+Then restart the worker:
+```bash
+export CLAUDE_MEM_MODEL=opus
+npm run worker:restart
+```
+
+### Custom Skip Tools
+
+Control which tools are excluded from observations. Edit `~/.claude-mem/settings.json`:
+```json
+{
+  "CLAUDE_MEM_SKIP_TOOLS": "ListMcpResourcesTool,SlashCommand,Skill"
+}
+```
+
+**Default excluded tools:**
+- `ListMcpResourcesTool`
+- `SlashCommand`
+- `Skill`
+- `TodoWrite`
+- `AskUserQuestion`
+
+**Common customizations:**
+- Include TodoWrite: Remove from skip list to track task planning
+- Include AskUserQuestion: Remove to capture decision-making conversations
+- Skip additional tools: Add tool names to reduce observation noise
+
+Changes take effect on the next tool execution (no worker restart needed).
+
+## Advanced Configuration
+
+### Hook Timeouts
+
+Modify timeouts in `plugin/hooks/hooks.json`:
+
+```json
+{
+  "timeout": 120  // Default: 120 seconds
+}
+```
+
+Recommended values:
+- SessionStart: 120s (needs time for smart install check and context retrieval)
+- UserPromptSubmit: 60s
+- PostToolUse: 120s (can process many observations)
+- Stop: 60s
+- SessionEnd: 60s
+
+**Note**: With smart install caching (v5.0.3+), SessionStart is typically very fast (10ms) unless dependencies need installation.
+
+### Worker Memory Limit
+
+The worker service is managed by Bun and will automatically restart if it encounters issues. Memory usage is typically low (~100-200MB).
+
+### Logging Verbosity
+
+Enable debug logging:
+
+```bash
+export DEBUG=claude-mem:*
+npm run worker:restart
+npm run worker:logs
+```
+
+## Configuration Best Practices
+
+1. **Use defaults**: Default configuration works for most use cases
+2. **Override selectively**: Only change what you need
+3. **Document changes**: Keep track of custom configurations
+4. **Test after changes**: Verify worker restarts successfully
+5. **Monitor logs**: Check worker logs after configuration changes
+
+## Troubleshooting Configuration
+
+### Configuration Not Applied
+
+1. Restart worker after changes:
+   ```bash
+   npm run worker:restart
+   ```
+
+2. Verify environment variables:
+   ```bash
+   echo $CLAUDE_MEM_MODEL
+   echo $CLAUDE_MEM_WORKER_PORT
+   ```
+
+3. Check worker logs:
+   ```bash
+   npm run worker:logs
+   ```
+
+### Invalid Model Name
+
+If you specify an invalid model name, the worker will fall back to `sonnet` and log a warning.
+
+Valid shorthand models (forward to latest version):
+- haiku
+- sonnet
+- opus
+
+### Port Already in Use
+
+If port 37777 is already in use:
+
+1. Set custom port:
+   ```bash
+   export CLAUDE_MEM_WORKER_PORT=38000
+   ```
+
+2. Restart worker:
+   ```bash
+   npm run worker:restart
+   ```
+
+3. Verify new port:
+   ```bash
+   cat ~/.claude-mem/worker.port
+   ```
+
+## Next Steps
+
+- [Architecture Overview](architecture/overview) - Understand the system
+- [Troubleshooting](troubleshooting) - Common issues
+- [Development](development) - Building from source
--- a/.agent/services/claude-mem/docs/public/context-engineering.mdx
+++ b/.agent/services/claude-mem/docs/public/context-engineering.mdx
@@ -0,0 +1,227 @@
+---
+title: "Context Engineering"
+description: "Best practices for curating optimal token sets for AI agents"
+---
+
+# Context Engineering for AI Agents
+
+## Core Principle
+**Find the smallest possible set of high-signal tokens that maximize the likelihood of your desired outcome.**
+
+---
+
+## Context Engineering vs Prompt Engineering
+
+**Prompt Engineering**: Writing and organizing LLM instructions for optimal outcomes (one-time task)
+
+**Context Engineering**: Curating and maintaining the optimal set of tokens during inference across multiple turns (iterative process)
+
+Context engineering manages:
+- System instructions
+- Tools
+- Model Context Protocol (MCP)
+- External data
+- Message history
+- Runtime data retrieval
+
+---
+
+## The Problem: Context Rot
+
+**Key Insight**: LLMs have an "attention budget" that gets depleted as context grows
+
+- Every token attends to every other token (n² relationships)
+- As context length increases, model accuracy decreases
+- Models have less training experience with longer sequences
+- Context must be treated as a finite resource with diminishing marginal returns
+
+---
+
+## System Prompts: Find the "Right Altitude"
+
+### The Goldilocks Zone
+
+**Too Prescriptive** ❌
+- Hardcoded if-else logic
+- Brittle and fragile
+- High maintenance complexity
+
+**Too Vague** ❌
+- High-level guidance without concrete signals
+- Falsely assumes shared context
+- Lacks actionable direction
+
+**Just Right** ✅
+- Specific enough to guide behavior effectively
+- Flexible enough to provide strong heuristics
+- Minimal set of information that fully outlines expected behavior
+
+### Best Practices
+- Use simple, direct language
+- Organize into distinct sections (`<background_information>`, `<instructions>`, `## Tool guidance`, etc.)
+- Use XML tags or Markdown headers for structure
+- Start with minimal prompt, add based on failure modes
+- Note: Minimal ≠ short (provide sufficient information upfront)
+
+---
+
+## Tools: Minimal and Clear
+
+### Design Principles
+- **Self-contained**: Each tool has a single, clear purpose
+- **Robust to error**: Handle edge cases gracefully
+- **Extremely clear**: Intended use is unambiguous
+- **Token-efficient**: Returns relevant information without bloat
+- **Descriptive parameters**: Unambiguous input names (e.g., `user_id` not `user`)
+
+### Critical Rule
+**If a human engineer can't definitively say which tool to use in a given situation, an AI agent can't be expected to do better.**
+
+### Common Failure Modes to Avoid
+- Bloated tool sets covering too much functionality
+- Tools with overlapping purposes
+- Ambiguous decision points about which tool to use
+
+---
+
+## Examples: Diverse, Not Exhaustive
+
+**Do** ✅
+- Curate a set of diverse, canonical examples
+- Show expected behavior effectively
+- Think "pictures worth a thousand words"
+
+**Don't** ❌
+- Stuff in a laundry list of edge cases
+- Try to articulate every possible rule
+- Overwhelm with exhaustive scenarios
+
+---
+
+## Context Retrieval Strategies
+
+### Just-In-Time Context (Recommended for Agents)
+**Approach**: Maintain lightweight identifiers (file paths, queries, links) and dynamically load data at runtime
+
+**Benefits**:
+- Avoids context pollution
+- Enables progressive disclosure
+- Mirrors human cognition (we don't memorize everything)
+- Leverages metadata (file names, folder structure, timestamps)
+- Agents discover context incrementally
+
+**Trade-offs**:
+- Slower than pre-computed retrieval
+- Requires proper tool guidance to avoid dead-ends
+
+### Pre-Inference Retrieval (Traditional RAG)
+**Approach**: Use embedding-based retrieval to surface context before inference
+
+**When to Use**: Static content that won't change during interaction
+
+### Hybrid Strategy (Best of Both)
+**Approach**: Retrieve some data upfront, enable autonomous exploration as needed
+
+**Example**: Claude Code loads CLAUDE.md files upfront, uses glob/grep for just-in-time retrieval
+
+**Rule of Thumb**: "Do the simplest thing that works"
+
+---
+
+## Long-Horizon Tasks: Three Techniques
+
+### 1. Compaction
+**What**: Summarize conversation nearing context limit, reinitiate with summary
+
+**Implementation**:
+- Pass message history to model for compression
+- Preserve critical details (architectural decisions, bugs, implementation)
+- Discard redundant outputs
+- Continue with compressed context + recently accessed files
+
+**Tuning Process**:
+1. **First**: Maximize recall (capture all relevant information)
+2. **Then**: Improve precision (eliminate superfluous content)
+
+**Low-Hanging Fruit**: Clear old tool calls and results
+
+**Best For**: Tasks requiring extensive back-and-forth
+
+### 2. Structured Note-Taking (Agentic Memory)
+**What**: Agent writes notes persisted outside context window, retrieved later
+
+**Examples**:
+- To-do lists
+- NOTES.md files
+- Game state tracking (Pokémon example: tracking 1,234 steps of training)
+- Project progress logs
+
+**Benefits**:
+- Persistent memory with minimal overhead
+- Maintains critical context across tool calls
+- Enables multi-hour coherent strategies
+
+**Best For**: Iterative development with clear milestones
+
+### 3. Sub-Agent Architectures
+**What**: Specialized sub-agents handle focused tasks with clean context windows
+
+**How It Works**:
+- Main agent coordinates high-level plan
+- Sub-agents perform deep technical work
+- Sub-agents explore extensively (tens of thousands of tokens)
+- Return condensed summaries (1,000-2,000 tokens)
+
+**Benefits**:
+- Clear separation of concerns
+- Parallel exploration
+- Detailed context remains isolated
+
+**Best For**: Complex research and analysis tasks
+
+---
+
+## Quick Decision Framework
+
+| Scenario | Recommended Approach |
+|----------|---------------------|
+| Static content | Pre-inference retrieval or hybrid |
+| Dynamic exploration needed | Just-in-time context |
+| Extended back-and-forth | Compaction |
+| Iterative development | Structured note-taking |
+| Complex research | Sub-agent architectures |
+| Rapid model improvement | "Do the simplest thing that works" |
+
+---
+
+## Key Takeaways
+
+1. **Context is finite**: Treat it as a precious resource with an attention budget
+2. **Think holistically**: Consider the entire state available to the LLM
+3. **Stay minimal**: More context isn't always better
+4. **Be iterative**: Context curation happens each time you pass to the model
+5. **Design for autonomy**: As models improve, let them act intelligently
+6. **Start simple**: Test with minimal setup, add based on failure modes
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Cramming everything into prompts
+- ❌ Creating brittle if-else logic
+- ❌ Building bloated tool sets
+- ❌ Stuffing exhaustive edge cases as examples
+- ❌ Assuming larger context windows solve everything
+- ❌ Ignoring context pollution over long interactions
+
+---
+
+## Remember
+
+> "Even as models continue to improve, the challenge of maintaining coherence across extended interactions will remain central to building more effective agents."
+
+Context engineering will evolve, but the core principle stays the same: **optimize signal-to-noise ratio in your token budget**.
+
+---
+
+*Based on Anthropic's "Effective context engineering for AI agents" (September 2025)*
--- a/.agent/services/claude-mem/docs/public/cursor/gemini-setup.mdx
+++ b/.agent/services/claude-mem/docs/public/cursor/gemini-setup.mdx
@@ -0,0 +1,191 @@
+---
+title: "Cursor + Gemini Setup"
+description: "Use Claude-Mem in Cursor with Google's free Gemini API"
+---
+
+# Cursor + Gemini Setup
+
+This guide walks you through setting up Claude-Mem in Cursor using Google's Gemini API. Gemini offers a generous free tier that handles typical individual usage.
+
+<Info>
+**Free Tier:** 1,500 requests per day with `gemini-2.5-flash-lite`. No credit card required.
+</Info>
+
+## Step 1: Get a Gemini API Key
+
+1. Go to [Google AI Studio](https://aistudio.google.com/apikey)
+2. Sign in with your Google account
+3. Accept the Terms of Service
+4. Click **Create API key**
+5. Choose or create a Google Cloud project
+6. Copy your API key - you'll need it in Step 3
+
+<Tip>
+**Higher rate limits:** Enable billing on your Google Cloud project to unlock 4,000 RPM (vs 10 RPM without billing). You won't be charged unless you exceed the free quota.
+</Tip>
+
+## Step 2: Clone and Build Claude-Mem
+
+```bash
+# Clone the repository
+git clone https://github.com/thedotmack/claude-mem.git
+cd claude-mem
+
+# Install dependencies
+bun install
+
+# Build the project
+bun run build
+```
+
+## Step 3: Configure Gemini Provider
+
+### Option A: Interactive Setup (Recommended)
+
+Run the setup wizard which guides you through everything:
+
+```bash
+bun run cursor:setup
+```
+
+The wizard will:
+1. Detect you don't have Claude Code
+2. Ask you to choose Gemini as your provider
+3. Prompt for your API key
+4. Install hooks automatically
+5. Start the worker
+
+### Option B: Manual Configuration
+
+Create the settings file manually:
+
+```bash
+# Create settings directory
+mkdir -p ~/.claude-mem
+
+# Create settings file with Gemini configuration
+cat > ~/.claude-mem/settings.json << 'EOF'
+{
+  "CLAUDE_MEM_PROVIDER": "gemini",
+  "CLAUDE_MEM_GEMINI_API_KEY": "YOUR_GEMINI_API_KEY"
+}
+EOF
+```
+
+Replace `YOUR_GEMINI_API_KEY` with your actual API key.
+
+Then install hooks and start the worker:
+
+```bash
+bun run cursor:install
+bun run worker:start
+```
+
+## Step 4: Restart Cursor
+
+Close and reopen Cursor IDE for the hooks to take effect.
+
+## Step 5: Verify Installation
+
+```bash
+# Check worker is running
+bun run worker:status
+
+# Check hooks are installed
+bun run cursor:status
+```
+
+Open http://localhost:37777 to see the memory viewer.
+
+## Available Gemini Models
+
+| Model | Free Tier RPM | Notes |
+|-------|---------------|-------|
+| `gemini-2.5-flash-lite` | 10 (4,000 with billing) | **Default.** Fastest, highest free tier RPM |
+| `gemini-2.5-flash` | 5 (1,000 with billing) | Higher capability |
+| `gemini-3-flash-preview` | 5 (1,000 with billing) | Latest model |
+
+To change the model, update your settings:
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "gemini",
+  "CLAUDE_MEM_GEMINI_API_KEY": "your-key",
+  "CLAUDE_MEM_GEMINI_MODEL": "gemini-2.5-flash"
+}
+```
+
+## Rate Limiting
+
+Claude-mem automatically handles rate limiting for free tier usage:
+
+- Requests are spaced to stay within limits
+- Processing may be slightly slower but stays within quota
+- No errors or lost observations
+
+**To remove rate limiting:** Enable billing on your Google Cloud project, then add to settings:
+
+```json
+{
+  "CLAUDE_MEM_GEMINI_BILLING_ENABLED": true
+}
+```
+
+You'll still use the free quota but with much higher rate limits.
+
+## Troubleshooting
+
+### "Gemini API key not configured"
+
+Ensure your settings file exists and has the correct format:
+
+```bash
+cat ~/.claude-mem/settings.json
+```
+
+Should output something like:
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "gemini",
+  "CLAUDE_MEM_GEMINI_API_KEY": "AIza..."
+}
+```
+
+### Rate limit errors (HTTP 429)
+
+You're exceeding the free tier limits. Options:
+1. Wait a few minutes for the rate limit to reset
+2. Enable billing on Google Cloud to unlock higher limits
+3. Switch to OpenRouter for higher volume needs
+
+### API key invalid
+
+1. Verify your key at [Google AI Studio](https://aistudio.google.com/apikey)
+2. Ensure there are no extra spaces or newlines in your settings.json
+3. Try generating a new API key
+
+### Worker not processing observations
+
+Check the worker logs:
+
+```bash
+bun run worker:logs
+```
+
+Look for error messages related to Gemini API calls.
+
+## Switching Providers Later
+
+You can switch between Gemini, OpenRouter, and Claude SDK at any time by updating your settings. No restart required - changes take effect on the next observation.
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter"
+}
+```
+
+## Next Steps
+
+- [Cursor Integration Overview](/cursor/index) - All Cursor features
+- [OpenRouter Setup](/cursor/openrouter-setup) - Alternative provider with 100+ models
+- [Configuration Reference](../configuration) - All settings options
--- a/.agent/services/claude-mem/docs/public/cursor/index.mdx
+++ b/.agent/services/claude-mem/docs/public/cursor/index.mdx
@@ -0,0 +1,180 @@
+---
+title: "Cursor Integration"
+description: "Persistent AI memory for Cursor IDE - free tier options available"
+---
+
+# Cursor Integration
+
+> **Your AI stops forgetting. Give Cursor persistent memory.**
+
+Every Cursor session starts fresh - your AI doesn't remember what it worked on yesterday. Claude-mem changes that. Your agent builds cumulative knowledge about your codebase, decisions, and patterns over time.
+
+<CardGroup cols={2}>
+  <Card title="Free to Start" icon="dollar-sign">
+    Works with Gemini's free tier (1500 req/day) - no subscription required
+  </Card>
+  <Card title="Automatic Capture" icon="bolt">
+    MCP tools, shell commands, and file edits logged without effort
+  </Card>
+  <Card title="Smart Context" icon="brain">
+    Relevant history injected into every chat session
+  </Card>
+  <Card title="Works Everywhere" icon="check">
+    With or without Claude Code subscription
+  </Card>
+</CardGroup>
+
+<Info>
+**No Claude Code subscription required.** Use Gemini (free tier) or OpenRouter as your AI provider.
+</Info>
+
+## How It Works
+
+Claude-mem integrates with Cursor through native hooks:
+
+1. **Session hooks** capture tool usage, file edits, and shell commands
+2. **AI extraction** compresses observations into semantic summaries
+3. **Context injection** loads relevant history into each new session
+4. **Memory viewer** at http://localhost:37777 shows your knowledge base
+
+## Installation Paths
+
+Choose the installation method that fits your setup:
+
+### Path A: Cursor-Only Users (No Claude Code)
+
+If you're using Cursor without a Claude Code subscription:
+
+```bash
+# Clone and build
+git clone https://github.com/thedotmack/claude-mem.git
+cd claude-mem && bun install && bun run build
+
+# Run interactive setup wizard
+bun run cursor:setup
+```
+
+The setup wizard will:
+- Detect you don't have Claude Code
+- Help you choose and configure a free AI provider (Gemini recommended)
+- Install hooks automatically
+- Start the worker service
+
+**Detailed guides:**
+- [Gemini Setup](/cursor/gemini-setup) - Recommended free option (1500 req/day)
+- [OpenRouter Setup](/cursor/openrouter-setup) - 100+ models including free options
+
+### Path B: Claude Code Users
+
+If you have Claude Code installed:
+
+```bash
+# Install the plugin (if not already)
+/plugin marketplace add thedotmack/claude-mem
+/plugin install claude-mem
+
+# Install Cursor hooks
+claude-mem cursor install
+```
+
+The plugin uses Claude's SDK by default but you can switch to Gemini or OpenRouter anytime.
+
+## Prerequisites
+
+<AccordionGroup>
+  <Accordion title="macOS">
+    - [Bun](https://bun.sh): `curl -fsSL https://bun.sh/install | bash`
+    - Cursor IDE
+    - jq and curl: `brew install jq curl`
+  </Accordion>
+  <Accordion title="Linux">
+    - [Bun](https://bun.sh): `curl -fsSL https://bun.sh/install | bash`
+    - Cursor IDE
+    - jq and curl: `apt install jq curl` or `dnf install jq curl`
+  </Accordion>
+  <Accordion title="Windows">
+    - [Bun](https://bun.sh): `powershell -c "irm bun.sh/install.ps1 | iex"`
+    - Cursor IDE
+    - PowerShell 5.1+ (included in Windows 10/11)
+    - Git for Windows
+  </Accordion>
+</AccordionGroup>
+
+## Quick Commands Reference
+
+After installation, these commands are available from the claude-mem directory:
+
+| Command | Description |
+|---------|-------------|
+| `bun run cursor:setup` | Interactive setup wizard |
+| `bun run cursor:install` | Install Cursor hooks |
+| `bun run cursor:uninstall` | Remove Cursor hooks |
+| `bun run cursor:status` | Check hook installation status |
+| `bun run worker:start` | Start the worker service |
+| `bun run worker:stop` | Stop the worker service |
+| `bun run worker:status` | Check worker status |
+
+## Verifying Installation
+
+After setup, verify everything is working:
+
+1. **Check worker status:**
+   ```bash
+   bun run worker:status
+   ```
+
+2. **Check hook installation:**
+   ```bash
+   bun run cursor:status
+   ```
+
+3. **Open the memory viewer:**
+   Open http://localhost:37777 in your browser
+
+4. **Restart Cursor** and start a coding session - you should see context being captured
+
+## Provider Comparison
+
+| Provider | Cost | Rate Limit | Best For |
+|----------|------|------------|----------|
+| Gemini | Free tier | 1500 req/day | Individual use, getting started |
+| OpenRouter | Pay-per-use + free models | Varies by model | Model variety, high volume |
+| Claude SDK | Included with Claude Code | Unlimited | Claude Code subscribers |
+
+<Tip>
+**Recommendation:** Start with Gemini's free tier. It handles typical individual usage well. Switch to OpenRouter or Claude SDK if you need higher limits.
+</Tip>
+
+## Troubleshooting
+
+### Worker not starting
+
+```bash
+# Check if port is in use
+lsof -i :37777
+
+# Force restart
+bun run worker:stop && bun run worker:start
+
+# Check logs
+bun run worker:logs
+```
+
+### Hooks not firing
+
+1. Restart Cursor IDE after installation
+2. Check hooks are installed: `bun run cursor:status`
+3. Verify hooks.json exists in `.cursor/` directory
+
+### No context appearing
+
+1. Ensure worker is running: `bun run worker:status`
+2. Check that you have observations: visit http://localhost:37777
+3. Verify your API key is configured correctly
+
+## Next Steps
+
+- [Gemini Setup Guide](/cursor/gemini-setup) - Detailed free tier setup
+- [OpenRouter Setup Guide](/cursor/openrouter-setup) - Configure OpenRouter
+- [Configuration Reference](../configuration) - All settings options
+- [Troubleshooting](../troubleshooting) - Common issues and solutions
--- a/.agent/services/claude-mem/docs/public/cursor/openrouter-setup.mdx
+++ b/.agent/services/claude-mem/docs/public/cursor/openrouter-setup.mdx
@@ -0,0 +1,191 @@
+---
+title: "Cursor + OpenRouter Setup"
+description: "Use Claude-Mem in Cursor with OpenRouter's 100+ AI models"
+---
+
+# Cursor + OpenRouter Setup
+
+This guide walks you through setting up Claude-Mem in Cursor using OpenRouter. OpenRouter provides access to 100+ AI models from various providers, including several free options.
+
+<Info>
+**Model variety:** Access Claude, GPT-4, Gemini, Llama, Mistral, and many more through a single API key.
+</Info>
+
+## Step 1: Get an OpenRouter API Key
+
+1. Go to [OpenRouter](https://openrouter.ai)
+2. Sign up or sign in
+3. Navigate to [API Keys](https://openrouter.ai/keys)
+4. Click **Create Key**
+5. Copy your API key - you'll need it in Step 3
+
+<Tip>
+**Free models available:** OpenRouter offers free versions of several models including Gemini Flash and others. Check the [model list](https://openrouter.ai/models?show_free=true) for current free options.
+</Tip>
+
+## Step 2: Clone and Build Claude-Mem
+
+```bash
+# Clone the repository
+git clone https://github.com/thedotmack/claude-mem.git
+cd claude-mem
+
+# Install dependencies
+bun install
+
+# Build the project
+bun run build
+```
+
+## Step 3: Configure OpenRouter Provider
+
+### Option A: Interactive Setup (Recommended)
+
+Run the setup wizard which guides you through everything:
+
+```bash
+bun run cursor:setup
+```
+
+When prompted for provider, select **OpenRouter**.
+
+### Option B: Manual Configuration
+
+Create the settings file manually:
+
+```bash
+# Create settings directory
+mkdir -p ~/.claude-mem
+
+# Create settings file with OpenRouter configuration
+cat > ~/.claude-mem/settings.json << 'EOF'
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter",
+  "CLAUDE_MEM_OPENROUTER_API_KEY": "YOUR_OPENROUTER_API_KEY"
+}
+EOF
+```
+
+Replace `YOUR_OPENROUTER_API_KEY` with your actual API key.
+
+Then install hooks and start the worker:
+
+```bash
+bun run cursor:install
+bun run worker:start
+```
+
+## Step 4: Restart Cursor
+
+Close and reopen Cursor IDE for the hooks to take effect.
+
+## Step 5: Verify Installation
+
+```bash
+# Check worker is running
+bun run worker:status
+
+# Check hooks are installed
+bun run cursor:status
+```
+
+Open http://localhost:37777 to see the memory viewer.
+
+## Recommended Models
+
+### Free Models
+
+| Model | Provider | Notes |
+|-------|----------|-------|
+| `google/gemini-2.0-flash-exp:free` | Google | Fast, capable |
+| `xiaomi/mimo-v2-flash:free` | Xiaomi | Good general purpose |
+
+### Paid Models (Low Cost)
+
+| Model | Approx. Cost | Notes |
+|-------|--------------|-------|
+| `anthropic/claude-3-haiku` | ~$0.25/1M tokens | Fast, efficient |
+| `google/gemini-flash-1.5` | ~$0.075/1M tokens | Great value |
+| `mistralai/mistral-7b-instruct` | ~$0.07/1M tokens | Budget option |
+
+To specify a model, add to your settings:
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter",
+  "CLAUDE_MEM_OPENROUTER_API_KEY": "your-key",
+  "CLAUDE_MEM_OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
+}
+```
+
+## Cost Management
+
+OpenRouter charges per token. To manage costs:
+
+1. **Use free models:** Several high-quality free models are available
+2. **Monitor usage:** Check your [OpenRouter dashboard](https://openrouter.ai/activity)
+3. **Set spending limits:** Configure limits in OpenRouter settings
+
+<Warning>
+**Cost awareness:** Unlike Gemini's free tier, OpenRouter paid models charge per request. Monitor your usage if using paid models.
+</Warning>
+
+## Troubleshooting
+
+### "OpenRouter API key not configured"
+
+Ensure your settings file exists with the correct format:
+
+```bash
+cat ~/.claude-mem/settings.json
+```
+
+Should output something like:
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter",
+  "CLAUDE_MEM_OPENROUTER_API_KEY": "sk-or-..."
+}
+```
+
+### Model not found
+
+1. Check the model ID is correct at [OpenRouter Models](https://openrouter.ai/models)
+2. Some models may require payment - check if you have credits
+3. Free models have `:free` suffix in their ID
+
+### Rate limits
+
+OpenRouter rate limits vary by model and your account tier. If you hit limits:
+1. Wait briefly and retry
+2. Consider upgrading your OpenRouter account tier
+3. Switch to a less popular model
+
+### API errors
+
+Check the worker logs for details:
+
+```bash
+bun run worker:logs
+```
+
+Common issues:
+- Invalid API key (regenerate at OpenRouter)
+- Insufficient credits for paid models
+- Model temporarily unavailable
+
+## Switching Providers Later
+
+You can switch between OpenRouter, Gemini, and Claude SDK at any time by updating your settings. No restart required - changes take effect on the next observation.
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "gemini"
+}
+```
+
+## Next Steps
+
+- [Cursor Integration Overview](/cursor/index) - All Cursor features
+- [Gemini Setup](/cursor/gemini-setup) - Alternative free provider
+- [Configuration Reference](../configuration) - All settings options
--- a/.agent/services/claude-mem/docs/public/development.mdx
+++ b/.agent/services/claude-mem/docs/public/development.mdx
@@ -0,0 +1,775 @@
+---
+title: "Development"
+description: "Build from source, run tests, and contribute to Claude-Mem"
+---
+
+# Development Guide
+
+## Building from Source
+
+### Prerequisites
+
+- Node.js 18.0.0 or higher
+- npm (comes with Node.js)
+- Git
+
+### Clone and Build
+
+```bash
+# Clone repository
+git clone https://github.com/thedotmack/claude-mem.git
+cd claude-mem
+
+# Install dependencies
+npm install
+
+# Build all components
+npm run build
+```
+
+### Build Process
+
+The build process uses esbuild to compile TypeScript:
+
+1. Compiles TypeScript to JavaScript
+2. Creates standalone executables for each hook in `plugin/scripts/`
+3. Bundles MCP search server to `plugin/scripts/mcp-server.cjs`
+4. Bundles worker service to `plugin/scripts/worker-service.cjs`
+5. Bundles web viewer UI to `plugin/ui/viewer.html`
+
+**Build Output**:
+- Hook executables: `*-hook.js` (ESM format)
+- Smart installer: `smart-install.js` (ESM format)
+- Worker service: `worker-service.cjs` (CJS format)
+- MCP server: `mcp-server.cjs` (CJS format)
+- Viewer UI: `viewer.html` (self-contained HTML bundle)
+
+### Build Scripts
+
+```bash
+# Build everything
+npm run build
+
+# Build only hooks
+npm run build:hooks
+
+# The build script is defined in scripts/build-hooks.js
+```
+
+## Development Workflow
+
+### 1. Make Changes
+
+Edit TypeScript source files in `src/`:
+
+```
+src/
+├── hooks/           # Hook implementations (entry points + logic)
+├── services/        # Worker service and database
+├── servers/         # MCP search server
+├── sdk/             # Claude Agent SDK integration
+├── shared/          # Shared utilities
+├── ui/
+│   └── viewer/      # React web viewer UI components
+└── utils/           # General utilities
+```
+
+### 2. Build
+
+```bash
+npm run build
+```
+
+### 3. Test
+
+```bash
+# Run all tests
+npm test
+
+# Test specific file
+node --test tests/session-lifecycle.test.ts
+
+# Test context injection
+npm run test:context
+
+# Verbose context test
+npm run test:context:verbose
+```
+
+### 4. Manual Testing
+
+```bash
+# Start worker manually
+npm run worker:start
+
+# Check worker status
+npm run worker:status
+
+# View logs
+npm run worker:logs
+
+# Test hooks manually
+echo '{"session_id":"test-123","cwd":"'$(pwd)'","source":"startup"}' | node plugin/scripts/context-hook.js
+```
+
+### 5. Iterate
+
+Repeat steps 1-4 until your changes work as expected.
+
+## Viewer UI Development
+
+### Working with the React Viewer
+
+The web viewer UI is a React application built into a self-contained HTML bundle.
+
+**Location**: `src/ui/viewer/`
+
+**Structure**:
+```
+src/ui/viewer/
+├── index.tsx              # Entry point
+├── App.tsx                # Main application component
+├── components/            # React components
+│   ├── Header.tsx         # Header with logo and actions
+│   ├── Sidebar.tsx        # Project filter sidebar
+│   ├── Feed.tsx           # Main feed with infinite scroll
+│   ├── cards/             # Card components
+│   │   ├── ObservationCard.tsx
+│   │   ├── PromptCard.tsx
+│   │   ├── SummaryCard.tsx
+│   │   └── SkeletonCard.tsx
+├── hooks/                 # Custom React hooks
+│   ├── useSSE.ts          # Server-Sent Events connection
+│   ├── usePagination.ts   # Infinite scroll pagination
+│   ├── useSettings.ts     # Settings persistence
+│   └── useStats.ts        # Database statistics
+├── utils/                 # Utilities
+│   ├── constants.ts       # Constants (API URLs, etc.)
+│   ├── formatters.ts      # Date/time formatting
+│   └── merge.ts           # Data merging and deduplication
+└── assets/                # Static assets (fonts, logos)
+```
+
+### Building Viewer UI
+
+```bash
+# Build everything including viewer
+npm run build
+
+# The viewer is built to plugin/ui/viewer.html
+# It's a self-contained HTML file with inlined JS and CSS
+```
+
+### Testing Viewer Changes
+
+1. Make changes to React components in `src/ui/viewer/`
+2. Build: `npm run build`
+3. Sync to installed plugin: `npm run sync-marketplace`
+4. Restart worker: `npm run worker:restart`
+5. Refresh browser at http://localhost:37777
+
+**Hot Reload**: Not currently supported. Full rebuild + restart required for changes.
+
+### Adding New Viewer Features
+
+**Example: Adding a new card type**
+
+1. Create component in `src/ui/viewer/components/cards/YourCard.tsx`:
+
+```tsx
+import React from 'react';
+
+export interface YourCardProps {
+  // Your data structure
+}
+
+export const YourCard: React.FC<YourCardProps> = ({ ... }) => {
+  return (
+    <div className="card">
+      {/* Your UI */}
+    </div>
+  );
+};
+```
+
+2. Import and use in `Feed.tsx`:
+
+```tsx
+import { YourCard } from './cards/YourCard';
+
+// In render logic:
+{item.type === 'your_type' && <YourCard {...item} />}
+```
+
+3. Update types if needed in `src/ui/viewer/types.ts`
+
+4. Rebuild and test
+
+### Viewer UI Architecture
+
+**Data Flow**:
+1. Worker service exposes HTTP + SSE endpoints
+2. React app fetches initial data via HTTP (paginated)
+3. SSE connection provides real-time updates
+4. Custom hooks handle state management and data merging
+5. Components render cards based on item type
+
+**Key Patterns**:
+- **Infinite Scroll**: `usePagination` hook with Intersection Observer
+- **Real-Time Updates**: `useSSE` hook with auto-reconnection
+- **Deduplication**: `merge.ts` utilities prevent duplicate items
+- **Settings Persistence**: `useSettings` hook with localStorage
+- **Theme Support**: CSS variables with light/dark/system themes
+
+## Adding New Features
+
+### Adding a New Hook
+
+1. Create hook implementation in `src/hooks/your-hook.ts`:
+
+```typescript
+#!/usr/bin/env node
+import { readStdin } from '../shared/stdin';
+
+async function main() {
+  const input = await readStdin();
+
+  // Hook implementation
+  const result = {
+    hookSpecificOutput: 'Optional output'
+  };
+
+  console.log(JSON.stringify(result));
+}
+
+main().catch(console.error);
+```
+
+**Note**: As of v4.3.1, hooks are self-contained files. The shebang will be added automatically by esbuild during the build process.
+
+2. Add to `plugin/hooks/hooks.json`:
+
+```json
+{
+  "YourHook": [{
+    "hooks": [{
+      "type": "command",
+      "command": "node ${CLAUDE_PLUGIN_ROOT}/scripts/your-hook.js",
+      "timeout": 120
+    }]
+  }]
+}
+```
+
+4. Rebuild:
+
+```bash
+npm run build
+```
+
+### Modifying Database Schema
+
+1. Add migration to `src/services/sqlite/migrations.ts`:
+
+```typescript
+export const migration011: Migration = {
+  version: 11,
+  up: (db: Database) => {
+    db.run(`
+      ALTER TABLE observations ADD COLUMN new_field TEXT;
+    `);
+  },
+  down: (db: Database) => {
+    // Optional: define rollback
+  }
+};
+```
+
+2. Update types in `src/services/sqlite/types.ts`:
+
+```typescript
+export interface Observation {
+  // ... existing fields
+  new_field?: string;
+}
+```
+
+3. Update database methods in `src/services/sqlite/SessionStore.ts`:
+
+```typescript
+createObservation(obs: Observation) {
+  // Include new_field in INSERT
+}
+```
+
+4. Test migration:
+
+```bash
+# Backup database first!
+cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.backup
+
+# Run tests
+npm test
+```
+
+### Extending SDK Prompts
+
+1. Modify prompts in `src/sdk/prompts.ts`:
+
+```typescript
+export function buildObservationPrompt(observation: Observation): string {
+  return `
+    <observation>
+      <!-- Add new XML structure -->
+    </observation>
+  `;
+}
+```
+
+2. Update parser in `src/sdk/parser.ts`:
+
+```typescript
+export function parseObservation(xml: string): ParsedObservation {
+  // Parse new XML fields
+}
+```
+
+3. Test:
+
+```bash
+npm test
+```
+
+### Adding MCP Search Tools
+
+1. Add tool definition in `src/servers/mcp-server.ts`:
+
+```typescript
+server.setRequestHandler(CallToolRequestSchema, async (request) => {
+  if (request.params.name === 'your_new_tool') {
+    // Implement tool logic
+    const results = await search.yourNewSearch(params);
+    return formatResults(results);
+  }
+});
+```
+
+2. Add search method in `src/services/sqlite/SessionSearch.ts`:
+
+```typescript
+yourNewSearch(params: YourParams): SearchResult[] {
+  // Implement FTS5 search
+}
+```
+
+3. Rebuild and test:
+
+```bash
+npm run build
+npm test
+```
+
+## Testing
+
+### Testing Philosophy
+
+Claude-mem relies on **real-world usage and manual testing** rather than traditional unit tests. The project philosophy prioritizes:
+
+1. **Manual verification** - Testing features in actual Claude Code sessions
+2. **Integration testing** - Running the full system end-to-end
+3. **Database inspection** - Verifying data correctness via SQLite queries
+4. **CLI tools** - Interactive tools for checking system state
+5. **Observability** - Comprehensive logging and worker health checks
+
+This approach was chosen because:
+- Hook behavior depends heavily on Claude Code's runtime environment
+- SDK interactions require real API calls and responses
+- SQLite and Bun runtime provide stability guarantees
+- Manual testing catches integration issues that unit tests miss
+
+### Manual Testing Workflow
+
+When developing new features:
+
+1. **Build and sync**:
+   ```bash
+   npm run build
+   npm run sync-marketplace
+   npm run worker:restart
+   ```
+
+2. **Test in real session**:
+   - Start Claude Code
+   - Trigger the feature you're testing
+   - Verify expected behavior
+
+3. **Check database state**:
+   ```bash
+   sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM your_table;"
+   ```
+
+4. **Monitor worker logs**:
+   ```bash
+   npm run worker:logs
+   ```
+
+5. **Verify queue health** (for recovery features):
+   ```bash
+   bun scripts/check-pending-queue.ts
+   ```
+
+### Testing Tools
+
+**Health Checks**:
+```bash
+# Worker status
+npm run worker:status
+
+# Queue inspection
+curl http://localhost:37777/api/pending-queue
+
+# Database integrity
+sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA integrity_check;"
+```
+
+**Hook Testing**:
+```bash
+# Test context hook manually
+echo '{"session_id":"test-123","cwd":"'$(pwd)'","source":"startup"}' | node plugin/scripts/context-hook.js
+
+# Test new hook
+echo '{"session_id":"test-123","cwd":"'$(pwd)'","prompt":"test"}' | node plugin/scripts/new-hook.js
+```
+
+**Data Verification**:
+```bash
+# Check recent observations
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT id, tool_name, created_at
+  FROM observations
+  ORDER BY created_at_epoch DESC
+  LIMIT 10;
+"
+
+# Check summaries
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT id, request, completed
+  FROM session_summaries
+  ORDER BY created_at_epoch DESC
+  LIMIT 5;
+"
+```
+
+### Recovery Feature Testing
+
+For manual recovery features specifically:
+
+1. **Simulate stuck messages**:
+   ```bash
+   # Manually create stuck message (for testing only)
+   sqlite3 ~/.claude-mem/claude-mem.db "
+     UPDATE pending_messages
+     SET status = 'processing',
+         started_processing_at_epoch = strftime('%s', 'now', '-10 minutes') * 1000
+     WHERE id = 123;
+   "
+   ```
+
+2. **Test recovery**:
+   ```bash
+   bun scripts/check-pending-queue.ts
+   ```
+
+3. **Verify results**:
+   ```bash
+   curl http://localhost:37777/api/pending-queue | jq '.queue'
+   ```
+
+### Regression Testing
+
+Before releasing:
+
+1. **Test all hook triggers**:
+   - SessionStart: Start new Claude Code session
+   - UserPromptSubmit: Submit a prompt
+   - PostToolUse: Use a tool like Read
+   - Summary: Let session complete
+   - SessionEnd: Close Claude Code
+
+2. **Test core features**:
+   - Context injection (recent sessions appear)
+   - Observation processing (summaries generated)
+   - MCP search tools (search returns results)
+   - Viewer UI (loads at http://localhost:37777)
+   - Manual recovery (stuck messages recovered)
+
+3. **Test edge cases**:
+   - Worker crash recovery
+   - Database locks
+   - Port conflicts
+   - Large databases
+
+4. **Cross-platform** (if applicable):
+   - macOS
+   - Linux
+   - Windows
+
+## Code Style
+
+### TypeScript Guidelines
+
+- Use TypeScript strict mode
+- Define interfaces for all data structures
+- Use async/await for asynchronous code
+- Handle errors explicitly
+- Add JSDoc comments for public APIs
+
+### Formatting
+
+- Follow existing code formatting
+- Use 2-space indentation
+- Use single quotes for strings
+- Add trailing commas in objects/arrays
+
+### Example
+
+```typescript
+/**
+ * Create a new observation in the database
+ */
+export async function createObservation(
+  obs: Observation
+): Promise<number> {
+  try {
+    const result = await db.insert('observations', {
+      session_id: obs.session_id,
+      tool_name: obs.tool_name,
+      // ...
+    });
+    return result.id;
+  } catch (error) {
+    logger.error('Failed to create observation', error);
+    throw error;
+  }
+}
+```
+
+## Debugging
+
+### Enable Debug Logging
+
+```bash
+export DEBUG=claude-mem:*
+npm run worker:restart
+npm run worker:logs
+```
+
+### Inspect Database
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db
+
+# View schema
+.schema observations
+
+# Query data
+SELECT * FROM observations LIMIT 10;
+```
+
+### Trace Observations
+
+Use correlation IDs to trace observations through the pipeline:
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db
+SELECT correlation_id, tool_name, created_at
+FROM observations
+WHERE session_id = 'YOUR_SESSION_ID'
+ORDER BY created_at;
+```
+
+### Debug Hooks
+
+Run hooks manually with test input:
+
+```bash
+# Test context hook
+echo '{"session_id":"test-123","cwd":"'$(pwd)'","source":"startup"}' | node plugin/scripts/context-hook.js
+
+# Test new hook
+echo '{"session_id":"test-123","cwd":"'$(pwd)'","prompt":"test"}' | node plugin/scripts/new-hook.js
+```
+
+## Publishing
+
+### NPM Publishing
+
+```bash
+# Update version in package.json
+npm version patch  # or minor, or major
+
+# Build
+npm run build
+
+# Publish to NPM
+npm run release
+```
+
+The `release` script:
+1. Runs tests
+2. Builds all components
+3. Publishes to NPM registry
+
+### Creating a Release
+
+1. Update version in `package.json`
+2. Update `CHANGELOG.md`
+3. Commit changes
+4. Create git tag
+5. Push to GitHub
+6. Publish to NPM
+
+```bash
+# Manual version bump:
+# 1. Update version in package.json
+# 2. Update version in plugin/.claude-plugin/plugin.json
+# 3. Update version at top of CLAUDE.md
+# 4. Update version badge in README.md
+# 5. Run: npm run build && npm run sync-marketplace
+
+# Or use npm version command:
+npm version 4.3.2
+
+# Update changelog
+# Edit CHANGELOG.md manually
+
+# Commit
+git add .
+git commit -m "chore: Release v4.3.2"
+
+# Tag
+git tag v4.3.2
+
+# Push
+git push origin main --tags
+
+# Publish to NPM
+npm run release
+```
+
+## Contributing
+
+### Contribution Workflow
+
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Make your changes
+4. Write tests
+5. Update documentation
+6. Commit your changes (`git commit -m 'Add amazing feature'`)
+7. Push to the branch (`git push origin feature/amazing-feature`)
+8. Open a Pull Request
+
+### Pull Request Guidelines
+
+- **Clear title**: Describe what the PR does
+- **Description**: Explain why the change is needed
+- **Tests**: Include tests for new features
+- **Documentation**: Update docs as needed
+- **Changelog**: Add entry to CHANGELOG.md
+- **Commits**: Use clear, descriptive commit messages
+
+### Code Review Process
+
+1. Automated tests must pass
+2. Code review by maintainer
+3. Address feedback
+4. Final approval
+5. Merge to main
+
+## Development Tools
+
+### Recommended VSCode Extensions
+
+- TypeScript
+- ESLint
+- Prettier
+- SQLite Viewer
+
+### Useful Commands
+
+```bash
+# Check TypeScript types
+npx tsc --noEmit
+
+# Lint code (if configured)
+npm run lint
+
+# Format code (if configured)
+npm run format
+
+# Clean build artifacts
+rm -rf plugin/scripts/*.js plugin/scripts/*.cjs
+```
+
+## Troubleshooting Development
+
+### Build Fails
+
+1. Clean node_modules:
+   ```bash
+   rm -rf node_modules
+   npm install
+   ```
+
+2. Check Node.js version:
+   ```bash
+   node --version  # Should be >= 18.0.0
+   ```
+
+3. Check for syntax errors:
+   ```bash
+   npx tsc --noEmit
+   ```
+
+### Tests Fail
+
+1. Check database:
+   ```bash
+   rm ~/.claude-mem/claude-mem.db
+   npm test
+   ```
+
+2. Check worker status:
+   ```bash
+   npm run worker:status
+   ```
+
+3. View logs:
+   ```bash
+   npm run worker:logs
+   ```
+
+### Worker Won't Start
+
+1. Kill existing process:
+   ```bash
+   npm run worker:stop
+   ```
+
+2. Check port:
+   ```bash
+   lsof -i :37777
+   ```
+
+3. Try custom port:
+   ```bash
+   export CLAUDE_MEM_WORKER_PORT=38000
+   npm run worker:start
+   ```
+
+## Next Steps
+
+- [Architecture Overview](architecture/overview) - Understand the system
+- [Configuration](configuration) - Customize Claude-Mem
+- [Troubleshooting](troubleshooting) - Common issues
--- a/.agent/services/claude-mem/docs/public/docs.json
+++ b/.agent/services/claude-mem/docs/public/docs.json
@@ -0,0 +1,148 @@
+{
+  "$schema": "https://mintlify.com/schema.json",
+  "name": "Claude-Mem",
+  "description": "Persistent memory compression system that preserves context across Claude Code sessions",
+  "theme": "mint",
+  "favicon": "/claude-mem-logomark.webp",
+  "logo": {
+    "light": "/claude-mem-logo-for-light-mode.webp",
+    "dark": "/claude-mem-logo-for-dark-mode.webp",
+    "href": "https://github.com/thedotmack/claude-mem"
+  },
+  "colors": {
+    "primary": "#3B82F6",
+    "light": "#EFF6FF",
+    "dark": "#1E40AF"
+  },
+  "navbar": {
+    "links": [
+      {
+        "label": "GitHub",
+        "href": "https://github.com/thedotmack/claude-mem"
+      }
+    ],
+    "primary": {
+      "type": "button",
+      "label": "Install",
+      "href": "https://github.com/thedotmack/claude-mem#quick-start"
+    }
+  },
+  "navigation": {
+    "groups": [
+      {
+        "group": "Get Started",
+        "icon": "rocket",
+        "pages": [
+          "introduction",
+          "installation",
+          "usage/getting-started",
+          "usage/openrouter-provider",
+          "usage/gemini-provider",
+          "usage/search-tools",
+          "usage/claude-desktop",
+          "usage/private-tags",
+          "usage/export-import",
+          "usage/manual-recovery",
+          "usage/folder-context",
+          "beta-features",
+          "endless-mode"
+        ]
+      },
+      {
+        "group": "Cursor Integration",
+        "icon": "wand-magic-sparkles",
+        "pages": [
+          "cursor/index",
+          "cursor/gemini-setup",
+          "cursor/openrouter-setup"
+        ]
+      },
+      {
+        "group": "Best Practices",
+        "icon": "lightbulb",
+        "pages": [
+          "context-engineering",
+          "progressive-disclosure",
+          "smart-explore-benchmark"
+        ]
+      },
+      {
+        "group": "Configuration & Development",
+        "icon": "gear",
+        "pages": [
+          "configuration",
+          "modes",
+          "development",
+          "troubleshooting",
+          "platform-integration",
+          "openclaw-integration"
+        ]
+      },
+      {
+        "group": "Architecture",
+        "icon": "diagram-project",
+        "pages": [
+          "architecture/overview",
+          "architecture-evolution",
+          "hooks-architecture",
+          "architecture/hooks",
+          "architecture/worker-service",
+          "architecture/database",
+          "architecture/search-architecture",
+          "architecture/pm2-to-bun-migration"
+        ]
+      }
+    ]
+  },
+  "footer": {
+    "socials": {
+      "github": "https://github.com/thedotmack/claude-mem"
+    },
+    "links": [
+      {
+        "header": "Resources",
+        "items": [
+          {
+            "label": "Documentation",
+            "href": "https://github.com/thedotmack/claude-mem"
+          },
+          {
+            "label": "Issues",
+            "href": "https://github.com/thedotmack/claude-mem/issues"
+          }
+        ]
+      },
+      {
+        "header": "Legal",
+        "items": [
+          {
+            "label": "License (AGPL-3.0)",
+            "href": "https://github.com/thedotmack/claude-mem/blob/main/LICENSE"
+          }
+        ]
+      }
+    ]
+  },
+  "seo": {
+    "indexing": "all",
+    "metatags": {
+      "og:type": "website",
+      "og:site_name": "Claude-Mem Documentation",
+      "og:description": "Persistent memory compression system that preserves context across Claude Code sessions"
+    }
+  },
+  "contextual": {
+    "options": [
+      "copy",
+      "view",
+      "chatgpt",
+      "claude",
+      "cursor"
+    ]
+  },
+  "integrations": {
+    "telemetry": {
+      "enabled": false
+    }
+  }
+}
--- a/.agent/services/claude-mem/docs/public/endless-mode.mdx
+++ b/.agent/services/claude-mem/docs/public/endless-mode.mdx
@@ -0,0 +1,111 @@
+---
+title: "Endless Mode (Beta)"
+description: "Experimental biomimetic memory architecture for extended sessions"
+---
+
+# Current State of Endless Mode
+
+## Core Concept
+
+Endless Mode is a **biomimetic memory architecture** that solves Claude's context window exhaustion problem. Instead of keeping full tool outputs in the context window (O(N²) complexity), it:
+
+- Captures compressed observations after each tool use
+- Replaces transcripts with low token summaries
+- Achieves O(N) linear complexity
+- Maintains two-tier memory: working memory (compressed) + archive memory (full transcript on disk, maintained by default claude code functionality)
+
+## Implementation Status
+
+**Status**: FUNCTIONAL BUT EXPERIMENTAL
+
+**Current Branch**: `beta/endless-mode` (ahead of main)
+
+**Recent Activity**:
+- Merged main branch changes
+- Resolved merge conflicts in save-hook, SessionStore, SessionRoutes
+- Updated documentation to remove misleading token reduction claims
+- Added important caveats about beta status
+
+## Key Architecture Components
+
+1. **Pre-Tool-Use Hook** - Tracks tool execution start, sends tool_use_id to worker
+2. **Save Hook (PostToolUse)** - **CRITICAL**: Blocks until observation is generated (110s timeout), injects compressed observation back into context
+3. **SessionManager.waitForNextObservation()** - Event-driven wait mechanism (no polling)
+4. **SDKAgent** - Generates observations via Agent SDK, emits completion events
+5. **Database** - Added `tool_use_id` column for observation correlation
+
+## Configuration
+
+```json
+{
+  "CLAUDE_MEM_ENDLESS_MODE": "false",  // Default: disabled
+  "CLAUDE_MEM_ENDLESS_WAIT_TIMEOUT_MS": "90000"  // 90 second timeout
+}
+```
+
+**Enable via**: Manual checkout of beta branch (see instructions below)
+
+## Flow
+
+```
+Tool Executes → Pre-Hook (track ID) → Tool Completes →
+Save-Hook (BLOCKS) → Worker processes → SDK generates observation →
+Event fired → Hook receives observation → Injects markdown →
+Clears input → Context reduced
+```
+
+## Known Limitations
+
+From the documentation:
+- ⚠️ **Slower than standard mode** - Blocking adds latency
+- ⚠️ **Still in development** - May have bugs
+- ⚠️ **Not battle-tested** - New architecture
+- ⚠️ **Theoretical projections** - Efficiency gains not yet validated in production
+
+## What's Working
+
+- ✅ Synchronous observation injection
+- ✅ Event-driven wait mechanism
+- ✅ Token reduction via input clearing
+- ✅ Database schema with tool_use_id
+- ✅ Web UI for version switching
+- ✅ Graceful timeout fallbacks
+
+## What's Not Ready
+
+- ❌ Production validation of token savings
+- ❌ Comprehensive test coverage
+- ❌ Stable channel release
+- ❌ Performance benchmarks
+- ❌ Long-running session data
+
+## How to Try Endless Mode
+
+Endless Mode is currently only available on the beta branch. To try it:
+
+```bash
+# Navigate to your claude-mem installation
+cd ~/.claude/plugins/marketplaces/thedotmack/
+
+# Checkout the beta branch
+git checkout beta/endless-mode
+
+# Install dependencies
+npm install
+
+# Restart the worker
+npm run worker:restart
+```
+
+**To return to stable:**
+
+```bash
+cd ~/.claude/plugins/marketplaces/thedotmack/
+git checkout main
+npm install
+npm run worker:restart
+```
+
+## Summary
+
+The implementation is architecturally complete and functional, but remains experimental pending production validation of the theoretical efficiency gains.
--- a/.agent/services/claude-mem/docs/public/hooks-architecture.mdx
+++ b/.agent/services/claude-mem/docs/public/hooks-architecture.mdx
@@ -0,0 +1,811 @@
+# How Claude-Mem Uses Hooks: A Lifecycle-Driven Architecture
+
+## Core Principle
+**Observe the main Claude Code session from the outside, process observations in the background, inject context at the right time.**
+
+---
+
+## The Big Picture
+
+Claude-Mem is fundamentally a **hook-driven system**. Every piece of functionality happens in response to lifecycle events:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│              CLAUDE CODE SESSION                         │
+│  (Main session - user interacting with Claude)          │
+│                                                          │
+│  SessionStart → UserPromptSubmit → Tool Use → Stop      │
+│     ↓ ↓ ↓            ↓               ↓          ↓        │
+│  [3 Hooks]        [Hook]          [Hook]     [Hook]     │
+└─────────────────────────────────────────────────────────┘
+    ↓ ↓ ↓             ↓               ↓          ↓
+┌─────────────────────────────────────────────────────────┐
+│                  CLAUDE-MEM SYSTEM                       │
+│                                                          │
+│  Smart      Worker      Context    New        Obs       │
+│  Install    Start       Inject     Session    Capture   │
+└─────────────────────────────────────────────────────────┘
+```
+
+<Note>
+As of Claude Code 2.1.0 (ultrathink update), SessionStart hooks no longer display user-visible messages. Context is silently injected via `hookSpecificOutput.additionalContext`.
+</Note>
+
+**Key insight:** Claude-Mem doesn't interrupt or modify Claude Code's behavior. It observes from the outside and provides value through lifecycle hooks.
+
+---
+
+## Why Hooks?
+
+### The Non-Invasive Requirement
+
+Claude-Mem had several architectural constraints:
+
+1. **Can't modify Claude Code**: It's a closed-source binary
+2. **Must be fast**: Can't slow down the main session
+3. **Must be reliable**: Can't break Claude Code if it fails
+4. **Must be portable**: Works on any project without configuration
+
+**Solution:** External command hooks configured via settings.json
+
+### The Hook System Advantage
+
+Claude Code's hook system provides exactly what we need:
+
+<CardGroup cols={2}>
+  <Card title="Lifecycle Events" icon="clock">
+    SessionStart, UserPromptSubmit, PostToolUse, Stop
+  </Card>
+
+  <Card title="Non-Blocking" icon="forward">
+    Hooks run in parallel, don't wait for completion
+  </Card>
+
+  <Card title="Context Injection" icon="upload">
+    SessionStart and UserPromptSubmit can add context
+  </Card>
+
+  <Card title="Tool Observation" icon="eye">
+    PostToolUse sees all tool inputs and outputs
+  </Card>
+</CardGroup>
+
+---
+
+## The Hook Scripts
+
+Claude-Mem uses lifecycle hook scripts across 5 lifecycle events. SessionStart runs 3 hooks in sequence: smart-install, worker-service start, and context-hook.
+
+### Pre-Hook: Smart Install (Before SessionStart)
+
+**Purpose:** Intelligently manage dependencies and start worker service
+
+**Note:** This is NOT a lifecycle hook - it's a pre-hook script executed via command chaining before context-hook runs.
+
+**When:** Claude Code starts (startup, clear, or compact)
+
+**What it does:**
+1. Checks if dependencies need installation (version marker)
+2. Only runs `npm install` when necessary:
+   - First-time installation
+   - Version changed in package.json
+3. Provides Windows-specific error messages
+4. Starts Bun worker service
+
+**Configuration:**
+```json
+{
+  "hooks": {
+    "SessionStart": [{
+      "matcher": "startup|clear|compact",
+      "hooks": [{
+        "type": "command",
+        "command": "node \"${CLAUDE_PLUGIN_ROOT}/../scripts/smart-install.js\" && node ${CLAUDE_PLUGIN_ROOT}/scripts/context-hook.js",
+        "timeout": 300
+      }]
+    }]
+  }
+}
+```
+
+**Key Features:**
+- ✅ Version caching (`.install-version` file)
+- ✅ Fast when already installed (~10ms vs 2-5 seconds)
+- ✅ Cross-platform compatible
+- ✅ Helpful Windows error messages for build tools
+
+**v5.0.3 Enhancement:** Smart caching eliminates redundant installs
+
+**Source:** `scripts/smart-install.js`
+
+---
+
+### Hook 1: SessionStart - Context Injection
+
+**Purpose:** Inject relevant context from previous sessions
+
+**When:** Claude Code starts (runs after smart-install pre-hook)
+
+**What it does:**
+1. Extracts project name from current working directory
+2. Queries SQLite for recent session summaries (last 10)
+3. Queries SQLite for recent observations (configurable, default 50)
+4. Formats as progressive disclosure index
+5. Outputs to stdout (automatically injected into context)
+
+**Key decisions:**
+- ✅ Runs on startup, clear, and compact
+- ✅ 300-second timeout (allows for npm install if needed)
+- ✅ Progressive disclosure format (index, not full details)
+- ✅ Configurable observation count via `CLAUDE_MEM_CONTEXT_OBSERVATIONS`
+
+**Output format:**
+```markdown
+# [claude-mem] recent context
+
+**Legend:** 🎯 session-request | 🔴 gotcha | 🟡 problem-solution ...
+
+### Oct 26, 2025
+
+**General**
+| ID | Time | T | Title | Tokens |
+|----|------|---|-------|--------|
+| #2586 | 12:58 AM | 🔵 | Context hook file empty | ~51 |
+
+*Use MCP search tools to access full details*
+```
+
+**Source:** `src/hooks/context-hook.ts` → `plugin/scripts/context-hook.js`
+
+---
+
+### Hook 2: UserPromptSubmit (New Session Hook)
+
+**Purpose:** Initialize session tracking when user submits a prompt
+
+**When:** Before Claude processes the user's message
+
+**What it does:**
+1. Reads user prompt and session ID from stdin
+2. Creates new session record in SQLite
+3. Saves raw user prompt for full-text search (v4.2.0+)
+4. Starts Bun worker service if not running
+5. Returns immediately (non-blocking)
+
+**Configuration:**
+```json
+{
+  "hooks": {
+    "UserPromptSubmit": [{
+      "hooks": [{
+        "type": "command",
+        "command": "${CLAUDE_PLUGIN_ROOT}/scripts/new-hook.js"
+      }]
+    }]
+  }
+}
+```
+
+**Key decisions:**
+- ✅ No matcher (runs for all prompts)
+- ✅ Creates session record immediately
+- ✅ Stores raw prompts for search (privacy note: local SQLite only)
+- ✅ Auto-starts worker service
+- ✅ Suppresses output (`suppressOutput: true`)
+
+**Database operations:**
+```sql
+INSERT INTO sdk_sessions (claude_session_id, project, user_prompt, ...)
+VALUES (?, ?, ?, ...)
+
+INSERT INTO user_prompts (session_id, prompt, prompt_number, ...)
+VALUES (?, ?, ?, ...)
+```
+
+**Source:** `src/hooks/new-hook.ts` → `plugin/scripts/new-hook.js`
+
+---
+
+### Hook 3: PostToolUse (Save Observation Hook)
+
+**Purpose:** Capture tool execution observations for later processing
+
+**When:** Immediately after any tool completes successfully
+
+**What it does:**
+1. Receives tool name, input, output from stdin
+2. Finds active session for current project
+3. Enqueues observation in observation_queue table
+4. Returns immediately (processing happens in worker)
+
+**Configuration:**
+```json
+{
+  "hooks": {
+    "PostToolUse": [{
+      "matcher": "*",
+      "hooks": [{
+        "type": "command",
+        "command": "${CLAUDE_PLUGIN_ROOT}/scripts/save-hook.js"
+      }]
+    }]
+  }
+}
+```
+
+**Key decisions:**
+- ✅ Matcher: `*` (captures all tools)
+- ✅ Non-blocking (just enqueues, doesn't process)
+- ✅ Worker processes observations asynchronously
+- ✅ Parallel execution safe (each hook gets own stdin)
+
+**Database operations:**
+```sql
+INSERT INTO observation_queue (session_id, tool_name, tool_input, tool_output, ...)
+VALUES (?, ?, ?, ?, ...)
+```
+
+**What gets queued:**
+```json
+{
+  "session_id": "abc123",
+  "tool_name": "Edit",
+  "tool_input": {
+    "file_path": "/path/to/file.ts",
+    "old_string": "...",
+    "new_string": "..."
+  },
+  "tool_output": {
+    "success": true,
+    "linesChanged": 5
+  },
+  "created_at_epoch": 1698765432
+}
+```
+
+**Source:** `src/hooks/save-hook.ts` → `plugin/scripts/save-hook.js`
+
+---
+
+### Hook 4: Stop Hook (Summary Generation)
+
+**Purpose:** Generate AI-powered session summaries during the session
+
+**When:** When Claude stops (triggered by Stop lifecycle event)
+
+**What it does:**
+1. Gathers session observations from database
+2. Sends to Claude Agent SDK for summarization
+3. Processes response and extracts structured summary
+4. Stores in session_summaries table
+
+**Configuration:**
+```json
+{
+  "hooks": {
+    "Stop": [{
+      "hooks": [{
+        "type": "command",
+        "command": "${CLAUDE_PLUGIN_ROOT}/scripts/summary-hook.js"
+      }]
+    }]
+  }
+}
+```
+
+**Key decisions:**
+- ✅ Triggered by Stop lifecycle event
+- ✅ Multiple summaries per session (v4.2.0+)
+- ✅ Summaries are checkpoints, not endings
+- ✅ Uses Claude Agent SDK for AI compression
+
+**Summary structure:**
+```xml
+<summary>
+  <request>User's original request</request>
+  <investigated>What was examined</investigated>
+  <learned>Key discoveries</learned>
+  <completed>Work finished</completed>
+  <next_steps>Remaining tasks</next_steps>
+  <files_read>
+    <file>path/to/file1.ts</file>
+    <file>path/to/file2.ts</file>
+  </files_read>
+  <files_modified>
+    <file>path/to/file3.ts</file>
+  </files_modified>
+  <notes>Additional context</notes>
+</summary>
+```
+
+**Source:** `src/hooks/summary-hook.ts` → `plugin/scripts/summary-hook.js`
+
+---
+
+### Hook 5: SessionEnd (Cleanup Hook)
+
+**Purpose:** Mark sessions as completed when they end
+
+**When:** Claude Code session ends (not on `/clear`)
+
+**What it does:**
+1. Marks session as completed in database
+2. Allows worker to finish processing
+3. Performs graceful cleanup
+
+**Configuration:**
+```json
+{
+  "hooks": {
+    "SessionEnd": [{
+      "hooks": [{
+        "type": "command",
+        "command": "${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-hook.js"
+      }]
+    }]
+  }
+}
+```
+
+**Key decisions:**
+- ✅ Graceful completion (v4.1.0+)
+- ✅ No longer sends DELETE to workers
+- ✅ Skips cleanup on `/clear` commands
+- ✅ Preserves ongoing sessions
+
+**Why graceful cleanup?**
+
+**Old approach (v3):**
+```typescript
+// ❌ Aggressive cleanup
+SessionEnd → DELETE /worker/session → Worker stops immediately
+```
+
+**Problems:**
+- Interrupted summary generation
+- Lost pending observations
+- Race conditions
+
+**New approach (v4.1.0+):**
+```typescript
+// ✅ Graceful completion
+SessionEnd → UPDATE sessions SET completed_at = NOW()
+Worker sees completion → Finishes processing → Exits naturally
+```
+
+**Benefits:**
+- Worker finishes important operations
+- Summaries complete successfully
+- Clean state transitions
+
+**Source:** `src/hooks/cleanup-hook.ts` → `plugin/scripts/cleanup-hook.js`
+
+---
+
+## Hook Execution Flow
+
+### Session Lifecycle
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant Claude
+    participant Hooks
+    participant Worker
+    participant DB
+
+    User->>Claude: Start Claude Code
+    Claude->>Hooks: SessionStart hook
+    Hooks->>DB: Query recent context
+    DB-->>Hooks: Session summaries + observations
+    Hooks-->>Claude: Inject context
+    Note over Claude: Context available for session
+
+    User->>Claude: Submit prompt
+    Claude->>Hooks: UserPromptSubmit hook
+    Hooks->>DB: Create session record
+    Hooks->>Worker: Start worker (if not running)
+    Worker-->>DB: Ready to process
+
+    Claude->>Claude: Execute tools
+    Claude->>Hooks: PostToolUse (multiple times)
+    Hooks->>DB: Queue observations
+    Note over Worker: Polls queue, processes observations
+
+    Worker->>Worker: AI compression
+    Worker->>DB: Store compressed observations
+    Worker->>Hooks: Trigger summary hook
+    Hooks->>DB: Store session summary
+
+    User->>Claude: Finish
+    Claude->>Hooks: SessionEnd hook
+    Hooks->>DB: Mark session complete
+    Worker->>DB: Check completion
+    Worker->>Worker: Finish processing
+    Worker->>Worker: Exit gracefully
+```
+
+### Hook Timing
+
+| Event | Timing | Blocking | Timeout | Output Handling |
+|-------|--------|----------|---------|-----------------|
+| **SessionStart (smart-install)** | Before session | No | 300s | stderr (log only) |
+| **SessionStart (worker-start)** | Before session | No | 60s | stderr (log only) |
+| **SessionStart (context)** | Before session | No | 60s | JSON → additionalContext (silent) |
+| **UserPromptSubmit** | Before processing | No | 60s | stdout → context |
+| **PostToolUse** | After tool | No | 120s | Transcript only |
+| **Summary** | Worker triggered | No | 120s | Database |
+| **SessionEnd** | On exit | No | 120s | Log only |
+
+<Note>
+As of Claude Code 2.1.0 (ultrathink update), SessionStart hooks no longer display user-visible messages. Context is silently injected via `hookSpecificOutput.additionalContext`.
+</Note>
+
+---
+
+## The Worker Service Architecture
+
+### Why a Background Worker?
+
+**Problem:** Hooks must be fast (< 1 second)
+
+**Reality:** AI compression takes 5-30 seconds per observation
+
+**Solution:** Hooks enqueue observations, worker processes async
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                   HOOK (Fast)                            │
+│  1. Read stdin (< 1ms)                                  │
+│  2. Insert into queue (< 10ms)                          │
+│  3. Return success (< 20ms total)                       │
+└─────────────────────────────────────────────────────────┘
+                        ↓ (queue)
+┌─────────────────────────────────────────────────────────┐
+│                 WORKER (Slow)                            │
+│  1. Poll queue every 1s                                 │
+│  2. Process observation via Claude SDK (5-30s)          │
+│  3. Parse and store results                             │
+│  4. Mark observation processed                          │
+└─────────────────────────────────────────────────────────┘
+```
+
+### Bun Process Management
+
+**Technology:** Bun (JavaScript runtime and process manager)
+
+**Why Bun:**
+- Auto-restart on failure
+- Fast startup and low memory footprint
+- Built-in TypeScript support
+- Cross-platform (works on macOS, Linux, Windows)
+- No separate process manager needed
+
+**Worker lifecycle:**
+```bash
+# Started by hooks automatically (if not running)
+npm run worker:start
+
+# Status check
+npm run worker:status
+
+# View logs
+npm run worker:logs
+
+# Restart
+npm run worker:restart
+
+# Stop
+npm run worker:stop
+```
+
+### Worker HTTP API
+
+**Technology:** Express.js REST API on port 37777
+
+**Endpoints:**
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/health` | GET | Health check |
+| `/sessions` | POST | Create session |
+| `/sessions/:id` | GET | Get session status |
+| `/sessions/:id` | PATCH | Update session |
+| `/observations` | POST | Enqueue observation |
+| `/observations/:id` | GET | Get observation |
+
+**Why HTTP API?**
+- Language-agnostic (hooks can be any language)
+- Easy debugging (curl commands)
+- Standard error handling
+- Proper async handling
+
+---
+
+## Design Patterns
+
+### Pattern 1: Fire-and-Forget Hooks
+
+**Principle:** Hooks should return immediately, not wait for completion
+
+```typescript
+// ❌ Bad: Hook waits for processing
+export async function saveHook(stdin: HookInput) {
+  const observation = parseInput(stdin);
+  await processObservation(observation);  // BLOCKS!
+  return success();
+}
+
+// ✅ Good: Hook enqueues and returns
+export async function saveHook(stdin: HookInput) {
+  const observation = parseInput(stdin);
+  await enqueueObservation(observation);  // Fast
+  return success();  // Immediate
+}
+```
+
+### Pattern 2: Queue-Based Processing
+
+**Principle:** Decouple capture from processing
+
+```
+Hook (capture) → Queue (buffer) → Worker (process)
+```
+
+**Benefits:**
+- Parallel hook execution safe
+- Worker failure doesn't affect hooks
+- Retry logic centralized
+- Backpressure handling
+
+### Pattern 3: Graceful Degradation
+
+**Principle:** Memory system failure shouldn't break Claude Code
+
+```typescript
+try {
+  await captureObservation();
+} catch (error) {
+  // Log error, but don't throw
+  console.error('Memory capture failed:', error);
+  return { continue: true, suppressOutput: true };
+}
+```
+
+**Failure modes:**
+- Database locked → Skip observation, log error
+- Worker crashed → Auto-restart via Bun
+- Network issue → Retry with exponential backoff
+- Disk full → Warn user, disable memory
+
+### Pattern 4: Progressive Enhancement
+
+**Principle:** Core functionality works without memory, memory enhances it
+
+```
+Without memory: Claude Code works normally
+With memory:    Claude Code + context from past sessions
+Memory broken:  Falls back to working normally
+```
+
+---
+
+## Hook Debugging
+
+### Debug Mode
+
+Enable detailed hook execution logs:
+
+```bash
+claude --debug
+```
+
+**Output:**
+```
+[DEBUG] Executing hooks for PostToolUse:Write
+[DEBUG] Getting matching hook commands for PostToolUse with query: Write
+[DEBUG] Found 1 hook matchers in settings
+[DEBUG] Matched 1 hooks for query "Write"
+[DEBUG] Found 1 hook commands to execute
+[DEBUG] Executing hook command: ${CLAUDE_PLUGIN_ROOT}/scripts/save-hook.js with timeout 60000ms
+[DEBUG] Hook command completed with status 0: {"continue":true,"suppressOutput":true}
+```
+
+### Common Issues
+
+<AccordionGroup>
+  <Accordion title="Hook not executing">
+    **Symptoms:** Hook command never runs
+
+    **Debugging:**
+    1. Check `/hooks` menu - is hook registered?
+    2. Verify matcher pattern (case-sensitive!)
+    3. Test command manually: `echo '{}' | node save-hook.js`
+    4. Check file permissions (executable?)
+  </Accordion>
+
+  <Accordion title="Hook times out">
+    **Symptoms:** Hook execution exceeds timeout
+
+    **Debugging:**
+    1. Check timeout setting (default 60s)
+    2. Identify slow operation (database? network?)
+    3. Move slow operation to worker
+    4. Increase timeout if necessary
+  </Accordion>
+
+  <Accordion title="Context not injecting">
+    **Symptoms:** SessionStart hook runs but context missing
+
+    **Debugging:**
+    1. Check stdout (must be valid JSON or plain text)
+    2. Verify no stderr output (pollutes JSON)
+    3. Check exit code (must be 0)
+    4. Look for npm install output (v4.3.1 fix)
+  </Accordion>
+
+  <Accordion title="Observations not captured">
+    **Symptoms:** PostToolUse hook runs but observations missing
+
+    **Debugging:**
+    1. Check database: `sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM observation_queue"`
+    2. Verify session exists: `SELECT * FROM sdk_sessions`
+    3. Check worker status: `npm run worker:status`
+    4. View worker logs: `npm run worker:logs`
+  </Accordion>
+</AccordionGroup>
+
+### Testing Hooks Manually
+
+```bash
+# Test context hook
+echo '{
+  "session_id": "test123",
+  "cwd": "/Users/alex/projects/my-app",
+  "hook_event_name": "SessionStart",
+  "source": "startup"
+}' | node plugin/scripts/context-hook.js
+
+# Test save hook
+echo '{
+  "session_id": "test123",
+  "tool_name": "Edit",
+  "tool_input": {"file_path": "test.ts"},
+  "tool_output": {"success": true}
+}' | node plugin/scripts/save-hook.js
+
+# Test with actual Claude Code
+claude --debug
+/hooks  # View registered hooks
+# Submit prompt and watch debug output
+```
+
+---
+
+## Performance Considerations
+
+### Hook Execution Time
+
+**Target:** < 100ms per hook
+
+**Actual measurements:**
+
+| Hook | Average | p95 | p99 |
+|------|---------|-----|-----|
+| SessionStart (smart-install, cached) | 10ms | 20ms | 40ms |
+| SessionStart (smart-install, first run) | 2500ms | 5000ms | 8000ms |
+| SessionStart (context) | 45ms | 120ms | 250ms |
+| SessionStart (user-message) | 5ms | 10ms | 15ms |
+| UserPromptSubmit | 12ms | 25ms | 50ms |
+| PostToolUse | 8ms | 15ms | 30ms |
+| SessionEnd | 5ms | 10ms | 20ms |
+
+**Why smart-install is sometimes slow:**
+- First-time: Full npm install (2-5 seconds)
+- Cached: Version check only (~10ms)
+- Version change: Full npm install + worker restart
+
+**Optimization (v5.0.3):**
+- Version caching with `.install-version` marker
+- Only install on version change or missing deps
+- Windows-specific error messages with build tool help
+
+### Database Performance
+
+**Schema optimizations:**
+- Indexes on `project`, `created_at_epoch`, `claude_session_id`
+- FTS5 virtual tables for full-text search
+- WAL mode for concurrent reads/writes
+
+**Query patterns:**
+```sql
+-- Fast: Uses index on (project, created_at_epoch)
+SELECT * FROM session_summaries
+WHERE project = ?
+ORDER BY created_at_epoch DESC
+LIMIT 10
+
+-- Fast: Uses index on claude_session_id
+SELECT * FROM sdk_sessions
+WHERE claude_session_id = ?
+LIMIT 1
+
+-- Fast: FTS5 full-text search
+SELECT * FROM observations_fts
+WHERE observations_fts MATCH ?
+ORDER BY rank
+LIMIT 20
+```
+
+### Worker Throughput
+
+**Bottleneck:** Claude API latency (5-30s per observation)
+
+**Mitigation:**
+- Process observations sequentially (simpler, more predictable)
+- Skip low-value observations (TodoWrite, ListMcpResourcesTool)
+- Batch summaries (generate every N observations, not every observation)
+
+**Future optimization:**
+- Parallel processing (multiple workers)
+- Smart batching (combine related observations)
+- Lazy summarization (summarize only when needed)
+
+---
+
+## Security Considerations
+
+### Hook Command Safety
+
+**Risk:** Hooks execute arbitrary commands with user permissions
+
+**Mitigations:**
+1. **Frozen at startup:** Hook configuration captured at start, changes require review
+2. **User review required:** `/hooks` menu shows changes, requires approval
+3. **Plugin isolation:** `${CLAUDE_PLUGIN_ROOT}` prevents path traversal
+4. **Input validation:** Hooks validate stdin schema before processing
+
+### Data Privacy
+
+**What gets stored:**
+- User prompts (raw text) - v4.2.0+
+- Tool inputs and outputs
+- File paths read/modified
+- Session summaries
+
+**Privacy guarantees:**
+- All data stored locally in `~/.claude-mem/claude-mem.db`
+- No cloud uploads (API calls only for AI compression)
+- SQLite file permissions: user-only read/write
+- No analytics or telemetry
+
+### API Key Protection
+
+**Configuration:**
+- Anthropic API key in `~/.anthropic/api_key` or `ANTHROPIC_API_KEY` env var
+- Worker inherits environment from Claude Code
+- Never logged or stored in database
+
+---
+
+## Key Takeaways
+
+1. **Hooks are interfaces**: They define clean boundaries between systems
+2. **Non-blocking is critical**: Hooks must return fast, workers do the heavy lifting
+3. **Graceful degradation**: Memory system can fail without breaking Claude Code
+4. **Queue-based decoupling**: Capture and processing happen independently
+5. **Progressive disclosure**: Context injection uses index-first approach
+6. **Lifecycle alignment**: Each hook has a clear, single purpose
+
+---
+
+## Further Reading
+
+- [Claude Code Hooks Reference](https://docs.claude.com/claude-code/hooks) - Official documentation
+- [Progressive Disclosure](progressive-disclosure) - Context priming philosophy
+- [Architecture Evolution](architecture-evolution) - v3 to v4 journey
+- [Worker Service Design](architecture/worker-service) - Background processing details
+
+---
+
+*The hook-driven architecture enables Claude-Mem to be both powerful and invisible. Users never notice the memory system working - it just makes Claude smarter over time.*
--- a/.agent/services/claude-mem/docs/public/installation.mdx
+++ b/.agent/services/claude-mem/docs/public/installation.mdx
@@ -0,0 +1,116 @@
+---
+title: "Installation"
+description: "Install Claude-Mem plugin for persistent memory across sessions"
+---
+
+# Installation Guide
+
+## Quick Start
+
+Install Claude-Mem directly from the plugin marketplace:
+
+```bash
+/plugin marketplace add thedotmack/claude-mem
+/plugin install claude-mem
+```
+
+That's it! The plugin will automatically:
+- Download prebuilt binaries (no compilation needed)
+- Install all dependencies (including SQLite binaries)
+- Configure hooks for session lifecycle management
+- Auto-start the worker service on first session
+
+Start a new Claude Code session and you'll see context from previous sessions automatically loaded.
+
+> **Important:** Claude-Mem is published on npm, but running `npm install -g claude-mem` installs the
+> **SDK/library only**. It does **not** register plugin hooks or start the worker service.
+> To use Claude-Mem as a persistent memory plugin, always install via the `/plugin` commands above.
+
+## System Requirements
+
+- **Node.js**: 18.0.0 or higher
+- **Claude Code**: Latest version with plugin support
+- **Bun**: JavaScript runtime and process manager (auto-installed if missing)
+- **SQLite 3**: For persistent storage (bundled)
+
+## Advanced Installation
+
+For development or testing, you can clone and build from source:
+
+### Clone and Build
+
+```bash
+# Clone the repository
+git clone https://github.com/thedotmack/claude-mem.git
+cd claude-mem
+
+# Install dependencies
+npm install
+
+# Build hooks and worker service
+npm run build
+
+# Worker service will auto-start on first Claude Code session
+# Or manually start with:
+npm run worker:start
+
+# Verify worker is running
+npm run worker:status
+```
+
+### Post-Installation Verification
+
+#### 1. Automatic Dependency Installation
+
+Dependencies are installed automatically during plugin installation. The SessionStart hook also ensures dependencies are up-to-date on each session start (this is fast and idempotent). Works cross-platform on Windows, macOS, and Linux.
+
+#### 2. Verify Plugin Installation
+
+Check that hooks are configured in Claude Code:
+```bash
+cat plugin/hooks/hooks.json
+```
+
+#### 3. Data Directory Location
+
+Data is stored in `~/.claude-mem/`:
+- Database: `~/.claude-mem/claude-mem.db`
+- PID file: `~/.claude-mem/.worker.pid`
+- Port file: `~/.claude-mem/.worker.port`
+- Logs: `~/.claude-mem/logs/worker-YYYY-MM-DD.log`
+- Settings: `~/.claude-mem/settings.json`
+
+Override with environment variable:
+```bash
+export CLAUDE_MEM_DATA_DIR=/custom/path
+```
+
+#### 4. Check Worker Logs
+
+```bash
+npm run worker:logs
+```
+
+#### 5. Test Context Retrieval
+
+```bash
+npm run test:context
+```
+
+## Upgrading
+
+Upgrades are automatic when updating via the plugin marketplace. Key changes in recent versions:
+
+**v7.1.0**: PM2 replaced with native Bun process management. Migration is automatic on first hook trigger.
+
+**v7.0.0+**: 11 configuration settings, dual-tag privacy system.
+
+**v5.4.0+**: Skill-based search replaces MCP tools, saving ~2,250 tokens per session.
+
+See [CHANGELOG](https://github.com/thedotmack/claude-mem/blob/main/CHANGELOG.md) for complete version history.
+
+## Next Steps
+
+- [Getting Started Guide](usage/getting-started) - Learn how Claude-Mem works automatically
+- [MCP Search Tools](usage/search-tools) - Query your project history
+- [Configuration](configuration) - Customize Claude-Mem behavior
--- a/.agent/services/claude-mem/docs/public/introduction.mdx
+++ b/.agent/services/claude-mem/docs/public/introduction.mdx
@@ -0,0 +1,112 @@
+---
+title: "Introduction"
+description: "Persistent memory compression system for Claude Code"
+---
+
+# Claude-Mem
+
+**Persistent memory compression system for Claude Code**
+
+Claude-Mem seamlessly preserves context across sessions by automatically capturing tool usage observations, generating semantic summaries, and making them available to future sessions. This enables Claude to maintain continuity of knowledge about projects even after sessions end or reconnect.
+
+## Quick Start
+
+Start a new Claude Code session in the terminal and enter the following commands:
+
+```bash
+/plugin marketplace add thedotmack/claude-mem
+/plugin install claude-mem
+```
+
+Restart Claude Code. Context from previous sessions will automatically appear in new sessions.
+
+## Key Features
+
+- 🧠 **Persistent Memory** - Context survives across sessions
+- 📁 **Folder Context Files** - Auto-generated `CLAUDE.md` in project folders with activity timelines
+- 🌐 **Multilingual Modes** - Supports 28 languages (Spanish, Chinese, French, Japanese, etc.)
+- 🎭 **Mode System** - Switch between workflows (Code, Email Investigation, Chill)
+- 🔍 **MCP Search Tools** - Query your project history with natural language
+- 🌐 **Web Viewer UI** - Real-time memory stream visualization at http://localhost:37777
+- 🔒 **Privacy Control** - Use `<private>` tags to exclude sensitive content from storage
+- ⚙️ **Context Configuration** - Fine-grained control over what context gets injected
+- 🤖 **Automatic Operation** - No manual intervention required
+- 📊 **FTS5 Search** - Fast full-text search across observations
+- 🔗 **Citations** - Reference past observations with IDs
+
+## How It Works
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│ Session Start → Inject context from last 10 sessions       │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│ User Prompts → Create session, save user prompts           │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│ Tool Executions → Capture observations (Read, Write, etc.)  │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│ Worker Processes → Extract learnings via Claude Agent SDK   │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│ Session Ends → Generate summary, ready for next session     │
+└─────────────────────────────────────────────────────────────┘
+```
+
+**Core Components:**
+1. **4 Lifecycle Hooks** - SessionStart, UserPromptSubmit, PostToolUse, Stop
+2. **Smart Install** - Cached dependency checker (pre-hook script)
+3. **Worker Service** - HTTP API on port 37777 managed by Bun
+4. **SQLite Database** - Stores sessions, observations, summaries with FTS5 search
+5. **MCP Search Tools** - Query historical context with natural language
+6. **Web Viewer UI** - Real-time visualization with SSE and infinite scroll
+
+See [Architecture Overview](architecture/overview) for details.
+
+## System Requirements
+
+- **Node.js**: 18.0.0 or higher
+- **Claude Code**: Latest version with plugin support
+- **Bun**: JavaScript runtime and process manager (auto-installed if missing)
+- **SQLite 3**: For persistent storage (bundled)
+
+## What's New
+
+**v9.0.0 - Live Context:**
+- **Folder Context Files**: Auto-generated `CLAUDE.md` in project folders with activity timelines
+- **Worktree Support**: Unified context from parent repos and git worktrees
+- **Configurable Observation Limits**: Control how many observations appear in context
+- **Windows Fixes**: Resolved IPC detection and hook execution issues
+- **Settings Auto-Creation**: `settings.json` now auto-creates on first run
+- **MCP Tools Naming**: Updated from "mem-search skill" to "MCP tools" terminology
+
+**v7.1.0 - Bun Migration:**
+- Replaced PM2 with native Bun process management
+- Switched from better-sqlite3 to bun:sqlite for faster database access
+- Simplified cross-platform support
+
+**v7.0.0 - Context Configuration:**
+- 11 settings for fine-grained control over context injection
+- Dual-tag privacy system (`<private>` tags)
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Installation" icon="download" href="/installation">
+    Quick start & advanced installation
+  </Card>
+  <Card title="Getting Started" icon="rocket" href="/usage/getting-started">
+    Learn how Claude-Mem works automatically
+  </Card>
+  <Card title="Folder Context" icon="folder-open" href="/usage/folder-context">
+    Auto-generated folder CLAUDE.md files
+  </Card>
+  <Card title="Search Tools" icon="magnifying-glass" href="/usage/search-tools">
+    Query your project history
+  </Card>
+</CardGroup>
--- a/.agent/services/claude-mem/docs/public/modes.mdx
+++ b/.agent/services/claude-mem/docs/public/modes.mdx
@@ -0,0 +1,104 @@
+---
+title: "Modes & Languages"
+description: "Configure Claude-Mem behavior and language with the Mode System"
+---
+
+# Modes & Languages
+
+Claude-Mem uses a flexible **Mode System** to adapt its behavior, observation types, and output language. This allows you to switch between different workflows (like coding vs. email investigation) or languages without reinstalling the plugin.
+
+## What is a Mode?
+
+A "mode" is a configuration profile that defines:
+1.  **Observer Role**: How Claude should analyze your work (e.g., "Software Engineer" vs. "Forensic Analyst").
+2.  **Observation Types**: Valid categories for memory (e.g., "Bug Fix", "Feature" vs. "Person", "Organization").
+3.  **Concepts**: Semantic tags for indexing (e.g., "Pattern", "Trade-off").
+4.  **Language**: The language used for generating observations and summaries.
+
+## Configuration
+
+Set the active mode using the `CLAUDE_MEM_MODE` setting in `~/.claude-mem/settings.json`:
+
+```json
+{
+  "CLAUDE_MEM_MODE": "code--es"
+}
+```
+
+Or via environment variable:
+
+```bash
+export CLAUDE_MEM_MODE="code--fr"
+```
+
+## Available Modes
+
+### Code Mode (Default)
+The standard mode for software development. Captures bug fixes, features, refactors, and architectural decisions.
+
+**ID:** `code`
+
+### Code Mode Variants
+
+Behavioral variants that change how the code mode operates:
+
+| Variant | Mode ID | Description |
+|---------|---------|-------------|
+| **Chill** | `code--chill` | Produces fewer observations. Only records things "painful to rediscover" - shipped features, architectural decisions, and non-obvious gotchas. Skips routine work and obvious changes. |
+
+### Multilingual Code Modes
+Inherits all behavior from Code Mode but instructs Claude to generate **all** memory artifacts (titles, narratives, facts, summaries) in the target language.
+
+| Language | Mode ID | Native Name |
+|----------|---------|-------------|
+| **Arabic** | `code--ar` | العربية |
+| **Bengali** | `code--bn` | বাংলা |
+| **Chinese** | `code--zh` | 中文 |
+| **Czech** | `code--cs` | Čeština |
+| **Danish** | `code--da` | Dansk |
+| **Dutch** | `code--nl` | Nederlands |
+| **Finnish** | `code--fi` | Suomi |
+| **French** | `code--fr` | Français |
+| **German** | `code--de` | Deutsch |
+| **Greek** | `code--el` | Ελληνικά |
+| **Hebrew** | `code--he` | עברית |
+| **Hindi** | `code--hi` | हिन्दी |
+| **Hungarian** | `code--hu` | Magyar |
+| **Indonesian** | `code--id` | Bahasa Indonesia |
+| **Urdu** | `code--ur` | اردو |
+| **Italian** | `code--it` | Italiano |
+| **Japanese** | `code--ja` | 日本語 |
+| **Korean** | `code--ko` | 한국어 |
+| **Norwegian** | `code--no` | Norsk |
+| **Polish** | `code--pl` | Polski |
+| **Portuguese (Brazil)** | `code--pt-br` | Português Brasileiro |
+| **Romanian** | `code--ro` | Română |
+| **Russian** | `code--ru` | Русский |
+| **Spanish** | `code--es` | Español |
+| **Swedish** | `code--sv` | Svenska |
+| **Thai** | `code--th` | ภาษาไทย |
+| **Turkish** | `code--tr` | Türkçe |
+| **Ukrainian** | `code--uk` | Українська |
+| **Vietnamese** | `code--vi` | Tiếng Việt |
+
+### Email Investigation Mode
+A specialized mode for analyzing email dumps (e.g., FOIA releases, corporate archives). Focuses on identifying entities, relationships, timeline events, and key topics.
+
+**ID:** `email-investigation`
+
+**Observation Types:**
+- `entity`: Person, organization, or email address
+- `relationship`: Connection between entities
+- `timeline-event`: Time-stamped event in communication sequence
+- `evidence`: Supporting documentation or proof
+- `anomaly`: Suspicious pattern or irregularity
+- `conclusion`: Investigative finding or determination
+
+## Mode Inheritance
+
+The system supports inheritance using the `--` separator. For example, `code--es` means:
+1.  Load `code` (Parent) configuration.
+2.  Load `code--es` (Child) configuration.
+3.  Merge Child into Parent (Child overrides).
+
+This allows for lightweight "remix" modes that only change specific aspects (like the language prompt) while keeping the core definitions intact.
--- a/.agent/services/claude-mem/docs/public/openclaw-integration.mdx
+++ b/.agent/services/claude-mem/docs/public/openclaw-integration.mdx
@@ -0,0 +1,381 @@
+---
+title: OpenClaw Integration
+description: Persistent memory for OpenClaw agents — observation recording, system prompt context injection, and real-time observation feeds
+icon: dragon
+---
+
+## Overview
+
+The OpenClaw plugin gives claude-mem persistent memory to agents running on the [OpenClaw](https://openclaw.ai) gateway. It handles three things:
+
+1. **Observation recording** — Captures tool usage from OpenClaw's embedded runner and sends it to the claude-mem worker for AI processing
+2. **System prompt context injection** — Injects the observation timeline into each agent's system prompt via the `before_prompt_build` hook, keeping `MEMORY.md` free for agent-curated memory
+3. **Observation feed** — Streams new observations to messaging channels (Telegram, Discord, Slack, etc.) in real-time via SSE
+
+<Info>
+OpenClaw's embedded runner (`pi-embedded`) calls the Anthropic API directly without spawning a `claude` process, so claude-mem's standard hooks never fire. This plugin bridges that gap by using OpenClaw's event system to capture the same data.
+</Info>
+
+## How It Works
+
+```plaintext
+OpenClaw Gateway
+  │
+  ├── before_agent_start ───→ Init session
+  ├── before_prompt_build ──→ Inject context into system prompt
+  ├── tool_result_persist ──→ Record observation
+  ├── agent_end ────────────→ Summarize + Complete session
+  └── gateway_start ────────→ Reset session tracking + context cache
+                    │
+                    ▼
+         Claude-Mem Worker (localhost:37777)
+           ├── POST /api/sessions/init
+           ├── POST /api/sessions/observations
+           ├── POST /api/sessions/summarize
+           ├── POST /api/sessions/complete
+           ├── GET  /api/context/inject ──→ System prompt context
+           └── GET  /stream ─────────────→ SSE → Messaging channels
+```
+
+### Event Lifecycle
+
+<Steps>
+  <Step title="Agent starts (before_agent_start)">
+    When an OpenClaw agent starts, the plugin initializes a session by sending the user prompt to `POST /api/sessions/init` so the worker can create a new session and start processing.
+  </Step>
+  <Step title="Context injected (before_prompt_build)">
+    Before each LLM call, the plugin fetches the observation timeline from the worker's `/api/context/inject` endpoint and returns it as `appendSystemContext`. This injects cross-session context directly into the agent's system prompt without writing any files.
+
+    The context is cached for 60 seconds to avoid re-fetching on every LLM turn within a session.
+  </Step>
+  <Step title="Tool use recorded (tool_result_persist)">
+    Every time the agent uses a tool (Read, Write, Bash, etc.), the plugin sends the observation to `POST /api/sessions/observations` with the tool name, input, and truncated response (max 1000 chars). This is fire-and-forget — it doesn't block the agent from continuing work.
+
+    Tools prefixed with `memory_` are skipped to avoid recursive recording.
+  </Step>
+  <Step title="Agent finishes (agent_end)">
+    When the agent completes, the plugin extracts the last assistant message and sends it to `POST /api/sessions/summarize`, then calls `POST /api/sessions/complete` to close the session. Both are fire-and-forget.
+  </Step>
+  <Step title="Gateway restarts (gateway_start)">
+    Clears all session tracking (session IDs, context cache) so agents get fresh state after a gateway restart.
+  </Step>
+</Steps>
+
+### System Prompt Context Injection
+
+The plugin injects cross-session observation context into each agent's system prompt via OpenClaw's `before_prompt_build` hook. The content comes from the worker's `GET /api/context/inject?projects=<project>` endpoint, which generates a formatted markdown timeline from the SQLite database.
+
+This approach keeps `MEMORY.md` under the agent's control for curated long-term memory (decisions, preferences, durable facts), while the observation timeline is delivered through the system prompt where it belongs.
+
+<Info>
+Context is cached for 60 seconds per project to avoid re-fetching on every LLM turn. The cache is cleared on gateway restart. Use `syncMemoryFileExclude` to opt specific agents out of context injection entirely.
+</Info>
+
+### Observation Feed (SSE → Messaging)
+
+The plugin runs a background service that connects to the worker's SSE stream (`GET /stream`) and forwards `new_observation` events to a configured messaging channel. This lets you monitor what your agents are learning in real-time from Telegram, Discord, Slack, or any supported OpenClaw channel.
+
+The SSE connection uses exponential backoff (1s → 30s) for automatic reconnection.
+
+## Setting Up the Observation Feed
+
+The observation feed sends a formatted message to your OpenClaw channel every time claude-mem creates a new observation. Each message includes the observation title and subtitle so you can follow along as your agents work.
+
+Messages look like this in your channel:
+
+```
+🧠 Claude-Mem Observation
+**Implemented retry logic for API client**
+Added exponential backoff with configurable max retries to handle transient failures
+```
+
+### Step 1: Choose your channel
+
+The observation feed works with any channel that your OpenClaw gateway has configured. You need two pieces of information:
+
+- **Channel type** — The name of the channel plugin registered with OpenClaw (e.g., `telegram`, `discord`, `slack`, `signal`, `whatsapp`, `line`)
+- **Target ID** — The chat ID, channel ID, or user ID where messages should be sent
+
+<AccordionGroup>
+  <Accordion title="Telegram" icon="telegram">
+    **Channel type:** `telegram`
+
+    **Target ID:** Your Telegram chat ID (numeric). To find it:
+    1. Message [@userinfobot](https://t.me/userinfobot) on Telegram
+    2. It will reply with your chat ID (e.g., `123456789`)
+    3. For group chats, the ID is negative (e.g., `-1001234567890`)
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "telegram",
+      "to": "123456789"
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Discord" icon="discord">
+    **Channel type:** `discord`
+
+    **Target ID:** The Discord channel ID. To find it:
+    1. Enable Developer Mode in Discord (Settings → Advanced → Developer Mode)
+    2. Right-click the channel → Copy Channel ID
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "discord",
+      "to": "1234567890123456789"
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Slack" icon="slack">
+    **Channel type:** `slack`
+
+    **Target ID:** The Slack channel ID (not the channel name). To find it:
+    1. Open the channel in Slack
+    2. Click the channel name at the top
+    3. Scroll to the bottom of the channel details — the ID looks like `C01ABC2DEFG`
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "slack",
+      "to": "C01ABC2DEFG"
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Signal" icon="signal-messenger">
+    **Channel type:** `signal`
+
+    **Target ID:** The Signal phone number or group ID configured in your OpenClaw gateway.
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "signal",
+      "to": "+1234567890"
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="WhatsApp" icon="whatsapp">
+    **Channel type:** `whatsapp`
+
+    **Target ID:** The WhatsApp phone number or group JID configured in your OpenClaw gateway.
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "whatsapp",
+      "to": "+1234567890"
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="LINE" icon="line">
+    **Channel type:** `line`
+
+    **Target ID:** The LINE user ID or group ID from the LINE Developer Console.
+
+    ```json
+    "observationFeed": {
+      "enabled": true,
+      "channel": "line",
+      "to": "U1234567890abcdef"
+    }
+    ```
+  </Accordion>
+</AccordionGroup>
+
+### Step 2: Add the config to your gateway
+
+Add the `observationFeed` block to your claude-mem plugin config in your OpenClaw gateway configuration:
+
+```json
+{
+  "plugins": {
+    "claude-mem": {
+      "enabled": true,
+      "config": {
+        "project": "my-project",
+        "observationFeed": {
+          "enabled": true,
+          "channel": "telegram",
+          "to": "123456789"
+        }
+      }
+    }
+  }
+}
+```
+
+<Warning>
+The `channel` value must match a channel plugin that is already configured and running on your OpenClaw gateway. If the channel isn't registered, you'll see `Unknown channel type: <channel>` in the logs.
+</Warning>
+
+### Step 3: Verify the connection
+
+After starting the gateway, check that the feed is connected:
+
+1. **Check the logs** — You should see:
+   ```
+   [claude-mem] Observation feed starting — channel: telegram, target: 123456789
+   [claude-mem] Connecting to SSE stream at http://localhost:37777/stream
+   [claude-mem] Connected to SSE stream
+   ```
+
+2. **Use the status command** — Run `/claude_mem_feed` in any OpenClaw chat to see:
+   ```
+   Claude-Mem Observation Feed
+   Enabled: yes
+   Channel: telegram
+   Target: 123456789
+   Connection: connected
+   ```
+
+3. **Trigger a test** — Have an agent do some work. When the worker processes the tool usage into an observation, you'll receive a message in your configured channel.
+
+<Info>
+The feed only sends `new_observation` events — not raw tool usage. Observations are generated asynchronously by the worker's AI agent, so there's a 1-2 second delay between tool use and the observation message appearing in your channel.
+</Info>
+
+### Troubleshooting the Feed
+
+| Symptom | Cause | Fix |
+|---------|-------|-----|
+| `Connection: disconnected` | Worker not running or wrong port | Check `workerPort` config, run `npm run worker:status` |
+| `Connection: reconnecting` | Worker was running but connection dropped | The plugin auto-reconnects with backoff — wait up to 30s |
+| `Unknown channel type` in logs | Channel plugin not loaded on gateway | Verify your OpenClaw gateway has the channel plugin configured |
+| No messages appearing | Feed connected but no observations being created | Check that agents are running and the worker is processing observations |
+| `Observation feed disabled` in logs | `enabled` is `false` or missing | Set `observationFeed.enabled` to `true` |
+| `Observation feed misconfigured` in logs | Missing `channel` or `to` | Both `channel` and `to` are required |
+
+## Installation
+
+Run this one-liner to install everything automatically:
+
+```bash
+curl -fsSL https://install.cmem.ai/openclaw.sh | bash
+```
+
+The installer handles dependency checks (Bun, uv), plugin installation, memory slot configuration, AI provider setup, worker startup, and optional observation feed configuration.
+
+You can also pre-select options:
+
+```bash
+# With a specific AI provider
+curl -fsSL https://install.cmem.ai/openclaw.sh | bash -s -- --provider=gemini --api-key=YOUR_KEY
+
+# Fully unattended (defaults to Claude Max Plan)
+curl -fsSL https://install.cmem.ai/openclaw.sh | bash -s -- --non-interactive
+
+# Upgrade existing installation
+curl -fsSL https://install.cmem.ai/openclaw.sh | bash -s -- --upgrade
+```
+
+### Manual Configuration
+
+Add `claude-mem` to your OpenClaw gateway's plugin configuration:
+
+```json
+{
+  "plugins": {
+    "claude-mem": {
+      "enabled": true,
+      "config": {
+        "project": "my-project",
+        "syncMemoryFile": true,
+        "workerPort": 37777,
+        "observationFeed": {
+          "enabled": true,
+          "channel": "telegram",
+          "to": "your-chat-id"
+        }
+      }
+    }
+  }
+}
+```
+
+<Note>
+The claude-mem worker service must be running on the same machine as the OpenClaw gateway. The plugin communicates with it via HTTP on `localhost:37777`.
+</Note>
+
+## Configuration
+
+<ParamField body="project" type="string" default="openclaw">
+  Project name for scoping observations in the memory database. All observations from this gateway will be stored under this project name.
+</ParamField>
+
+<ParamField body="syncMemoryFile" type="boolean" default={true}>
+  Inject observation context into the agent system prompt via `before_prompt_build` hook. When `true`, agents receive cross-session context automatically. Set to `false` to disable context injection entirely (observations are still recorded).
+</ParamField>
+
+<ParamField body="syncMemoryFileExclude" type="string[]" default={[]}>
+  Agent IDs excluded from automatic context injection. Useful for agents that curate their own memory and don't need the observation timeline (e.g., `["snarf", "debugger"]`). Observations are still recorded for excluded agents — only the context injection is skipped.
+</ParamField>
+
+<ParamField body="workerPort" type="number" default={37777}>
+  Port for the claude-mem worker service. Override if your worker runs on a non-default port.
+</ParamField>
+
+<ParamField body="observationFeed.enabled" type="boolean" default={false}>
+  Enable live observation streaming to messaging channels.
+</ParamField>
+
+<ParamField body="observationFeed.channel" type="string">
+  Channel type: `telegram`, `discord`, `signal`, `slack`, `whatsapp`, `line`
+</ParamField>
+
+<ParamField body="observationFeed.to" type="string">
+  Target chat/user/channel ID to send observations to.
+</ParamField>
+
+## Commands
+
+### /claude_mem_feed
+
+Show or toggle the observation feed status.
+
+```
+/claude_mem_feed        # Show current status
+/claude_mem_feed on     # Request enable
+/claude_mem_feed off    # Request disable
+```
+
+### /claude_mem_status
+
+Check worker health and session status.
+
+```
+/claude_mem_status
+```
+
+Returns worker status, port, active session count, and observation feed connection state.
+
+## Architecture
+
+The plugin uses HTTP calls to the already-running claude-mem worker service rather than spawning subprocesses. This means:
+
+- No `bun` dependency required on the gateway
+- No process spawn overhead per event
+- Uses the same worker API that Claude Code hooks use
+- All operations are non-blocking (fire-and-forget where possible)
+
+### Session Tracking
+
+Each OpenClaw agent session gets a unique `contentSessionId` (format: `openclaw-<sessionKey>-<timestamp>`) that maps to a claude-mem session in the worker. The plugin tracks:
+
+- `sessionIds` — Maps OpenClaw session keys to content session IDs
+- `contextCache` — TTL cache (60s) for context injection responses, keyed by project
+
+Both are cleared on `gateway_start`.
+
+## Requirements
+
+- Claude-mem worker service running on `localhost:37777` (or configured port)
+- OpenClaw gateway with plugin support
+- Network access between gateway and worker (localhost only)
--- a/.agent/services/claude-mem/docs/public/platform-integration.mdx
+++ b/.agent/services/claude-mem/docs/public/platform-integration.mdx
--- a/.agent/services/claude-mem/docs/public/progressive-disclosure.mdx
+++ b/.agent/services/claude-mem/docs/public/progressive-disclosure.mdx
@@ -0,0 +1,672 @@
+# Progressive Disclosure: Claude-Mem's Context Priming Philosophy
+
+## Core Principle
+**Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.**
+
+---
+
+## What is Progressive Disclosure?
+
+Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:
+
+1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts)
+2. **Layer 2 (Details)**: Fetch full content only when needed
+3. **Layer 3 (Deep Dive)**: Read original source files if required
+
+This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.
+
+---
+
+## The Problem: Context Pollution
+
+Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:
+
+```
+❌ Traditional Approach:
+┌─────────────────────────────────────┐
+│ Session Start                        │
+│                                      │
+│ [15,000 tokens of past sessions]    │
+│ [8,000 tokens of observations]      │
+│ [12,000 tokens of file summaries]   │
+│                                      │
+│ Total: 35,000 tokens                │
+│ Relevant: ~2,000 tokens (6%)        │
+└─────────────────────────────────────┘
+```
+
+**Problems:**
+- Wastes 94% of attention budget on irrelevant context
+- User prompt gets buried under mountain of history
+- Agent must process everything before understanding task
+- No way to know what's actually useful until after reading
+
+---
+
+## Claude-Mem's Solution: Progressive Disclosure
+
+```
+✅ Progressive Disclosure Approach:
+┌─────────────────────────────────────┐
+│ Session Start                        │
+│                                      │
+│ Index of 50 observations: ~800 tokens│
+│ ↓                                    │
+│ Agent sees: "🔴 Hook timeout issue"  │
+│ Agent decides: "Relevant!"           │
+│ ↓                                    │
+│ Fetch observation #2543: ~120 tokens│
+│                                      │
+│ Total: 920 tokens                   │
+│ Relevant: 920 tokens (100%)         │
+└─────────────────────────────────────┘
+```
+
+**Benefits:**
+- Agent controls its own context consumption
+- Directly relevant to current task
+- Can fetch more if needed
+- Can skip everything if not relevant
+- Clear cost/benefit for each retrieval decision
+
+---
+
+## How It Works in Claude-Mem
+
+### The Index Format
+
+Every SessionStart hook provides a compact index:
+
+```markdown
+### Oct 26, 2025
+
+**General**
+| ID | Time | T | Title | Tokens |
+|----|------|---|-------|--------|
+| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
+| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
+| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |
+
+**src/hooks/context-hook.ts**
+| ID | Time | T | Title | Tokens |
+|----|------|---|-------|--------|
+| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
+| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
+```
+
+**What the agent sees:**
+- **What exists**: Observation titles give semantic meaning
+- **When it happened**: Timestamps for temporal context
+- **What type**: Icons indicate observation category
+- **Retrieval cost**: Token counts for informed decisions
+- **Where to get it**: MCP search tools referenced at bottom
+
+### The Legend System
+
+```
+🎯 session-request  - User's original goal
+🔴 gotcha          - Critical edge case or pitfall
+🟡 problem-solution - Bug fix or workaround
+🔵 how-it-works    - Technical explanation
+🟢 what-changed    - Code/architecture change
+🟣 discovery       - Learning or insight
+🟠 why-it-exists   - Design rationale
+🟤 decision        - Architecture decision
+⚖️ trade-off       - Deliberate compromise
+```
+
+**Purpose:**
+- Visual scanning (humans and AI both benefit)
+- Semantic categorization
+- Priority signaling (🔴 gotchas are more critical)
+- Pattern recognition across sessions
+
+### Progressive Disclosure Instructions
+
+The index includes usage guidance:
+
+```markdown
+💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
+- Use MCP search tools to fetch full observation details on-demand
+- Prefer searching observations over re-reading code for past decisions
+- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
+```
+
+**What this does:**
+- Teaches the agent the pattern
+- Suggests when to fetch (critical types)
+- Recommends search over code re-reading (efficiency)
+- Makes the system self-documenting
+
+---
+
+## The Philosophy: Context as Currency
+
+### Mental Model: Token Budget as Money
+
+Think of context window as a bank account:
+
+| Approach | Metaphor | Outcome |
+|----------|----------|---------|
+| **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need |
+| **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks |
+| **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs |
+
+### The Attention Budget
+
+LLMs have finite attention:
+- Every token attends to every other token (n² relationships)
+- 100,000 token window ≠ 100,000 tokens of useful attention
+- Context "rot" happens as window fills
+- Later tokens get less attention than earlier ones
+
+**Claude-Mem's approach:**
+- Start with ~1,000 tokens of index
+- Agent has 99,000 tokens free for task
+- Agent fetches ~200 tokens when needed
+- Final budget: ~98,000 tokens for actual work
+
+### Design for Autonomy
+
+> "As models improve, let them act intelligently"
+
+Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context.
+
+**Traditional RAG:**
+```
+System → [Decides relevance] → Agent
+        ↑
+   Hope this helps!
+```
+
+**Progressive Disclosure:**
+```
+System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
+                          ↑
+                   You know best!
+```
+
+The agent knows:
+- The current task context
+- What information would help
+- How much budget to spend
+- When to stop searching
+
+We don't.
+
+---
+
+## Implementation Principles
+
+### 1. Make Costs Visible
+
+Every item in the index shows token count:
+
+```
+| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
+                                                        ^^^^
+                                                    Retrieval cost
+```
+
+**Why:**
+- Agent can make informed ROI decisions
+- Small observations (~50 tokens) are "cheap" to fetch
+- Large observations (~500 tokens) require stronger justification
+- Matches how humans think about effort
+
+### 2. Use Semantic Compression
+
+Titles compress full observations into ~10 words:
+
+**Bad title:**
+```
+Observation about a thing
+```
+
+**Good title:**
+```
+🔴 Hook timeout issue: 60s default too short for npm install
+```
+
+**What makes a good title:**
+- Specific: Identifies exact issue
+- Actionable: Clear what to do
+- Self-contained: Doesn't require reading observation
+- Searchable: Contains key terms (hook, timeout, npm)
+- Categorized: Icon indicates type
+
+### 3. Group by Context
+
+Observations are grouped by:
+- **Date**: Temporal context
+- **File path**: Spatial context (work on specific files)
+- **Project**: Logical context
+
+```markdown
+**src/hooks/context-hook.ts**
+| ID | Time | T | Title | Tokens |
+|----|------|---|-------|--------|
+| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
+| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
+```
+
+**Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together.
+
+### 4. Provide Retrieval Tools
+
+The index is useless without retrieval mechanisms:
+
+```markdown
+*Use claude-mem MCP search to access records with the given ID*
+```
+
+**Available MCP tools:**
+- `search` - Search memory index (Layer 1: Get IDs)
+- `timeline` - Get chronological context (Layer 2: See narrative arc)
+- `get_observations` - Fetch full details (Layer 3: Deep dive)
+
+The 3-layer workflow ensures progressive disclosure: index → context → details.
+
+---
+
+## Real-World Example
+
+### Scenario: Agent asked to fix a bug in hooks
+
+**Without progressive disclosure:**
+```
+SessionStart injects 25,000 tokens of past context
+Agent reads everything
+Agent finds 1 relevant observation (buried in middle)
+Total tokens consumed: 25,000
+Relevant tokens: ~200
+Efficiency: 0.8%
+```
+
+**With progressive disclosure:**
+```
+SessionStart shows index: ~800 tokens
+Agent sees title: "🔴 Hook timeout issue: 60s too short"
+Agent thinks: "This looks relevant to my bug!"
+Agent fetches observation #2543: ~155 tokens
+Total tokens consumed: 955
+Relevant tokens: 955
+Efficiency: 100%
+```
+
+### The Index Entry
+
+```markdown
+| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
+```
+
+**What the agent learns WITHOUT fetching:**
+- There's a known gotcha (🔴) about hook timeouts
+- It's related to npm install taking too long
+- Full details are ~155 tokens (cheap)
+- Happened at 2:14 PM (recent)
+
+**Decision tree:**
+```
+Is my task related to hooks? → YES
+Is my task related to timeouts? → YES
+Is my task related to npm? → YES
+155 tokens is cheap → FETCH IT
+```
+
+---
+
+## The Three-Layer Workflow
+
+Claude-Mem implements progressive disclosure through a 3-layer workflow pattern:
+
+### Layer 1: Search (Index)
+
+Start by searching to get a compact index with IDs:
+
+```typescript
+search({
+  query: "hook timeout",
+  limit: 10
+})
+```
+
+**Returns:**
+```
+Found 3 observations matching "hook timeout":
+
+| ID | Date | Type | Title |
+|----|------|------|-------|
+| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short |
+| #2891 | Oct 25 | how-it-works | Hook timeout configuration |
+| #2102 | Oct 20 | problem-solution | Fixed timeout in CI |
+```
+
+**Cost:** ~50-100 tokens per result
+**Value:** Agent can scan and decide which observations are relevant
+
+### Layer 2: Timeline (Context)
+
+Get chronological context around interesting observations:
+
+```typescript
+timeline({
+  anchor: 2543,  // Observation ID from search
+  depth_before: 3,
+  depth_after: 3
+})
+```
+
+**Returns:** Chronological view showing what happened before/during/after observation #2543
+
+**Cost:** Variable based on depth
+**Value:** Understand narrative arc and context
+
+### Layer 3: Get Observations (Details)
+
+Fetch full details only for relevant observations:
+
+```typescript
+get_observations({
+  ids: [2543, 2102]  // Selected from search results
+})
+```
+
+**Returns:**
+```
+#2543 🔴 Hook timeout: 60s too short for npm install
+─────────────────────────────────────────────────
+Date: Oct 26, 2025 2:14 PM
+Type: gotcha
+Project: claude-mem
+
+Narrative:
+Discovered that the default 60-second hook timeout is insufficient
+for npm install operations, especially with large dependency trees
+or slow network conditions. This causes SessionStart hook to fail
+silently, preventing context injection.
+
+Facts:
+- Default timeout: 60 seconds
+- npm install with cold cache: ~90 seconds
+- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25
+
+Files Modified:
+- plugin/hooks/hooks.json
+
+Concepts: hooks, timeout, npm, configuration
+```
+
+**Cost:** ~155 tokens for full details
+**Value:** Complete understanding of the issue
+
+---
+
+## Cognitive Load Theory
+
+Progressive disclosure is grounded in **Cognitive Load Theory**:
+
+### Intrinsic Load
+The inherent difficulty of the task itself.
+
+**Example:** "Fix authentication bug"
+- Must understand auth system
+- Must understand the bug
+- Must write the fix
+
+This load is unavoidable.
+
+### Extraneous Load
+The cognitive burden of poorly presented information.
+
+**Traditional RAG adds extraneous load:**
+- Scanning irrelevant observations
+- Filtering out noise
+- Remembering what to ignore
+- Re-contextualizing after each section
+
+**Progressive disclosure minimizes extraneous load:**
+- Scan titles (low effort)
+- Fetch only relevant (targeted effort)
+- Full attention on current task
+
+### Germane Load
+The effort of building mental models and schemas.
+
+**Progressive disclosure supports germane load:**
+- Consistent structure (legend, grouping)
+- Clear categorization (types, icons)
+- Semantic compression (good titles)
+- Explicit costs (token counts)
+
+---
+
+## Anti-Patterns to Avoid
+
+### ❌ Verbose Titles
+
+**Bad:**
+```
+| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
+```
+
+**Good:**
+```
+| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
+```
+
+### ❌ Hiding Costs
+
+**Bad:**
+```
+| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
+```
+
+**Good:**
+```
+| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |
+```
+
+### ❌ No Retrieval Path
+
+**Bad:**
+```
+Here are 10 observations. [No instructions on how to get full details]
+```
+
+**Good:**
+```
+Here are 10 observations.
+*Use MCP search tools to fetch full observation details on-demand*
+```
+
+### ❌ Skipping the Index Layer
+
+**Bad:**
+```typescript
+// Fetching full details immediately
+get_observations({
+  ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]  // Guessing which are relevant
+})
+```
+
+**Good:**
+```typescript
+// Follow the 3-layer workflow
+// Layer 1: Search for index
+search({
+  query: "hooks",
+  limit: 20
+})
+
+// Layer 2: Review index, identify 2-3 relevant IDs
+
+// Layer 3: Fetch only relevant observations
+get_observations({
+  ids: [2543, 2891]  // Just the most relevant
+})
+```
+
+---
+
+## Key Design Decisions
+
+### Why Token Counts?
+
+**Decision:** Show approximate token counts (~155, ~203) rather than exact counts.
+
+**Rationale:**
+- Communicates scale (50 vs 500) without false precision
+- Maps to human intuition (small/medium/large)
+- Allows agent to budget attention
+- Encourages cost-conscious retrieval
+
+### Why Icons Instead of Text Labels?
+
+**Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO).
+
+**Rationale:**
+- Visual scanning (pattern recognition)
+- Token efficient (1 char vs 10 chars)
+- Language-agnostic
+- Aesthetically distinct
+- Works for both humans and AI
+
+### Why Index-First, Not Smart Pre-Fetch?
+
+**Decision:** Always show index first, even if we "know" what's relevant.
+
+**Rationale:**
+- We can't know what's relevant better than the agent
+- Pre-fetching assumes we understand the task
+- Agent knows current context, we don't
+- Respects agent autonomy
+- Fails gracefully (can always fetch more)
+
+### Why Group by File Path?
+
+**Decision:** Group observations by file path in addition to date.
+
+**Rationale:**
+- Spatial locality: Work on file X likely needs context about file X
+- Reduces scanning effort
+- Matches how developers think
+- Clear semantic boundaries
+
+---
+
+## Measuring Success
+
+Progressive disclosure is working when:
+
+### ✅ Low Waste Ratio
+```
+Relevant Tokens / Total Context Tokens > 80%
+```
+
+Most of the context consumed is actually useful.
+
+### ✅ Selective Fetching
+```
+Index Shown: 50 observations
+Details Fetched: 2-3 observations
+```
+
+Agent is being selective, not fetching everything.
+
+### ✅ Fast Task Completion
+```
+Session with index: 30 seconds to find relevant context
+Session without: 90 seconds scanning all context
+```
+
+Time-to-relevant-information is faster.
+
+### ✅ Appropriate Depth
+```
+Simple task: Only index needed
+Medium task: 1-2 observations fetched
+Complex task: 5-10 observations + code reads
+```
+
+Depth scales with task complexity.
+
+---
+
+## Future Enhancements
+
+### Adaptive Index Size
+
+```typescript
+// Vary index size based on session type
+SessionStart({ source: "startup" }):
+  → Show last 10 sessions (small index)
+
+SessionStart({ source: "resume" }):
+  → Show only current session (micro index)
+
+SessionStart({ source: "compact" }):
+  → Show last 20 sessions (larger index)
+```
+
+### Relevance Scoring
+
+```typescript
+// Use embeddings to pre-sort index by relevance
+search({
+  query: "authentication bug",
+  orderBy: "relevance"  // Based on semantic similarity (future enhancement)
+})
+```
+
+### Cost Forecasting
+
+```markdown
+💡 **Budget Estimate:**
+- Fetching all 🔴 gotchas: ~450 tokens
+- Fetching all file-related: ~1,200 tokens
+- Fetching everything: ~8,500 tokens
+```
+
+### Progressive Detail Levels
+
+```
+Layer 1: Index (titles only)
+Layer 2: Summaries (2-3 sentences)
+Layer 3: Full details (complete observation)
+Layer 4: Source files (referenced code)
+```
+
+---
+
+## Key Takeaways
+
+1. **Show, don't tell**: Index reveals what exists without forcing consumption
+2. **Cost-conscious**: Make retrieval costs visible for informed decisions
+3. **Agent autonomy**: Let the agent decide what's relevant
+4. **Semantic compression**: Good titles make or break the system
+5. **Consistent structure**: Patterns reduce cognitive load
+6. **Two-tier everything**: Index first, details on-demand
+7. **Context as currency**: Spend wisely on high-value information
+
+---
+
+## Remember
+
+> "The best interface is one that disappears when not needed, and appears exactly when it is."
+
+Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path.
+
+---
+
+## Further Reading
+
+- [Context Engineering for AI Agents](context-engineering) - Foundational principles
+- [Claude-Mem Architecture](architecture/overview) - How it all fits together
+- Cognitive Load Theory (Sweller, 1988)
+- Information Foraging Theory (Pirolli & Card, 1999)
+- Progressive Disclosure (Nielsen Norman Group)
+
+---
+
+*This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*
--- a/.agent/services/claude-mem/docs/public/smart-explore-benchmark.mdx
+++ b/.agent/services/claude-mem/docs/public/smart-explore-benchmark.mdx
@@ -0,0 +1,196 @@
+---
+title: "Smart Explore Benchmark"
+description: "Token efficiency comparison between AST-based and traditional code exploration"
+---
+
+# Smart Explore Benchmark
+
+Smart Explore uses tree-sitter AST parsing to provide structural code navigation through three MCP tools: `smart_search`, `smart_outline`, and `smart_unfold`. This report documents a rigorous A/B comparison against the standard Explore agent (which uses Glob, Grep, and Read tools) to quantify the token savings and quality trade-offs.
+
+## Executive Summary
+
+| Metric | Smart Explore | Explore Agent | Advantage |
+|--------|:---:|:---:|---|
+| Discovery (cross-file search) | ~14,200 tokens | ~252,500 tokens | **17.8x cheaper** |
+| Targeted reads (specific symbols) | ~5,650 tokens | ~109,400 tokens | **19.4x cheaper** |
+| End-to-end (search + read) | ~4,200 tokens | ~45,000 tokens | **10-12x cheaper** |
+| Completeness | 5/5 full source returned | 4/5 (truncated longest method) | Smart Explore more reliable |
+| Speed | Under 2s per call | 5-66s per call | **10-30x faster** |
+
+## Methodology
+
+### Test Environment
+
+- **Codebase**: claude-mem (`src/` directory, 194 TypeScript files, 1,206 parsed symbols)
+- **Model**: Claude Opus 4.6 for both approaches
+- **Measurement**: Token counts from tool response metadata (`total_tokens` for Explore agents, self-reported `~N tokens for folded view` for Smart Explore)
+
+### Controls
+
+The Explore agents were explicitly instructed: *"Do NOT use smart_search, smart_outline, or smart_unfold tools. Only use Glob, Grep, and Read tools."* This was verified necessary after an initial round where agents opportunistically used the Smart Explore tools, invalidating the comparison.
+
+### Queries
+
+Five queries were selected to represent common exploration tasks:
+
+1. **"session processing"** -- Cross-cutting feature spanning multiple services
+2. **"shutdown"** -- Infrastructure concern touching 6+ files
+3. **"hook registration"** -- Architecture question about plugin system
+4. **"sqlite database"** -- Technology-specific search across the data layer
+5. **"worker-service.ts outline"** -- Single large file (1,225 lines) structural understanding
+
+## Round 1: Discovery
+
+*"What exists and where is it?"* -- Finding relevant files and symbols across the codebase.
+
+### Results
+
+| Query | Smart Explore | Explore Agent | Ratio | Explore Tool Calls |
+|-------|:---:|:---:|:---:|:---:|
+| session processing | ~4,391 t | 51,659 t | **11.8x** | 15 |
+| shutdown | ~3,852 t | 51,523 t | **13.4x** | 18 |
+| hook registration | ~1,930 t | 51,688 t | **26.8x** | 37 |
+| sqlite database | ~2,543 t | 58,633 t | **23.1x** | 16 |
+| worker-service outline | ~1,500 t | 38,973 t | **26.0x** | 15 |
+| **Total** | **~14,216 t** | **252,476 t** | **17.8x** | **101** |
+
+### What Each Returned
+
+**Smart Explore** (1 tool call each): 10 ranked symbols with signatures, line numbers, and JSDoc summaries, plus folded structural views of all matching files showing every function/class/interface with bodies collapsed.
+
+**Explore Agent** (15-37 tool calls each): Synthesized narrative reports with architecture diagrams, design pattern analysis, data flow explanations, complete interface dumps, and file structure maps. Significantly more explanatory prose.
+
+### Analysis
+
+The token gap is widest for narrowly-scoped queries ("hook registration" at 26.8x) because the Explore agent reads multiple full files to find relatively few relevant symbols. For broad queries ("session processing" at 11.8x), more of the file content is relevant, narrowing the ratio.
+
+Smart Explore's consistent 1-tool-call pattern means its cost is predictable. The Explore agent's cost varies with how many files it reads and how much it synthesizes -- ranging from 15 to 37 tool calls for comparable scope.
+
+## Round 2: Targeted Reads
+
+*"Show me this specific function."* -- Reading the implementation of a known symbol after discovery.
+
+Based on the Round 1 results, five specific symbols were selected as natural drill-down targets:
+
+| Target Symbol | File | Lines |
+|---------------|------|:---:|
+| `SessionManager.initializeSession` | services/worker/SessionManager.ts | 135 |
+| `performGracefulShutdown` | services/infrastructure/GracefulShutdown.ts | 48 |
+| `hookCommand` | cli/hook-command.ts | 45 |
+| `DatabaseManager.initialize` | services/sqlite/Database.ts | 27 |
+| `WorkerService.startSessionProcessor` | services/worker-service.ts | 158 |
+
+### Results
+
+| Symbol | Smart Unfold | Explore Agent | Ratio | Completeness |
+|--------|:---:|:---:|:---:|---|
+| initializeSession (135 lines) | ~1,800 t | 27,816 t | **15.5x** | Both returned full source |
+| performGracefulShutdown (48 lines) | ~700 t | 19,621 t | **28.0x** | Both returned full source |
+| hookCommand (45 lines) | ~650 t | 18,680 t | **28.7x** | Both returned full source |
+| DatabaseManager.initialize (27 lines) | ~400 t | 22,334 t | **55.8x** | Both returned full source |
+| startSessionProcessor (158 lines) | ~2,100 t | 20,906 t | **10.0x** | Smart Unfold: complete. Explore: **truncated** |
+| **Total** | **~5,650 t** | **109,357 t** | **19.4x** | |
+
+### Analysis
+
+**The ratio scales inversely with symbol size.** The smallest function (`initialize`, 27 lines) shows the biggest gap at 55.8x because the Explore agent still reads the entire 235-line file to extract 27 lines. The largest method (`startSessionProcessor`, 158 lines) narrows to 10x since more of the file is "useful."
+
+**Smart Unfold returned more complete code.** For the longest method (158 lines), the Explore agent truncated the error handling section with "... error handling continues ...", while `smart_unfold` returned the complete implementation. This is because smart_unfold extracts by AST node boundaries, guaranteeing completeness regardless of symbol size.
+
+**Explore agents add zero unique information for targeted reads.** When you already know the file path and symbol name, the agent's overhead is pure waste -- it reads the file, locates the function, and echoes it back. The only addition is a brief explanatory paragraph.
+
+## Combined Workflow
+
+The realistic workflow is discovery followed by targeted reading. Here is the end-to-end cost comparison for understanding a single function:
+
+### Smart Explore: search + unfold
+
+```
+smart_search("shutdown", path="./src")     ~3,852 tokens
+smart_unfold("GracefulShutdown.ts", "performGracefulShutdown")  ~700 tokens
+────────────────────────────────────────────────────────────────
+Total: ~4,552 tokens (2 tool calls, under 3 seconds)
+```
+
+### Explore Agent: single query
+
+```
+"Find and explain the shutdown logic"      ~51,523 tokens
+────────────────────────────────────────────────────────────────
+Total: ~51,523 tokens (18 tool calls, ~43 seconds)
+```
+
+**End-to-end ratio: 11.3x** -- and the Smart Explore workflow gives you the actual source code, while the Explore agent gives you a prose summary that may paraphrase or truncate.
+
+## Quality Assessment
+
+Neither approach is universally better. They optimize for different outcomes.
+
+### Smart Explore Strengths
+
+- **Predictable cost**: 1 tool call per operation, consistent token ranges
+- **Complete source code**: AST-based extraction guarantees full symbol bodies
+- **Structural context**: Folded views show every symbol in matching files
+- **Speed**: Sub-second responses enable rapid iteration
+- **Composability**: Search, outline, and unfold chain naturally
+
+### Explore Agent Strengths
+
+- **Synthesized understanding**: Produces architecture narratives, data flow diagrams, and design pattern analysis
+- **Cross-cutting explanation**: Connects concepts across files that individual symbol reads cannot
+- **Onboarding quality**: Output reads like documentation, not raw code
+- **Error handling insight**: Identifies edge cases and design decisions that require reading multiple related functions
+- **No prior knowledge needed**: Can answer open-ended questions without knowing file paths or symbol names
+
+### Quality by Task Type
+
+| Task | Better Tool | Why |
+|------|-------------|-----|
+| "Where is X defined?" | Smart Explore | One call, exact answer |
+| "What functions are in this file?" | Smart Explore | Outline returns complete structural map |
+| "Show me this function" | Smart Explore | Unfold returns exact source, never truncates |
+| "How does feature X work end-to-end?" | Explore Agent | Reads multiple files and synthesizes narrative |
+| "What design patterns are used here?" | Explore Agent | Requires reading and interpreting, not just extracting |
+| "Help me understand this codebase" | Explore Agent | Produces onboarding-quality documentation |
+
+## When to Use Which
+
+**Use Smart Explore when:**
+- You know what you are looking for (function name, concept, file)
+- You need source code, not explanation
+- You are iterating quickly (read, modify, read again)
+- Token budget matters (large codebases, long sessions)
+- You need file structure at a glance
+
+**Use the Explore Agent when:**
+- You need synthesized cross-cutting understanding
+- The question is open-ended ("how does this system work?")
+- You are writing documentation or architecture reviews
+- You need to understand *why*, not just *what*
+- You are onboarding to an unfamiliar codebase
+
+**Use both when:**
+- Start with Smart Explore for discovery and navigation
+- Escalate to Explore Agent only for deep analysis that requires multi-file synthesis
+- This hybrid approach captures most of the token savings while preserving access to deep understanding when needed
+
+## Token Economics Reference
+
+| Operation | Tokens | Use Case |
+|-----------|:---:|----------|
+| `smart_search` | 2,000-6,000 | Cross-file symbol discovery |
+| `smart_outline` | 1,000-2,000 | Single file structural map |
+| `smart_unfold` | 400-2,100 | Single symbol full source |
+| `smart_search` + `smart_unfold` | 3,000-8,000 | End-to-end: find and read |
+| Explore Agent (targeted) | 18,000-28,000 | Single function with explanation |
+| Explore Agent (cross-cutting) | 39,000-59,000 | Architecture-level understanding |
+| Read (full file) | 8,000-15,000+ | Complete file contents |
+
+### Savings by Workflow
+
+| Workflow | Smart Explore | Traditional | Savings |
+|----------|:---:|:---:|:---:|
+| Understand one file | outline + unfold (~3,100 t) | Read full file (~12,000 t) | **4x** |
+| Find a function across codebase | search (~3,500 t) | Explore agent (~50,000 t) | **14x** |
+| Find and read a specific function | search + unfold (~4,500 t) | Explore agent (~50,000 t) | **11x** |
+| Navigate a 1,200-line file | outline (~1,500 t) | Read full file (~12,000 t) | **8x** |
--- a/.agent/services/claude-mem/docs/public/trendshift-badge-dark.svg
+++ b/.agent/services/claude-mem/docs/public/trendshift-badge-dark.svg
@@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+  <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 250 53" width="250" height="55" data-date-format="longDate">
+    <rect xmlns="http://www.w3.org/2000/svg" stroke="#b5a0d9" stroke-width="1" fill="#1a1a1a" x="0.5" y="0.5" width="249" height="53" rx="10"/>
+    <foreignObject width="198" height="17" style="font-size: 9px;color: rgb(200, 180, 230);font-family: Arial;font-weight: 400;text-align: center;letter-spacing: 0em;line-height: 1.5;" x="6" y="10" selection="true">
+      <div xmlns="http://www.w3.org/1999/xhtml">GITHUB TRENDING</div>
+    </foreignObject>
+    <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" id="&#x421;&#x43B;&#x43E;&#x439;_1" viewBox="0 0 80 80" width="48" height="45" x="10" y="8">
+  <path fill="#b5a0d9" d="M70.71,40.31C75.74,44.3,80,37.86,80,37.86s-5.64-2.17-8.55,0.61c0.59-1.62,1.02-3.31,1.28-5.01  c4.08,2.16,6.44-2.95,6.44-2.95s-4.41-0.97-6.26,1.4c0.08-0.91,0.12-1.82,0.1-2.73c-0.01-0.36-0.02-0.73-0.05-1.09  c2.96-3.68-1.73-6.99-1.73-6.99s-2.14,5.09,0.98,7.09c0.02,0.33,0.03,0.66,0.03,1c0.01,0.76-0.03,1.52-0.1,2.27  c-0.85-2.69-4.91-3.69-4.91-3.69s-0.13,5.78,4.68,5.48c-0.28,1.69-0.73,3.35-1.34,4.95c-0.19-4.03-5.79-6.33-5.79-6.33  s-1.33,7.55,5.01,8.16c-0.38,0.8-0.8,1.57-1.25,2.32c-0.56,0.95-1.21,1.84-1.89,2.71c0.97-3.99-3.96-7.72-3.96-7.72  s-3.18,6.94,2.73,9.15c-0.38,0.43-0.8,0.81-1.2,1.21c-0.21,0.2-0.43,0.38-0.64,0.58l-0.32,0.29c-0.11,0.09-0.22,0.18-0.33,0.27  l-0.67,0.54l-0.7,0.51c-0.08,0.05-0.16,0.11-0.23,0.16c1.62-3.42-2.07-7.77-2.07-7.77s-4.21,5.55,0.49,8.78  c-1.34,0.79-2.74,1.45-4.2,1.98c1.91-2.59-0.23-6.89-0.23-6.89s-4.66,3.77-1.52,7.46c-1.15,0.33-2.33,0.57-3.51,0.74  c1.46-1.68,0.55-4.83,0.55-4.83s-3.7,2.03-2.18,5c-0.52,0.03-1.05,0.07-1.57,0.06c-0.29,0-0.57,0.01-0.86,0l-0.86-0.04  c-0.85-0.06-1.7-0.15-2.54-0.28l0.68-0.27l0.42-0.17l0.41-0.19l0.82-0.38c0,0,0.01,0,0.01,0c0.39-0.18,0.55-0.65,0.37-1.03  c-0.18-0.39-0.65-0.55-1.03-0.37l-0.04,0.02l-0.77,0.37l-0.39,0.18l-0.39,0.16l-0.79,0.33l-0.8,0.29l-0.4,0.14l-0.41,0.12L40,53.6  l-0.51-0.15l-0.41-0.12l-0.4-0.14l-0.8-0.29l-0.79-0.33l-0.39-0.16l-0.39-0.18l-0.77-0.37l-0.04-0.02c0,0,0,0-0.01,0  c-0.39-0.18-0.85-0.01-1.03,0.38c-0.18,0.39-0.01,0.85,0.38,1.03l0.82,0.38l0.41,0.19l0.42,0.17l0.68,0.27  c-0.84,0.14-1.69,0.22-2.54,0.28l-0.86,0.04c-0.29,0.01-0.57,0-0.86,0c-0.53,0.01-1.05-0.03-1.57-0.06c1.51-2.98-2.18-5-2.18-5  s-0.92,3.15,0.55,4.83c-1.19-0.16-2.36-0.41-3.51-0.74c3.15-3.7-1.52-7.46-1.52-7.46s-2.14,4.31-0.23,6.89  c-1.46-0.53-2.86-1.19-4.2-1.98c4.7-3.22,0.49-8.78,0.49-8.78s-3.69,4.34-2.07,7.77c-0.08-0.05-0.16-0.1-0.23-0.16l-0.7-0.51  l-0.67-0.54c-0.11-0.09-0.23-0.18-0.33-0.27l-0.32-0.29c-0.21-0.19-0.43-0.38-0.64-0.58c-0.4-0.4-0.82-0.79-1.2-1.21  c5.91-2.21,2.73-9.15,2.73-9.15s-4.93,3.73-3.96,7.72c-0.68-0.86-1.33-1.76-1.89-2.71c-0.46-0.75-0.87-1.53-1.25-2.32  c6.33-0.61,5.01-8.16,5.01-8.16s-5.6,2.31-5.79,6.33c-0.61-1.6-1.06-3.26-1.34-4.95c4.81,0.3,4.68-5.48,4.68-5.48  s-4.05,0.99-4.91,3.69c-0.07-0.76-0.1-1.51-0.1-2.27c0-0.33,0.01-0.66,0.03-1c3.11-2.01,0.98-7.09,0.98-7.09s-4.69,3.31-1.73,6.99  C7,28.46,6.99,28.82,6.98,29.18c-0.02,0.91,0.01,1.82,0.1,2.73c-1.84-2.38-6.26-1.4-6.26-1.4s2.37,5.11,6.44,2.95  c0.26,1.71,0.69,3.39,1.28,5.01C5.64,35.69,0,37.86,0,37.86s4.26,6.43,9.29,2.45c0.39,0.87,0.83,1.72,1.31,2.54  c0.47,0.83,1.01,1.63,1.58,2.4C8.71,43.7,4.11,47,4.11,47s5.7,5.1,9.56,0.08c0.04,0.04,0.07,0.08,0.11,0.12  c0.39,0.45,0.82,0.87,1.24,1.3c0.21,0.21,0.44,0.41,0.66,0.61l0.33,0.3c0.11,0.1,0.23,0.19,0.34,0.29l0.69,0.57l0.23,0.17  c-3.34-0.34-6.58,3.29-6.58,3.29s6.19,3.47,8.69-1.83c1.2,0.75,2.47,1.41,3.78,1.96c-2.76,0.6-4.62,4.13-4.62,4.13  s5.89,1.62,6.98-3.26c1.03,0.32,2.07,0.58,3.13,0.78c-1.63,0.99-2.39,3.38-2.39,3.38s4.31,0.39,4.61-3.07  c0.07,0.01,0.14,0.02,0.21,0.02c0.6,0.04,1.2,0.1,1.8,0.09c0.3,0,0.6,0.02,0.9,0.01l0.9-0.03c1.2-0.07,2.41-0.18,3.59-0.42  l0.45-0.08c0.15-0.03,0.29-0.07,0.44-0.1L40,55.13l0.81,0.19c0.15,0.03,0.29,0.07,0.44,0.1l0.45,0.08c1.18,0.23,2.39,0.35,3.59,0.42  l0.9,0.03c0.3,0.01,0.6-0.01,0.9-0.01c0.6,0,1.2-0.06,1.8-0.09c0.07-0.01,0.14-0.02,0.21-0.02c0.31,3.45,4.61,3.07,4.61,3.07  s-0.76-2.39-2.39-3.38c1.06-0.2,2.11-0.46,3.13-0.78c1.09,4.88,6.98,3.26,6.98,3.26s-1.86-3.52-4.62-4.13  c1.31-0.55,2.57-1.21,3.78-1.96c2.5,5.3,8.69,1.83,8.69,1.83s-3.24-3.63-6.58-3.29l0.23-0.17l0.69-0.57  c0.11-0.1,0.23-0.19,0.34-0.29l0.33-0.3c0.22-0.2,0.45-0.4,0.66-0.61c0.42-0.43,0.85-0.84,1.24-1.3c0.03-0.04,0.07-0.08,0.11-0.12  C70.19,52.1,75.89,47,75.89,47s-4.6-3.31-8.08-1.75c0.57-0.77,1.11-1.56,1.58-2.4C69.88,42.03,70.31,41.18,70.71,40.31z"/>
+  </svg>
+    <foreignObject width="230" height="35" style="font-size: 14px;color: rgb(200, 180, 230);font-family: Arial;font-weight: 700;text-align: left;letter-spacing: 0em;line-height: 1.5;" x="64" y="24">
+      <div xmlns="http://www.w3.org/1999/xhtml">#1 Repository Of The Day</div>
+    </foreignObject>
+    <foreignObject width="141" height="36" style="font-size: 18px;color: rgb(180, 160, 217);font-family: Arial;font-weight: 400;text-align: center;letter-spacing: 0em;line-height: 1.5;" x="-36" y="9">
+      <div xmlns="http://www.w3.org/1999/xhtml">1</div>
+    </foreignObject>
+  </svg>
--- a/.agent/services/claude-mem/docs/public/trendshift-badge.svg
+++ b/.agent/services/claude-mem/docs/public/trendshift-badge.svg
@@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+  <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 250 53" width="250" height="55" data-date-format="longDate">
+    <rect xmlns="http://www.w3.org/2000/svg" stroke="#4a0e99" stroke-width="1" fill="#FFFFFF" x="0.5" y="0.5" width="249" height="53" rx="10"/>
+    <foreignObject width="198" height="17" style="font-size: 9px;color: rgb(67, 39, 135);font-family: Arial;font-weight: 400;text-align: center;letter-spacing: 0em;line-height: 1.5;" x="6" y="10" selection="true">
+      <div xmlns="http://www.w3.org/1999/xhtml">GITHUB TRENDING</div>
+    </foreignObject>
+    <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" id="&#x421;&#x43B;&#x43E;&#x439;_1" viewBox="0 0 80 80" width="48" height="45" x="10" y="8">
+  <path fill="#49278e" d="M70.71,40.31C75.74,44.3,80,37.86,80,37.86s-5.64-2.17-8.55,0.61c0.59-1.62,1.02-3.31,1.28-5.01  c4.08,2.16,6.44-2.95,6.44-2.95s-4.41-0.97-6.26,1.4c0.08-0.91,0.12-1.82,0.1-2.73c-0.01-0.36-0.02-0.73-0.05-1.09  c2.96-3.68-1.73-6.99-1.73-6.99s-2.14,5.09,0.98,7.09c0.02,0.33,0.03,0.66,0.03,1c0.01,0.76-0.03,1.52-0.1,2.27  c-0.85-2.69-4.91-3.69-4.91-3.69s-0.13,5.78,4.68,5.48c-0.28,1.69-0.73,3.35-1.34,4.95c-0.19-4.03-5.79-6.33-5.79-6.33  s-1.33,7.55,5.01,8.16c-0.38,0.8-0.8,1.57-1.25,2.32c-0.56,0.95-1.21,1.84-1.89,2.71c0.97-3.99-3.96-7.72-3.96-7.72  s-3.18,6.94,2.73,9.15c-0.38,0.43-0.8,0.81-1.2,1.21c-0.21,0.2-0.43,0.38-0.64,0.58l-0.32,0.29c-0.11,0.09-0.22,0.18-0.33,0.27  l-0.67,0.54l-0.7,0.51c-0.08,0.05-0.16,0.11-0.23,0.16c1.62-3.42-2.07-7.77-2.07-7.77s-4.21,5.55,0.49,8.78  c-1.34,0.79-2.74,1.45-4.2,1.98c1.91-2.59-0.23-6.89-0.23-6.89s-4.66,3.77-1.52,7.46c-1.15,0.33-2.33,0.57-3.51,0.74  c1.46-1.68,0.55-4.83,0.55-4.83s-3.7,2.03-2.18,5c-0.52,0.03-1.05,0.07-1.57,0.06c-0.29,0-0.57,0.01-0.86,0l-0.86-0.04  c-0.85-0.06-1.7-0.15-2.54-0.28l0.68-0.27l0.42-0.17l0.41-0.19l0.82-0.38c0,0,0.01,0,0.01,0c0.39-0.18,0.55-0.65,0.37-1.03  c-0.18-0.39-0.65-0.55-1.03-0.37l-0.04,0.02l-0.77,0.37l-0.39,0.18l-0.39,0.16l-0.79,0.33l-0.8,0.29l-0.4,0.14l-0.41,0.12L40,53.6  l-0.51-0.15l-0.41-0.12l-0.4-0.14l-0.8-0.29l-0.79-0.33l-0.39-0.16l-0.39-0.18l-0.77-0.37l-0.04-0.02c0,0,0,0-0.01,0  c-0.39-0.18-0.85-0.01-1.03,0.38c-0.18,0.39-0.01,0.85,0.38,1.03l0.82,0.38l0.41,0.19l0.42,0.17l0.68,0.27  c-0.84,0.14-1.69,0.22-2.54,0.28l-0.86,0.04c-0.29,0.01-0.57,0-0.86,0c-0.53,0.01-1.05-0.03-1.57-0.06c1.51-2.98-2.18-5-2.18-5  s-0.92,3.15,0.55,4.83c-1.19-0.16-2.36-0.41-3.51-0.74c3.15-3.7-1.52-7.46-1.52-7.46s-2.14,4.31-0.23,6.89  c-1.46-0.53-2.86-1.19-4.2-1.98c4.7-3.22,0.49-8.78,0.49-8.78s-3.69,4.34-2.07,7.77c-0.08-0.05-0.16-0.1-0.23-0.16l-0.7-0.51  l-0.67-0.54c-0.11-0.09-0.23-0.18-0.33-0.27l-0.32-0.29c-0.21-0.19-0.43-0.38-0.64-0.58c-0.4-0.4-0.82-0.79-1.2-1.21  c5.91-2.21,2.73-9.15,2.73-9.15s-4.93,3.73-3.96,7.72c-0.68-0.86-1.33-1.76-1.89-2.71c-0.46-0.75-0.87-1.53-1.25-2.32  c6.33-0.61,5.01-8.16,5.01-8.16s-5.6,2.31-5.79,6.33c-0.61-1.6-1.06-3.26-1.34-4.95c4.81,0.3,4.68-5.48,4.68-5.48  s-4.05,0.99-4.91,3.69c-0.07-0.76-0.1-1.51-0.1-2.27c0-0.33,0.01-0.66,0.03-1c3.11-2.01,0.98-7.09,0.98-7.09s-4.69,3.31-1.73,6.99  C7,28.46,6.99,28.82,6.98,29.18c-0.02,0.91,0.01,1.82,0.1,2.73c-1.84-2.38-6.26-1.4-6.26-1.4s2.37,5.11,6.44,2.95  c0.26,1.71,0.69,3.39,1.28,5.01C5.64,35.69,0,37.86,0,37.86s4.26,6.43,9.29,2.45c0.39,0.87,0.83,1.72,1.31,2.54  c0.47,0.83,1.01,1.63,1.58,2.4C8.71,43.7,4.11,47,4.11,47s5.7,5.1,9.56,0.08c0.04,0.04,0.07,0.08,0.11,0.12  c0.39,0.45,0.82,0.87,1.24,1.3c0.21,0.21,0.44,0.41,0.66,0.61l0.33,0.3c0.11,0.1,0.23,0.19,0.34,0.29l0.69,0.57l0.23,0.17  c-3.34-0.34-6.58,3.29-6.58,3.29s6.19,3.47,8.69-1.83c1.2,0.75,2.47,1.41,3.78,1.96c-2.76,0.6-4.62,4.13-4.62,4.13  s5.89,1.62,6.98-3.26c1.03,0.32,2.07,0.58,3.13,0.78c-1.63,0.99-2.39,3.38-2.39,3.38s4.31,0.39,4.61-3.07  c0.07,0.01,0.14,0.02,0.21,0.02c0.6,0.04,1.2,0.1,1.8,0.09c0.3,0,0.6,0.02,0.9,0.01l0.9-0.03c1.2-0.07,2.41-0.18,3.59-0.42  l0.45-0.08c0.15-0.03,0.29-0.07,0.44-0.1L40,55.13l0.81,0.19c0.15,0.03,0.29,0.07,0.44,0.1l0.45,0.08c1.18,0.23,2.39,0.35,3.59,0.42  l0.9,0.03c0.3,0.01,0.6-0.01,0.9-0.01c0.6,0,1.2-0.06,1.8-0.09c0.07-0.01,0.14-0.02,0.21-0.02c0.31,3.45,4.61,3.07,4.61,3.07  s-0.76-2.39-2.39-3.38c1.06-0.2,2.11-0.46,3.13-0.78c1.09,4.88,6.98,3.26,6.98,3.26s-1.86-3.52-4.62-4.13  c1.31-0.55,2.57-1.21,3.78-1.96c2.5,5.3,8.69,1.83,8.69,1.83s-3.24-3.63-6.58-3.29l0.23-0.17l0.69-0.57  c0.11-0.1,0.23-0.19,0.34-0.29l0.33-0.3c0.22-0.2,0.45-0.4,0.66-0.61c0.42-0.43,0.85-0.84,1.24-1.3c0.03-0.04,0.07-0.08,0.11-0.12  C70.19,52.1,75.89,47,75.89,47s-4.6-3.31-8.08-1.75c0.57-0.77,1.11-1.56,1.58-2.4C69.88,42.03,70.31,41.18,70.71,40.31z"/>
+  </svg>
+    <foreignObject width="230" height="35" style="font-size: 14px;color: rgb(67, 39, 135);font-family: Arial;font-weight: 700;text-align: left;letter-spacing: 0em;line-height: 1.5;" x="64" y="24">
+      <div xmlns="http://www.w3.org/1999/xhtml">#1 Repository Of The Day</div>
+    </foreignObject>
+    <foreignObject width="141" height="36" style="font-size: 18px;color: rgb(74, 14, 153);font-family: Arial;font-weight: 400;text-align: center;letter-spacing: 0em;line-height: 1.5;" x="-36" y="9">
+      <div xmlns="http://www.w3.org/1999/xhtml">1</div>
+    </foreignObject>
+  </svg>
--- a/.agent/services/claude-mem/docs/public/troubleshooting.mdx
+++ b/.agent/services/claude-mem/docs/public/troubleshooting.mdx
--- a/.agent/services/claude-mem/docs/public/usage/claude-desktop.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/claude-desktop.mdx
@@ -0,0 +1,149 @@
+---
+title: Claude Desktop MCP
+description: Use claude-mem memory search in Claude Desktop with MCP tools
+icon: desktop
+---
+
+<Note>
+**Availability:** Claude-mem MCP tools work with Claude Desktop on macOS and Windows.
+</Note>
+
+## Overview
+
+Claude Desktop can access your claude-mem memory database through **MCP tools**. This allows you to search past sessions, decisions, and observations directly from Claude Desktop conversations.
+
+## Prerequisites
+
+Before configuring MCP tools, ensure:
+
+1. **claude-mem is installed** and the worker service is running
+2. **MCP server is configured** in Claude Desktop (uses the `mcp-search` MCP server)
+
+### Verify Worker is Running
+
+```bash
+curl http://localhost:37777/api/health
+# Should return: {"status":"ok"}
+```
+
+## Installation
+
+### Step 1: Configure MCP Server
+
+The skill requires the `mcp-search` MCP server. Add this to your Claude Desktop configuration:
+
+<Tabs>
+  <Tab title="macOS">
+    Edit `~/Library/Application Support/Claude/claude_desktop_config.json`:
+
+    ```json
+    {
+      "mcpServers": {
+        "mcp-search": {
+          "command": "node",
+          "args": [
+            "/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
+          ]
+        }
+      }
+    }
+    ```
+  </Tab>
+  <Tab title="Windows">
+    Edit `%APPDATA%\Claude\claude_desktop_config.json`:
+
+    ```json
+    {
+      "mcpServers": {
+        "mcp-search": {
+          "command": "node",
+          "args": [
+            "C:\\Users\\YOUR_USERNAME\\.claude\\plugins\\marketplaces\\thedotmack\\plugin\\scripts\\mcp-server.cjs"
+          ]
+        }
+      }
+    }
+    ```
+  </Tab>
+</Tabs>
+
+<Warning>
+Replace `YOUR_USERNAME` with your actual username. Restart Claude Desktop after editing the configuration.
+</Warning>
+
+### Step 2: Restart Claude Desktop
+
+Close and reopen Claude Desktop for the MCP server configuration to take effect.
+
+## Usage
+
+Once installed, the skill auto-activates when you ask about past work:
+
+```
+"What did we do last session?"
+"Did we fix this bug before?"
+"How did we implement authentication?"
+"What decisions did we make about the API?"
+"Show me changes to worker-service.ts"
+```
+
+## Available MCP Tools
+
+The skill provides three core MCP tools following a 3-layer workflow pattern:
+
+| Tool | Description |
+|------|-------------|
+| `search` | Search memory index. Returns compact results with IDs for filtering |
+| `timeline` | Get chronological context around a query or observation ID |
+| `get_observations` | Fetch full observation details by ID (use after filtering with search/timeline) |
+
+### Token-Efficient Workflow
+
+1. **Search** → Get index with IDs (~50-100 tokens/result)
+2. **Timeline** → Get context around interesting results
+3. **Get Observations** → Fetch full details ONLY for filtered IDs
+
+This 3-layer approach provides ~10x token savings compared to fetching full details upfront.
+
+## Troubleshooting
+
+### Skill Not Appearing
+
+1. Verify the zip file was properly installed
+2. Check Claude Desktop's skill installation logs
+3. Restart Claude Desktop
+
+### MCP Server Connection Failed
+
+1. Verify the worker is running: `curl http://localhost:37777/api/health`
+2. Check the MCP server path in configuration
+3. Look for errors in Claude Desktop logs
+
+<Tabs>
+  <Tab title="macOS">
+    ```bash
+    # View Claude Desktop logs
+    tail -f ~/Library/Logs/Claude/claude.log
+    ```
+  </Tab>
+  <Tab title="Windows">
+    Check `%APPDATA%\Claude\logs\`
+  </Tab>
+</Tabs>
+
+### Search Returns No Results
+
+1. Ensure claude-mem has recorded sessions (check http://localhost:37777)
+2. Verify the database exists: `ls ~/.claude-mem/claude-mem.db`
+3. Test the API directly: `curl "http://localhost:37777/api/search?query=test"`
+
+## Related
+
+<CardGroup cols={2}>
+  <Card title="Search Tools" icon="magnifying-glass" href="/usage/search-tools">
+    Complete search API reference
+  </Card>
+  <Card title="Platform Integration" icon="plug" href="/platform-integration">
+    Build custom integrations
+  </Card>
+</CardGroup>
--- a/.agent/services/claude-mem/docs/public/usage/export-import.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/export-import.mdx
@@ -0,0 +1,295 @@
+---
+title: "Memory Export/Import"
+description: "Share knowledge across claude-mem installations with duplicate prevention"
+---
+
+# Memory Export/Import Scripts
+
+Share your claude-mem knowledge with other users! These scripts allow you to export specific memories (observations, sessions, summaries, and prompts) and import them into another claude-mem installation.
+
+## Use Cases
+
+- **Share Windows compatibility knowledge** with Windows users
+- **Share bug fix patterns** with contributors
+- **Share project-specific learnings** across teams
+- **Backup specific memory sets** for safekeeping
+
+## How It Works
+
+### Export Script
+
+Searches the database using **hybrid search** (combines ChromaDB vector embeddings with FTS5 full-text search) and exports all matching:
+- **Observations** - Individual learnings and discoveries
+- **Sessions** - Session metadata
+- **Summaries** - Session summaries
+- **Prompts** - User prompts that led to the work
+
+Output is a portable JSON file that can be shared.
+
+> **Privacy Note:** Export files contain all matching memory data in plain text. Review exports before sharing to ensure no sensitive information (API keys, passwords, private paths) is included.
+
+### Import Script
+
+Imports memories with **duplicate prevention**:
+- Checks if each record already exists before inserting
+- Skips duplicates automatically
+- Maintains data integrity with transactional imports
+- Reports what was imported vs. skipped
+
+**Duplicate Detection Strategy:**
+- **Sessions**: By `claude_session_id` (unique)
+- **Summaries**: By `sdk_session_id` (unique)
+- **Observations**: By `sdk_session_id` + `title` + `created_at_epoch` (composite)
+- **Prompts**: By `claude_session_id` + `prompt_number` (composite)
+
+## Usage
+
+### Export Memories
+
+```bash
+# Export all Windows-related memories
+npx tsx scripts/export-memories.ts "windows" windows-memories.json
+
+# Export bug fixes
+npx tsx scripts/export-memories.ts "bugfix" bugfixes.json
+
+# Export specific feature work
+npx tsx scripts/export-memories.ts "progressive disclosure" progressive-disclosure.json
+```
+
+**Parameters:**
+1. `<query>` - Search query (uses hybrid semantic + full-text search)
+2. `<output-file>` - Output JSON file path
+3. `--project=name` - Optional: filter results to a specific project
+
+**Example Output:**
+```
+🔍 Searching for: "windows"
+✅ Found 54 observations
+✅ Found 12 sessions
+✅ Found 12 summaries
+✅ Found 7 prompts
+
+📦 Export complete!
+📄 Output: windows-memories.json
+📊 Stats:
+   • 54 observations
+   • 12 sessions
+   • 12 summaries
+   • 7 prompts
+```
+
+### Import Memories
+
+```bash
+# Import from an export file
+npx tsx scripts/import-memories.ts windows-memories.json
+```
+
+**Parameters:**
+1. `<input-file>` - Input JSON file (from export script)
+
+**Example Output:**
+```
+📦 Import file: windows-memories.json
+📅 Exported: 2025-12-10T23:45:00.000Z
+🔍 Query: "windows"
+📊 Contains:
+   • 54 observations
+   • 12 sessions
+   • 12 summaries
+   • 7 prompts
+
+🔄 Importing sessions...
+   ✅ Imported: 12, Skipped: 0
+🔄 Importing summaries...
+   ✅ Imported: 12, Skipped: 0
+🔄 Importing observations...
+   ✅ Imported: 54, Skipped: 0
+🔄 Importing prompts...
+   ✅ Imported: 7, Skipped: 0
+
+✅ Import complete!
+📊 Summary:
+   Sessions:     12 imported, 0 skipped
+   Summaries:    12 imported, 0 skipped
+   Observations: 54 imported, 0 skipped
+   Prompts:      7 imported, 0 skipped
+```
+
+### Re-importing (Duplicate Prevention)
+
+If you run the import again on the same file, duplicates are automatically skipped:
+
+```
+🔄 Importing sessions...
+   ✅ Imported: 0, Skipped: 12  ← All skipped (already exist)
+🔄 Importing summaries...
+   ✅ Imported: 0, Skipped: 12
+🔄 Importing observations...
+   ✅ Imported: 0, Skipped: 54
+🔄 Importing prompts...
+   ✅ Imported: 0, Skipped: 7
+```
+
+## Sharing Memories
+
+### For Export Authors
+
+1. **Export your memories:**
+   ```bash
+   npx tsx scripts/export-memories.ts "windows" windows-memories.json
+   ```
+
+2. **Share the JSON file** via:
+   - GitHub gist
+   - Project repository (`shared-memories/`)
+   - Direct file transfer
+   - Package in releases
+
+3. **Document what's included:**
+   - What query was used
+   - What knowledge is contained
+   - Who might benefit from it
+
+### For Import Users
+
+1. **Download the export file** to your local machine
+
+2. **Review what's in it** (optional):
+   ```bash
+   cat windows-memories.json | jq '.totalObservations, .totalSessions'
+   ```
+
+3. **Import into your database:**
+   ```bash
+   npx tsx scripts/import-memories.ts windows-memories.json
+   ```
+
+4. **Verify import** by searching:
+   ```bash
+   curl "http://localhost:37777/api/search?query=windows&format=index&limit=10"
+   ```
+
+## JSON Export Format
+
+```json
+{
+  "exportedAt": "2025-12-10T23:45:00.000Z",
+  "exportedAtEpoch": 1733876700000,
+  "query": "windows",
+  "totalObservations": 54,
+  "totalSessions": 12,
+  "totalSummaries": 12,
+  "totalPrompts": 7,
+  "observations": [ /* array of observation objects */ ],
+  "sessions": [ /* array of session objects */ ],
+  "summaries": [ /* array of summary objects */ ],
+  "prompts": [ /* array of prompt objects */ ]
+}
+```
+
+## Safety Features
+
+✅ **Duplicate Prevention** - Won't re-import existing records
+✅ **Transactional** - All-or-nothing imports (database stays consistent)
+✅ **Read-only Export** - Export script opens database in read-only mode
+✅ **Dependency Ordering** - Sessions imported before observations/summaries
+✅ **Validation** - Checks database exists before starting
+
+## Advanced Usage
+
+### Export by Project
+
+```bash
+# Export only claude-mem project memories
+npx tsx scripts/export-memories.ts "bugfix" bugfixes.json --project=claude-mem
+
+# Export all memories for a specific project
+npx tsx scripts/export-memories.ts "" all-project.json --project=my-app
+```
+
+### Export by Type
+
+```bash
+# Export only discoveries
+npx tsx scripts/export-memories.ts "type:discovery" discoveries.json
+
+# Export only bug fixes
+npx tsx scripts/export-memories.ts "type:bugfix" bugfixes.json
+```
+
+### Export by Date Range
+
+You can filter the export after exporting:
+
+```bash
+# Export all memories, then filter manually with jq
+npx tsx scripts/export-memories.ts "" all-memories.json
+cat all-memories.json | jq '.observations |= map(select(.created_at_epoch > 1700000000000))' > recent-memories.json
+```
+
+### Combine Multiple Exports
+
+```bash
+# Export different topics
+npx tsx scripts/export-memories.ts "windows" windows.json
+npx tsx scripts/export-memories.ts "linux" linux.json
+
+# Import both
+npx tsx scripts/import-memories.ts windows.json
+npx tsx scripts/import-memories.ts linux.json
+```
+
+## Troubleshooting
+
+### Database Not Found
+
+```
+❌ Database not found at: /Users/you/.claude-mem/claude-mem.db
+```
+
+**Solution:** Make sure claude-mem is installed and has been run at least once.
+
+### Import File Not Found
+
+```
+❌ Input file not found: windows-memories.json
+```
+
+**Solution:** Check the file path. Use absolute paths if needed.
+
+### Partial Import
+
+If import fails mid-way, the transaction is rolled back - your database remains unchanged. Fix the issue and try again.
+
+## Contributing Memory Sets
+
+If you've exported valuable knowledge that others might benefit from:
+
+1. Create a PR to the `shared-memories/` directory
+2. Include a README describing what's in the export
+3. Tag with relevant keywords (windows, linux, bugfix, etc.)
+4. Community members can then import your knowledge!
+
+## Examples of Useful Exports
+
+**Windows Compatibility Knowledge:**
+```bash
+npx tsx scripts/export-memories.ts "windows compatibility installation" windows-fixes.json
+```
+
+**Progressive Disclosure Architecture:**
+```bash
+npx tsx scripts/export-memories.ts "progressive disclosure architecture token" pd-patterns.json
+```
+
+**Bug Fix Patterns:**
+```bash
+npx tsx scripts/export-memories.ts "bugfix error handling" bugfix-patterns.json
+```
+
+**Performance Optimization:**
+```bash
+npx tsx scripts/export-memories.ts "performance optimization caching" perf-tips.json
+```
--- a/.agent/services/claude-mem/docs/public/usage/folder-context.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/folder-context.mdx
@@ -0,0 +1,280 @@
+---
+title: "Folder Context Files"
+description: "Automatic per-folder CLAUDE.md files that provide directory-level context to Claude"
+---
+
+## Overview
+
+Claude-mem automatically generates `CLAUDE.md` files in your project folders to provide Claude with directory-level context. These files contain a summary of recent activity in each folder, helping Claude understand what work has been done and where.
+
+<Info>
+This feature is **disabled by default**. Enable it via settings if you want automatic folder-level context generation.
+</Info>
+
+## How It Works
+
+When you work with Claude Code in a project, claude-mem tracks which files are read and modified. After each observation is saved, it automatically:
+
+1. Identifies unique folder paths from touched files
+2. Queries recent observations relevant to each folder
+3. Generates a formatted timeline of activity
+4. Writes it to `CLAUDE.md` in that folder (inside `<claude-mem-context>` tags)
+
+### What Gets Generated
+
+Each folder's `CLAUDE.md` contains a "Recent Activity" section showing:
+
+- Observation IDs for reference
+- Timestamps of when work occurred
+- Type indicators (bug fixes, features, discoveries, etc.)
+- Brief titles describing the work
+- Estimated token counts
+
+```markdown
+<claude-mem-context>
+# Recent Activity
+
+<!-- This section is auto-generated by claude-mem. Edit content outside the tags. -->
+
+### Jan 4, 2026
+
+| ID | Time | T | Title | Read |
+|----|------|---|-------|------|
+| #1234 | 4:30 PM | 🔵 | Implemented user authentication | ~250 |
+| #1235 | " | 🔴 | Fixed login redirect bug | ~180 |
+</claude-mem-context>
+```
+
+### User Content Preservation
+
+The auto-generated content is wrapped in `<claude-mem-context>` tags. **Any content you write outside these tags is preserved** when the file is regenerated. This means you can:
+
+- Add your own documentation above or below the generated section
+- Write folder-specific instructions for Claude
+- Include architectural notes or conventions
+
+```markdown
+# Authentication Module
+
+This folder contains all authentication-related code.
+Follow the established patterns for new auth providers.
+
+<claude-mem-context>
+... auto-generated content ...
+</claude-mem-context>
+
+## Manual Notes
+
+- OAuth providers go in /providers/
+- Session handling uses Redis
+```
+
+### Project Root Exclusion
+
+The **project root** (folders containing a `.git` directory) is **excluded** from auto-generation. This is intentional:
+
+- Root `CLAUDE.md` files typically contain project-wide instructions you've written manually
+- Auto-generating at the root could overwrite important project documentation
+- Subfolders are where folder-level context is most useful
+
+<Note>
+Git submodules (which have a `.git` *file* instead of directory) are correctly detected and **not** excluded, so they receive auto-generated context.
+</Note>
+
+## Configuration
+
+### Enabling the Feature
+
+To enable folder CLAUDE.md generation, edit your settings file:
+
+**1. Open `~/.claude-mem/settings.json`**
+
+**2. Add or update this setting:**
+```json
+{
+  "CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED": "true"
+}
+```
+
+**3. Save the file** - changes take effect immediately (no restart needed)
+
+| Value | Behavior |
+|-------|----------|
+| `"false"` (default) | Folder CLAUDE.md generation disabled |
+| `"true"` | Auto-generate folder CLAUDE.md files |
+
+<Tip>
+If the settings file doesn't exist, create it with just the settings you want to change. Claude-mem will use defaults for any missing settings.
+</Tip>
+
+## Cleanup Mode
+
+The regenerate script includes a `--clean` mode for removing auto-generated content:
+
+```bash
+# Preview what would be cleaned (dry run)
+bun scripts/regenerate-claude-md.ts --clean --dry-run
+
+# Actually clean files
+bun scripts/regenerate-claude-md.ts --clean
+```
+
+**What cleanup does:**
+1. Finds all `CLAUDE.md` files recursively
+2. Strips `<claude-mem-context>...</claude-mem-context>` sections
+3. **Deletes** files that become empty after stripping
+4. **Preserves** files that have user content outside the tags
+
+This is useful for:
+- Preparing a branch for PR (removing generated files)
+- Resetting folder context to start fresh
+- Removing context before sharing code
+
+## Git Integration
+
+### Should You Commit These Files?
+
+This is **your choice** based on your workflow. Here are the trade-offs:
+
+<Tabs>
+  <Tab title="Commit Them">
+    **Pros:**
+    - Team members see folder-level context and recent activity
+    - New contributors can understand what happened where
+    - Code reviewers get additional context about changes
+    - Historical record of work patterns in the repo
+
+    **Cons:**
+    - Adds files to your repository
+    - Files change frequently during development
+    - May create noise in diffs and commit history
+    - Different team members may generate different content
+  </Tab>
+  <Tab title="Gitignore Them">
+    **Pros:**
+    - Clean repository without generated files
+    - No commit noise from auto-generated content
+    - Each developer has their own local context
+    - Simpler git history
+
+    **Cons:**
+    - Team doesn't share folder context
+    - Context is lost when switching machines
+    - New team members don't benefit from existing context
+  </Tab>
+</Tabs>
+
+### Gitignore Pattern
+
+To exclude folder CLAUDE.md files from git:
+
+```gitignore
+# Ignore auto-generated folder context files
+**/CLAUDE.md
+
+# But keep the root CLAUDE.md if you want
+!CLAUDE.md
+```
+
+Or to ignore all CLAUDE.md files everywhere:
+```gitignore
+**/CLAUDE.md
+```
+
+### Recommended Workflows
+
+**For Solo Developers:**
+- Keep them local (gitignore) for personal context
+- Or commit them if you work across multiple machines
+
+**For Teams:**
+- Discuss with your team which approach works best
+- Consider committing them if onboarding is frequent
+- Use `--clean` before PRs if you prefer clean diffs
+
+**Before Merging PRs:**
+```bash
+# Clean up generated files before merge
+bun scripts/regenerate-claude-md.ts --clean
+git add -A
+git commit -m "chore: clean up generated CLAUDE.md files"
+```
+
+## Regenerating Context
+
+To manually regenerate all folder CLAUDE.md files from the database:
+
+```bash
+# Preview what would be regenerated
+bun scripts/regenerate-claude-md.ts --dry-run
+
+# Regenerate all folders
+bun scripts/regenerate-claude-md.ts
+
+# Regenerate for a specific project only
+bun scripts/regenerate-claude-md.ts --project=my-project
+```
+
+This is useful after:
+- Importing observations from another machine
+- Database recovery
+- Wanting to refresh all folder context
+
+## Worktree Support
+
+**New in v9.0**: Claude-mem now supports git worktrees with unified context.
+
+When you're working in a git worktree, context is automatically gathered from both:
+- The parent repository (where the worktree was created)
+- The worktree directory itself
+
+This means observations about shared code are visible regardless of which worktree you're in, giving you a complete picture of recent activity across all related directories.
+
+### How It Works
+
+1. When generating context, claude-mem detects if your project is a worktree
+2. It identifies the parent repository automatically
+3. Timeline queries include both locations
+4. Results are interleaved chronologically
+
+<Note>
+No configuration needed - worktree detection is automatic. If you're not using worktrees, this feature has no effect.
+</Note>
+
+## Technical Details
+
+### File Format
+
+Generated content uses a consistent markdown table format:
+
+| Column | Description |
+|--------|-------------|
+| ID | Observation ID (e.g., `#1234`) or session ID (`#S123`) |
+| Time | 12-hour format with AM/PM, ditto marks (`"`) for repeated times |
+| T | Type emoji indicator |
+| Title | Brief description of the observation |
+| Read | Estimated token count (e.g., `~250`) |
+
+### Type Indicators
+
+| Emoji | Type |
+|-------|------|
+| 🔴 | Bug fix |
+| 🟣 | Feature |
+| 🔄 | Refactor |
+| ✅ | Change |
+| 🔵 | Discovery |
+| ⚖️ | Decision |
+| 🎯 | Session |
+| 💬 | Prompt |
+
+### Atomic Writes
+
+Files are written atomically using a temp file + rename pattern. This prevents partial writes if the process is interrupted.
+
+### Performance
+
+- Updates happen asynchronously (fire-and-forget)
+- Failures are logged but don't block the main workflow
+- Only folders with actual file activity are updated
+- Deduplication prevents redundant updates for the same folder
--- a/.agent/services/claude-mem/docs/public/usage/gemini-provider.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/gemini-provider.mdx
@@ -0,0 +1,172 @@
+---
+title: "Gemini Provider"
+description: "Use Google's Gemini API as an alternative to Claude for observation extraction"
+---
+
+# Gemini Provider
+
+Claude-mem supports Google's Gemini API as an alternative to the Claude Agent SDK for extracting observations from your sessions. This can significantly reduce costs since Gemini offers a generous free tier.
+
+<Warning>
+**Free Tier Rate Limits**: Without billing enabled, Gemini has strict rate limits (5-10 RPM). Enable billing on your Google Cloud project to unlock 1000-4000 RPM while still using the free quota.
+</Warning>
+
+## Why Use Gemini?
+
+- **Cost savings**: The free tier covers most individual usage patterns
+- **Same quality**: Gemini extracts observations using the same XML format as Claude
+- **Seamless fallback**: Automatically falls back to Claude if Gemini is unavailable
+- **Hot-swappable**: Switch providers without restarting the worker
+
+## Getting a Free API Key
+
+1. Go to the [Google AI Studio API Key page](https://aistudio.google.com/app/apikey)
+2. Sign in with your Google account
+3. Accept the Terms of Service and privacy policies
+4. Click the **Create API key** button
+5. Choose a Google Cloud project or create a new one
+6. Copy and securely store the generated API key
+
+<Tip>
+**No billing required** to get started, but we recommend enabling billing to unlock higher rate limits (1000-4000 RPM vs 5-10 RPM) while still using the free quota.
+</Tip>
+
+## Configuration
+
+### Settings
+
+| Setting | Values | Default | Description |
+|---------|--------|---------|-------------|
+| `CLAUDE_MEM_PROVIDER` | `claude`, `gemini` | `claude` | AI provider for observation extraction |
+| `CLAUDE_MEM_GEMINI_API_KEY` | string | — | Your Gemini API key |
+| `CLAUDE_MEM_GEMINI_MODEL` | `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash-preview` | `gemini-2.5-flash-lite` | Gemini model to use |
+| `CLAUDE_MEM_GEMINI_BILLING_ENABLED` | `true`, `false` | `false` | Skip rate limiting if billing is enabled on Google Cloud |
+
+### Using the Settings UI
+
+1. Open the viewer at http://localhost:37777
+2. Click the **gear icon** to open Settings
+3. Under **AI Provider**, select **Gemini**
+4. Enter your Gemini API key
+5. Optionally select a different model
+
+Settings are applied immediately—no restart required.
+
+### Manual Configuration
+
+Edit `~/.claude-mem/settings.json`:
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "gemini",
+  "CLAUDE_MEM_GEMINI_API_KEY": "your-api-key-here",
+  "CLAUDE_MEM_GEMINI_MODEL": "gemini-2.5-flash-lite",
+  "CLAUDE_MEM_GEMINI_BILLING_ENABLED": "true"
+}
+```
+
+Alternatively, set the API key via environment variable:
+
+```bash
+export GEMINI_API_KEY="your-api-key-here"
+```
+
+The settings file takes precedence over the environment variable.
+
+## Available Models
+
+| Model | Free Tier RPM | Notes |
+|-------|--------------|-------|
+| `gemini-2.5-flash-lite` | 10 | Default, recommended for free tier (highest RPM) |
+| `gemini-2.5-flash` | 5 | Higher capability, lower rate limit |
+| `gemini-3-flash-preview` | 5 | Latest model, lower rate limit |
+
+## Provider Switching
+
+You can switch between Claude and Gemini at any time:
+
+- **No restart required**: Changes take effect on the next observation
+- **Conversation history preserved**: When switching mid-session, the new provider sees the full conversation context
+- **Seamless transition**: Both providers use the same observation format
+
+### Switching via UI
+
+1. Open Settings in the viewer
+2. Change the **AI Provider** dropdown
+3. The next observation will use the new provider
+
+### Switching via Settings File
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "gemini"
+}
+```
+
+## Fallback Behavior
+
+If Gemini is selected but encounters errors, claude-mem automatically falls back to the Claude Agent SDK:
+
+**Triggers fallback:**
+- Rate limiting (HTTP 429)
+- Server errors (HTTP 5xx)
+- Network issues (connection refused, timeout)
+
+**Does not trigger fallback:**
+- Missing API key (logs warning, uses Claude from start)
+- Invalid API key (fails with error)
+
+When fallback occurs:
+1. A warning is logged
+2. Any in-progress messages are reset to pending
+3. Claude SDK takes over with the full conversation context
+
+## Troubleshooting
+
+### "Gemini API key not configured"
+
+Either:
+- Set `CLAUDE_MEM_GEMINI_API_KEY` in `~/.claude-mem/settings.json`, or
+- Set the `GEMINI_API_KEY` environment variable
+
+### Rate Limiting
+
+Google has two rate limit tiers for free usage:
+
+**Without billing (API key only):**
+
+| Model | RPM | TPM |
+|-------|-----|-----|
+| gemini-2.5-flash-lite | 10 | 250K |
+| gemini-2.5-flash | 5 | 250K |
+| gemini-3-flash-preview | 5 | 250K |
+
+Claude-mem enforces these limits automatically with built-in delays between requests. Processing may be slower but stays within limits.
+
+**With billing enabled (still free tier):**
+
+| Model | RPM | TPM |
+|-------|-----|-----|
+| gemini-2.5-flash-lite | 4,000 | 4M |
+| gemini-2.5-flash | 1,000 | 1M |
+| gemini-3-flash-preview | 1,000 | 1M |
+
+<Tip>
+**Recommended**: Enable billing on your Google Cloud project to unlock much higher rate limits. You won't be charged unless you exceed the generous free quota. This allows claude-mem to process observations instantly instead of waiting between requests.
+</Tip>
+
+If you hit rate limits:
+- Claude-mem automatically falls back to Claude SDK
+- Or switch back to Claude as your primary provider
+
+### Observation Quality
+
+If observations seem lower quality with Gemini:
+- Note that Claude typically produces slightly higher quality observations
+- Consider using Gemini for cost savings and Claude for important projects
+
+## Next Steps
+
+- [Configuration](/configuration) - Full settings reference
+- [Getting Started](/usage/getting-started) - Basic usage guide
+- [Troubleshooting](/troubleshooting) - Common issues
--- a/.agent/services/claude-mem/docs/public/usage/getting-started.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/getting-started.mdx
@@ -0,0 +1,213 @@
+---
+title: "Getting Started"
+description: "Learn how Claude-Mem works automatically in the background"
+---
+
+# Getting Started with Claude-Mem
+
+## Automatic Operation
+
+Claude-Mem works automatically once installed. No manual intervention required!
+
+### The Full Cycle
+
+1. **Start Claude Code** - Context from last 10 sessions appears automatically
+2. **Work normally** - Every tool execution is captured
+3. **Claude finishes responding** - Stop hook automatically generates and saves a summary
+4. **Next session** - Previous work appears in context
+
+### What Gets Captured
+
+Every time Claude uses a tool, claude-mem captures it:
+
+- **Read** - File reads and content access
+- **Write** - New file creation
+- **Edit** - File modifications
+- **Bash** - Command executions
+- **Glob** - File pattern searches
+- **Grep** - Content searches
+- And all other Claude Code tools
+
+### What Gets Processed
+
+The worker service processes tool observations and extracts:
+
+- **Title** - Brief description of what happened
+- **Subtitle** - Additional context
+- **Narrative** - Detailed explanation
+- **Facts** - Key learnings as bullet points
+- **Concepts** - Relevant tags and categories
+- **Type** - Classification (decision, bugfix, feature, etc.)
+- **Files** - Which files were read or modified
+
+### Session Summaries
+
+When Claude finishes responding (triggering the Stop hook), a summary is automatically generated with:
+
+- **Request** - What you asked for
+- **Investigated** - What Claude explored
+- **Learned** - Key discoveries and insights
+- **Completed** - What was accomplished
+- **Next Steps** - What to do next
+
+### Context Injection
+
+When you start a new Claude Code session, the SessionStart hook:
+
+1. Queries the database for recent observations in your project (default: 50)
+2. Retrieves recent session summaries for context
+3. Displays observations in a chronological timeline with session markers
+4. Shows full summary details (Investigated, Learned, Completed, Next Steps) **only if the summary was generated after the last observation**
+5. Injects formatted context into Claude's initial context
+
+**Summary Display Logic:**
+
+The most recent summary's full details appear at the end of the context display **only when** the summary was generated after the most recent observation. This ensures you see summary details when they represent the latest state of your project, but not when new observations have been captured since the last summary.
+
+For example:
+- ✅ **Shows summary**: Last observation at 2:00 PM, summary generated at 2:05 PM → Summary details appear
+- ❌ **Hides summary**: Summary generated at 2:00 PM, new observation at 2:05 PM → Summary details hidden (outdated)
+
+This prevents showing stale summaries when new work has been captured but not yet summarized.
+
+This means Claude "remembers" what happened in previous sessions!
+
+## Manual Commands (Optional)
+
+### Worker Management
+
+v4.0+ auto-starts the worker on first session. Manual commands below are optional.
+
+```bash
+# Start worker service (optional - auto-starts automatically)
+npm run worker:start
+
+# Stop worker service
+npm run worker:stop
+
+# Restart worker service
+npm run worker:restart
+
+# View worker logs
+npm run worker:logs
+
+# Check worker status
+npm run worker:status
+```
+
+### Testing
+
+```bash
+# Run all tests
+npm test
+
+# Test context injection
+npm run test:context
+
+# Verbose context test
+npm run test:context:verbose
+```
+
+### Development
+
+```bash
+# Build hooks and worker
+npm run build
+
+# Build only hooks
+npm run build:hooks
+
+# Publish to NPM (maintainers only)
+npm run publish:npm
+```
+
+## Viewing Stored Context
+
+Context is stored in SQLite database at `~/.claude-mem/claude-mem.db`.
+
+Query the database directly:
+
+```bash
+# Open database
+sqlite3 ~/.claude-mem/claude-mem.db
+
+# View recent sessions
+SELECT session_id, project, created_at, status
+FROM sdk_sessions
+ORDER BY created_at DESC
+LIMIT 10;
+
+# View session summaries
+SELECT session_id, request, completed, learned
+FROM session_summaries
+ORDER BY created_at DESC
+LIMIT 5;
+
+# View observations for a session
+SELECT tool_name, created_at
+FROM observations
+WHERE session_id = 'YOUR_SESSION_ID';
+```
+
+## Understanding Progressive Disclosure
+
+Context injection uses progressive disclosure for efficient token usage:
+
+### Layer 1: Index Display (Session Start)
+- Shows observation titles with token cost estimates
+- Displays session markers in chronological timeline
+- Groups observations by file for visual clarity
+- Shows full summary details **only if** generated after last observation
+- Token cost: ~50-200 tokens for index view
+
+### Layer 2: On-Demand Details (MCP Tools)
+- Ask naturally: "What bugs did we fix?" or "How did we implement X?"
+- Claude auto-invokes MCP search tools to fetch full details
+- Search by concept, file, type, or keyword
+- Timeline context around specific observations
+- Token cost: ~100-500 tokens per observation fetched
+- Uses 3-layer workflow: search → timeline → get_observations
+
+### Layer 3: Perfect Recall (Code Access)
+- Read source files directly when needed
+- Access original transcripts and raw data
+- Full context available on-demand
+
+This ensures efficient token usage while maintaining access to complete history when needed.
+
+## Multi-Prompt Sessions & `/clear` Behavior
+
+Claude-Mem supports sessions that span multiple user prompts:
+
+- **prompt_counter**: Tracks total prompts in a session
+- **prompt_number**: Identifies specific prompt within session
+- **Session continuity**: Observations and summaries link across prompts
+
+### Important Note About `/clear`
+
+When you use `/clear`, the session doesn't end - it continues with a new prompt number. This means:
+
+- ✅ **Context is re-injected** from recent sessions (SessionStart hook fires with `source: "clear"`)
+- ✅ **Observations are still being captured** and added to the current session
+- ✅ **A summary will be generated** when Claude finishes responding (Stop hook fires)
+
+The `/clear` command clears the conversation context visible to Claude AND re-injects fresh context from recent sessions, while the underlying session continues tracking observations.
+
+## Searching Your History
+
+Claude-Mem provides MCP tools for querying your project history. Simply ask naturally:
+
+```
+"What bugs did we fix last session?"
+"How did we implement authentication?"
+"What changes were made to worker-service.ts?"
+"Show me recent work on this project"
+```
+
+Claude automatically recognizes your intent and invokes the MCP search tools, which use a 3-layer workflow (search → timeline → get_observations) for efficient token usage.
+
+## Next Steps
+
+- [Skill-Based Search](/usage/search-tools) - Learn how to search your project history
+- [Architecture Overview](/architecture/overview) - Understand how it works
+- [Troubleshooting](/troubleshooting) - Common issues and solutions
--- a/.agent/services/claude-mem/docs/public/usage/manual-recovery.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/manual-recovery.mdx
@@ -0,0 +1,450 @@
+---
+title: "Manual Recovery"
+description: "Recover stuck observations after worker crashes or restarts"
+---
+
+# Manual Recovery Guide
+
+## Overview
+
+Claude-mem's manual recovery system helps you recover observations that get stuck in the processing queue after worker crashes, system restarts, or unexpected shutdowns.
+
+**Key Change in v5.x**: Automatic recovery on worker startup is now disabled. This gives you explicit control over when reprocessing happens, preventing unexpected duplicate observations.
+
+## When Do You Need Manual Recovery?
+
+You should trigger manual recovery when:
+
+- **Worker crashed or restarted** - Observations were queued but worker stopped before processing
+- **No new summaries appearing** - Observations are being saved but not processed into summaries
+- **Stuck messages detected** - Messages showing as "processing" for >5 minutes
+- **System crashes** - Unexpected shutdowns left messages in incomplete states
+
+## Quick Start
+
+### Using the CLI Tool (Recommended)
+
+The interactive CLI tool is the safest and easiest way to recover stuck observations:
+
+```bash
+# Check status and prompt for recovery
+bun scripts/check-pending-queue.ts
+```
+
+This will:
+1. Check worker health
+2. Show queue summary (pending, processing, failed, stuck counts)
+3. Display sessions with pending work
+4. Prompt you to confirm recovery
+5. Show recently processed messages for feedback
+
+### Auto-Process Without Prompts
+
+For scripting or when you're confident recovery is needed:
+
+```bash
+# Auto-process without prompting
+bun scripts/check-pending-queue.ts --process
+
+# Limit to 5 sessions
+bun scripts/check-pending-queue.ts --process --limit 5
+```
+
+## Understanding Queue States
+
+Messages progress through these lifecycle states:
+
+1. **pending** → Queued, waiting to process
+2. **processing** → Currently being processed by SDK agent
+3. **processed** → Completed successfully
+4. **failed** → Failed after 3 retry attempts
+
+### Stuck Detection
+
+Messages in `processing` state for **>5 minutes** are considered stuck:
+
+- They're automatically reset to `pending` on worker startup
+- They're NOT automatically reprocessed (requires manual trigger)
+- They appear in the `stuckCount` field when checking queue status
+
+## Recovery Methods
+
+### Method 1: Interactive CLI Tool
+
+**Best for**: Regular users, interactive sessions, when you want visibility into what's happening
+
+```bash
+bun scripts/check-pending-queue.ts
+```
+
+**Example Output**:
+```
+Checking worker health...
+Worker is healthy ✓
+
+Queue Summary:
+  Pending: 12 messages
+  Processing: 2 messages (1 stuck)
+  Failed: 0 messages
+  Recently Processed: 5 messages in last 30 minutes
+
+Sessions with pending work: 3
+  Session 44: 5 pending, 1 processing (age: 2m)
+  Session 45: 4 pending, 1 processing (age: 7m - STUCK)
+  Session 46: 2 pending
+
+Would you like to process these pending queues? (y/n)
+```
+
+**Features**:
+- ✅ Pre-flight health check (verifies worker is running)
+- ✅ Detailed queue breakdown by session
+- ✅ Age tracking for stuck detection
+- ✅ Confirmation prompt (prevents accidental reprocessing)
+- ✅ Non-interactive mode with `--process` flag
+- ✅ Session limit control with `--limit N`
+
+### Method 2: HTTP API
+
+**Best for**: Automation, scripting, integration with monitoring systems
+
+#### Check Queue Status
+
+```bash
+curl http://localhost:37777/api/pending-queue
+```
+
+**Response**:
+```json
+{
+  "queue": {
+    "messages": [
+      {
+        "id": 123,
+        "session_db_id": 45,
+        "claude_session_id": "abc123",
+        "message_type": "observation",
+        "status": "pending",
+        "retry_count": 0,
+        "created_at_epoch": 1730886600000
+      }
+    ],
+    "totalPending": 12,
+    "totalProcessing": 2,
+    "totalFailed": 0,
+    "stuckCount": 1
+  },
+  "recentlyProcessed": [...],
+  "sessionsWithPendingWork": [44, 45, 46]
+}
+```
+
+**Key Fields**:
+- `totalPending` - Messages waiting to process
+- `totalProcessing` - Messages currently processing
+- `stuckCount` - Processing messages >5 minutes old
+- `sessionsWithPendingWork` - Session IDs needing recovery
+
+#### Trigger Recovery
+
+```bash
+curl -X POST http://localhost:37777/api/pending-queue/process \
+  -H "Content-Type: application/json" \
+  -d '{"sessionLimit": 10}'
+```
+
+**Response**:
+```json
+{
+  "success": true,
+  "totalPendingSessions": 15,
+  "sessionsStarted": 10,
+  "sessionsSkipped": 2,
+  "startedSessionIds": [44, 45, 46, 47, 48, 49, 50, 51, 52, 53]
+}
+```
+
+**Response Fields**:
+- `totalPendingSessions` - Total sessions with pending messages in database
+- `sessionsStarted` - Sessions we started processing this request
+- `sessionsSkipped` - Sessions already processing (prevents duplicate agents)
+- `startedSessionIds` - Database IDs of sessions we started
+
+## Best Practices
+
+### 1. Always Check Before Recovery
+
+```bash
+# Check queue status first
+curl http://localhost:37777/api/pending-queue
+
+# Or use CLI tool which checks automatically
+bun scripts/check-pending-queue.ts
+```
+
+### 2. Start with Low Session Limits
+
+```bash
+# Process only 5 sessions at a time
+bun scripts/check-pending-queue.ts --process --limit 5
+```
+
+This prevents overwhelming the worker with too many concurrent SDK agents.
+
+### 3. Monitor During Recovery
+
+Watch worker logs while recovery runs:
+
+```bash
+npm run worker:logs
+```
+
+Look for:
+- SDK agent starts: `Starting SDK agent for session...`
+- Processing completions: `Processed observation...`
+- Errors: `ERROR` or `Failed to process...`
+
+### 4. Verify Recovery Success
+
+Check recently processed messages:
+
+```bash
+curl http://localhost:37777/api/pending-queue | jq '.recentlyProcessed'
+```
+
+Or use the CLI tool which shows this automatically.
+
+### 5. Handle Failed Messages
+
+Messages that fail 3 times are marked `failed` and won't auto-retry:
+
+```bash
+# View failed messages
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT id, session_db_id, message_type, retry_count
+  FROM pending_messages
+  WHERE status = 'failed'
+  ORDER BY completed_at_epoch DESC;
+"
+```
+
+You can manually reset them if needed:
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db "
+  UPDATE pending_messages
+  SET status = 'pending', retry_count = 0
+  WHERE status = 'failed';
+"
+```
+
+## Troubleshooting
+
+### Recovery Not Working
+
+**Symptom**: Triggered recovery but messages still pending
+
+**Solutions**:
+
+1. **Verify worker health**:
+   ```bash
+   curl http://localhost:37777/health
+   ```
+
+2. **Check worker logs for errors**:
+   ```bash
+   npm run worker:logs | grep -i error
+   ```
+
+3. **Restart worker**:
+   ```bash
+   npm run worker:restart
+   ```
+
+4. **Check database integrity**:
+   ```bash
+   sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA integrity_check;"
+   ```
+
+### Messages Stuck Forever
+
+**Symptom**: Messages show as "processing" for hours
+
+**Solution**: Force reset stuck messages
+
+```bash
+# Reset all stuck messages to pending
+sqlite3 ~/.claude-mem/claude-mem.db "
+  UPDATE pending_messages
+  SET status = 'pending', started_processing_at_epoch = NULL
+  WHERE status = 'processing';
+"
+
+# Then trigger recovery
+bun scripts/check-pending-queue.ts --process
+```
+
+### Worker Crashes During Recovery
+
+**Symptom**: Worker stops while processing recovered messages
+
+**Solutions**:
+
+1. **Check available memory**:
+   ```bash
+   npm run worker:status
+   ```
+
+2. **Reduce session limit**:
+   ```bash
+   bun scripts/check-pending-queue.ts --process --limit 3
+   ```
+
+3. **Check for SDK errors in logs**:
+   ```bash
+   npm run worker:logs | grep -i "sdk"
+   ```
+
+4. **Increase worker memory** (if using custom runner):
+   ```bash
+   export NODE_OPTIONS="--max-old-space-size=4096"
+   npm run worker:restart
+   ```
+
+## Advanced Usage
+
+### Direct Database Inspection
+
+View all pending messages:
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT
+    id,
+    session_db_id,
+    message_type,
+    status,
+    retry_count,
+    datetime(created_at_epoch/1000, 'unixepoch') as created_at,
+    datetime(started_processing_at_epoch/1000, 'unixepoch') as started_at,
+    CAST((strftime('%s', 'now') * 1000 - started_processing_at_epoch) / 60000 AS INTEGER) as age_minutes
+  FROM pending_messages
+  WHERE status IN ('pending', 'processing')
+  ORDER BY created_at_epoch;
+"
+```
+
+### Count Messages by Status
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT status, COUNT(*) as count
+  FROM pending_messages
+  GROUP BY status;
+"
+```
+
+### Find Sessions with Pending Work
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT
+    session_db_id,
+    COUNT(*) as pending_count,
+    GROUP_CONCAT(message_type) as message_types
+  FROM pending_messages
+  WHERE status IN ('pending', 'processing')
+  GROUP BY session_db_id;
+"
+```
+
+### View Recent Failures
+
+```bash
+sqlite3 ~/.claude-mem/claude-mem.db "
+  SELECT
+    id,
+    session_db_id,
+    message_type,
+    retry_count,
+    datetime(completed_at_epoch/1000, 'unixepoch') as failed_at
+  FROM pending_messages
+  WHERE status = 'failed'
+  ORDER BY completed_at_epoch DESC
+  LIMIT 10;
+"
+```
+
+## Integration Examples
+
+### Cron Job for Automatic Recovery
+
+```bash
+#!/bin/bash
+# Run every hour to process stuck queues
+
+# Check if worker is healthy
+if curl -f http://localhost:37777/health > /dev/null 2>&1; then
+  # Auto-process up to 5 sessions
+  bun scripts/check-pending-queue.ts --process --limit 5
+else
+  echo "Worker not healthy, skipping recovery"
+  exit 1
+fi
+```
+
+### Monitoring Script
+
+```bash
+#!/bin/bash
+# Alert if stuck count exceeds threshold
+
+STUCK_COUNT=$(curl -s http://localhost:37777/api/pending-queue | jq '.queue.stuckCount')
+
+if [ "$STUCK_COUNT" -gt 5 ]; then
+  echo "WARNING: $STUCK_COUNT stuck messages detected"
+  # Send alert (email, Slack, etc.)
+fi
+```
+
+### Pre-Shutdown Recovery
+
+```bash
+#!/bin/bash
+# Process pending queues before system shutdown
+
+echo "Processing pending queues before shutdown..."
+bun scripts/check-pending-queue.ts --process --limit 20
+
+echo "Waiting for processing to complete..."
+sleep 10
+
+echo "Stopping worker..."
+claude-mem stop
+```
+
+## Migration Note
+
+If you're upgrading from v4.x to v5.x:
+
+**v4.x Behavior** (Automatic Recovery):
+- Worker automatically recovered stuck messages on startup
+- No user control over reprocessing timing
+
+**v5.x Behavior** (Manual Recovery):
+- Stuck messages detected but NOT automatically reprocessed
+- User must explicitly trigger recovery via CLI or API
+- Prevents unexpected duplicate observations
+- Provides explicit control over when processing happens
+
+**Migration Steps**:
+1. Upgrade to v5.x
+2. Check for stuck messages: `bun scripts/check-pending-queue.ts`
+3. Process if needed: `bun scripts/check-pending-queue.ts --process`
+4. Add recovery to your workflow (cron job, pre-shutdown script, etc.)
+
+## See Also
+
+- [Worker Service Architecture](../architecture/worker-service) - Technical details on queue processing
+- [Troubleshooting - Manual Recovery](../troubleshooting#manual-recovery-for-stuck-observations) - Common issues and solutions
+- [Database Schema](../architecture/database) - Pending messages table structure
--- a/.agent/services/claude-mem/docs/public/usage/openrouter-provider.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/openrouter-provider.mdx
@@ -0,0 +1,320 @@
+---
+title: "OpenRouter Provider"
+description: "Access 100+ AI models through OpenRouter's unified API, including free models for cost-effective observation extraction"
+---
+
+# OpenRouter Provider
+
+Claude-mem supports [OpenRouter](https://openrouter.ai) as an alternative provider for observation extraction. OpenRouter provides a unified API to access 100+ models from different providers including Google, Meta, Mistral, DeepSeek, and many others—often with generous free tiers.
+
+<Tip>
+**Free Models Available**: OpenRouter offers several completely free models, making it an excellent choice for reducing observation extraction costs to zero while maintaining quality.
+</Tip>
+
+## Why Use OpenRouter?
+
+- **Access to 100+ models**: Choose from models across multiple providers through one API
+- **Free tier options**: Several high-quality models are completely free to use
+- **Cost flexibility**: Pay-as-you-go pricing on premium models with no commitments
+- **Seamless fallback**: Automatically falls back to Claude if OpenRouter is unavailable
+- **Hot-swappable**: Switch providers without restarting the worker
+- **Multi-turn conversations**: Full conversation history maintained across API calls
+
+## Free Models on OpenRouter
+
+OpenRouter actively supports democratizing AI access by offering free models. These are production-ready models suitable for observation extraction.
+
+### Featured Free Models
+
+| Model | ID | Parameters | Context | Best For |
+|-------|------|------------|---------|----------|
+| **Xiaomi MiMo-V2-Flash** | `xiaomi/mimo-v2-flash:free` | 309B (15B active, MoE) | 256K | Reasoning, coding, agents |
+| **Gemini 2.0 Flash** | `google/gemini-2.0-flash-exp:free` | — | 1M | General purpose |
+| **Gemini 2.5 Flash** | `google/gemini-2.5-flash-preview:free` | — | 1M | Latest capabilities |
+| **DeepSeek R1** | `deepseek/deepseek-r1:free` | 671B | 64K | Reasoning, analysis |
+| **Llama 3.1 70B** | `meta-llama/llama-3.1-70b-instruct:free` | 70B | 128K | General purpose |
+| **Llama 3.1 8B** | `meta-llama/llama-3.1-8b-instruct:free` | 8B | 128K | Fast, lightweight |
+| **Mistral Nemo** | `mistralai/mistral-nemo:free` | 12B | 128K | Efficient performance |
+
+<Note>
+**Default Model**: Claude-mem uses `xiaomi/mimo-v2-flash:free` by default—a 309B parameter mixture-of-experts model that ranks #1 on SWE-bench Verified and excels at coding and reasoning tasks.
+</Note>
+
+### Free Model Considerations
+
+- **Rate limits**: Free models may have stricter rate limits than paid models
+- **Availability**: Free capacity depends on provider partnerships and demand
+- **Queue times**: During peak usage, requests may be queued briefly
+- **Max tokens**: Most free models support 65,536 completion tokens
+
+All free models support:
+- Tool use and function calling
+- Temperature and sampling controls
+- Stop sequences
+- Streaming responses
+
+## Getting an API Key
+
+1. Go to [OpenRouter](https://openrouter.ai)
+2. Sign in with Google, GitHub, or email
+3. Navigate to [API Keys](https://openrouter.ai/keys)
+4. Click **Create Key**
+5. Copy and securely store your API key
+
+<Tip>
+**Free to start**: No credit card required to create an account or use free models. Add credits only if you want to use premium models.
+</Tip>
+
+## Configuration
+
+### Settings
+
+| Setting | Values | Default | Description |
+|---------|--------|---------|-------------|
+| `CLAUDE_MEM_PROVIDER` | `claude`, `gemini`, `openrouter` | `claude` | AI provider for observation extraction |
+| `CLAUDE_MEM_OPENROUTER_API_KEY` | string | — | Your OpenRouter API key |
+| `CLAUDE_MEM_OPENROUTER_MODEL` | string | `xiaomi/mimo-v2-flash:free` | Model identifier (see list above) |
+| `CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES` | number | `20` | Max messages in conversation history |
+| `CLAUDE_MEM_OPENROUTER_MAX_TOKENS` | number | `100000` | Token budget safety limit |
+| `CLAUDE_MEM_OPENROUTER_SITE_URL` | string | — | Optional: URL for analytics attribution |
+| `CLAUDE_MEM_OPENROUTER_APP_NAME` | string | `claude-mem` | Optional: App name for analytics |
+
+### Using the Settings UI
+
+1. Open the viewer at http://localhost:37777
+2. Click the **gear icon** to open Settings
+3. Under **AI Provider**, select **OpenRouter**
+4. Enter your OpenRouter API key
+5. Optionally select a different model
+
+Settings are applied immediately—no restart required.
+
+### Manual Configuration
+
+Edit `~/.claude-mem/settings.json`:
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter",
+  "CLAUDE_MEM_OPENROUTER_API_KEY": "sk-or-v1-your-key-here",
+  "CLAUDE_MEM_OPENROUTER_MODEL": "xiaomi/mimo-v2-flash:free"
+}
+```
+
+Alternatively, set the API key via environment variable:
+
+```bash
+export OPENROUTER_API_KEY="sk-or-v1-your-key-here"
+```
+
+The settings file takes precedence over the environment variable.
+
+## Model Selection Guide
+
+### For Free Usage (No Cost)
+
+**Recommended**: `xiaomi/mimo-v2-flash:free`
+- Best-in-class performance on coding benchmarks
+- 256K context window handles large observations
+- 65K max completion tokens
+- Mixture-of-experts architecture (15B active parameters)
+
+**Alternatives**:
+- `google/gemini-2.0-flash-exp:free` - 1M context, Google's flagship
+- `deepseek/deepseek-r1:free` - Excellent reasoning capabilities
+- `meta-llama/llama-3.1-70b-instruct:free` - Strong general purpose
+
+### For Paid Usage (Higher Quality/Speed)
+
+| Model | Price (per 1M tokens) | Best For |
+|-------|----------------------|----------|
+| `anthropic/claude-3.5-sonnet` | $3 in / $15 out | Highest quality observations |
+| `google/gemini-2.0-flash` | $0.075 in / $0.30 out | Fast, cost-effective |
+| `openai/gpt-4o` | $2.50 in / $10 out | GPT-4 quality |
+
+## Context Window Management
+
+OpenRouter agent implements intelligent context management to prevent runaway costs:
+
+### Automatic Truncation
+
+The agent uses a sliding window strategy:
+1. Checks if message count exceeds `MAX_CONTEXT_MESSAGES` (default: 20)
+2. Checks if estimated tokens exceed `MAX_TOKENS` (default: 100,000)
+3. If limits exceeded, keeps most recent messages only
+4. Logs warnings with dropped message counts
+
+### Token Estimation
+
+- Conservative estimate: 1 token ≈ 4 characters
+- Used for proactive context management
+- Actual usage logged from API response
+
+### Cost Tracking
+
+Logs include detailed usage information:
+
+```
+OpenRouter API usage: {
+  model: "xiaomi/mimo-v2-flash:free",
+  inputTokens: 2500,
+  outputTokens: 1200,
+  totalTokens: 3700,
+  estimatedCostUSD: "0.00",
+  messagesInContext: 8
+}
+```
+
+## Provider Switching
+
+You can switch between providers at any time:
+
+- **No restart required**: Changes take effect on the next observation
+- **Conversation history preserved**: When switching mid-session, the new provider sees the full conversation context
+- **Seamless transition**: All providers use the same observation format
+
+### Switching via UI
+
+1. Open Settings in the viewer
+2. Change the **AI Provider** dropdown
+3. The next observation will use the new provider
+
+### Switching via Settings File
+
+```json
+{
+  "CLAUDE_MEM_PROVIDER": "openrouter"
+}
+```
+
+## Fallback Behavior
+
+If OpenRouter encounters errors, claude-mem automatically falls back to the Claude Agent SDK:
+
+**Triggers fallback:**
+- Rate limiting (HTTP 429)
+- Server errors (HTTP 500, 502, 503)
+- Network issues (connection refused, timeout)
+- Generic fetch failures
+
+**Does not trigger fallback:**
+- Missing API key (logs warning, uses Claude from start)
+- Invalid API key (fails with error)
+
+When fallback occurs:
+1. A warning is logged
+2. Any in-progress messages are reset to pending
+3. Claude SDK takes over with the full conversation context
+
+<Note>
+**Fallback is transparent**: Your observations continue processing without interruption. The fallback preserves all conversation context.
+</Note>
+
+## Multi-Turn Conversation Support
+
+OpenRouter agent maintains full conversation history across API calls:
+
+```
+Session Created
+  ↓
+Load Pending Messages (observations from queue)
+  ↓
+For each message:
+  → Add to conversation history
+  → Call OpenRouter API with FULL history
+  → Parse XML response
+  → Store observations in database
+  → Sync to Chroma vector DB
+  ↓
+Session complete
+```
+
+This enables:
+- Coherent multi-turn exchanges
+- Context preservation across observations
+- Seamless provider switching mid-session
+
+## Troubleshooting
+
+### "OpenRouter API key not configured"
+
+Either:
+- Set `CLAUDE_MEM_OPENROUTER_API_KEY` in `~/.claude-mem/settings.json`, or
+- Set the `OPENROUTER_API_KEY` environment variable
+
+### Rate Limiting
+
+Free models may have rate limits during peak usage. If you hit rate limits:
+- Claude-mem automatically falls back to Claude SDK
+- Consider switching to a different free model
+- Add credits for premium model access
+
+### Model Not Found
+
+Verify the model ID is correct:
+- Check [OpenRouter Models](https://openrouter.ai/models) for current availability
+- Use the `:free` suffix for free model variants
+- Model IDs are case-sensitive
+
+### High Token Usage Warning
+
+If you see warnings about high token usage (>50,000 per request):
+- Reduce `CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES`
+- Reduce `CLAUDE_MEM_OPENROUTER_MAX_TOKENS`
+- Consider a model with larger context window
+
+### Connection Errors
+
+If you see connection errors:
+- Check your internet connection
+- Verify OpenRouter service status at [status.openrouter.ai](https://status.openrouter.ai)
+- The agent will automatically fall back to Claude
+
+## API Details
+
+OpenRouter uses an OpenAI-compatible REST API:
+
+**Endpoint**: `https://openrouter.ai/api/v1/chat/completions`
+
+**Headers**:
+```
+Authorization: Bearer {apiKey}
+HTTP-Referer: https://github.com/thedotmack/claude-mem
+X-Title: claude-mem
+Content-Type: application/json
+```
+
+**Request Format**:
+```json
+{
+  "model": "xiaomi/mimo-v2-flash:free",
+  "messages": [
+    {"role": "system", "content": "..."},
+    {"role": "user", "content": "..."}
+  ],
+  "temperature": 0.3,
+  "max_tokens": 4096
+}
+```
+
+## Comparing Providers
+
+| Feature | Claude (SDK) | Gemini | OpenRouter |
+|---------|-------------|--------|------------|
+| **Cost** | Pay per token | Free tier + paid | Free models + paid |
+| **Models** | Claude only | Gemini only | 100+ models |
+| **Quality** | Highest | High | Varies by model |
+| **Rate limits** | Based on tier | 5-4000 RPM | Varies by model |
+| **Fallback** | N/A (primary) | → Claude | → Claude |
+| **Setup** | Automatic | API key required | API key required |
+
+<Tip>
+**Recommendation**: Start with OpenRouter's free `xiaomi/mimo-v2-flash:free` model for zero-cost observation extraction. If you need higher quality or encounter rate limits, switch to Claude or add OpenRouter credits for premium models.
+</Tip>
+
+## Next Steps
+
+- [Configuration](/configuration) - Full settings reference
+- [Gemini Provider](/usage/gemini-provider) - Alternative free provider
+- [Getting Started](/usage/getting-started) - Basic usage guide
+- [Troubleshooting](/troubleshooting) - Common issues
--- a/.agent/services/claude-mem/docs/public/usage/private-tags.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/private-tags.mdx
@@ -0,0 +1,195 @@
+---
+title: "Private Tags"
+description: "Control what gets stored in memory with privacy tags"
+---
+
+# Private Tags
+
+## Overview
+
+Use `<private>` tags to mark content you don't want persisted in claude-mem's observation database. This gives you fine-grained control over what gets remembered across sessions.
+
+## How It Works
+
+Wrap any content in `<private>` tags:
+
+```
+<private>
+This content will not be stored in memory
+</private>
+```
+
+Claude can see and use this content during the current session, but it won't be saved as an observation.
+
+## Use Cases
+
+### 1. Sensitive Information
+
+```
+Please analyze this error:
+
+<private>
+Error: Database connection failed
+Host: internal-db-prod.company.com
+Port: 5432
+User: admin_user
+</private>
+
+What might be causing this?
+```
+
+Claude sees the full error but only the question gets stored.
+
+### 2. Temporary Context
+
+```
+<private>
+Here's some background context just for this session:
+- Project deadline is tomorrow
+- This is a hotfix for production
+- Manager asked for this specifically
+</private>
+
+Help me fix this bug quickly.
+```
+
+### 3. Debugging Information
+
+```
+<private>
+Debug output from previous run:
+[... 500 lines of logs ...]
+</private>
+
+Based on these logs, what's the root cause?
+```
+
+### 4. Exploratory Prompts
+
+```
+<private>
+I'm just brainstorming here, not making a final decision
+</private>
+
+What are some wild approaches to solving this?
+```
+
+## Technical Details
+
+### Tag Behavior
+
+- **Multiline support**: Tags can wrap multiple lines of content
+- **Multiple tags**: You can use multiple `<private>` sections in one message
+- **Nested tags**: Inner tags are included in outer tag removal
+- **Always active**: No configuration needed - works automatically
+
+### What Gets Filtered
+
+The `<private>` tag filters content from storage and memory:
+- **User prompt storage** - Tags are stripped before saving to the user_prompts table
+- **Tool inputs** - Parameters passed to tools are filtered before observation creation
+- **Tool responses** - Output from tools is filtered before observation creation
+- **All searchable content** - Private content never reaches the database or search indices
+
+**Important**: Tags are stripped during storage, not from the live conversation. Claude sees the full content including `<private>` tags during the session, and they only disappear when content is persisted to the database.
+
+### What Doesn't Get Filtered
+
+- Session summaries (generated from non-private observations only)
+- Claude's responses (not captured by claude-mem)
+
+## Examples
+
+### Example 1: API Keys
+
+```
+<private>
+API_KEY=sk-proj-abc123xyz789
+</private>
+
+Test this API connection for me
+```
+
+The API key won't be stored, but Claude can use it during the session.
+
+### Example 2: Personal Notes
+
+```
+<private>
+Note to self: This is for the Smith project - the one we discussed
+last Tuesday. Don't confuse with the Jones project.
+</private>
+
+Review the authentication implementation and suggest improvements.
+```
+
+The personal context helps Claude understand your request without polluting your observation history.
+
+## Best Practices
+
+1. **Don't over-tag**: Only use `<private>` for content you genuinely don't want stored
+2. **Context matters**: Claude's understanding of your project comes from observations - excessive private tagging reduces future context quality
+3. **Secrets belong elsewhere**: While `<private>` prevents storage, sensitive data should still use proper secrets management
+4. **Test it works**: Check `~/.claude-mem/silent.log` if you're unsure whether tags are being stripped
+
+## Verification
+
+To verify tags are working:
+
+1. Submit a prompt with `<private>` tags
+2. Check the database to ensure private content is not stored:
+   ```bash
+   # Check user prompts
+   sqlite3 ~/.claude-mem/claude-mem.db "SELECT prompt_text FROM user_prompts ORDER BY created_at_epoch DESC LIMIT 1;"
+
+   # Check observations
+   sqlite3 ~/.claude-mem/claude-mem.db "SELECT narrative FROM observations ORDER BY created_at_epoch DESC LIMIT 1;"
+   ```
+3. The private content should NOT appear in either user_prompts or observations
+4. The `<private>` tags themselves should also be stripped
+
+## Architecture
+
+The `<private>` tag uses an **edge processing pattern**:
+
+- Content is filtered at the hook layer before any storage
+  - **UserPromptSubmit hook**: Strips tags from user prompts before saving to the user_prompts table (your typed prompts are cleaned before database storage)
+  - **PostToolUse hook**: Strips tags from serialized tool_input and tool_response JSON before observation creation
+- Filtering happens before data reaches the worker service or database
+- This keeps the worker simple and follows a one-way data stream
+- Tags remain visible in the live conversation but are stripped from all persistent storage
+
+**Tag Stripping Scope**: The implementation strips tags from the *serialized JSON representations* of tool inputs and tool responses, not from the original user prompt text in the conversation UI. The user prompt text you type is stored in a separate table (user_prompts) where tags are also stripped before storage.
+
+This design ensures that private content never reaches the database, search indices, or memory agent, maintaining a clean separation between ephemeral and persistent data.
+
+## Related Features
+
+- [Search Tools](/usage/search-tools) - How to search past observations
+- [Getting Started](/usage/getting-started) - Basic usage guide
+- [Configuration](/configuration) - System settings and environment variables
+
+## Troubleshooting
+
+### Tags Not Being Stripped
+
+1. Verify correct syntax: `<private>content</private>`
+2. Check `~/.claude-mem/silent.log` for errors
+3. Ensure worker is running: `npm run worker:status`
+4. Restart worker: `npm run worker:restart`
+
+### Partial Content Stored
+
+If content appears partially in observations:
+- Ensure tags are properly closed
+- Check for typos in tag names
+- Verify content is inside tool executions (not just in your prompt text)
+
+### Silent Log Shows Errors
+
+If you see errors in `~/.claude-mem/silent.log`:
+```
+[save-hook] stripMemoryTags received non-string: { type: 'object' }
+```
+
+This is usually harmless - it indicates defensive type checking is working. However, if you see these frequently, it may indicate a bug. Please report it at https://github.com/thedotmack/claude-mem/issues
--- a/.agent/services/claude-mem/docs/public/usage/search-tools.mdx
+++ b/.agent/services/claude-mem/docs/public/usage/search-tools.mdx
@@ -0,0 +1,454 @@
+---
+title: "Memory Search"
+description: "Search your project history with MCP tools"
+---
+
+# Memory Search with MCP Tools
+
+Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**.
+
+## Overview
+
+Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
+
+1. **Search** → Get a compact index with IDs (~50-100 tokens/result)
+2. **Timeline** → Get context around interesting results
+3. **Get Observations** → Fetch full details ONLY for filtered IDs
+
+This achieves **~10x token savings** compared to traditional RAG approaches.
+
+## The 3-Layer Workflow
+
+### Layer 1: Search (Index)
+
+Start by searching to get a lightweight index of results:
+
+```
+search(query="authentication bug", type="bugfix", limit=10)
+```
+
+**Returns:** Compact table with IDs, titles, dates, types
+**Cost:** ~50-100 tokens per result
+**Purpose:** Survey what exists before fetching details
+
+### Layer 2: Timeline (Context)
+
+Get chronological context around specific observations:
+
+```
+timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
+```
+
+Or search and get timeline in one step:
+
+```
+timeline(query="authentication", depth_before=2, depth_after=2)
+```
+
+**Returns:** Chronological view showing what was happening before/after
+**Cost:** Variable, depends on depth
+**Purpose:** Understand narrative arc and context
+
+### Layer 3: Get Observations (Details)
+
+Fetch full details only for relevant observations:
+
+```
+get_observations(ids=[123, 456, 789])
+```
+
+**Returns:** Complete observation details (narrative, facts, files, concepts)
+**Cost:** ~500-1000 tokens per observation
+**Purpose:** Deep dive on specific, validated items
+
+### Why This Works
+
+**Traditional Approach:**
+- Fetch everything upfront: 20,000 tokens
+- Relevance: ~10% (2,000 tokens actually useful)
+- Waste: 18,000 tokens on irrelevant context
+
+**3-Layer Approach:**
+- Search index: 1,000 tokens (10 results)
+- Timeline context: 500 tokens (around 2 key results)
+- Fetch details: 1,500 tokens (3 observations)
+- **Total: 3,000 tokens, 100% relevant**
+
+## Available Tools
+
+### `__IMPORTANT` - Workflow Documentation
+
+Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently.
+
+**Usage:** Automatically shown, no need to invoke
+
+### `search` - Search Memory Index
+
+Search your memory and get a compact index with IDs.
+
+**Parameters:**
+- `query` - Full-text search query (supports AND, OR, NOT, phrase searches)
+- `limit` - Maximum results (default: 20)
+- `offset` - Skip first N results for pagination
+- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
+- `obs_type` - Filter by record type (observation, session, prompt)
+- `project` - Filter by project name
+- `dateStart` - Filter by start date (YYYY-MM-DD)
+- `dateEnd` - Filter by end date (YYYY-MM-DD)
+- `orderBy` - Sort order (date_desc, date_asc, relevance)
+
+**Returns:** Compact index table with IDs, titles, dates, types
+
+**Example:**
+```
+search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")
+```
+
+### `timeline` - Get Chronological Context
+
+Get a chronological view of observations around a specific point or query.
+
+**Parameters:**
+- `anchor` - Observation ID to center timeline around (optional if query provided)
+- `query` - Search query to find anchor automatically (optional if anchor provided)
+- `depth_before` - Number of observations before anchor (default: 3)
+- `depth_after` - Number of observations after anchor (default: 3)
+- `project` - Filter by project name
+
+**Returns:** Chronological list showing what happened before/during/after
+
+**Example:**
+```
+timeline(anchor=12345, depth_before=5, depth_after=5)
+```
+
+Or search-based:
+```
+timeline(query="implemented JWT auth", depth_before=3, depth_after=3)
+```
+
+### `get_observations` - Fetch Full Details
+
+Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.**
+
+**Parameters:**
+- `ids` - Array of observation IDs (required)
+- `orderBy` - Sort order (date_desc, date_asc)
+- `limit` - Maximum observations to return
+- `project` - Filter by project name
+
+**Returns:** Complete observation details including narrative, facts, files, concepts
+
+**Example:**
+```
+get_observations(ids=[123, 456, 789, 1011])
+```
+
+**Important:** Always batch IDs instead of making separate calls per observation.
+
+## Common Use Cases
+
+### Debugging Issues
+
+**Scenario:** Find what went wrong with database connections
+
+```
+Step 1: search(query="error database connection", type="bugfix", limit=10)
+  → Review index, identify observations #245, #312, #489
+
+Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
+  → See what was happening around the fix
+
+Step 3: get_observations(ids=[312, 489])
+  → Get full details on relevant fixes
+```
+
+### Understanding Decisions
+
+**Scenario:** Review architectural choices about authentication
+
+```
+Step 1: search(query="authentication", type="decision", limit=5)
+  → Find decision observations
+
+Step 2: get_observations(ids=[<relevant_ids>])
+  → Get full decision rationale, trade-offs, facts
+```
+
+### Code Archaeology
+
+**Scenario:** Find when a specific file was modified
+
+```
+Step 1: search(query="worker-service.ts", limit=20)
+  → Get all observations mentioning that file
+
+Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
+  → See what led to and followed from the refactor
+
+Step 3: get_observations(ids=[<specific_observation_ids>])
+  → Get implementation details
+```
+
+### Feature History
+
+**Scenario:** Track how a feature evolved
+
+```
+Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
+  → Chronological view of feature work
+
+Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
+  → See the full development timeline
+
+Step 3: get_observations(ids=[<key_milestones>])
+  → Deep dive on critical implementation points
+```
+
+### Learning from Past Work
+
+**Scenario:** Review refactoring patterns
+
+```
+Step 1: search(type="refactor", limit=10, orderBy="date_desc")
+  → Recent refactoring work
+
+Step 2: get_observations(ids=[<interesting_ids>])
+  → Study the patterns and approaches used
+```
+
+### Context Recovery
+
+**Scenario:** Restore context after time away from project
+
+```
+Step 1: search(query="project-name", limit=10, orderBy="date_desc")
+  → See recent work
+
+Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
+  → Understand what led to current state
+
+Step 3: get_observations(ids=[<critical_observations>])
+  → Refresh memory on key decisions
+```
+
+## Search Query Syntax
+
+The `query` parameter supports SQLite FTS5 full-text search syntax:
+
+### Boolean Operators
+
+```
+query="authentication AND JWT"           # Both terms must appear
+query="OAuth OR JWT"                      # Either term can appear
+query="security NOT deprecated"           # Exclude deprecated items
+```
+
+### Phrase Searches
+
+```
+query='"database migration"'             # Exact phrase match
+```
+
+### Column-Specific Searches
+
+```
+query="title:authentication"             # Search in title only
+query="content:database"                  # Search in content only
+query="concepts:security"                 # Search in concepts only
+```
+
+### Combining Operators
+
+```
+query='"user auth" AND (JWT OR session) NOT deprecated'
+```
+
+## Token Management
+
+### Token Efficiency Best Practices
+
+1. **Always start with search** - Get index first (~50-100 tokens/result)
+2. **Use small limits** - Start with 3-5 results, increase if needed
+3. **Filter before fetching** - Use type, date, project filters
+4. **Batch get_observations** - Always group multiple IDs in one call
+5. **Use timeline strategically** - Get context only when narrative matters
+
+### Token Cost Estimates
+
+| Operation | Tokens per Result |
+|-----------|-------------------|
+| search (index) | 50-100 |
+| timeline (per observation) | 100-200 |
+| get_observations (full details) | 500-1,000 |
+
+**Example Comparison:**
+
+**Inefficient:**
+```
+# Fetching 20 full observations upfront: 10,000-20,000 tokens
+get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
+```
+
+**Efficient:**
+```
+# Search index: ~1,000 tokens
+search(query="bug fix", limit=20)
+
+# Review IDs, identify 3 relevant observations
+
+# Fetch only relevant: ~1,500-3,000 tokens
+get_observations(ids=[5, 12, 18])
+
+# Total: 2,500-4,000 tokens (vs 10,000-20,000)
+```
+
+## Advanced Filtering
+
+### Date Ranges
+
+```
+search(
+  query="performance optimization",
+  dateStart="2025-10-01",
+  dateEnd="2025-10-31"
+)
+```
+
+### Multiple Types
+
+For observations of multiple types, make multiple searches or use broader query:
+
+```
+search(query="database", type="bugfix", limit=10)
+search(query="database", type="feature", limit=10)
+```
+
+### Project-Specific
+
+```
+search(query="API", project="my-app", limit=15)
+```
+
+### Pagination
+
+```
+# First page
+search(query="refactor", limit=10, offset=0)
+
+# Second page
+search(query="refactor", limit=10, offset=10)
+
+# Third page
+search(query="refactor", limit=10, offset=20)
+```
+
+## Result Metadata
+
+All observations include rich metadata:
+
+- **ID** - Unique observation identifier
+- **Type** - bugfix, feature, decision, discovery, refactor, change
+- **Date** - When the work occurred
+- **Title** - Concise description
+- **Concepts** - Tagged themes (e.g., security, performance, architecture)
+- **Files Read** - Files examined during work
+- **Files Modified** - Files changed during work
+- **Narrative** - Story of what happened and why
+- **Facts** - Key factual points (decisions made, patterns used, metrics)
+
+## Troubleshooting
+
+### No Results Found
+
+1. **Broaden your search:**
+   ```
+   # Too specific
+   search(query="JWT authentication implementation with RS256")
+
+   # Better
+   search(query="authentication")
+   ```
+
+2. **Check database has data:**
+   ```bash
+   curl "http://localhost:37777/api/search?query=test"
+   ```
+
+3. **Try without filters:**
+   ```
+   # Remove type/date filters to see if data exists
+   search(query="your-search-term")
+   ```
+
+### IDs Not Found in get_observations
+
+**Error:** "Observation IDs not found: [123, 456]"
+
+**Causes:**
+- IDs from different project (use `project` parameter)
+- IDs were deleted
+- Typo in ID numbers
+
+**Solution:**
+```
+# Verify IDs exist
+search(query="<related-search>")
+
+# Use correct project filter
+get_observations(ids=[123, 456], project="correct-project-name")
+```
+
+### Token Limit Errors
+
+**Error:** Response exceeds token limits
+
+**Solution:** Use the 3-layer workflow to reduce upfront costs:
+
+```
+# Instead of fetching 50 full observations:
+# get_observations(ids=[1,2,3,...,50])  # 25,000-50,000 tokens!
+
+# Do this:
+search(query="<your-query>", limit=50)  # ~2,500-5,000 tokens
+# Review index, identify 5 relevant observations
+get_observations(ids=[<5-most-relevant>])  # ~2,500-5,000 tokens
+# Total: 5,000-10,000 tokens (50-80% savings)
+```
+
+### Search Performance
+
+If searches seem slow:
+1. Be more specific in queries (helps FTS5 index)
+2. Use date range filters to narrow scope
+3. Specify project filter when possible
+4. Use smaller limit values
+
+## Best Practices
+
+1. **Index First, Details Later** - Always start with search to survey options
+2. **Filter Before Fetching** - Use search parameters to narrow results
+3. **Batch ID Fetches** - Group multiple IDs in one get_observations call
+4. **Use Timeline for Context** - When narrative matters, timeline shows the story
+5. **Specific Queries** - More specific = better relevance
+6. **Small Limits Initially** - Start with 3-5 results, expand if needed
+7. **Review Before Deep Dive** - Check index before fetching full details
+
+## Technical Details
+
+**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search.
+
+**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
+
+**Worker Service:** Express API on port 37777, managed by Bun
+
+**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db`
+
+**Vector Search:** Chroma embeddings for semantic search (underlying implementation)
+
+## Next Steps
+
+- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
+- [Architecture Overview](/architecture/overview) - System components
+- [Database Schema](/architecture/database) - Understanding the data structure
+- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration