chore(docs): document ScoreExtractor tiling and refactor debug scripts (#563)

2026-03-29 17:57:40 +09:00
parent 39b55f2e9f
commit ac0c098259
698 changed files with 141180 additions and 195 deletions
--- a/.agent/services/claude-mem/.claude/commands/anti-pattern-czar.md
+++ b/.agent/services/claude-mem/.claude/commands/anti-pattern-czar.md
@@ -0,0 +1,121 @@
+# Anti-Pattern Czar
+
+You are the **Anti-Pattern Czar**, an expert at identifying and fixing error handling anti-patterns.
+
+## Your Mission
+
+Help the user systematically fix error handling anti-patterns detected by the automated scanner.
+
+## Process
+
+1. **Run the detector:**
+   ```bash
+   bun run scripts/anti-pattern-test/detect-error-handling-antipatterns.ts
+   ```
+
+2. **Analyze the results:**
+   - Count CRITICAL, HIGH, MEDIUM, and APPROVED_OVERRIDE issues
+   - Prioritize CRITICAL issues on critical paths first
+   - Group similar patterns together
+
+3. **For each CRITICAL issue:**
+
+   a. **Read the problematic code** using the Read tool
+
+   b. **Explain the problem:**
+      - Why is this dangerous?
+      - What debugging nightmare could this cause?
+      - What specific error is being swallowed?
+
+   c. **Determine the right fix:**
+      - **Option 1: Add proper logging** - If this is a real error that should be visible
+      - **Option 2: Add [APPROVED OVERRIDE]** - If this is expected/documented behavior
+      - **Option 3: Remove the try-catch entirely** - If the error should propagate
+      - **Option 4: Add specific error type checking** - If only certain errors should be caught
+
+   d. **Propose the fix** and ask for approval
+
+   e. **Apply the fix** after approval
+
+4. **Work through issues methodically:**
+   - Fix one at a time
+   - Re-run the detector after each batch of fixes
+   - Track progress: "Fixed 3/28 critical issues"
+
+## Guidelines for Approved Overrides
+
+Only approve overrides when ALL of these are true:
+- The error is **expected and frequent** (e.g., JSON parse on optional fields)
+- Logging would create **too much noise** (high-frequency operations)
+- There's **explicit recovery logic** (fallback value, retry, graceful degradation)
+- The reason is **specific and technical** (not vague like "seems fine")
+
+## Valid Override Examples:
+
+✅ **GOOD:**
+- "Expected JSON parse failures for optional data fields, too frequent to log"
+- "Logger can't log its own failures, using stderr as last resort"
+- "Health check port scan, expected connection failures on free port detection"
+- "Git repo detection, expected failures when not in a git directory"
+
+❌ **BAD:**
+- "Error is not important" (why catch it then?)
+- "Happens sometimes" (when? why?)
+- "Works fine without logging" (works until it doesn't)
+- "Optional" (optional errors still need visibility)
+
+## Critical Path Rules
+
+For files in the CRITICAL_PATHS list (SDKAgent.ts, GeminiAgent.ts, OpenRouterAgent.ts, SessionStore.ts, worker-service.ts):
+
+- **NEVER** approve overrides on critical paths without exceptional justification
+- Errors on critical paths MUST be visible (logged) or fatal (thrown)
+- Catch-and-continue on critical paths is BANNED unless explicitly approved
+- If in doubt, make it throw - fail loud, not silent
+
+## Output Format
+
+After each fix:
+```
+✅ Fixed: src/utils/example.ts:42
+   Pattern: NO_LOGGING_IN_CATCH
+   Solution: Added logger.error() with context
+
+Progress: 3/28 critical issues remaining
+```
+
+After completing a batch:
+```
+🎯 Batch complete! Re-running detector...
+[shows new results]
+```
+
+## Important
+
+- **Read the code** before proposing fixes - understand what it's doing
+- **Ask the user** if you're uncertain about the right approach
+- **Don't blindly add overrides** - challenge each one
+- **Prefer logging** over overrides when in doubt
+- **Work incrementally** - small batches, frequent validation
+
+## When Complete
+
+Report final statistics:
+```
+🎉 Anti-pattern cleanup complete!
+
+Before:
+  🔴 CRITICAL: 28
+  🟠 HIGH: 47
+  🟡 MEDIUM: 76
+
+After:
+  🔴 CRITICAL: 0
+  🟠 HIGH: 47
+  🟡 MEDIUM: 76
+  ⚪ APPROVED OVERRIDES: 15
+
+All critical anti-patterns resolved!
+```
+
+Now, ask the user: "Ready to fix error handling anti-patterns? I'll start with the critical issues."
--- a/.agent/services/claude-mem/.claude/plans/animated-installer.md
+++ b/.agent/services/claude-mem/.claude/plans/animated-installer.md
@@ -0,0 +1,371 @@
+# Comprehensive Claude-Mem Installer with @clack/prompts
+
+## Overview
+
+Build a beautiful, animated CLI installer for claude-mem using `@clack/prompts` (v1.0.1). Distributable via `npx claude-mem-installer` and `curl -fsSL https://install.cmem.ai | bash`. Replaces the need for users to manually clone, build, configure settings, and start the worker.
+
+**Worktree**: `feat/animated-installer` at `.claude/worktrees/animated-installer`
+
+---
+
+## Phase 0: Documentation & API Reference
+
+### Allowed APIs (@clack/prompts v1.0.1, ESM-only)
+
+| API | Signature | Use Case |
+|-----|-----------|----------|
+| `intro(title?)` | `void` | Opening banner |
+| `outro(message?)` | `void` | Completion message |
+| `cancel(message?)` | `void` | User cancelled |
+| `isCancel(value)` | `boolean` | Check if user pressed Ctrl+C |
+| `text(opts)` | `Promise<string \| symbol>` | API key input, port, data dir |
+| `password(opts)` | `Promise<string \| symbol>` | API key input (masked) |
+| `select(opts)` | `Promise<Value \| symbol>` | Provider, model, auth method |
+| `multiselect(opts)` | `Promise<Value[] \| symbol>` | IDE selection, observation types |
+| `confirm(opts)` | `Promise<boolean \| symbol>` | Enable Chroma, start worker |
+| `spinner()` | `SpinnerResult` | Installing deps, building, starting worker |
+| `progress(opts)` | `ProgressResult` | Multi-step installation progress |
+| `tasks(tasks[])` | `Promise<void>` | Sequential install steps |
+| `group(prompts, opts)` | `Promise<Results>` | Chain prompts with shared results |
+| `note(message, title)` | `void` | Display settings summary, next steps |
+| `log.info/success/warn/error(msg)` | `void` | Status messages |
+| `box(message, title, opts)` | `void` | Welcome box, completion summary |
+
+### Anti-Patterns
+- Do NOT use `require()` — package is ESM-only
+- Do NOT call prompts without TTY check first — hangs indefinitely in non-TTY
+- Do NOT forget `isCancel()` check after every prompt (or use `group()` with `onCancel`)
+- Do NOT use `chalk` — use `picocolors` (clack's dep) for consistency
+- `text()` has no numeric mode — validate manually for port numbers
+- `spinner.stop()` does not accept status codes — use `spinner.error()` for failures
+
+### Distribution Patterns
+- **npx**: `package.json` `bin` field → `"./dist/index.js"`, file needs `#!/usr/bin/env node`
+- **curl|bash**: Shell bootstrap downloads JS, runs `node script.js` directly (preserves TTY)
+- **esbuild**: Bundle to single ESM file, `platform: 'node'`, `banner` for shebang
+
+### Key Source Files to Reference
+- Settings defaults: `src/shared/SettingsDefaultsManager.ts` (lines 73-125)
+- Settings validation: `src/services/server/SettingsRoutes.ts`
+- Worker startup: `src/services/worker-service.ts` (lines 337-359)
+- Health check: `src/services/infrastructure/HealthMonitor.ts`
+- Plugin registration: `plugin/.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`
+- Marketplace sync: `scripts/sync-marketplace.cjs`
+- Cursor integration: `src/services/integrations/CursorHooksInstaller.ts`
+- Existing OpenClaw installer: `install/public/openclaw.sh` (reference for logic, not code to copy)
+
+---
+
+## Phase 1: Project Scaffolding
+
+**Goal**: Set up the installer package structure with build tooling.
+
+### Tasks
+
+1. **Create directory structure** in the worktree:
+   ```
+   installer/
+   ├── src/
+   │   ├── index.ts              # Entry point with TTY guard
+   │   ├── steps/
+   │   │   ├── welcome.ts        # intro + version check
+   │   │   ├── dependencies.ts   # bun, uv, git checks
+   │   │   ├── ide-selection.ts  # IDE picker + registration
+   │   │   ├── provider.ts       # AI provider + API key
+   │   │   ├── settings.ts       # Additional settings config
+   │   │   ├── install.ts        # Clone, build, register plugin
+   │   │   ├── worker.ts         # Start worker + health check
+   │   │   └── complete.ts       # Summary + next steps
+   │   └── utils/
+   │       ├── system.ts         # OS detection, command runner
+   │       ├── dependencies.ts   # bun/uv/git install helpers
+   │       └── settings-writer.ts # Write ~/.claude-mem/settings.json
+   ├── build.mjs                 # esbuild config
+   ├── package.json              # bin, type: module, deps
+   └── tsconfig.json
+   ```
+
+2. **Create `package.json`**:
+   ```json
+   {
+     "name": "claude-mem-installer",
+     "version": "1.0.0",
+     "type": "module",
+     "bin": { "claude-mem-installer": "./dist/index.js" },
+     "files": ["dist"],
+     "scripts": {
+       "build": "node build.mjs",
+       "dev": "node build.mjs && node dist/index.js"
+     },
+     "dependencies": {
+       "@clack/prompts": "^1.0.1",
+       "picocolors": "^1.1.1"
+     },
+     "devDependencies": {
+       "esbuild": "^0.24.0",
+       "typescript": "^5.7.0",
+       "@types/node": "^22.0.0"
+     },
+     "engines": { "node": ">=18.0.0" }
+   }
+   ```
+
+3. **Create `build.mjs`**:
+   - esbuild bundle: `entryPoints: ['src/index.ts']`, `format: 'esm'`, `platform: 'node'`, `target: 'node18'`
+   - Banner: `#!/usr/bin/env node`
+   - Output: `dist/index.js`
+
+4. **Create `tsconfig.json`**:
+   - `module: "ESNext"`, `target: "ES2022"`, `moduleResolution: "bundler"`
+
+5. **Run `npm install`** in installer/ directory
+
+### Verification
+- [ ] `node build.mjs` succeeds
+- [ ] `dist/index.js` exists with shebang
+- [ ] `node dist/index.js` runs (even if empty installer)
+
+---
+
+## Phase 2: Entry Point + Welcome Screen
+
+**Goal**: Create the main entry point with TTY detection and a beautiful welcome screen.
+
+### Tasks
+
+1. **`src/index.ts`** — Entry point:
+   - TTY guard: if `!process.stdin.isTTY`, print error directing user to `npx claude-mem-installer`, exit 1
+   - Import and call `runInstaller()` from steps
+   - Top-level catch → `p.cancel()` + exit 1
+
+2. **`src/steps/welcome.ts`** — Welcome step:
+   - `p.intro()` with styled title using picocolors: `" claude-mem installer "`
+   - Display version info via `p.log.info()`
+   - Check if already installed (detect `~/.claude-mem/settings.json` and `~/.claude/plugins/marketplaces/thedotmack/`)
+   - If upgrade detected, `p.confirm()`: "claude-mem is already installed. Upgrade?"
+   - `p.select()` for install mode: Fresh Install vs Upgrade vs Configure Only
+
+3. **`src/utils/system.ts`** — System utilities:
+   - `detectOS()`: returns 'macos' | 'linux' | 'windows'
+   - `commandExists(cmd)`: checks if command is in PATH
+   - `runCommand(cmd, args)`: executes shell command, returns { stdout, stderr, exitCode }
+   - `expandHome(path)`: resolves `~` to home directory
+
+### Verification
+- [ ] Running `node dist/index.js` shows intro banner
+- [ ] Ctrl+C triggers cancel message
+- [ ] Non-TTY (piped) shows error and exits
+
+---
+
+## Phase 3: Dependency Checks
+
+**Goal**: Check and install required dependencies (Bun, uv, git, Node.js version).
+
+### Tasks
+
+1. **`src/steps/dependencies.ts`** — Dependency checker:
+   - Use `p.tasks()` to check each dependency sequentially with animated spinners:
+     - **Node.js**: Verify >= 18.0.0 via `process.version`
+     - **git**: `commandExists('git')`, show install instructions per OS if missing
+     - **Bun**: Check PATH + common locations (`~/.bun/bin/bun`, `/usr/local/bin/bun`, `/opt/homebrew/bin/bun`). Min version 1.1.14. Offer to auto-install from `https://bun.sh/install`
+     - **uv**: Check PATH + common locations (`~/.local/bin/uv`, `~/.cargo/bin/uv`). Offer to auto-install from `https://astral.sh/uv/install.sh`
+   - For missing deps: `p.confirm()` to auto-install, or show manual instructions
+   - After install attempts, re-verify each dep
+
+2. **`src/utils/dependencies.ts`** — Install helpers:
+   - `installBun()`: downloads and runs bun install script
+   - `installUv()`: downloads and runs uv install script
+   - `findBinary(name, extraPaths[])`: searches PATH + known locations
+   - `checkVersion(binary, minVersion)`: parses `--version` output
+
+### Verification
+- [ ] Shows green checkmarks for found dependencies
+- [ ] Shows yellow warnings for missing deps with install option
+- [ ] Auto-install actually installs bun/uv when confirmed
+- [ ] Fails gracefully if git is missing (can't auto-install)
+
+---
+
+## Phase 4: IDE Selection & Provider Configuration
+
+**Goal**: Let user choose IDEs and configure AI provider with API keys.
+
+### Tasks
+
+1. **`src/steps/ide-selection.ts`** — IDE picker:
+   - `p.multiselect()` with options:
+     - Claude Code (default selected, hint: "recommended")
+     - Cursor
+     - Windsurf (hint: "coming soon", disabled: true)
+   - For Claude Code: explain plugin will be registered via marketplace
+   - For Cursor: explain hooks will be installed via CursorHooksInstaller pattern
+   - Store selections for later installation steps
+
+2. **`src/steps/provider.ts`** — AI provider configuration:
+   - `p.select()` for provider:
+     - **Claude** (hint: "recommended — uses your Claude subscription")
+     - **Gemini** (hint: "free tier available")
+     - **OpenRouter** (hint: "free models available")
+   - **If Claude selected**:
+     - `p.select()` for auth method: "CLI (Max Plan subscription)" vs "API Key"
+     - If API key: `p.password()` for key input
+   - **If Gemini selected**:
+     - `p.password()` for API key (required)
+     - `p.select()` for model: gemini-2.5-flash-lite (default), gemini-2.5-flash, gemini-3-flash-preview
+     - `p.confirm()` for rate limiting (default: true)
+   - **If OpenRouter selected**:
+     - `p.password()` for API key (required)
+     - `p.text()` for model (default: `xiaomi/mimo-v2-flash:free`)
+   - Validate API keys where possible (non-empty, format check)
+
+### Verification
+- [ ] Multiselect allows picking multiple IDEs
+- [ ] Provider selection shows correct follow-up prompts
+- [ ] API keys are masked during input
+- [ ] Cancel at any step triggers graceful exit
+
+---
+
+## Phase 5: Settings Configuration
+
+**Goal**: Configure additional settings with sensible defaults.
+
+### Tasks
+
+1. **`src/steps/settings.ts`** — Settings wizard:
+   - `p.confirm()`: "Use default settings?" (recommended) — if yes, skip detailed config
+   - If customizing, use `p.group()` for:
+     - **Worker port**: `p.text()` with default 37777, validate 1024-65535
+     - **Data directory**: `p.text()` with default `~/.claude-mem`
+     - **Context observations**: `p.text()` with default 50, validate 1-200
+     - **Log level**: `p.select()` — DEBUG, INFO (default), WARN, ERROR
+     - **Python version**: `p.text()` with default 3.13
+     - **Chroma vector search**: `p.confirm()` (default: true)
+       - If yes, `p.select()` mode: local (default) vs remote
+       - If remote: `p.text()` for host, port, `p.confirm()` for SSL
+   - Show settings summary via `p.note()` before proceeding
+
+2. **`src/utils/settings-writer.ts`** — Write settings:
+   - Build flat key-value settings object matching SettingsDefaultsManager schema
+   - Merge with existing settings if upgrading (preserve user customizations)
+   - Write to `~/.claude-mem/settings.json`
+   - Create `~/.claude-mem/` directory if it doesn't exist
+
+### Verification
+- [ ] Default settings mode skips all detailed prompts
+- [ ] Custom settings validates all inputs
+- [ ] Settings file written matches SettingsDefaultsManager schema exactly
+- [ ] Existing settings preserved on upgrade
+
+---
+
+## Phase 6: Installation Execution
+
+**Goal**: Clone repo, build plugin, register with IDEs, start worker.
+
+### Tasks
+
+1. **`src/steps/install.ts`** — Installation runner:
+   - Use `p.tasks()` for visual progress:
+     - **"Cloning claude-mem repository"**: `git clone --depth 1 https://github.com/thedotmack/claude-mem.git` to temp dir
+     - **"Installing dependencies"**: `npm install` in cloned repo
+     - **"Building plugin"**: `npm run build` in cloned repo
+     - **"Registering plugin"**: Copy plugin files to `~/.claude/plugins/marketplaces/thedotmack/`
+       - Create marketplace.json, plugin.json structure
+       - Register in `~/.claude/plugins/known_marketplaces.json`
+       - Add to `~/.claude/plugins/installed_plugins.json`
+       - Enable in `~/.claude/settings.json` under `enabledPlugins`
+     - **"Installing dependencies"** (in marketplace dir): `npm install`
+   - For Cursor (if selected):
+     - **"Configuring Cursor hooks"**: Run Cursor hooks installer logic
+     - Write hooks.json to `~/.cursor/` or project-level `.cursor/`
+     - Configure MCP in `.cursor/mcp.json`
+
+2. **`src/steps/worker.ts`** — Worker startup:
+   - Use `p.spinner()` for worker startup:
+     - Start worker: `bun plugin/scripts/worker-service.cjs` (from marketplace dir)
+     - Write PID file to `~/.claude-mem/worker.pid`
+   - Two-stage health check (copy pattern from OpenClaw installer):
+     - Stage 1: Poll `/api/health` — spinner message: "Starting worker service..."
+     - Stage 2: Poll `/api/readiness` — spinner message: "Initializing database..."
+     - Budget: 30 attempts, 1 second apart
+     - On success: `spinner.stop("Worker running on port {port}")`
+     - On failure: `spinner.error("Worker failed to start")`, show log path
+
+### Verification
+- [ ] Plugin files exist at `~/.claude/plugins/marketplaces/thedotmack/`
+- [ ] known_marketplaces.json updated
+- [ ] installed_plugins.json updated
+- [ ] settings.json has enabledPlugins entry
+- [ ] Worker responds to `/api/health` with 200
+- [ ] Worker responds to `/api/readiness` with 200
+
+---
+
+## Phase 7: Completion & Summary
+
+**Goal**: Show success screen with configuration summary and next steps.
+
+### Tasks
+
+1. **`src/steps/complete.ts`** — Completion screen:
+   - `p.note()` with configuration summary:
+     - Provider + model
+     - IDEs configured
+     - Data directory
+     - Worker port
+     - Chroma enabled/disabled
+   - `p.note()` with next steps:
+     - "Open Claude Code and start a conversation — memory is automatic!"
+     - "View your memories: http://localhost:{port}"
+     - "Search past work: use /mem-search in Claude Code"
+     - If Cursor: "Open Cursor — hooks are active in your projects"
+   - `p.outro()` with styled completion message
+
+### Verification
+- [ ] Summary accurately reflects chosen settings
+- [ ] URLs use correct port from settings
+- [ ] Next steps are relevant to selected IDEs
+
+---
+
+## Phase 8: curl|bash Bootstrap Script
+
+**Goal**: Create the shell bootstrap script for `curl -fsSL https://install.cmem.ai | bash`.
+
+### Tasks
+
+1. **`install/public/install.sh`** — Bootstrap script:
+   - Check for Node.js >= 18 (required to run the installer)
+   - Download bundled installer JS to temp file
+   - Execute with `node` directly (preserves TTY for @clack/prompts)
+   - Cleanup temp file on exit (trap)
+   - Support `--non-interactive` flag passthrough
+   - Support `--provider=X --api-key=Y` flag passthrough
+
+2. **Update `install/vercel.json`** to serve `install.sh` alongside `openclaw.sh`
+
+### Verification
+- [ ] `curl -fsSL https://install.cmem.ai | bash` downloads and runs installer
+- [ ] Interactive prompts work after curl download
+- [ ] Temp file cleaned up on success and failure
+- [ ] Flags pass through correctly
+
+---
+
+## Phase 9: Final Verification
+
+### Checks
+- [ ] `npm run build` in installer/ produces single-file `dist/index.js`
+- [ ] `node dist/index.js` runs full wizard flow
+- [ ] Fresh install on clean system works end-to-end
+- [ ] Upgrade path preserves existing settings
+- [ ] Ctrl+C at any step exits cleanly
+- [ ] Non-TTY shows error message
+- [ ] All settings written match SettingsDefaultsManager.ts defaults schema
+- [ ] Worker health check succeeds after install
+- [ ] Plugin appears in Claude Code plugin list
+- [ ] grep for deprecated/non-existent APIs returns 0 results
+- [ ] No `require()` calls in source (ESM-only)
+- [ ] No `chalk` imports (use picocolors)
--- a/.agent/services/claude-mem/.claude/reports/test-audit-2026-01-05.md
+++ b/.agent/services/claude-mem/.claude/reports/test-audit-2026-01-05.md
@@ -0,0 +1,290 @@
+# Test Quality Audit Report
+
+**Date**: 2026-01-05
+**Auditor**: Claude Code (Opus 4.5)
+**Methodology**: Deep analysis with focus on anti-pattern prevention, actual functionality testing, and regression prevention
+
+---
+
+## Executive Summary
+
+**Total Test Files Audited**: 41
+**Total Test Cases**: ~450+
+
+### Score Distribution
+
+| Score | Category | Count | Percentage |
+|-------|----------|-------|------------|
+| 5 | Essential | 8 | 19.5% |
+| 4 | Valuable | 15 | 36.6% |
+| 3 | Marginal | 11 | 26.8% |
+| 2 | Weak | 5 | 12.2% |
+| 1 | Delete | 2 | 4.9% |
+
+### Key Findings
+
+**Strengths**:
+- SQLite database tests are exemplary - real database operations with proper setup/teardown
+- Infrastructure tests (WMIC parsing, token calculator) use pure unit testing with no mocks
+- Search strategy tests have comprehensive coverage of edge cases
+- Logger formatTool tests are thorough and test actual transformation logic
+
+**Critical Issues**:
+- **context-builder.test.ts** has incomplete mocks that pollute the module cache, causing 81 test failures when run with the full suite
+- Several tests verify mock behavior rather than actual functionality
+- Type validation tests (export-types.test.ts) provide minimal value - TypeScript already validates types at compile time
+- Some "validation" tests only verify code patterns exist, not that they work
+
+**Recommendations**:
+1. Fix or delete context-builder.test.ts - it actively harms the test suite
+2. Delete trivial type validation tests that duplicate TypeScript compiler checks
+3. Convert heavy-mock tests to integration tests where feasible
+4. Add integration tests for critical paths (hook execution, worker API endpoints)
+
+---
+
+## Detailed Scores
+
+### Score 5 - Essential (8 tests)
+
+These tests catch real bugs, use minimal mocking, and test actual behavior.
+
+| File | Test Count | Notes |
+|------|------------|-------|
+| `tests/sqlite/observations.test.ts` | 25+ | Real SQLite operations, in-memory DB, tests actual data persistence and retrieval |
+| `tests/sqlite/sessions.test.ts` | 20+ | Real database CRUD operations, status transitions, relationship integrity |
+| `tests/sqlite/transactions.test.ts` | 15+ | Critical transaction isolation tests, rollback behavior, error handling |
+| `tests/context/token-calculator.test.ts` | 35+ | Pure unit tests, no mocks, tests actual token estimation algorithms |
+| `tests/infrastructure/wmic-parsing.test.ts` | 20+ | Pure parsing logic tests, validates Windows process enumeration edge cases |
+| `tests/utils/logger-format-tool.test.ts` | 56 | Comprehensive formatTool tests, validates JSON parsing, tool output formatting |
+| `tests/server/server.test.ts` | 15+ | Real HTTP server integration tests, actual endpoint validation |
+| `tests/cursor-hook-outputs.test.ts` | 12+ | Integration tests running actual hook scripts, validates real output |
+
+**Why Essential**: These tests catch actual bugs before production. They test real behavior with minimal abstraction. The SQLite tests in particular are exemplary - they use an in-memory database but perform real SQL operations.
+
+---
+
+### Score 4 - Valuable (15 tests)
+
+Good tests with acceptable mocking that still verify meaningful behavior.
+
+| File | Test Count | Notes |
+|------|------------|-------|
+| `tests/sqlite/prompts.test.ts` | 15+ | Real DB operations for user prompts, timestamp handling |
+| `tests/sqlite/summaries.test.ts` | 15+ | Real DB operations for session summaries |
+| `tests/worker/search/search-orchestrator.test.ts` | 30+ | Comprehensive strategy selection logic, good edge case coverage |
+| `tests/worker/search/strategies/sqlite-search-strategy.test.ts` | 25+ | Filter logic tests, date range handling |
+| `tests/worker/search/strategies/hybrid-search-strategy.test.ts` | 20+ | Ranking preservation, merge logic |
+| `tests/worker/search/strategies/chroma-search-strategy.test.ts` | 20+ | Vector search behavior, doc_type filtering |
+| `tests/worker/search/result-formatter.test.ts` | 15+ | Output formatting validation |
+| `tests/gemini_agent.test.ts` | 20+ | Multi-turn conversation flow, rate limiting fallback |
+| `tests/infrastructure/health-monitor.test.ts` | 15+ | Health check logic, threshold validation |
+| `tests/infrastructure/graceful-shutdown.test.ts` | 15+ | Shutdown sequence, timeout handling |
+| `tests/infrastructure/process-manager.test.ts` | 12+ | Process lifecycle management |
+| `tests/cursor-mcp-config.test.ts` | 10+ | MCP configuration generation validation |
+| `tests/cursor-hooks-json-utils.test.ts` | 8+ | JSON parsing utilities |
+| `tests/shared/settings-defaults-manager.test.ts` | 27 | Settings validation, migration logic |
+| `tests/context/formatters/markdown-formatter.test.ts` | 15+ | Markdown generation, terminology consistency |
+
+**Why Valuable**: These tests have some mocking but still verify important business logic. The search strategy tests are particularly good at testing the decision-making logic for query routing.
+
+---
+
+### Score 3 - Marginal (11 tests)
+
+Tests with moderate value, often too much mocking or testing obvious behavior.
+
+| File | Test Count | Issues |
+|------|------------|--------|
+| `tests/worker/agents/observation-broadcaster.test.ts` | 15+ | Heavy mocking of SSE workers, tests mock behavior more than actual broadcasting |
+| `tests/worker/agents/fallback-error-handler.test.ts` | 10+ | Error message formatting tests, low complexity |
+| `tests/worker/agents/session-cleanup-helper.test.ts` | 10+ | Cleanup logic with mocked dependencies |
+| `tests/context/observation-compiler.test.ts` | 20+ | Mock database, tests query building not actual compilation |
+| `tests/server/error-handler.test.ts` | 8+ | Mock Express response, tests formatting only |
+| `tests/cursor-registry.test.ts` | 8+ | Registry pattern tests, low risk area |
+| `tests/cursor-context-update.test.ts` | 5+ | File format validation, could be stricter |
+| `tests/hook-constants.test.ts` | 5+ | Constant validation, low value |
+| `tests/session_store.test.ts` | 10+ | In-memory store tests, straightforward logic |
+| `tests/logger-coverage.test.ts` | 8+ | Coverage verification, not functionality |
+| `tests/scripts/smart-install.test.ts` | 25+ | Path array tests, replicates rather than imports logic |
+
+**Why Marginal**: These tests provide some regression protection but either mock too heavily or test low-risk areas. The smart-install tests notably replicate the path arrays from the source file rather than testing the actual module.
+
+---
+
+### Score 2 - Weak (5 tests)
+
+Tests that mostly verify mocks work or provide little value.
+
+| File | Test Count | Issues |
+|------|------------|--------|
+| `tests/worker/agents/response-processor.test.ts` | 20+ | **Heavy mocking**: >50% setup is mock configuration. Tests verify mocks are called, not that XML parsing actually works |
+| `tests/session_id_refactor.test.ts` | 10+ | **Code pattern validation**: Tests that certain patterns exist in code, not that they work |
+| `tests/session_id_usage_validation.test.ts` | 5+ | **Static analysis as tests**: Reads files and checks for string patterns. Should be a lint rule, not a test |
+| `tests/validate_sql_update.test.ts` | 5+ | **One-time validation**: Validated a migration, no ongoing value |
+| `tests/worker-spawn.test.ts` | 5+ | **Trivial mocking**: Tests spawn config exists, doesn't test actual spawning |
+
+**Why Weak**: These tests create false confidence. The response-processor tests in particular set up elaborate mocks and then verify those mocks were called - they don't verify actual XML parsing or database operations work correctly.
+
+---
+
+### Score 1 - Delete (2 tests)
+
+Tests that actively harm the codebase or provide zero value.
+
+| File | Test Count | Issues |
+|------|------------|--------|
+| `tests/context/context-builder.test.ts` | 20+ | **CRITICAL**: Incomplete logger mock pollutes module cache. Causes 81 test failures when run with full suite. Tests verify mocks, not actual context building |
+| `tests/scripts/export-types.test.ts` | 30+ | **Zero runtime value**: Tests TypeScript type definitions compile. TypeScript compiler already does this. These tests can literally never fail at runtime |
+
+**Why Delete**:
+- **context-builder.test.ts**: This test is actively harmful. It imports the logger module with an incomplete mock (only 4 of 13+ methods mocked), and this polluted mock persists in Bun's module cache. When other tests run afterwards, they get the broken logger singleton. The test itself only verifies that mocked methods were called with expected arguments - it doesn't test actual context building logic.
+- **export-types.test.ts**: These tests instantiate TypeScript interfaces and verify properties exist. TypeScript already validates this at compile time. If a type definition is wrong, the code won't compile. These runtime tests add overhead without catching any bugs that TypeScript wouldn't already catch.
+
+---
+
+## Missing Test Coverage
+
+### Critical Gaps
+
+| Area | Risk | Current Coverage | Recommendation |
+|------|------|------------------|----------------|
+| **Hook execution E2E** | HIGH | None | Add integration tests that run hooks with real Claude Code SDK |
+| **Worker API endpoints** | HIGH | Partial (server.test.ts) | Add tests for all REST endpoints: `/observe`, `/search`, `/health` |
+| **Chroma vector sync** | HIGH | None | Add tests for ChromaSync.ts embedding generation and retrieval |
+| **Database migrations** | MEDIUM | None | Add tests for schema migrations, especially version upgrades |
+| **Settings file I/O** | MEDIUM | Partial | Add tests for settings file creation, corruption recovery |
+| **Tag stripping** | MEDIUM | None | Add tests for `<private>` and `<meta-observation>` tag handling |
+| **MCP tool handlers** | MEDIUM | None | Add tests for search, timeline, get_observations MCP tools |
+| **Error recovery** | MEDIUM | Minimal | Add tests for worker crash recovery, database corruption handling |
+
+### Recommended New Tests
+
+1. **`tests/integration/hook-execution.test.ts`**
+   - Run actual hooks with mocked Claude Code environment
+   - Verify data flows correctly through SessionStart -> PostToolUse -> SessionEnd
+
+2. **`tests/integration/worker-api.test.ts`**
+   - Start actual worker server
+   - Make real HTTP requests to all endpoints
+   - Verify response formats and error handling
+
+3. **`tests/services/chroma-sync.test.ts`**
+   - Test embedding generation with real text
+   - Test semantic similarity retrieval
+   - Test sync between SQLite and Chroma
+
+4. **`tests/utils/tag-stripping.test.ts`**
+   - Test `<private>` tag removal
+   - Test `<meta-observation>` tag handling
+   - Test nested tag scenarios
+
+---
+
+## Recommendations
+
+### Immediate Actions
+
+1. **Delete or fix `tests/context/context-builder.test.ts`** (Priority: CRITICAL)
+   - This test causes 81 other tests to fail due to module cache pollution
+   - Either complete the logger mock (all 13+ methods) or delete entirely
+   - Recommended: Delete and rewrite as integration test without mocks
+
+2. **Delete `tests/scripts/export-types.test.ts`** (Priority: HIGH)
+   - Zero runtime value - TypeScript compiler already validates types
+   - Remove to reduce test suite noise
+
+3. **Delete or convert validation tests** (Priority: MEDIUM)
+   - `tests/session_id_refactor.test.ts` - Was useful during migration, no longer needed
+   - `tests/session_id_usage_validation.test.ts` - Convert to lint rule
+   - `tests/validate_sql_update.test.ts` - Was useful during migration, no longer needed
+
+### Architecture Improvements
+
+1. **Create test utilities for common mocks**
+   - Centralize logger mock in `tests/utils/mock-logger.ts` with ALL methods
+   - Centralize database mock with proper transaction support
+   - Prevent incomplete mocks from polluting module cache
+
+2. **Add integration test suite**
+   - Create `tests/integration/` directory
+   - Run with real worker server (separate database)
+   - Test actual data flow, not mock interactions
+
+3. **Implement test isolation**
+   - Use `beforeEach` to reset module state
+   - Consider test file ordering to prevent cache pollution
+   - Add cleanup hooks for database state
+
+### Quality Guidelines
+
+For future tests, follow these principles:
+
+1. **Prefer real implementations over mocks**
+   - Use in-memory SQLite instead of mock database
+   - Use real HTTP requests instead of mock req/res
+   - Mock only external services (AI APIs, file system when needed)
+
+2. **Test behavior, not implementation**
+   - Bad: "verify function X was called with argument Y"
+   - Good: "verify output contains expected data after operation"
+
+3. **Each test should be able to fail**
+   - If a test cannot fail (like type validation tests), it's not testing anything
+   - Write tests that would catch real bugs
+
+4. **Keep test setup minimal**
+   - If >50% of test is mock setup, consider integration testing
+   - Complex mock setup often indicates testing the wrong thing
+
+---
+
+## Appendix: Full Test File Inventory
+
+| File | Score | Tests | LOC | Mock % |
+|------|-------|-------|-----|--------|
+| `tests/context/context-builder.test.ts` | 1 | 20+ | 400+ | 80% |
+| `tests/context/formatters/markdown-formatter.test.ts` | 4 | 15+ | 200+ | 10% |
+| `tests/context/observation-compiler.test.ts` | 3 | 20+ | 300+ | 60% |
+| `tests/context/token-calculator.test.ts` | 5 | 35+ | 400+ | 0% |
+| `tests/cursor-context-update.test.ts` | 3 | 5+ | 100+ | 20% |
+| `tests/cursor-hook-outputs.test.ts` | 5 | 12+ | 250+ | 10% |
+| `tests/cursor-hooks-json-utils.test.ts` | 4 | 8+ | 150+ | 0% |
+| `tests/cursor-mcp-config.test.ts` | 4 | 10+ | 200+ | 20% |
+| `tests/cursor-registry.test.ts` | 3 | 8+ | 150+ | 30% |
+| `tests/gemini_agent.test.ts` | 4 | 20+ | 400+ | 40% |
+| `tests/hook-constants.test.ts` | 3 | 5+ | 80+ | 0% |
+| `tests/infrastructure/graceful-shutdown.test.ts` | 4 | 15+ | 300+ | 40% |
+| `tests/infrastructure/health-monitor.test.ts` | 4 | 15+ | 250+ | 30% |
+| `tests/infrastructure/process-manager.test.ts` | 4 | 12+ | 200+ | 35% |
+| `tests/infrastructure/wmic-parsing.test.ts` | 5 | 20+ | 240+ | 0% |
+| `tests/logger-coverage.test.ts` | 3 | 8+ | 150+ | 20% |
+| `tests/scripts/export-types.test.ts` | 1 | 30+ | 350+ | 0% |
+| `tests/scripts/smart-install.test.ts` | 3 | 25+ | 230+ | 0% |
+| `tests/server/error-handler.test.ts` | 3 | 8+ | 150+ | 50% |
+| `tests/server/server.test.ts` | 5 | 15+ | 300+ | 20% |
+| `tests/session_id_refactor.test.ts` | 2 | 10+ | 200+ | N/A |
+| `tests/session_id_usage_validation.test.ts` | 2 | 5+ | 150+ | N/A |
+| `tests/session_store.test.ts` | 3 | 10+ | 180+ | 10% |
+| `tests/shared/settings-defaults-manager.test.ts` | 4 | 27 | 400+ | 20% |
+| `tests/sqlite/observations.test.ts` | 5 | 25+ | 400+ | 0% |
+| `tests/sqlite/prompts.test.ts` | 4 | 15+ | 250+ | 0% |
+| `tests/sqlite/sessions.test.ts` | 5 | 20+ | 350+ | 0% |
+| `tests/sqlite/summaries.test.ts` | 4 | 15+ | 250+ | 0% |
+| `tests/sqlite/transactions.test.ts` | 5 | 15+ | 300+ | 0% |
+| `tests/utils/logger-format-tool.test.ts` | 5 | 56 | 1000+ | 0% |
+| `tests/validate_sql_update.test.ts` | 2 | 5+ | 100+ | N/A |
+| `tests/worker/agents/fallback-error-handler.test.ts` | 3 | 10+ | 200+ | 40% |
+| `tests/worker/agents/observation-broadcaster.test.ts` | 3 | 15+ | 350+ | 60% |
+| `tests/worker/agents/response-processor.test.ts` | 2 | 20+ | 500+ | 70% |
+| `tests/worker/agents/session-cleanup-helper.test.ts` | 3 | 10+ | 200+ | 50% |
+| `tests/worker/search/result-formatter.test.ts` | 4 | 15+ | 250+ | 20% |
+| `tests/worker/search/search-orchestrator.test.ts` | 4 | 30+ | 500+ | 45% |
+| `tests/worker/search/strategies/chroma-search-strategy.test.ts` | 4 | 20+ | 350+ | 50% |
+| `tests/worker/search/strategies/hybrid-search-strategy.test.ts` | 4 | 20+ | 300+ | 45% |
+| `tests/worker/search/strategies/sqlite-search-strategy.test.ts` | 4 | 25+ | 350+ | 40% |
+| `tests/worker-spawn.test.ts` | 2 | 5+ | 100+ | 60% |
+
+---
+
+*Report generated by Claude Code (Opus 4.5) on 2026-01-05*
--- a/.agent/services/claude-mem/.claude/settings.json
+++ b/.agent/services/claude-mem/.claude/settings.json
@@ -0,0 +1,10 @@
+{
+  "env": {},
+  "permissions": {
+    "deny": [
+      "Read(./package-lock.json)",
+      "Read(./node_modules/**)",
+      "Read(./.DS_Store)"
+    ]
+  }
+}
--- a/.agent/services/claude-mem/.claude/skills/CLAUDE.md
+++ b/.agent/services/claude-mem/.claude/skills/CLAUDE.md
@@ -0,0 +1,29 @@
+# Project-Level Skills
+
+This directory contains skills **for developing and maintaining the claude-mem project itself**, not skills that are released as part of the plugin.
+
+## Distinction
+
+**Project Skills** (`.claude/skills/`):
+- Used by developers working on claude-mem
+- Not included in the plugin distribution
+- Project-specific workflows (version bumps, release management, etc.)
+- Not synced to `~/.claude/plugins/marketplaces/thedotmack/`
+
+**Plugin Skills** (`plugin/skills/`):
+- Released as part of the claude-mem plugin
+- Available to all users who install the plugin
+- General-purpose memory search functionality
+- Synced to user installations via `npm run sync-marketplace`
+
+## Skills in This Directory
+
+### version-bump
+Manages semantic versioning for the claude-mem project itself. Handles updating all three version files (package.json, marketplace.json, plugin.json), creating git tags, and GitHub releases.
+
+**Usage**: Only for claude-mem maintainers releasing new versions.
+
+## Adding New Skills
+
+**For claude-mem development** → Add to `.claude/skills/`
+**For end users** → Add to `plugin/skills/` (gets distributed with plugin)
--- a/.agent/services/claude-mem/.mcp.json
+++ b/.agent/services/claude-mem/.mcp.json
@@ -0,0 +1,3 @@
+{
+  "mcpServers": {}
+}
--- a/.agent/services/claude-mem/plugin/.mcp.json
+++ b/.agent/services/claude-mem/plugin/.mcp.json
@@ -0,0 +1,8 @@
+{
+  "mcpServers": {
+    "mcp-search": {
+      "type": "stdio",
+      "command": "${CLAUDE_PLUGIN_ROOT}/scripts/mcp-server.cjs"
+    }
+  }
+}