feat(pipeline): YouTube Tab → PDF 자동 추출 파이프라인 초기 구현

- 5단계 파이프라인: 다운로드 → 프레임 추출 → 패턴 감지 → 중복 제거 → PDF 생성 - 3가지 패턴 지원: overlay, split, scroll - MSE 기반 픽셀 비교 프레임 중복 제거 - split 모드: 42% 크롭 + 밝기 필터 + Tab 라인 검증 - overlay 모드: 320x120 정규화 + 슬라이딩 윈도우 비교 - 프로젝트 문서 초기 작성 (architecture, tech-stack, STATUS, known-issues)
2026-03-24 23:25:17 +09:00
commit 3d3f74b082
18 changed files with 1989 additions and 0 deletions
--- a/.agent/AGENT.md
+++ b/.agent/AGENT.md
@@ -0,0 +1,75 @@
+---
+description: 모든 작업에 자동 적용되는 에이전트 행동 규칙. 새 대화 시작 시 반드시 이 파일을 먼저 읽습니다.
+---
+
+# Agent Rules
+
+## Identity
+
+당신은 이 프로젝트의 시니어 개발자입니다. 지시를 정확히 따르고, 추측보다 근거를 우선합니다.
+
+## NEVER (절대 금지)
+
+1. NEVER start coding without reading relevant reference documents in `.agent/references/`
+2. NEVER guess when documentation exists — always check `.agent/references/` first
+3. NEVER repeat a failed approach — check `.agent/references/known-issues.md` first
+4. NEVER call APIs directly when helper scripts exist in `.agent/workflows/helpers/`
+5. NEVER skip the pre-task checklist defined in `.agent/workflows/pre-task.md`
+6. NEVER attempt the same failed approach more than 2 times
+7. NEVER truncate error messages — always show the full error output
+8. NEVER modify `.env`, secrets, or credential files without explicit user approval
+9. NEVER make changes exceeding 3 files without stating the blast radius first
+10. NEVER dump large outputs without summarizing — paginate or filter results
+
+## ALWAYS (필수)
+
+1. ALWAYS run `.agent/workflows/pre-task.md` before any implementation task
+2. ALWAYS check `.agent/references/known-issues.md` before debugging
+3. ALWAYS cite which reference document you consulted and what you learned
+4. ALWAYS stop and ask the user if 2 consecutive attempts on the same approach fail
+5. ALWAYS use existing helper scripts instead of raw API calls
+6. ALWAYS read related existing code (minimum 3 files) before writing new code
+7. ALWAYS read `STATUS.md` before starting work to understand the big picture
+8. ALWAYS update `STATUS.md` at session end when any module changes
+9. ALWAYS state the blast radius (affected files/modules) before multi-file changes
+10. ALWAYS verify changes compile/run before reporting completion
+
+## Security Boundaries (건드리지 않을 것)
+
+- `.env`, `.env.*` — 환경변수/시크릿 (읽기만 허용, 수정 시 반드시 유저 승인)
+- `*.pem`, `*.key` — 인증서/키 파일
+- `.git/` — Git 내부 구조
+- 프로덕션 서비스 직접 조작 금지 (반드시 helper 스크립트 경유)
+
+## Context Management (컨텍스트 관리)
+
+- 대용량 출력은 반드시 요약/필터링 (전체 로그 덤프 금지)
+- 이전 세션 맥락: `STATUS.md` → devlog → known-issues 순서로 최소한만 로딩
+- 긴 작업 중간에 진행 상황을 devlog entry로 기록 (세션 유실 방지)
+
+## Failure Protocol
+
+```
+1st failure → Re-read reference docs → Try DIFFERENT approach
+2nd failure (same issue) → STOP → Report diagnosis to user with:
+   - What was tried
+   - What failed
+   - Root cause hypothesis
+   - Suggested next steps
+3rd attempt on same approach → FORBIDDEN
+```
+
+## Reference Loading Order
+
+1. `.agent/AGENT.md` (this file — behavior rules)
+2. `.agent/references/STATUS.md` (big picture — system design & features)
+3. `.agent/references/known-issues.md` (past failure patterns)
+4. `.agent/references/` (project-specific knowledge)
+5. `.agent/workflows/services.md` (service credentials & protocols)
+6. `.agent/workflows/` (action procedures)
+
+## PowerShell Notes
+
+- `curl` → PowerShell에서 `Invoke-WebRequest` 별칭. **반드시 `curl.exe`** 사용
+- `npm` → 실행 정책 문제 시 `cmd /c npm` 사용
+- JSON 처리 시 `.py` 스크립트 권장 (PowerShell 이스케이핑 이슈 방지)
--- a/.agent/GUIDE.md
+++ b/.agent/GUIDE.md
@@ -0,0 +1,154 @@
+# AI 에이전트 워크플로우 시스템 가이드
+
+> 이 가이드는 AI 코딩 에이전트가 더 똑똑하게 동작하도록 설계된 범용 워크플로우 시스템의 사용법을 설명합니다.
+
+---
+
+## 왜 이 시스템이 필요한가?
+
+AI 에이전트는 다음과 같은 문제를 자주 일으킵니다:
+
+| 문제 | 원인 |
+|------|------|
+| 📋 워크플로우를 무시함 | 규칙이 강제가 아닌 권고 사항으로만 작성됨 |
+| 🔄 같은 실수를 반복함 | 과거 실패 기록을 저장/참조하는 메커니즘 없음 |
+| 📖 레퍼런스 문서를 안 읽음 | "읽어라"는 강제 지시가 없고, 어떤 문서를 확인할지 불명확 |
+| 🎲 추측으로 시행착오 | 작업 전 체크리스트(Pre-flight Checklist) 부재 |
+
+이 시스템은 **13회 웹 검색**, **80+ 소스 분석**, **7개 주요 AI 플랫폼**(Claude, GPT, Gemini, Cursor, Cline, Roo, Windsurf) 연구를 기반으로 설계되었습니다.
+
+---
+
+## 파일 구조 개요
+
+```
+.agent/
+├── AGENT.md                          ← 🧠 에이전트 헌법 (NEVER/ALWAYS 규칙)
+├── GUIDE.md                          ← 📖 이 가이드
+├── references/                       ← 📚 프로젝트 지식 베이스
+│   ├── architecture.md               ← 아키텍처 설명
+│   ├── tech-stack.md                 ← 기술 스택 & 버전
+│   ├── conventions.md                ← 코딩 컨벤션
+│   └── known-issues.md               ← 🔴 과거 실패 기록 (핵심!)
+└── workflows/                        ← ⚙️ 행동 절차
+    ├── start.md                      ← 세션 시작 (룰 로딩 + devlog 복구)
+    ├── end.md                        ← 세션 종료 (devlog + known-issues + Vikunja + Git)
+    ├── pre-task.md                   ← 작업 전 필수 체크리스트
+    ├── debug.md                      ← 디버깅 전용 절차
+    ├── services.md                   ← 서비스 연동 + 작업 프로토콜 + 개발/테스트 명령어
+    ├── check-gitea.md                ← Gitea 현황 조회
+    ├── check-vikunja.md              ← Vikunja 태스크 조회
+    └── helpers/
+        ├── vikunja_helper.py         ← Vikunja API 안전 래퍼
+        └── wiki_helper.py            ← Gitea Wiki 래퍼
+```
+
+**프로젝트 루트에 자동 생성되는 디렉토리:**
+```
+docs/devlog/                          ← 📓 세션별 작업 기록
+├── YYYY-MM-DD.md                     ← Index (매일 1줄씩 누적)
+└── entries/
+    └── YYYYMMDD-NNN.md               ← Entry (설계 결정/미완료 시만)
+```
+
+---
+
+## 각 파일의 역할
+
+### 🧠 `AGENT.md` — 에이전트 헌법
+
+에이전트가 **모든 대화에서 따라야 하는 글로벌 규칙**입니다.
+
+**핵심 메커니즘:**
+- **NEVER 규칙**: `"절대 ~하지 마라"` — 연구에 따르면 금지 규칙이 더 잘 지켜집니다
+- **Failure Protocol**: 동일 접근 2회 실패 시 자동 중단 → 유저에게 보고
+- **Reference Loading Order**: 어떤 문서를 먼저 읽을지 우선순위 명시
+
+### 📋 `pre-task.md` — 사전 점검 체크리스트
+
+모든 구현 작업 전에 실행하는 **4단계 체크리스트**:
+1. 요구사항 정리
+2. 레퍼런스 확인 (추측 금지)
+3. 계획 수립
+4. 유저 확인
+
+### 🔴 `known-issues.md` — 과거 실패 기록
+
+**가장 중요한 파일.** 에이전트가 같은 실수를 반복하는 근본 원인은 **실패를 기억하지 못하기 때문**입니다. 이 파일은:
+- 세션 종료 시 에이전트가 자동으로 새 이슈를 추가
+- 디버깅/구현 전에 에이전트가 반드시 확인
+- 시간이 지날수록 **축적 학습** 효과
+
+### 🔧 `debug.md` — 디버깅 전용 워크플로우
+
+**추측 기반 디버깅을 금지**하는 5단계 절차:
+1. 정보 수집 (에러 전문 확인)
+2. known-issues 확인
+3. 근본 원인 분석 (가설 → 검증)
+4. 수정 및 검증
+5. 기록 (known-issues에 추가)
+
+### 📓 Devlog — 세션별 작업 기록 (start.md / end.md에서 관리)
+
+known-issues가 **실패만** 기록한다면, devlog는 **전체 세션 이력**을 기록합니다:
+- **Index** (`docs/devlog/YYYY-MM-DD.md`): 매 작업마다 1줄 (필수)
+- **Entry** (`docs/devlog/entries/YYYYMMDD-NNN.md`): 설계 결정/미완료/삽질 시만 (선택)
+- **start.md**에서 자동으로 오늘/어제 devlog를 읽어 맥락 복구
+
+### ▶️ `start.md` / ⏹️ `end.md` — 세션 관리
+
+- **start**: 에이전트 룰 로딩 + devlog 맥락 복구 + Git 상태 + Vikunja TODO
+- **end**: known-issues 업데이트 + devlog 기록 + Vikunja 동기화 + Git commit/push
+
+---
+
+## 사용법
+
+### 프로젝트별 워크플로우와 함께 사용하기
+
+이 범용 워크플로우와 프로젝트별 워크플로우(예: Vikunja 동기화, Gitea 연동)는 **함께 사용**합니다:
+
+```
+.agent/
+├── AGENT.md                    ← 범용 (공통)
+├── references/                 ← 범용 + 프로젝트 특화
+│   ├── known-issues.md         ← 범용 (공통)
+│   └── ...                     ← 프로젝트에 맞게 작성
+└── workflows/
+    ├── pre-task.md             ← 범용 (공통)
+    ├── debug.md                ← 범용 (공통)
+    ├── start.md                ← 범용 기반 + 프로젝트 단계 추가
+    ├── end.md                  ← 범용 기반 + 프로젝트 단계 추가
+    ├── services.md             ← ⭐ 프로젝트별 (서비스 + 프로토콜 + 개발/테스트)
+    ├── check-vikunja.md        ← ⭐ 프로젝트별
+    ├── check-gitea.md          ← ⭐ 프로젝트별
+    └── helpers/
+        ├── vikunja_helper.py   ← ⭐ 프로젝트별
+        └── wiki_helper.py      ← ⭐ 프로젝트별
+```
+
+### 다른 AI IDE에서도 사용하기
+
+| 대상 플랫폼 | 방법 |
+|------------|------|
+| **Cursor** | `AGENT.md` → `.cursor/rules/agent.mdc` (alwaysApply) |
+| **Claude Code** | `AGENT.md` → `CLAUDE.md`, references를 `@import` |
+| **Windsurf** | `AGENT.md` → `.windsurfrules` 또는 `.windsurf/rules/agent.md` |
+| **Cline/Roo** | 루트에 `AGENTS.md`로 복사 |
+| **Gemini** | `AGENT.md` → `.gemini/GEMINI.md` |
+
+---
+
+## 연구 근거 요약
+
+이 시스템의 각 설계 결정은 학술 연구와 실무 사례에 근거합니다:
+
+| 설계 결정 | 근거 |
+|----------|------|
+| NEVER > ALWAYS (금지 규칙 우선) | Community 검증 — "NEVER use X" ≫ "always prefer Y" |
+| 2회 실패 시 자동 중단 | Streak Breaker / Sentinel Check 연구 |
+| 실패 기록 누적 | Reflexion Framework (텍스트 피드백 기반 자기 교정) |
+| 사전 체크리스트 강제 | Claude Skills 체크리스트 + GPT Chain-of-Thought |
+| Progressive Disclosure | Anthropic Context Engineering (2025) |
+| 300줄 이하 규칙 | Claude `CLAUDE.md` 공식 권장 (토큰 효율성) |
+| 코드 예시 > 설명 | GitHub Copilot Agents, AGENTS.md 공통 Best Practice |
--- a/.agent/references/STATUS.md
+++ b/.agent/references/STATUS.md
@@ -0,0 +1,29 @@
+# Project Status
+
+## 기능 현황
+
+| 기능 | 상태 | 비고 |
+|------|------|------|
+| YouTube 다운로드 | ✅ 완료 | yt-dlp + 쿠키 인증 |
+| 프레임 추출 | ✅ 완료 | fps=2 기본값 |
+| 패턴 감지 (overlay) | ✅ 완료 | Tab 라인 검증 포함 |
+| 패턴 감지 (split) | ✅ 완료 | 밝기 기준 엄격화 |
+| 패턴 감지 (scroll) | ✅ 완료 | 기본 폴백 |
+| MSE 기반 중복 제거 | ✅ 완료 | 히스토그램 → MSE 전환 |
+| 오버레이 정규화 비교 | ✅ 완료 | 320×120 정규화 + 슬라이딩 윈도우 |
+| PDF/PNG 생성 | ✅ 완료 | A4 + 롱 이미지 |
+
+## 최근 변경
+
+| 날짜 | 변경 내용 |
+|------|-----------|
+| 2026-03-24 | 패턴 감지 고도화: overlay→split→scroll 우선순위 |
+| 2026-03-24 | 히스토그램 비교 → MSE 픽셀 비교로 전환 |
+| 2026-03-24 | split 모드: 42% 크롭 + 밝기 필터 + Tab 라인 검증 |
+| 2026-03-24 | overlay 모드: 정규화 + 슬라이딩 윈도우 중복 제거 |
+| 2026-03-24 | split 감지 조건 엄격화 (top>180, bottom<100) |
+
+## 알려진 제한사항
+
+- 오버레이형 영상(空奏列車)에서 추출 프레임 수가 아직 많을 수 있음 (MSE 임계값 추가 튜닝 필요)
+- 영상 내 Tab이 반복되는 곡은 실제 고유 프레임 수가 적음 (正常 동작)
--- a/.agent/references/architecture.md
+++ b/.agent/references/architecture.md
@@ -0,0 +1,58 @@
+# Architecture
+
+> YouTube 기타 TAB 영상 → PDF 자동 추출 파이프라인
+
+## 프로젝트 개요
+
+YouTube 기타 TAB 튜토리얼 영상을 입력받아, 컴퓨터 비전으로 TAB 악보 영역만 추출하고 중복을 제거하여 깨끗한 PDF/PNG로 출력하는 CLI 도구.
+
+## 디렉토리 구조
+
+```
+project-root/
+├── youtube_tab_to_pdf.py  # 메인 파이프라인 (단일 파일)
+├── .env                   # yt-dlp 쿠키 경로 설정
+├── .gitignore
+├── output/                # 생성된 PDF/PNG/디버그 프레임
+├── .agent/                # AI 에이전트 설정
+│   ├── AGENT.md
+│   ├── references/        # 프로젝트 문서
+│   └── workflows/         # 워크플로우 정의
+└── docs/devlog/           # 개발 로그
+```
+
+## 핵심 모듈 (youtube_tab_to_pdf.py 내부)
+
+| 모듈 | 역할 | 비고 |
+|------|------|------|
+| `download_video()` | yt-dlp로 영상 다운로드 | .env의 쿠키 경로 사용 |
+| `extract_frames()` | 영상 → 프레임 분리 (fps=2) | OpenCV VideoCapture |
+| `detect_pattern()` | 영상 패턴 분류 (overlay→split→scroll) | 우선순위: overlay > split > scroll |
+| `_detect_tab_overlay()` | 흰 박스 + Tab 라인 검출 | HoughLinesP 기반 |
+| `_detect_split_screen()` | 상단 Tab + 하단 핸드캠 분할 검출 | 밝기 기준 (top>180, bottom<100) |
+| `_has_tab_lines()` | 수평 Staff 라인 존재 여부 | 모폴로지 + HoughLinesP |
+| `compare_frames()` | MSE 기반 프레임 유사도 비교 | 320px 리사이즈, factor 8 |
+| `extract_unique_*()` | 패턴별 고유 프레임 추출 | scroll/overlay/split 3종 |
+| `generate_pdf()` | 프레임 → A4 PDF + 롱 이미지 생성 | ReportLab + Pillow |
+
+## 데이터 흐름
+
+```mermaid
+graph LR
+    A[YouTube URL] --> B[yt-dlp 다운로드]
+    B --> C[프레임 추출<br>fps=2]
+    C --> D{패턴 감지}
+    D -->|overlay| E1[Tab 오버레이 크롭<br>+ 정규화 비교]
+    D -->|split| E2[상단 42% 크롭<br>+ 밝기/라인 필터]
+    D -->|scroll| E3[상단 크롭]
+    E1 --> F[중복 제거<br>MSE 비교]
+    E2 --> F
+    E3 --> F
+    F --> G[PDF + PNG 생성]
+```
+
+## 패턴 감지 우선순위
+
+1. **overlay** — 화면 위에 떠 있는 Tab 박스 (가장 구체적)
+2. **split** — 상단 Tab 용지 + 하단 핸드캠 (엄격한 밝기 기준)
+3. **scroll** — 상단 크롭 (기본 폴백)
--- a/.agent/references/conventions.md
+++ b/.agent/references/conventions.md
@@ -0,0 +1,45 @@
+# Coding Conventions
+
+> AI 에이전트는 코드를 작성하기 전 이 컨벤션을 확인합니다.
+
+## 네이밍
+
+| 대상 | 규칙 | 예시 |
+|------|------|------|
+| 변수/함수 | camelCase | `getUserData()` |
+| 클래스 | PascalCase | `UserService` |
+| 상수 | UPPER_SNAKE_CASE | `MAX_RETRY_COUNT` |
+| 파일명 | kebab-case | `user-service.js` |
+| CSS 클래스 | kebab-case | `.nav-header` |
+
+## 코드 스타일
+
+- 들여쓰기: (2 spaces / 4 spaces / tab)
+- 세미콜론: (사용 / 미사용)
+- 따옴표: (single / double)
+- 줄바꿈: LF (Unix style)
+
+## 커밋 메시지
+
+```
+<type>(<scope>): <description>
+
+type: feat|fix|refactor|test|docs|chore|ci|infra
+scope: (선택)
+```
+
+**예시:**
+- `feat(server): add WebSocket reconnection logic`
+- `fix(frontend): resolve button overlap on mobile`
+- `docs: update API documentation`
+
+## 주석
+
+- 한국어/영어 혼용 가능
+- TODO 주석: `// TODO: 설명` 형식
+- 복잡한 로직에는 반드시 WHY(왜) 주석 추가
+
+## 테스트
+
+- 테스트 파일 위치: (예: `__tests__/` 또는 `*.test.js`)
+- 테스트 네이밍: `should [expected behavior] when [condition]`
--- a/.agent/references/known-issues.md
+++ b/.agent/references/known-issues.md
@@ -0,0 +1,57 @@
+# Known Issues & Lessons Learned
+
+> **이 파일은 SSOT(Single Source of Truth)입니다.**
+> 디버깅이나 구현 전에 **반드시** 이 파일을 확인하세요.
+> 세션 종료 시 새로 발견된 이슈를 이 파일에 추가합니다.
+
+---
+
+## 포맷
+
+각 항목은 아래 형식을 따릅니다:
+
+```markdown
+### [날짜] [키워드] — 한줄 요약
+- **증상**: 무엇이 잘못되었는가
+- **원인**: 근본 원인
+- **해결**: 올바른 해결 방법
+- **주의**: 재발 방지를 위한 교훈
+```
+
+---
+
+## 공통 이슈
+
+### [2026-03-08] PowerShell curl — Invoke-WebRequest 충돌
+- **증상**: `curl` 명령이 예상과 다른 응답 형식을 반환
+- **원인**: PowerShell에서 `curl`은 `Invoke-WebRequest`의 별칭
+- **해결**: **`curl.exe`**를 명시적으로 사용
+- **주의**: HTTP 관련 모든 명령에서 `curl.exe` 사용 필수
+
+### [2026-03-08] PowerShell npm — 실행 정책 오류
+- **증상**: `npm run` 명령이 `실행 정책` 관련 오류로 실패
+- **원인**: PowerShell 스크립트 실행 정책이 제한적으로 설정됨
+- **해결**: `cmd /c npm run dev` 형식으로 cmd를 통해 실행
+- **주의**: npm 관련 명령은 항상 `cmd /c` 접두어 사용 권장
+
+---
+
+## 프로젝트별 이슈
+
+### [2026-03-24] 히스토그램 비교 — Tab 프레임 구분 불가
+- **증상**: 서로 다른 Tab 페이지가 "동일"로 판정되어 프레임 수가 과소 추출
+- **원인**: 히스토그램 상관관계는 밝기 분포만 비교 — Tab 용지 배경이 같으면 다른 내용도 동일로 판정
+- **해결**: MSE 픽셀 비교로 전환 (320px 리사이즈, factor 8)
+- **주의**: MSE scaling factor는 영상 타입에 따라 다르게 동작할 수 있음
+
+### [2026-03-24] split 감지 — 밝은 배경 오분류
+- **증상**: 밝은 배경 기타 연주 영상(空奏列車)이 split으로 오분류
+- **원인**: 흰 셔츠/밝은 배경이 상단 밝기 조건 통과, 기타줄이 _has_tab_lines 통과
+- **해결**: split 검증 강화 (top>180, bottom<100, diff>80, 4+ lines) + overlay 우선 검사
+- **주의**: 패턴 감지 순서는 반드시 overlay → split → scroll 유지
+
+### [2026-03-24] overlay 크롭 크기 불일치
+- **증상**: overlay 프레임 비교 시 모든 프레임이 "다르다"로 판정 (1000+개 추출)
+- **원인**: _detect_tab_overlay가 프레임마다 다른 크기의 바운딩박스 반환 (69~360px)
+- **해결**: 320×120 흰색 캔버스에 정규화 후 비교 + 슬라이딩 윈도우(5프레임)
+- **주의**: overlay 프레임 수 최적화는 아직 진행 중 (추가 튜닝 필요)
--- a/.agent/references/tech-stack.md
+++ b/.agent/references/tech-stack.md
@@ -0,0 +1,38 @@
+# Tech Stack
+
+> AI 에이전트는 구현 전 이 문서를 확인하여 올바른 기술/버전을 사용합니다.
+
+## 언어 & 런타임
+
+| 항목 | 버전 | 비고 |
+|------|------|------|
+| Python | 3.x | `C:\ProgramData\miniforge3\envs\score\python.exe` |
+
+## 주요 라이브러리
+
+| 항목 | 용도 |
+|------|------|
+| OpenCV (cv2) | 프레임 추출, 이미지 처리, HoughLinesP 라인 검출 |
+| NumPy | 배열 연산, 밝기 계산, MSE 비교 |
+| yt-dlp | YouTube 영상 다운로드 |
+| Pillow (PIL) | 이미지 변환 (BGR→RGB), 롱 이미지 생성 |
+| ReportLab | PDF 생성 (A4 레이아웃) |
+| python-dotenv | .env 파일 로딩 (쿠키 경로) |
+
+## 패키지 관리
+
+- 패키지 매니저: conda (miniforge3, `score` 환경)
+- 가상환경: `C:\ProgramData\miniforge3\envs\score`
+
+## 개발 도구
+
+| 도구 | 명령어 |
+|------|--------|
+| 실행 | `C:\ProgramData\miniforge3\envs\score\python.exe youtube_tab_to_pdf.py <URL>` |
+| 디버그 실행 | 위 명령어 + `--debug` |
+
+## 환경 변수
+
+| 변수명 | 용도 | 설정 위치 |
+|--------|------|-----------|
+| COOKIE_PATH | yt-dlp 쿠키 파일 경로 | `.env` |
--- a/.agent/workflows/debug.md
+++ b/.agent/workflows/debug.md
@@ -0,0 +1,52 @@
+---
+description: 에러/버그 발생 시 체계적 디버깅 워크플로우 (에러, 안돼요, 왜 안돼, 버그, 디버그, 수정)
+---
+
+# Debug Workflow
+
+> [!IMPORTANT]
+> 추측으로 코드를 수정하지 마세요. 반드시 이 순서를 따릅니다.
+
+## 1단계: 정보 수집 (추측 금지)
+
+- [ ] 에러 메시지 **전문** 확인 (절대 잘라내지 않기)
+- [ ] 관련 로그 파일 확인
+- [ ] 환경 정보 확인 (OS, Node/Python 버전, 의존성 버전 등)
+- [ ] 에러가 발생하는 **정확한 입력/조건** 파악
+
+## 2단계: Known Issues 확인
+
+`.agent/references/known-issues.md`를 읽고 동일하거나 유사한 문제가 있는지 확인합니다.
+
+> [!CAUTION]
+> **known-issues 확인 없이 해결 시도를 시작하지 마세요.**
+> 이미 해결된 문제를 다시 삽질하는 것은 시간 낭비입니다.
+
+## 3단계: 근본 원인 분석
+
+- [ ] 에러가 발생하는 **정확한 코드 위치** 확인
+- [ ] 가설을 세우고, 가설을 검증할 수 있는 **최소한의 테스트** 수행
+- [ ] 가설이 틀렸다면 **즉시 다른 가설로 전환**
+
+> [!WARNING]
+> **동일한 접근을 2회 초과 시도하지 마세요.**
+> 2회 실패 시 유저에게 보고하고 판단을 요청합니다.
+> 보고 내용: 시도한 것 / 실패한 것 / 원인 가설 / 다음 제안
+
+## 4단계: 수정 및 검증
+
+- [ ] 수정 적용
+- [ ] 동일 에러가 재현되지 않는지 확인
+- [ ] 사이드 이펙트(다른 기능에 영향) 없는지 확인
+
+## 5단계: 기록
+
+- [ ] `known-issues.md`에 새 항목 추가 (아래 포맷 사용)
+
+```markdown
+### [날짜] [키워드] — 한줄 요약
+- **증상**: 무엇이 잘못되었는가
+- **원인**: 근본 원인
+- **해결**: 올바른 해결 방법
+- **주의**: 재발 방지를 위한 교훈
+```
--- a/.agent/workflows/end.md
+++ b/.agent/workflows/end.md
@@ -0,0 +1,188 @@
+---
+description: 세션 종료 시 devlog 기록 + git commit + Vikunja 동기화 (끝, 마무리, 커밋해, 완료)
+---
+
+# 세션 종료 프로토콜
+
+작업 완료, "끝", "마무리", "커밋해" 등 요청 시 이 워크플로우를 실행합니다.
+
+// turbo-all
+
+## 0. 학습 기록 (실패/시행착오 저장)
+
+이번 세션에서 발생한 실패, 시행착오, 새로 알게 된 사실을 정리합니다:
+
+- [ ] `.agent/references/known-issues.md`에 추가할 항목이 있는지 확인
+- [ ] 있다면 아래 포맷으로 추가:
+
+```markdown
+### [날짜] [키워드] — 한줄 요약
+- **증상**: ...
+- **원인**: ...
+- **해결**: ...
+- **주의**: ...
+```
+
+## 1. Devlog 기록
+
+### Index 업데이트 (필수 — 매 작업)
+
+오늘 날짜의 index 파일에 완료된 작업 1줄을 추가합니다.
+
+- **파일**: `docs/devlog/YYYY-MM-DD.md`
+- **형식**:
+```markdown
+| NNN | HH:MM | 작업 설명 | `커밋해시` | ✅ 또는 🔧 |
+```
+
+> [!TIP]
+> - ✅ = 완료, 🔧 = 미완료 (다음 세션에서 이어받기)
+> - 파일이 없으면 새로 생성 (테이블 헤더 포함)
+
+### Entry 작성 (선택적 — 필요할 때만)
+
+> [!IMPORTANT]
+> Entry는 **git/Vikunja/wiki에 없는 정보**가 있을 때만 작성합니다.
+
+**Entry 작성 기준:**
+- ✅ 설계 결정이 있었을 때 (왜 A가 아닌 B를 선택했는지)
+- ✅ 미완료 사항이 있을 때 (다음 세션이 이어받아야 할 맥락)
+- ✅ 삽질/트러블슈팅이 있었을 때 (같은 실수 방지)
+
+**Entry 불필요:**
+- ❌ 단순 버그 픽스 (커밋 메시지로 충분)
+- ❌ 문서 업데이트 (git diff로 충분)
+- ❌ 이미 Vikunja 태스크에 상세 설명이 있는 경우
+
+**Entry 파일**: `docs/devlog/entries/YYYYMMDD-NNN.md`
+```markdown
+# 작업 제목
+
+- **시간**: YYYY-MM-DD HH:MM~HH:MM
+- **Commit**: `해시`
+- **Vikunja**: #태스크번호 → done/진행중
+
+## 결정 사항
+- 왜 이 방식을 선택했는지
+
+## 미완료
+- 남은 작업 (있을 경우)
+```
+
+---
+
+## 2. Vikunja 동기화
+
+> [!CAUTION]
+> **반드시 `vikunja_helper.py` 사용.** 직접 API 호출 금지.
+> Vikunja API는 POST 시 body에 없는 필드를 빈값으로 덮어씁니다.
+
+### 2-1. 커밋 전수 검사
+
+이번 세션의 **모든 커밋을 하나씩 검사**하고 Vikunja에 매핑합니다.
+
+```powershell
+git log --oneline -20
+```
+
+| 커밋 유형 | Vikunja 액션 |
+|-----------|-------------|
+| 기존 태스크 해당 작업 **완료** | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py done {ID}` |
+| 신규 작업 완료 (기존 태스크 없음) | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --done --labels Backend,Priority:High` |
+| 작업 중 발견된 **미완료 TODO** | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:Mid` |
+
+> [!IMPORTANT]
+> 모든 커밋이 기존 또는 신규 태스크에 매핑되었는지 확인.
+
+### 2-2. 완료 처리
+
+```powershell
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py done {TASK_ID}
+```
+
+### 2-3. 신규 태스크 생성
+
+```powershell
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:High
+```
+
+### 라벨 규칙
+
+**영역 (필수 1개 이상):** `Backend` / `Frontend` / `Engine` / `Infra` / `Test`
+**우선순위 (필수 1개):** `Priority:High` / `Priority:Mid` / `Priority:Low`
+
+---
+
+## 3. 설계 문서 업데이트 (필수)
+
+- [ ] `.agent/references/STATUS.md` — 모듈 상태 테이블 업데이트 (변경 있었다면)
+  - 기능 추가/제거 → 기능 목록 테이블 업데이트
+  - 모듈 변경 → 모듈 상태 + 최근 변경 업데이트
+  - 마일스톤 달성 → 최근 마일스톤 추가
+- [ ] `.agent/references/architecture.md` — 구조 변경이 있었다면 업데이트
+
+## 4. Wiki 동기화 (로컬 변경 기반)
+
+로컬 references 파일이 변경되었는지 확인합니다:
+
+```powershell
+git diff --name-only .agent/references/
+```
+
+변경된 파일이 있으면 **파일별로** Wiki에 업로드합니다:
+
+| 로컬 파일 | Wiki 페이지 |
+|-----------|------------|
+| `STATUS.md` | "Status" |
+| `architecture.md` | "Architecture" |
+
+```powershell
+# STATUS.md가 변경된 경우
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\wiki_helper.py update "Status" .agent\references\STATUS.md
+```
+```powershell
+# architecture.md가 변경된 경우
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\wiki_helper.py update "Architecture" .agent\references\architecture.md
+```
+
+> [!TIP]
+> 변경된 파일이 없으면 이 단계를 건너뜁니다.
+
+---
+
+## 5. Git Commit & Push
+
+```powershell
+git add -A
+git status --short
+```
+```powershell
+git commit -m "커밋 메시지"
+```
+```powershell
+git push origin main
+```
+
+**커밋 메시지 컨벤션:**
+```
+<type>(<scope>): <description>
+
+type: feat|fix|refactor|test|docs|chore|ci|infra
+scope: (선택)
+```
+
+---
+
+## 6. 최종 체크리스트
+
+> [!WARNING]
+> 아래 항목 중 하나라도 누락되면 세션 종료를 완료할 수 없습니다.
+
+- [ ] known-issues 업데이트됨 (새 이슈가 있었다면)
+- [ ] devlog index 업데이트됨
+- [ ] devlog entry 작성됨 (필요한 경우만)
+- [ ] Vikunja 태스크 생성/완료 처리됨 (커밋 전수 검사 기반)
+- [ ] STATUS.md 업데이트됨 (모듈/기능 변경이 있었다면)
+- [ ] Wiki 동기화됨 (변경된 references 파일이 있었다면)
+- [ ] git push 완료
+- [ ] 사용자에게 완료 보고
--- a/.agent/workflows/helpers/vikunja_helper.py
+++ b/.agent/workflows/helpers/vikunja_helper.py
@@ -0,0 +1,328 @@
+"""Vikunja safe task updater — preserves existing fields when updating tasks.
+
+Usage:
+  python vikunja_helper.py done 75              # Mark task #75 as done
+  python vikunja_helper.py done 71 77 78        # Mark multiple tasks done
+  python vikunja_helper.py undone 75            # Mark task #75 as not done
+  python vikunja_helper.py comment 75 "text"    # Add comment to task #75
+  python vikunja_helper.py desc 75 "text"       # Set description (appends if exists)
+  python vikunja_helper.py create "title" "desc" --labels Backend,Priority:High
+  python vikunja_helper.py create "title" "desc" --done --labels Frontend,Priority:Mid
+  python vikunja_helper.py label 75 Backend Priority:High   # Add labels to task
+  python vikunja_helper.py list                 # List all tasks
+  python vikunja_helper.py list todo            # List TODO only
+  python vikunja_helper.py list done            # List DONE only
+  python vikunja_helper.py projects             # List all Vikunja projects
+  python vikunja_helper.py report               # Project status report (current)
+  python vikunja_helper.py report <project_id>  # Project status report (specific)
+"""
+
+import sys
+import json
+import urllib.request
+import urllib.error
+import io
+
+# Fix Windows console encoding (cp949 → utf-8)
+if sys.stdout.encoding != "utf-8":
+    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf-8", errors="replace")
+
+# ============================================================
+# ⚙️ CONFIGURATION — PROJECT_ID만 프로젝트별로 변경하세요
+# ============================================================
+API_BASE = "https://plan.variet.net/api/v1"
+TOKEN = "tk_070f8e0b715e818bb7178c3815ed5389040eddca"
+PROJECT_ID = 7                                  # Variet Agent 프로젝트
+# ============================================================
+
+HEADERS = {
+    "Authorization": f"Bearer {TOKEN}",
+    "Content-Type": "application/json",
+}
+
+# Label name → Vikunja label ID mapping
+# Customize for your project's labels
+LABEL_MAP = {
+    "Backend": 1, "Frontend": 2, "Engine": 3, "Infra": 4, "Test": 5,
+    "Priority:High": 6, "Priority:Mid": 7, "Priority:Low": 8,
+    "Agent": 17, "Tool": 18, "AI/LLM": 19,
+}
+
+
+def api_get(path: str):
+    req = urllib.request.Request(f"{API_BASE}{path}", headers=HEADERS)
+    with urllib.request.urlopen(req) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(path: str, data: dict):
+    body = json.dumps(data).encode("utf-8")
+    req = urllib.request.Request(f"{API_BASE}{path}", data=body, headers=HEADERS, method="POST")
+    with urllib.request.urlopen(req) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_put(path: str, data: dict):
+    body = json.dumps(data).encode("utf-8")
+    req = urllib.request.Request(f"{API_BASE}{path}", data=body, headers=HEADERS, method="PUT")
+    with urllib.request.urlopen(req) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def get_task(task_id: int) -> dict:
+    return api_get(f"/tasks/{task_id}")
+
+
+def safe_update_task(task_id: int, updates: dict) -> dict:
+    task = get_task(task_id)
+    safe_body = {
+        "title": task.get("title", ""),
+        "description": task.get("description", ""),
+        "priority": task.get("priority", 0),
+        "done": task.get("done", False),
+    }
+    safe_body.update(updates)
+    return api_post(f"/tasks/{task_id}", safe_body)
+
+
+def mark_done(task_ids: list):
+    for tid in task_ids:
+        result = safe_update_task(tid, {"done": True})
+        title = result.get("title", "?")
+        print(f"  ✅ #{tid} → done=True  [{title}]")
+
+
+def mark_undone(task_ids: list):
+    for tid in task_ids:
+        result = safe_update_task(tid, {"done": False})
+        title = result.get("title", "?")
+        print(f"  ⬜ #{tid} → done=False  [{title}]")
+
+
+def add_comment(task_id: int, comment: str):
+    result = api_put(f"/tasks/{task_id}/comments", {"comment": comment})
+    print(f"  💬 #{task_id} comment added (id={result.get('id', '?')})")
+
+
+def set_description(task_id: int, desc: str, append: bool = True):
+    task = get_task(task_id)
+    existing = task.get("description", "") or ""
+    if append and existing:
+        new_desc = existing.rstrip() + "\n\n" + desc
+    else:
+        new_desc = desc
+    result = safe_update_task(task_id, {"description": new_desc})
+    print(f"  📝 #{task_id} description updated [{result.get('title', '?')}]")
+
+
+def list_tasks(filter_: str = "all"):
+    all_tasks = []
+    page = 1
+    while True:
+        batch = api_get(f"/projects/{PROJECT_ID}/tasks?per_page=50&page={page}")
+        if not batch:
+            break
+        all_tasks.extend(batch)
+        if len(batch) < 50:
+            break
+        page += 1
+
+    if filter_ == "todo":
+        all_tasks = [t for t in all_tasks if not t["done"]]
+    elif filter_ == "done":
+        all_tasks = [t for t in all_tasks if t["done"]]
+
+    all_tasks.sort(key=lambda t: t["id"])
+    for t in all_tasks:
+        status = "✅" if t["done"] else "⬜"
+        desc = (t.get("description") or "")[:50].replace("\n", " ")
+        labels = ", ".join(l["title"] for l in (t.get("labels") or []))
+        print(f"  {status} #{t['id']:3d}  {t['title'][:40]:<40}  [{labels}]  {desc}")
+    print(f"\n  Total: {len(all_tasks)} tasks")
+
+
+def add_labels(task_id: int, label_names: list):
+    for name in label_names:
+        label_id = LABEL_MAP.get(name)
+        if not label_id:
+            print(f"  ⚠️ Unknown label '{name}'. Valid: {', '.join(LABEL_MAP.keys())}")
+            continue
+        try:
+            api_put(f"/tasks/{task_id}/labels", {"label_id": label_id})
+            print(f"  🏷️ #{task_id} + {name} (id={label_id})")
+        except Exception as e:
+            if "already" in str(e).lower() or "409" in str(e):
+                print(f"  🏷️ #{task_id} already has {name}")
+            else:
+                print(f"  ⚠️ #{task_id} label {name} failed: {e}")
+
+
+def create_task(title: str, description: str = "", done: bool = False, labels: list = None):
+    payload = {"title": title, "description": description}
+    result = api_put(f"/projects/{PROJECT_ID}/tasks", payload)
+    task_id = result["id"]
+    print(f"  ✨ #{task_id} created: {result.get('title', '?')}")
+
+    if labels:
+        add_labels(task_id, labels)
+
+    if done:
+        result = safe_update_task(task_id, {"done": True})
+        print(f"  ✅ #{task_id} → done=True")
+
+    return result
+
+
+def list_projects():
+    """Vikunja 전체 프로젝트 목록 + 태스크 통계."""
+    projects = api_get("/projects")
+    print("📂 프로젝트 목록:")
+    for p in projects:
+        pid = p["id"]
+        title = p["title"]
+        # 각 프로젝트의 태스크 수 조회
+        try:
+            tasks = api_get(f"/projects/{pid}/tasks?per_page=200")
+            todo = sum(1 for t in tasks if not t["done"])
+            done = sum(1 for t in tasks if t["done"])
+            print(f"  #{pid:<3d} {title:<30s}  TODO: {todo}  DONE: {done}")
+        except Exception:
+            print(f"  #{pid:<3d} {title:<30s}  (조회 실패)")
+    print(f"\n  Total: {len(projects)} projects")
+
+
+def report(project_id: int = None):
+    """프로젝트 종합 현황 보고 (태스크 + git log + devlog)."""
+    import subprocess
+    from pathlib import Path
+    from datetime import datetime, timedelta
+
+    pid = project_id or PROJECT_ID
+
+    # 1) 프로젝트 이름 조회
+    try:
+        projects = api_get("/projects")
+        proj_name = next((p["title"] for p in projects if p["id"] == pid), f"Project #{pid}")
+    except Exception:
+        proj_name = f"Project #{pid}"
+
+    print(f"=== 프로젝트 현황: {proj_name} (#{pid}) ===")
+
+    # 2) 태스크 현황
+    try:
+        tasks = api_get(f"/projects/{pid}/tasks?per_page=200")
+        todo_tasks = [t for t in tasks if not t["done"]]
+        done_tasks = [t for t in tasks if t["done"]]
+        total = len(tasks)
+        rate = f"{len(done_tasks)/total*100:.0f}%" if total else "N/A"
+
+        print(f"\n[태스크]")
+        print(f"  TODO: {len(todo_tasks)}건 | DONE: {len(done_tasks)}건 | 완료율: {rate}")
+
+        if todo_tasks:
+            print(f"  미완료:")
+            for t in todo_tasks:
+                labels = ", ".join(l["title"] for l in (t.get("labels") or []))
+                label_str = f"  [{labels}]" if labels else ""
+                desc = (t.get("description") or "")[:40].replace("\n", " ")
+                desc_str = f"  {desc}" if desc else ""
+                print(f"    ⬜ #{t['id']} {t['title'][:50]}{label_str}{desc_str}")
+
+        if done_tasks:
+            # 최근 완료 5건만 표시
+            recent_done = sorted(done_tasks, key=lambda t: t.get("done_at", ""), reverse=True)[:5]
+            print(f"  최근 완료 (최대 5건):")
+            for t in recent_done:
+                print(f"    ✅ #{t['id']} {t['title'][:50]}")
+    except Exception as e:
+        print(f"  태스크 조회 실패: {e}")
+
+    # 3) Git log (현재 디렉토리 기준)
+    print(f"\n[최근 커밋 5건]")
+    try:
+        result = subprocess.run(
+            ["git", "log", "--oneline", "-5"],
+            capture_output=True, timeout=10,
+            encoding="utf-8", errors="replace",
+        )
+        if result.returncode == 0 and result.stdout.strip():
+            for line in result.stdout.strip().splitlines():
+                print(f"  {line}")
+        else:
+            print("  (git log 없음)")
+    except Exception:
+        print("  (git 실행 불가)")
+
+    # 4) Devlog (오늘/어제)
+    print(f"\n[Devlog]")
+    today = datetime.now().strftime("%Y-%m-%d")
+    yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
+
+    devlog_found = False
+    for date_str in [today, yesterday]:
+        devlog_path = Path("docs/devlog") / f"{date_str}.md"
+        if devlog_path.exists():
+            content = devlog_path.read_text(encoding="utf-8").strip()
+            # 최대 500자
+            if len(content) > 500:
+                content = content[:500] + "\n  ...(생략)"
+            print(f"  [{date_str}]")
+            for line in content.splitlines():
+                print(f"  {line}")
+            devlog_found = True
+            break
+
+    if not devlog_found:
+        print("  (최근 devlog 없음)")
+
+
+def main():
+    if len(sys.argv) < 2:
+        print(__doc__)
+        return
+
+    cmd = sys.argv[1].lower()
+
+    if cmd == "done":
+        ids = [int(x) for x in sys.argv[2:]]
+        mark_done(ids)
+    elif cmd == "undone":
+        ids = [int(x) for x in sys.argv[2:]]
+        mark_undone(ids)
+    elif cmd == "comment":
+        add_comment(int(sys.argv[2]), sys.argv[3])
+    elif cmd == "desc":
+        set_description(int(sys.argv[2]), sys.argv[3])
+    elif cmd == "list":
+        f = sys.argv[2] if len(sys.argv) > 2 else "all"
+        list_tasks(f)
+    elif cmd == "label":
+        if len(sys.argv) < 4:
+            print("Usage: vikunja_helper.py label TASK_ID Label1 Label2 ...")
+            return
+        add_labels(int(sys.argv[2]), sys.argv[3:])
+    elif cmd == "create":
+        title = sys.argv[2] if len(sys.argv) > 2 else ""
+        desc = sys.argv[3] if len(sys.argv) > 3 and not sys.argv[3].startswith("--") else ""
+        is_done = "--done" in sys.argv
+        labels = None
+        for i, arg in enumerate(sys.argv):
+            if arg == "--labels" and i + 1 < len(sys.argv):
+                labels = sys.argv[i + 1].split(",")
+                break
+        if not title:
+            print("Error: title is required")
+            return
+        create_task(title, desc, done=is_done, labels=labels)
+    elif cmd == "projects":
+        list_projects()
+    elif cmd == "report":
+        pid = int(sys.argv[2]) if len(sys.argv) > 2 else None
+        report(pid)
+    else:
+        print(f"Unknown command: {cmd}")
+        print(__doc__)
+
+
+if __name__ == "__main__":
+    main()
--- a/.agent/workflows/helpers/wiki_helper.py
+++ b/.agent/workflows/helpers/wiki_helper.py
@@ -0,0 +1,100 @@
+"""Gitea Wiki helper: list, read, create, update wiki pages.
+
+Usage:
+  wiki_helper.py list                     — list all pages
+  wiki_helper.py read <title>             — read a page
+  wiki_helper.py create <title> <file>    — create a page from file
+  wiki_helper.py update <title> <file>    — update a page from file
+"""
+import sys, io, json, base64, urllib.request, urllib.error
+
+sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
+
+# ============================================================
+# ⚙️ CONFIGURATION — GITEA_REPO만 프로젝트별로 변경하세요
+# ============================================================
+GITEA_BASE_URL = "https://git.variet.net"
+GITEA_OWNER = "Variet"
+GITEA_REPO = "variet-agent"                       # Variet Agent 프로젝트
+GITEA_TOKEN = "3a01b4b15a39921572e64c413353e870d4d2161b"
+# ============================================================
+
+BASE = f"{GITEA_BASE_URL}/api/v1/repos/{GITEA_OWNER}/{GITEA_REPO}/wiki"
+HEADERS = {"Authorization": f"token {GITEA_TOKEN}", "Content-Type": "application/json"}
+
+def _req(method, path, data=None):
+    url = f"{BASE}{path}"
+    body = json.dumps(data).encode() if data else None
+    req = urllib.request.Request(url, data=body, headers=HEADERS, method=method)
+    try:
+        with urllib.request.urlopen(req) as resp:
+            return json.loads(resp.read().decode())
+    except urllib.error.HTTPError as e:
+        err = e.read().decode()
+        print(f"  ⚠️ HTTP {e.code}: {err}")
+        return None
+
+def _find_sub_url(title):
+    pages = _req("GET", "/pages")
+    if pages:
+        for p in pages:
+            if p.get("title", "").lower() == title.lower():
+                return p.get("sub_url", title)
+    return title
+
+def list_pages():
+    pages = _req("GET", "/pages")
+    if pages:
+        print(f"=== {len(pages)} Wiki Pages ===")
+        for p in pages:
+            print(f"  {p.get('title', '?')}")
+    return pages
+
+def read_page(title):
+    sub = _find_sub_url(title)
+    page = _req("GET", f"/page/{sub}")
+    if page and page.get("content_base64"):
+        content = base64.b64decode(page["content_base64"]).decode("utf-8")
+        return content
+    return None
+
+def create_page(title, content):
+    data = {
+        "title": title,
+        "content_base64": base64.b64encode(content.encode()).decode(),
+    }
+    result = _req("POST", "/new", data)
+    if result:
+        print(f"  ✅ Created wiki page: {title}")
+    return result
+
+def update_page(title, content):
+    sub = _find_sub_url(title)
+    data = {
+        "title": title,
+        "content_base64": base64.b64encode(content.encode()).decode(),
+    }
+    result = _req("PATCH", f"/page/{sub}", data)
+    if result:
+        print(f"  ✅ Updated wiki page: {title}")
+    return result
+
+if __name__ == "__main__":
+    cmd = sys.argv[1] if len(sys.argv) > 1 else "list"
+
+    if cmd == "list":
+        list_pages()
+    elif cmd == "read" and len(sys.argv) > 2:
+        content = read_page(sys.argv[2])
+        if content:
+            print(content[:5000])
+        else:
+            print(f"  Page '{sys.argv[2]}' not found")
+    elif cmd == "create" and len(sys.argv) > 3:
+        with open(sys.argv[3], "r", encoding="utf-8") as f:
+            create_page(sys.argv[2], f.read())
+    elif cmd == "update" and len(sys.argv) > 3:
+        with open(sys.argv[3], "r", encoding="utf-8") as f:
+            update_page(sys.argv[2], f.read())
+    else:
+        print("Usage: wiki_helper.py list|read <title>|create <title> <file>|update <title> <file>")
--- a/.agent/workflows/pre-task.md
+++ b/.agent/workflows/pre-task.md
@@ -0,0 +1,39 @@
+---
+description: 모든 구현 작업 전 실행하는 사전 점검 체크리스트 (pre-task, 준비, 시작 전, 계획, 구현)
+---
+
+# Pre-Task Checklist
+
+> [!IMPORTANT]
+> 코딩을 시작하기 전에 반드시 이 체크리스트를 순서대로 완료하세요.
+> 체크리스트를 건너뛸 경우 불필요한 시행착오가 발생합니다.
+
+## 1단계: 요구사항 정리
+
+- [ ] 유저 요청을 구체적 작업 항목으로 분해
+- [ ] 변경 범위(scope)를 명확히 정의 (영향받는 파일/모듈)
+- [ ] 성공 기준(acceptance criteria) 확인
+
+## 2단계: 레퍼런스 확인 (추측 금지)
+
+- [ ] `.agents/references/architecture.md` — 현재 아키텍처 확인
+- [ ] `.agents/references/tech-stack.md` — 기술 스택 및 버전 확인
+- [ ] `.agents/references/conventions.md` — 코딩 컨벤션 확인
+- [ ] `.agents/references/known-issues.md` — 과거 실패 패턴 확인
+- [ ] 관련 기존 코드 최소 3개 파일 읽기
+
+> [!CAUTION]
+> 레퍼런스 문서가 존재하는 주제에 대해 추측하지 마세요.
+> 문서가 없으면 유저에게 확인을 요청하세요.
+
+## 3단계: 계획 수립
+
+- [ ] 변경할 파일 목록 작성
+- [ ] 의존성 순서 파악 (어떤 파일부터 수정해야 하는가?)
+- [ ] 리스크 식별 (어디서 실패할 가능성이 높은가?)
+- [ ] 테스트 방법 결정 (어떻게 검증할 것인가?)
+
+## 4단계: 유저 확인
+
+- [ ] 계획을 유저에게 보고하고 승인받기 (변경 파일 3개 이상인 경우)
+- [ ] 작은 변경은 바로 실행하되, 변경 내용을 명확히 설명
--- a/.agent/workflows/services.md
+++ b/.agent/workflows/services.md
@@ -0,0 +1,78 @@
+---
+description: 프로젝트 서비스 연동 정보 + 작업 프로토콜 (서비스, 크레덴셜, API)
+---
+
+# 서비스 연동 정보
+
+> [!CAUTION]
+> 아래에는 API 토큰이 포함되어 있습니다. 외부에 노출하지 마세요.
+
+## 런타임 환경
+
+| 항목 | 값 |
+|------|-----|
+| **Python** | `C:\ProgramData\miniforge3\envs\score\python.exe` |
+| **Shell** | PowerShell (`curl` = `Invoke-WebRequest` 별칭이므로 반드시 **`curl.exe`** 사용) |
+
+> [!TIP]
+> 기술 스택 상세: `.agent/references/tech-stack.md` 참조
+> PowerShell 주의사항: `.agent/AGENT.md` PowerShell Notes 참조
+
+## Gitea (Git Repository)
+
+| 항목 | 값 |
+|------|-----|
+| **Base URL** | `https://git.variet.net` |
+| **API Base** | `https://git.variet.net/api/v1` |
+| **Repo** | `Variet/guitar_score` |
+| **Token** | `3a01b4b15a39921572e64c413353e870d4d2161b` |
+| **Auth Header** | `-H "Authorization: token 3a01b4b15a39921572e64c413353e870d4d2161b"` |
+
+## Vikunja (Task Management)
+
+| 항목 | 값 |
+|------|-----|
+| **Base URL** | `https://plan.variet.net` |
+| **API Base** | `https://plan.variet.net/api/v1` |
+| **Project ID** | `12` |
+| **Token** | `tk_070f8e0b715e818bb7178c3815ed5389040eddca` |
+| **Auth Header** | `-H "Authorization: Bearer tk_070f8e0b715e818bb7178c3815ed5389040eddca"` |
+
+### Vikunja 태스크 조회
+
+> [!TIP]
+> 직접 API 호출 대신 반드시 helper 스크립트를 사용하세요.
+
+```powershell
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
+```
+
+### Vikunja 라벨 체계
+
+**영역 라벨 (필수, 1개 이상):**
+
+| ID | 라벨 | 적용 대상 |
+|:--:|-------|-----------|
+| 1 | `Backend` | 서버, DB, API |
+| 2 | `Frontend` | UI, 인터페이스 |
+| 3 | `Engine` | 엔진 로직/연산 |
+| 4 | `Infra` | Docker, CI/CD, 배포 |
+| 5 | `Test` | 테스트, E2E |
+| 17 | `Agent` | 에이전트 관련 |
+| 18 | `Tool` | 도구 관련 |
+| 19 | `AI/LLM` | AI/LLM 관련 |
+
+**우선순위 라벨 (필수, 1개):**
+
+| ID | 라벨 | 기준 |
+|:--:|-------|------|
+| 6 | `Priority:High` | 장애, 필수 기능 |
+| 7 | `Priority:Mid` | 개선, UX, 리팩토링 |
+| 8 | `Priority:Low` | nice-to-have |
+
+## 모니터링 서비스
+
+| 서비스 | URL | 용도 |
+|--------|-----|------|
+| Uptime Kuma | `https://status.variet.net` | 서비스 모니터링 |
+| Authentik | `https://auth.variet.net` | SSO 인증 |
--- a/.agent/workflows/start.md
+++ b/.agent/workflows/start.md
@@ -0,0 +1,66 @@
+---
+description: 세션 시작 시 프로젝트 맥락을 빠르게 복구하는 워크플로우 (시작, continue, 이어서, 작업 시작)
+---
+
+# 세션 시작 프로토콜
+
+새 대화 시작, "continue", "이어서", "작업 시작" 등 요청 시 이 워크플로우를 실행합니다.
+
+// turbo-all
+
+## 절차
+
+### 0. 에이전트 룰 & 맥락 로딩 (자동)
+
+`.agent/AGENT.md`를 읽고 에이전트 행동 규칙을 로딩합니다.
+`.agent/references/known-issues.md`를 읽어 최근 이슈를 파악합니다.
+`.agent/references/STATUS.md`를 읽어 전체 설계 현황과 기능 목록을 파악합니다.
+
+### 1. Devlog 맥락 복구
+
+오늘 + 어제 devlog index를 읽고 최근 작업 흐름을 파악합니다.
+
+```powershell
+$today = Get-Date -Format "yyyy-MM-dd"
+$yesterday = (Get-Date).AddDays(-1).ToString("yyyy-MM-dd")
+if (Test-Path "docs\devlog\$today.md") {
+    Write-Host "=== Devlog: $today ==="
+    Get-Content "docs\devlog\$today.md"
+} elseif (Test-Path "docs\devlog\$yesterday.md") {
+    Write-Host "=== Devlog: $yesterday (no entry for today yet) ==="
+    Get-Content "docs\devlog\$yesterday.md"
+} else {
+    Write-Host "=== No recent devlog found ==="
+}
+```
+
+미완료(🔧) 항목이 있으면 해당 entry 파일을 읽어 이어받기 맥락을 확보합니다:
+- Entry 경로: `docs/devlog/entries/YYYYMMDD-NNN.md`
+
+### 2. Git 상태 확인
+
+```powershell
+git status --short
+```
+```powershell
+git log --oneline -5
+```
+
+### 3. Vikunja TODO 태스크
+
+```powershell
+C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
+```
+
+### 4. 종합 보고
+
+결과를 종합하여 사용자에게 보고:
+- 마지막 작업 맥락 + 미완료 항목 (devlog 🔧 기반)
+- TODO 태스크 목록 (라벨 + 우선순위)
+- 다음 작업 제안
+
+**우선순위 판단 기준** (라벨만으로 판단 금지):
+- P0: 최근 커밋에서 스키마/모델/인터페이스 변경 → 연쇄 영향 점검
+- P1: 서버 기동/API 응답 장애
+- P2: 기능 미완성/UX 개선
+- P3: 정확도 향상, 신규 기능, CI/CD, 문서 정리
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,17 @@
+# Output
+output/
+
+# Environment
+.env
+
+# Python
+__pycache__/
+*.pyc
+*.pyo
+
+# Debug/Temp
+debug_sim.py
+
+# OS
+Thumbs.db
+.DS_Store
--- a/docs/devlog/2026-03-24.md
+++ b/docs/devlog/2026-03-24.md
@@ -0,0 +1,5 @@
+# Devlog — 2026-03-24
+
+| # | 시간 | 작업 설명 | 커밋 | 상태 |
+|---|------|-----------|------|------|
+| 1 | 22:00~23:20 | 패턴 감지 고도화 (split/overlay 구분, MSE 비교, 밝기 필터, 크롭 최적화) | `init` | 🔧 |
--- a/docs/devlog/entries/20260324-001.md
+++ b/docs/devlog/entries/20260324-001.md
@@ -0,0 +1,33 @@
+# 패턴 감지 고도화 & MSE 비교 전환
+
+- **시간**: 2026-03-24 22:00~23:20
+- **Commit**: `init` (첫 커밋)
+
+## 결정 사항
+
+### 1. 패턴 감지 우선순위: overlay → split → scroll
+- overlay가 가장 구체적(흰 박스 + Tab 라인)이므로 최우선 검사
+- split은 밝기 기준이 느슨할 수 있어 overlay 이후에 검사
+- **이전 문제**: split을 먼저 검사하면 밝은 배경의 overlay 영상(空奏列車)이 split으로 오분류됨
+
+### 2. 히스토그램 → MSE 비교 전환
+- 히스토그램 상관관계는 밝기 분포만 비교하여 서로 다른 Tab 페이지도 "동일"로 판정
+- MSE 픽셀 비교는 실제 내용 변화(프렛 번호, 마디 위치)를 정확히 감지
+- **MSE 스케일 팩터**: 8 (동일 프레임 sim>0.999, 커서 이동 ~0.995, Tab 전환 0.52~0.91)
+
+### 3. split 감지 조건 엄격화
+- top_brightness > 180 (Tab 용지는 거의 흰색)
+- bottom_brightness < 100 (핸드캠은 어두움)
+- 차이 > 80
+- Tab 수평 라인 4개 이상
+
+### 4. overlay 정규화 비교
+- 오버레이 크롭은 프레임마다 바운딩박스 크기가 다름 (69~360px)
+- 320×120 흰색 캔버스에 정규화 후 비교
+- 슬라이딩 윈도우 (최근 5개 고유 프레임) 사용
+
+## 미완료
+
+- overlay형(空奏列車)의 추출 프레임 수 최적화 (현재 ~1000개, 목표 20~50개)
+- MSE 임계값 추가 튜닝 필요
+- 추가 영상 타입 테스트 (순수 스크롤형)
--- a/youtube_tab_to_pdf.py
+++ b/youtube_tab_to_pdf.py
@@ -0,0 +1,627 @@
+#!/usr/bin/env python3
+"""
+YouTube Tab → PDF 캡처 파이프라인
+YouTube 기타 TAB 영상에서 Tab 프레임을 추출하여 깔끔한 PDF로 만듭니다.
+
+사용법:
+    python youtube_tab_to_pdf.py "https://youtu.be/VIDEO_ID"
+    python youtube_tab_to_pdf.py "https://youtu.be/VIDEO_ID" -o output.pdf --debug
+"""
+
+import argparse
+import os
+import sys
+import subprocess
+import shutil
+import re
+import tempfile
+from pathlib import Path
+from typing import List, Tuple, Optional
+
+import cv2
+import numpy as np
+from PIL import Image
+
+# Windows 콘솔 인코딩 강제 UTF-8
+if sys.platform == "win32":
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+    sys.stderr.reconfigure(encoding="utf-8", errors="replace")
+
+
+# ─── Configuration ───────────────────────────────────────────────────────
+
+DEFAULT_FPS = 2           # 프레임 추출 빈도 (초당 N프레임)
+DEFAULT_CROP_RATIO = 0.55 # 상단 크롭 비율 (스크롤형)
+SIMILARITY_THRESHOLD = 0.95  # 프레임 유사도 임계값 (SSIM 대신 히스토그램 비교)
+OVERLAY_MIN_AREA_RATIO = 0.05  # 오버레이 박스 최소 면적 비율
+OVERLAY_MAX_AREA_RATIO = 0.6   # 오버레이 박스 최대 면적 비율
+MIN_TAB_LINES = 4              # Tab 악보 최소 수평 라인 수 (6줄 중 4줄 이상)
+SPLIT_TOP_RATIO = 0.42         # 분할 화면 상단 영역 비율 (핸드캠 제외)
+PDF_DPI = 150
+PDF_PAGE_WIDTH_MM = 210   # A4
+
+
+# ─── Step 1: Download ────────────────────────────────────────────────────
+
+def _find_yt_dlp() -> str:
+    """yt-dlp 실행 파일 경로 찾기"""
+    yt_dlp = shutil.which("yt-dlp")
+    if yt_dlp:
+        return yt_dlp
+    # pip user-installed path (Windows)
+    for pyver in ["Python312", "Python311", "Python310"]:
+        user_scripts = Path(os.environ.get("APPDATA", "")) / "Python" / pyver / "Scripts"
+        yt_dlp_path = user_scripts / "yt-dlp.exe"
+        if yt_dlp_path.exists():
+            return str(yt_dlp_path)
+    # conda env Scripts
+    conda_path = Path(sys.executable).parent / "Scripts" / "yt-dlp.exe"
+    if conda_path.exists():
+        return str(conda_path)
+    raise RuntimeError("yt-dlp를 찾을 수 없습니다. pip install yt-dlp를 실행하세요.")
+
+
+def download_video(url: str, output_dir: Path) -> Tuple[Path, str]:
+    """yt-dlp로 YouTube 영상 다운로드. 반환: (파일 경로, 제목)"""
+    print("[1/5] 영상 다운로드 중...")
+
+    yt_dlp = _find_yt_dlp()
+
+    # 제목 추출 (encoding 안전 처리)
+    result = subprocess.run(
+        [yt_dlp, "--get-title", "--encoding", "utf-8", url],
+        capture_output=True, encoding="utf-8", errors="replace"
+    )
+    title = (result.stdout or "").strip() or "untitled"
+    # 파일명 안전 문자로 변환
+    safe_title = re.sub(r'[\\/:*?"<>|\x00-\x1f]', '_', title)[:80]
+
+    video_path = output_dir / f"{safe_title}.mp4"
+
+    if video_path.exists():
+        print(f"  → 이미 다운로드됨: {video_path.name}")
+        return video_path, safe_title
+
+    subprocess.run(
+        [yt_dlp,
+         "-f", "best[height<=720][ext=mp4]/best[ext=mp4]/best",
+         "-o", str(video_path), url],
+        encoding="utf-8", errors="replace",
+        check=True
+    )
+    print(f"  → 다운로드 완료: {video_path.name}")
+    return video_path, safe_title
+
+
+# ─── Step 2: Frame Extraction ────────────────────────────────────────────
+
+def extract_frames(video_path: Path, fps: float = DEFAULT_FPS) -> List[np.ndarray]:
+    """OpenCV VideoCapture로 프레임 추출"""
+    print(f"[2/5] 프레임 추출 중 (fps={fps})...")
+    cap = cv2.VideoCapture(str(video_path))
+    if not cap.isOpened():
+        raise RuntimeError(f"영상을 열 수 없습니다: {video_path}")
+
+    video_fps = cap.get(cv2.CAP_PROP_FPS)
+    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+    frame_interval = max(1, int(video_fps / fps))
+
+    frames = []
+    frame_idx = 0
+    while True:
+        ret, frame = cap.read()
+        if not ret:
+            break
+        if frame_idx % frame_interval == 0:
+            frames.append(frame)
+        frame_idx += 1
+
+    cap.release()
+    print(f"  → {len(frames)}개 프레임 추출 (전체 {total_frames}프레임, 원본 {video_fps:.1f}fps)")
+    return frames
+
+
+# ─── Step 3: Pattern Detection ───────────────────────────────────────────
+
+def _has_tab_lines(region: np.ndarray, min_lines: int = MIN_TAB_LINES) -> bool:
+    """영역 내에 Tab 악보 수평 라인(기타 6줄)이 있는지 확인"""
+    if region is None or region.size == 0:
+        return False
+
+    gray = cv2.cvtColor(region, cv2.COLOR_BGR2GRAY) if len(region.shape) == 3 else region
+    h, w = gray.shape
+    if h < 20 or w < 50:
+        return False
+
+    # 이진화 (밝은 배경 + 어두운 라인)
+    _, binary = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY_INV)
+
+    # 수평 라인 강조: 가로 커널 모폴로지
+    horiz_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (max(w // 4, 30), 1))
+    horiz = cv2.morphologyEx(binary, cv2.MORPH_OPEN, horiz_kernel)
+
+    # HoughLinesP로 수평 라인 검출
+    lines = cv2.HoughLinesP(horiz, 1, np.pi / 180, threshold=50,
+                            minLineLength=w // 3, maxLineGap=20)
+    if lines is None:
+        return False
+
+    # 거의 수평인 라인만 필터 (각도 < 5도)
+    horizontal_ys = []
+    for line in lines:
+        x1, y1, x2, y2 = line[0]
+        if abs(y2 - y1) < max(5, abs(x2 - x1) * 0.087):  # ~5도
+            horizontal_ys.append((y1 + y2) / 2)
+
+    if len(horizontal_ys) < min_lines:
+        return False
+
+    # Y좌표 클러스터링: 가까운 라인을 하나로 묶기 (6줄 그룹 검출)
+    horizontal_ys.sort()
+    clusters = []
+    for y in horizontal_ys:
+        if not clusters or y - clusters[-1] > h * 0.02:  # 2% 거리 이상이면 새 클러스터
+            clusters.append(y)
+
+    return len(clusters) >= min_lines
+
+
+def _detect_white_region(frame: np.ndarray) -> Optional[Tuple[int, int, int, int]]:
+    """흰색 사각형 영역 검출 (Tab 여부 무관). 반환: (x, y, w, h) or None"""
+    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
+    h, w = gray.shape
+
+    _, thresh = cv2.threshold(gray, 220, 255, cv2.THRESH_BINARY)
+    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15, 15))
+    closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
+    contours, _ = cv2.findContours(closed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+
+    total_area = h * w
+    best = None
+    best_area = 0
+
+    for cnt in contours:
+        x, y, cw, ch = cv2.boundingRect(cnt)
+        area = cw * ch
+        ratio = area / total_area
+
+        if (OVERLAY_MIN_AREA_RATIO < ratio < OVERLAY_MAX_AREA_RATIO
+                and cw > ch * 0.5
+                and area > best_area):
+            best = (x, y, cw, ch)
+            best_area = area
+
+    return best
+
+
+def _detect_tab_overlay(frame: np.ndarray) -> Optional[Tuple[int, int, int, int]]:
+    """Tab 악보가 포함된 흰색 오버레이 박스 검출. 반환: (x, y, w, h) or None"""
+    bbox = _detect_white_region(frame)
+    if bbox is None:
+        return None
+
+    x, y, w, h = bbox
+    region = frame[y:y + h, x:x + w]
+
+    # Tab 수평 라인이 있는 경우에만 반환
+    if _has_tab_lines(region, min_lines=3):
+        return bbox
+    return None
+
+
+def _detect_split_screen(frames: List[np.ndarray], sample_count: int = 10) -> bool:
+    """분할 화면 감지: 상단이 밝은 Tab 용지, 하단이 어두운 핸드캠인지 확인
+
+    엄격한 기준:
+    - 상단 평균 밝기 > 180 (Tab 용지는 거의 흰색)
+    - 하단 평균 밝기 < 100 (핸드캠은 일반적으로 어두움)
+    - 밝기 차이 > 80
+    - 상단에 Tab 수평 라인이 4개 이상 존재
+    """
+    DETECT_SPLIT = 0.5  # 감지용 분할 비율
+
+    if len(frames) < sample_count:
+        sample_count = len(frames)
+
+    indices = np.linspace(0, len(frames) - 1, sample_count, dtype=int)
+    split_count = 0
+
+    for idx in indices:
+        frame = frames[idx]
+        fh, fw = frame.shape[:2]
+        top_half = frame[0:int(fh * DETECT_SPLIT), :]
+        bottom_half = frame[int(fh * DETECT_SPLIT):, :]
+
+        top_brightness = np.mean(cv2.cvtColor(top_half, cv2.COLOR_BGR2GRAY))
+        bottom_brightness = np.mean(cv2.cvtColor(bottom_half, cv2.COLOR_BGR2GRAY))
+
+        # 엄격한 밝기 기준: Tab 용지(>180) + 어두운 핸드캠(<100) + 큰 차이(>80)
+        if (top_brightness > 180 and bottom_brightness < 100
+                and top_brightness - bottom_brightness > 80
+                and _has_tab_lines(top_half, min_lines=4)):
+            split_count += 1
+
+    ratio = split_count / sample_count
+    return ratio > 0.3
+
+
+def detect_pattern(frames: List[np.ndarray], sample_count: int = 20) -> str:
+    """영상 패턴 감지: 'scroll', 'overlay', 또는 'split'
+
+    감지 순서:
+    1. overlay — Tab 오버레이 박스가 가장 구체적이므로 최우선
+    2. split — 상단 Tab 용지 + 하단 핸드캠 = 엄격한 밝기 기준
+    3. scroll — 기본 (상단 크롭)
+    """
+    print("[3/5] 영상 패턴 분석 중...")
+
+    if len(frames) < sample_count:
+        sample_count = len(frames)
+
+    indices = np.linspace(0, len(frames) - 1, sample_count, dtype=int)
+    sample_frames = [frames[i] for i in indices]
+
+    # 1) 오버레이 검출 먼저 — Tab 라인이 있는 흰 박스 (가장 구체적)
+    overlay_count = 0
+    for frame in sample_frames:
+        if _detect_tab_overlay(frame) is not None:
+            overlay_count += 1
+
+    overlay_ratio = overlay_count / sample_count
+    if overlay_ratio > 0.3:
+        print(f"  → 패턴: overlay (Tab 오버레이 감지율: {overlay_ratio:.0%})")
+        return "overlay"
+
+    # 2) 분할 화면(split) 검출 — 상단 Tab 용지 + 하단 핸드캠
+    if _detect_split_screen(frames, sample_count):
+        print("  → 패턴: split (상단 Tab + 하단 핸드캠)")
+        return "split"
+
+    # 3) 기본: 스크롤형
+    print(f"  → 패턴: scroll (오버레이 감지율: {overlay_ratio:.0%})")
+    return "scroll"
+
+
+# ─── Step 4: Extract Unique Tab Frames ────────────────────────────────────
+
+def compare_frames(frame1: np.ndarray, frame2: np.ndarray) -> float:
+    """두 프레임의 유사도 비교 (0~1, 1=동일).
+
+    픽셀 수준 정규화 상호상관(NCC) 사용 — 히스토그램 방식보다
+    Tab 내용 변화(프렛 번호, 마디 위치 등)를 정확히 감지.
+    """
+    # 그레이스케일 변환
+    g1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY) if len(frame1.shape) == 3 else frame1
+    g2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY) if len(frame2.shape) == 3 else frame2
+
+    # 크기 맞추기
+    if g1.shape != g2.shape:
+        g2 = cv2.resize(g2, (g1.shape[1], g1.shape[0]))
+
+    # 표준화된 크기로 축소 (속도 + 노이즈 감소)
+    target_w = 320
+    if g1.shape[1] > target_w:
+        scale = target_w / g1.shape[1]
+        new_size = (target_w, int(g1.shape[0] * scale))
+        g1 = cv2.resize(g1, new_size)
+        g2 = cv2.resize(g2, new_size)
+
+    # 정규화 상호상관 (NCC): 픽셀 수준 비교
+    # MSE 기반: 0=동일, 높을수록 다름 → 유사도로 변환
+    g1_f = g1.astype(np.float32) / 255.0
+    g2_f = g2.astype(np.float32) / 255.0
+    mse = np.mean((g1_f - g2_f) ** 2)
+
+    # MSE → 유사도 변환 (0~1, 1=동일)
+    # factor 8: MSE 0.005→sim 0.96, MSE 0.06→sim 0.52, MSE 0.13+→sim 0.0
+    similarity = 1.0 - min(mse * 8.0, 1.0)
+    return max(0.0, similarity)
+
+
+def extract_unique_scroll(frames: List[np.ndarray],
+                          crop_ratio: float = DEFAULT_CROP_RATIO,
+                          threshold: float = SIMILARITY_THRESHOLD) -> List[np.ndarray]:
+    """스크롤형: 상단 크롭 후 중복 제거"""
+    print("[4/5] 스크롤형 Tab 프레임 추출 중...")
+
+    unique = []
+    prev_crop = None
+
+    for i, frame in enumerate(frames):
+        h, w = frame.shape[:2]
+        crop = frame[0:int(h * crop_ratio), :]
+
+        if prev_crop is None:
+            unique.append(crop)
+            prev_crop = crop
+            continue
+
+        sim = compare_frames(crop, prev_crop)
+        if sim < threshold:
+            unique.append(crop)
+            prev_crop = crop
+
+    print(f"  → {len(unique)}개 고유 프레임 선별 (임계값: {threshold})")
+    return unique
+
+
+def _normalize_overlay(crop: np.ndarray, target_w: int = 320,
+                        target_h: int = 120) -> np.ndarray:
+    """오버레이 크롭을 고정 크기 흰색 캔버스 위에 배치 (비교 정규화용)"""
+    h, w = crop.shape[:2]
+    scale = min(target_w / w, target_h / h)
+    new_w = int(w * scale)
+    new_h = int(h * scale)
+    resized = cv2.resize(crop, (new_w, new_h))
+
+    # 흰색 캔버스에 중앙 배치
+    canvas = np.full((target_h, target_w, 3), 255, dtype=np.uint8)
+    offset_x = (target_w - new_w) // 2
+    offset_y = (target_h - new_h) // 2
+    canvas[offset_y:offset_y + new_h, offset_x:offset_x + new_w] = resized
+    return canvas
+
+
+def extract_unique_overlay(frames: List[np.ndarray],
+                           threshold: float = SIMILARITY_THRESHOLD) -> List[np.ndarray]:
+    """오버레이형: Tab 라인이 있는 흰 박스 영역 검출 후 중복 제거
+
+    슬라이딩 윈도우 비교: 각 프레임을 최근 N개 고유 프레임과 비교하여
+    점진적 변화 누적(drift)에 의한 중복을 방지.
+    """
+    print("[4/5] 오버레이형 Tab 프레임 추출 중...")
+
+    WINDOW_SIZE = 5  # 최근 5개 고유 프레임과 비교
+    MIN_CROP_H = 40  # 최소 크롭 높이 (너무 작은 검출 제외)
+    MIN_CROP_W = 100 # 최소 크롭 폭
+
+    unique = []
+    recent_normalized = []  # 최근 고유 프레임 정규화 결과
+
+    for i, frame in enumerate(frames):
+        bbox = _detect_tab_overlay(frame)
+        if bbox is None:
+            continue
+
+        x, y, w, h = bbox
+        # 최소 크기 필터
+        if h < MIN_CROP_H or w < MIN_CROP_W:
+            continue
+
+        # 약간의 패딩 추가
+        pad = 10
+        x = max(0, x - pad)
+        y = max(0, y - pad)
+        w = min(frame.shape[1] - x, w + 2 * pad)
+        h = min(frame.shape[0] - y, h + 2 * pad)
+
+        overlay_crop = frame[y:y + h, x:x + w]
+        normalized = _normalize_overlay(overlay_crop)
+
+        # 최근 N개 고유 프레임과 비교 — 하나라도 유사하면 건너뛰기
+        is_duplicate = False
+        for ref_norm in recent_normalized:
+            sim = compare_frames(normalized, ref_norm)
+            if sim >= threshold:
+                is_duplicate = True
+                break
+
+        if not is_duplicate:
+            unique.append(overlay_crop)
+            recent_normalized.append(normalized)
+            # 윈도우 크기 유지
+            if len(recent_normalized) > WINDOW_SIZE:
+                recent_normalized.pop(0)
+
+    print(f"  → {len(unique)}개 고유 오버레이 프레임 선별")
+    return unique
+
+
+def extract_unique_split(frames: List[np.ndarray],
+                         crop_ratio: float = SPLIT_TOP_RATIO,
+                         threshold: float = 0.95) -> List[np.ndarray]:
+    """분할 화면형: 상단 Tab 영역 크롭 후 중복 제거
+
+    MSE 기반 비교에서 동일 프레임은 sim>0.999, 커서만 이동 시 ~0.995.
+    실제 Tab 전환 시 sim 0.60~0.91. threshold=0.95가 적절한 균형점.
+    """
+    print(f"[4/5] 분할 화면형 Tab 프레임 추출 중 (crop={crop_ratio:.0%}, sim={threshold})...")
+
+    unique = []
+    prev_crop = None
+
+    for i, frame in enumerate(frames):
+        h, w = frame.shape[:2]
+        crop = frame[0:int(h * crop_ratio), :]
+
+        # 밝기 필터: 어두운 프레임(인트로/아웃트로) 제외
+        gray_crop = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
+        mean_brightness = np.mean(gray_crop)
+        if mean_brightness < 120:  # 어두운 프레임 건너뛰기
+            continue
+
+        # Tab 라인이 있는 프레임만 선별
+        if not _has_tab_lines(crop, min_lines=3):
+            continue
+
+        if prev_crop is None:
+            unique.append(crop)
+            prev_crop = crop
+            continue
+
+        sim = compare_frames(crop, prev_crop)
+        if sim < threshold:
+            unique.append(crop)
+            prev_crop = crop
+
+    print(f"  → {len(unique)}개 고유 분할화면 프레임 선별")
+    return unique
+
+
+# ─── Step 5: Generate PDF ─────────────────────────────────────────────────
+
+def generate_pdf(frames: List[np.ndarray], output_path: Path,
+                 debug_dir: Optional[Path] = None) -> None:
+    """고유 프레임들을 하나의 PDF로 합성"""
+    print("[5/5] PDF 생성 중...")
+
+    if not frames:
+        print("  ⚠ 추출된 프레임이 없습니다!")
+        return
+
+    pil_images = []
+    for i, frame in enumerate(frames):
+        # BGR → RGB
+        rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+        img = Image.fromarray(rgb)
+
+        # 디버그 모드: 개별 이미지 저장
+        if debug_dir:
+            img.save(debug_dir / f"frame_{i:04d}.png")
+
+        pil_images.append(img)
+
+    # PDF 생성: 첫 이미지에 나머지를 append
+    # 각 프레임을 PDF 페이지로 변환 (원본 크기 유지)
+    pdf_pages = []
+    for img in pil_images:
+        # RGB → PDF 호환 (RGBA 미지원이므로 RGB로)
+        if img.mode != 'RGB':
+            img = img.convert('RGB')
+        pdf_pages.append(img)
+
+    if pdf_pages:
+        first_page = pdf_pages[0]
+        rest_pages = pdf_pages[1:] if len(pdf_pages) > 1 else []
+        first_page.save(
+            str(output_path),
+            save_all=True,
+            append_images=rest_pages,
+            resolution=PDF_DPI,
+        )
+        print(f"  → PDF 생성 완료: {output_path}")
+        print(f"     {len(pdf_pages)} 페이지, 파일 크기: {output_path.stat().st_size / 1024:.0f} KB")
+
+
+# ─── Also generate single long PNG ────────────────────────────────────────
+
+def generate_long_image(frames: List[np.ndarray], output_path: Path) -> None:
+    """모든 프레임을 하나의 긴 이미지로 이어붙이기"""
+    if not frames:
+        return
+
+    # 가장 넓은 프레임에 맞춰 통일
+    max_width = max(f.shape[1] for f in frames)
+    resized = []
+    for f in frames:
+        if f.shape[1] != max_width:
+            scale = max_width / f.shape[1]
+            new_h = int(f.shape[0] * scale)
+            f = cv2.resize(f, (max_width, new_h))
+        resized.append(f)
+
+    concat = np.vstack(resized)
+    rgb = cv2.cvtColor(concat, cv2.COLOR_BGR2RGB)
+    img = Image.fromarray(rgb)
+    img.save(str(output_path))
+    print(f"  → 롱 이미지 생성: {output_path} ({img.width}x{img.height})")
+
+
+# ─── Main Pipeline ────────────────────────────────────────────────────────
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="YouTube 기타 TAB 영상 → PDF 캡처",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+예시:
+  python youtube_tab_to_pdf.py "https://youtu.be/90BWvJY6KbE"
+  python youtube_tab_to_pdf.py "https://youtu.be/Ri9g4lwnrJQ" -o my_tab.pdf --debug
+  python youtube_tab_to_pdf.py "https://youtu.be/VIDEO" --pattern overlay --crop-ratio 0.6
+        """,
+    )
+    parser.add_argument("url", help="YouTube 영상 URL")
+    parser.add_argument("-o", "--output", help="출력 PDF 파일 경로")
+    parser.add_argument("--crop-ratio", type=float, default=DEFAULT_CROP_RATIO,
+                        help=f"Tab 영역 크롭 비율 (기본: {DEFAULT_CROP_RATIO})")
+    parser.add_argument("--fps", type=float, default=DEFAULT_FPS,
+                        help=f"프레임 추출 빈도 (기본: {DEFAULT_FPS})")
+    parser.add_argument("--similarity", type=float, default=SIMILARITY_THRESHOLD,
+                        help=f"프레임 유사도 임계값 (기본: {SIMILARITY_THRESHOLD})")
+    parser.add_argument("--pattern", choices=["auto", "scroll", "overlay", "split"],
+                        default="auto", help="영상 패턴 (기본: auto)")
+    parser.add_argument("--debug", action="store_true", help="중간 이미지 저장")
+
+    args = parser.parse_args()
+
+    # 출력 디렉토리 설정
+    output_dir = Path("output")
+    output_dir.mkdir(exist_ok=True)
+
+    # Debug 디렉토리
+    debug_dir = None
+    if args.debug:
+        debug_dir = output_dir / "debug_frames"
+        debug_dir.mkdir(exist_ok=True)
+
+    # ── Step 1: Download ──
+    video_path, safe_title = download_video(args.url, output_dir)
+
+    # ── Step 2: Extract Frames ──
+    frames = extract_frames(video_path, fps=args.fps)
+    if not frames:
+        print("❌ 프레임을 추출할 수 없습니다.")
+        sys.exit(1)
+
+    # ── Step 3: Detect Pattern ──
+    if args.pattern == "auto":
+        pattern = detect_pattern(frames)
+    else:
+        pattern = args.pattern
+        print(f"[3/5] 패턴 수동 지정: {pattern}")
+
+    # ── Step 4: Extract Unique Frames ──
+    if pattern == "scroll":
+        unique_frames = extract_unique_scroll(
+            frames, crop_ratio=args.crop_ratio, threshold=args.similarity
+        )
+    elif pattern == "split":
+        # split 모드: 자체 최적값 사용 (crop=42%, sim=0.98)
+        # CLI에서 명시 지정 시에만 override
+        split_kwargs = {}
+        if args.crop_ratio != DEFAULT_CROP_RATIO:  # 사용자가 직접 지정한 경우
+            split_kwargs['crop_ratio'] = args.crop_ratio
+        if args.similarity != SIMILARITY_THRESHOLD:
+            split_kwargs['threshold'] = args.similarity
+        unique_frames = extract_unique_split(frames, **split_kwargs)
+    else:
+        unique_frames = extract_unique_overlay(
+            frames, threshold=args.similarity
+        )
+
+    if not unique_frames:
+        print("❌ 고유 프레임을 찾을 수 없습니다. --similarity 값을 낮추거나 --pattern을 수동 지정해보세요.")
+        sys.exit(1)
+
+    # ── Step 5: Generate Output ──
+    if args.output:
+        pdf_path = Path(args.output)
+    else:
+        pdf_path = output_dir / f"{safe_title}.pdf"
+
+    generate_pdf(unique_frames, pdf_path, debug_dir=debug_dir)
+
+    # 보너스: 긴 이미지도 생성
+    long_img_path = pdf_path.with_suffix(".png")
+    generate_long_image(unique_frames, long_img_path)
+
+    print(f"\n✅ 완료!")
+    print(f"   PDF: {pdf_path}")
+    print(f"   PNG: {long_img_path}")
+    if debug_dir:
+        debug_count = len(list(debug_dir.glob("*.png")))
+        print(f"   Debug: {debug_dir} ({debug_count}개 이미지)")
+
+
+if __name__ == "__main__":
+    main()