From 98381d28932b07ab4011b5ac32a19dc01ccc2d1f Mon Sep 17 00:00:00 2001
From: quantlab <quantlab@vega.local>
Date: Wed, 25 Mar 2026 21:58:48 +0900
Subject: [PATCH] feat(pipeline): v3-v4 dedup + panorama stitching + 1080p
 support

- HSV-aware _trim_to_content (white ratio 30-97%)
- pHash cluster dedup: dHash 32x32(1024bit), max_hamming=20
- Panoramic stitching: template matching scroll offset detection
- 4-stage pipeline: MSE -> Panorama -> pHash
- 1080p download priority + MAX_FRAME_WIDTH=1280 cap
- test_pipeline.py with YouTube URLs and --download mode
- 3 new known-issues documented
- devlog + STATUS.md updated
---
 .agent/references/STATUS.md                |  30 +-
 .agent/references/known-issues.md          |  20 +-
 .agent/workflows/end.md                    |  14 +-
 .agent/workflows/helpers/vikunja_helper.py |   2 +-
 .agent/workflows/helpers/wiki_helper.py    |   2 +-
 .agent/workflows/services.md               |   2 +-
 .agent/workflows/start.md                  |   2 +-
 diag_v2.py                                 |  75 ++
 docs/devlog/2026-03-25.md                  |   7 +
 docs/devlog/entries/20260325-001.md        |  34 +
 dump_frames.py                             |  33 +
 test_pipeline.py                           | 110 +++
 youtube_tab_to_pdf.py                      | 926 ++++++++++++---------
 13 files changed, 836 insertions(+), 421 deletions(-)
 create mode 100644 diag_v2.py
 create mode 100644 docs/devlog/2026-03-25.md
 create mode 100644 docs/devlog/entries/20260325-001.md
 create mode 100644 dump_frames.py
 create mode 100644 test_pipeline.py

diff --git a/.agent/references/STATUS.md b/.agent/references/STATUS.md
index 420afbd..847dd45 100644
--- a/.agent/references/STATUS.md
+++ b/.agent/references/STATUS.md
@@ -4,26 +4,38 @@
 
 | 기능 | 상태 | 비고 |
 |------|------|------|
-| YouTube 다운로드 | ✅ 완료 | yt-dlp + 쿠키 인증 |
-| 프레임 추출 | ✅ 완료 | fps=2 기본값 |
+| YouTube 다운로드 | ✅ 완료 | yt-dlp, 1080p 우선 다운로드 |
+| 프레임 추출 | ✅ 완료 | fps=2, MAX_FRAME_WIDTH=1280 캡 |
 | 패턴 감지 (overlay) | ✅ 완료 | Tab 라인 검증 포함 |
 | 패턴 감지 (split) | ✅ 완료 | 밝기 기준 엄격화 |
 | 패턴 감지 (scroll) | ✅ 완료 | 기본 폴백 |
-| MSE 기반 중복 제거 | ✅ 완료 | 히스토그램 → MSE 전환 |
-| 오버레이 정규화 비교 | ✅ 완료 | 320×120 정규화 + 슬라이딩 윈도우 |
+| HSV 기반 Tab 검출 | ✅ 완료 | 2-tier HSV 마스크, 960px 업스케일 |
+| MSE 기반 중복 제거 | ✅ 완료 | 480px 정규화 비교 |
+| pHash 클러스터 중복제거 | ✅ 완료 | dHash 32×32(1024bit), max_hamming=20 |
+| 파노라마 스티칭 | ✅ 완료 | 템플릿 매칭 수평 스크롤 합성 |
+| 오버레이 정규화 비교 | ✅ 완료 | 480×180 정규화 + 전체 히스토리 MSE 비교 |
 | PDF/PNG 생성 | ✅ 완료 | A4 + 롱 이미지 |
 
+## 처리 파이프라인 (scroll)
+
+```
+Raw Frames → HSV Strip 검출 → Median Crop → MSE 1차 → 파노라마 스티칭 → pHash 2차 → PDF
+```
+
 ## 최근 변경
 
 | 날짜 | 변경 내용 |
 |------|-----------|
+| 2026-03-25 | 1080p 우선 다운로드 + MAX_FRAME_WIDTH=1280 캡 (OOM 방지) |
+| 2026-03-25 | dHash 32×32 + max_hamming=20으로 pHash 정밀도 향상 |
+| 2026-03-25 | 파노라마 스티칭: 템플릿 매칭 스크롤 오프셋 검출 + 연속 프레임 합성 |
+| 2026-03-25 | HSV 트림: 흰색비율 30~97% 기반 정밀 크롭 |
+| 2026-03-25 | overlay 프레임 수 최적화: 858→51프레임 (OVERLAY_SIMILARITY_THRESHOLD=0.55) |
 | 2026-03-24 | 패턴 감지 고도화: overlay→split→scroll 우선순위 |
 | 2026-03-24 | 히스토그램 비교 → MSE 픽셀 비교로 전환 |
-| 2026-03-24 | split 모드: 42% 크롭 + 밝기 필터 + Tab 라인 검증 |
-| 2026-03-24 | overlay 모드: 정규화 + 슬라이딩 윈도우 중복 제거 |
-| 2026-03-24 | split 감지 조건 엄격화 (top>180, bottom<100) |
 
 ## 알려진 제한사항
 
-- 오버레이형 영상(空奏列車)에서 추출 프레임 수가 아직 많을 수 있음 (MSE 임계값 추가 튜닝 필요)
-- 영상 내 Tab이 반복되는 곡은 실제 고유 프레임 수가 적음 (正常 동작)
+- 1080p 처리 시 여전히 중복 프레임 존재 가능 (마디번호 기반 추가 검증 필요)
+- 순차 영상 처리 시 메모리 누적 주의 (gc.collect 필수)
+- test_pipeline.py 아직 메인 코드와 완전 통합 안 됨
diff --git a/.agent/references/known-issues.md b/.agent/references/known-issues.md
index 66538b6..3034db4 100644
--- a/.agent/references/known-issues.md
+++ b/.agent/references/known-issues.md
@@ -54,4 +54,22 @@
 - **증상**: overlay 프레임 비교 시 모든 프레임이 "다르다"로 판정 (1000+개 추출)
 - **원인**: _detect_tab_overlay가 프레임마다 다른 크기의 바운딩박스 반환 (69~360px)
 - **해결**: 320×120 흰색 캔버스에 정규화 후 비교 + 슬라이딩 윈도우(5프레임)
-- **주의**: overlay 프레임 수 최적화는 아직 진행 중 (추가 튜닝 필요)
\ No newline at end of file
+- **주의**: overlay 프레임 수 최적화는 아직 진행 중 (추가 튜닝 필요)
+
+### [2026-03-25] pHash 16×16 — Tab 프레임 과도합병
+- **증상**: 서로 다른 Tab 페이지가 pHash 클러스터링에서 동일 그룹으로 합병 (20→9 프레임)
+- **원인**: 16×16 dHash(256비트)는 Tab 구조(6선 + 숫자)를 구분하기엔 해상도 부족. 모든 Tab이 유사한 hash 생성
+- **해결**: dHash 32×32(1024비트)로 확대 + max_hamming 50→20 조정
+- **주의**: hash_size와 max_hamming은 항상 쌍으로 조정해야 함 (비트수 대비 비율)
+
+### [2026-03-25] 1080p 프레임 — 메모리 부족/프로세스 행
+- **증상**: 1920×1080 프레임 500+개 로딩 시 프로세스 무한 대기 (3.5GB+ RAM)
+- **원인**: extract_frames가 모든 프레임을 list에 보관, 1080p는 프레임당 ~6MB
+- **해결**: MAX_FRAME_WIDTH=1280 캡 + gc.collect() 추가. 4K→1280px 다운스케일
+- **주의**: 영상 3개 순차 처리 시 GC 없으면 누적 메모리로 swap thrashing 발생
+
+### [2026-03-25] yt-dlp 다운로드 — 360p 폴백
+- **증상**: `bestvideo[height>=720]` 포맷으로 요청했으나 640×360 파일 다운로드
+- **원인**: format string의 `/best` 폴백이 720p 없을 때 360p 선택. 또는 mp4 전용 필터가 해상도 제한
+- **해결**: 명시적 1080p 우선 + 720p 폴백 체인 분리 (`bestvideo[height>=1080]/.../best[height>=720]/best`)
+- **주의**: 캐시된 파일이 있으면 재다운로드 안 함 — 해상도 변경 시 기존 파일 삭제 필요
\ No newline at end of file
diff --git a/.agent/workflows/end.md b/.agent/workflows/end.md
index f3c2470..6dec20c 100644
--- a/.agent/workflows/end.md
+++ b/.agent/workflows/end.md
@@ -87,9 +87,9 @@ git log --oneline -20
 
 | 커밋 유형 | Vikunja 액션 |
 |-----------|-------------|
-| 기존 태스크 해당 작업 **완료** | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py done {ID}` |
-| 신규 작업 완료 (기존 태스크 없음) | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --done --labels Backend,Priority:High` |
-| 작업 중 발견된 **미완료 TODO** | `C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:Mid` |
+| 기존 태스크 해당 작업 **완료** | `C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py done {ID}` |
+| 신규 작업 완료 (기존 태스크 없음) | `C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --done --labels Backend,Priority:High` |
+| 작업 중 발견된 **미완료 TODO** | `C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:Mid` |
 
 > [!IMPORTANT]
 > 모든 커밋이 기존 또는 신규 태스크에 매핑되었는지 확인.
@@ -97,13 +97,13 @@ git log --oneline -20
 ### 2-2. 완료 처리
 
 ```powershell
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py done {TASK_ID}
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py done {TASK_ID}
 ```
 
 ### 2-3. 신규 태스크 생성
 
 ```powershell
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:High
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py create "제목" "설명" --labels Backend,Priority:High
 ```
 
 ### 라벨 규칙
@@ -138,11 +138,11 @@ git diff --name-only .agent/references/
 
 ```powershell
 # STATUS.md가 변경된 경우
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\wiki_helper.py update "Status" .agent\references\STATUS.md
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\wiki_helper.py update "Status" .agent\references\STATUS.md
 ```
 ```powershell
 # architecture.md가 변경된 경우
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\wiki_helper.py update "Architecture" .agent\references\architecture.md
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\wiki_helper.py update "Architecture" .agent\references\architecture.md
 ```
 
 > [!TIP]
diff --git a/.agent/workflows/helpers/vikunja_helper.py b/.agent/workflows/helpers/vikunja_helper.py
index 7ff44f6..44f5b75 100644
--- a/.agent/workflows/helpers/vikunja_helper.py
+++ b/.agent/workflows/helpers/vikunja_helper.py
@@ -32,7 +32,7 @@ if sys.stdout.encoding != "utf-8":
 # ============================================================
 API_BASE = "https://plan.variet.net/api/v1"
 TOKEN = "tk_070f8e0b715e818bb7178c3815ed5389040eddca"
-PROJECT_ID = 7                                  # Variet Agent 프로젝트
+PROJECT_ID = 12                                 # guitar_score 프로젝트
 # ============================================================
 
 HEADERS = {
diff --git a/.agent/workflows/helpers/wiki_helper.py b/.agent/workflows/helpers/wiki_helper.py
index bf63161..3dec24a 100644
--- a/.agent/workflows/helpers/wiki_helper.py
+++ b/.agent/workflows/helpers/wiki_helper.py
@@ -15,7 +15,7 @@ sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
 # ============================================================
 GITEA_BASE_URL = "https://git.variet.net"
 GITEA_OWNER = "Variet"
-GITEA_REPO = "variet-agent"                       # Variet Agent 프로젝트
+GITEA_REPO = "guitar_score"                       # guitar_score 프로젝트
 GITEA_TOKEN = "3a01b4b15a39921572e64c413353e870d4d2161b"
 # ============================================================
 
diff --git a/.agent/workflows/services.md b/.agent/workflows/services.md
index c108f59..0e66dc5 100644
--- a/.agent/workflows/services.md
+++ b/.agent/workflows/services.md
@@ -44,7 +44,7 @@ description: 프로젝트 서비스 연동 정보 + 작업 프로토콜 (서비
 > 직접 API 호출 대신 반드시 helper 스크립트를 사용하세요.
 
 ```powershell
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
 ```
 
 ### Vikunja 라벨 체계
diff --git a/.agent/workflows/start.md b/.agent/workflows/start.md
index 0ca4f39..e242b26 100644
--- a/.agent/workflows/start.md
+++ b/.agent/workflows/start.md
@@ -49,7 +49,7 @@ git log --oneline -5
 ### 3. Vikunja TODO 태스크
 
 ```powershell
-C:\ProgramData\miniforge3\envs\variet-agent\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
+C:\ProgramData\miniforge3\envs\score\python.exe .agent\workflows\helpers\vikunja_helper.py list todo
 ```
 
 ### 4. 종합 보고
diff --git a/diag_v2.py b/diag_v2.py
new file mode 100644
index 0000000..ece61ef
--- /dev/null
+++ b/diag_v2.py
@@ -0,0 +1,75 @@
+"""video_2 진단: 왜 0 프레임인지 각 단계별 확인"""
+import sys
+sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+
+import cv2
+import numpy as np
+from pathlib import Path
+import importlib.util
+
+spec = importlib.util.spec_from_file_location("p", "youtube_tab_to_pdf.py")
+p = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(p)
+
+mp4 = Path("output") / "サカナクション／新宝島(エレキギターTAB) 難易度★★★ sakanaction shintakarajima.mp4"
+cap = cv2.VideoCapture(str(mp4))
+total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+fps = cap.get(cv2.CAP_PROP_FPS)
+
+# 10개 프레임 샘플
+indices = np.linspace(total * 0.1, total * 0.8, 10, dtype=int)
+for idx in indices:
+    cap.set(cv2.CAP_PROP_POS_FRAMES, idx)
+    ret, frame = cap.read()
+    if not ret:
+        continue
+
+    h, w = frame.shape[:2]
+    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
+
+    # 행별 밝기
+    margin_x = int(w * 0.1)
+    row_br = np.mean(gray[:, margin_x:w-margin_x], axis=1)
+
+    strip = p._find_white_tab_strip(frame)
+    has_tab = False
+    if strip:
+        top, bottom = strip
+        crop = frame[top:bottom, :]
+        has_tab = p._has_tab_content(crop)
+
+    print(f"Frame {idx:5d}: strip={strip}, has_tab={has_tab}, "
+          f"top_br={np.mean(row_br[:h//3]):.0f}, "
+          f"mid_br={np.mean(row_br[h//3:2*h//3]):.0f}, "
+          f"bot_br={np.mean(row_br[2*h//3:]):.0f}")
+
+    # strip이 있지만 has_tab=False인 경우 상세 진단
+    if strip and not has_tab:
+        top, bottom = strip
+        crop = frame[top:bottom, :]
+        g = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
+        ch, cw = g.shape
+        _, binary = cv2.threshold(g, 180, 255, cv2.THRESH_BINARY_INV)
+        horiz_k = cv2.getStructuringElement(cv2.MORPH_RECT, (max(cw//3, 30), 1))
+        horiz = cv2.morphologyEx(binary, cv2.MORPH_OPEN, horiz_k)
+        lines = cv2.HoughLinesP(horiz, 1, np.pi/180, threshold=40,
+                                minLineLength=cw//4, maxLineGap=20)
+        nlines = 0 if lines is None else len(lines)
+        ys = []
+        if lines is not None:
+            for l in lines:
+                x1,y1,x2,y2 = l[0]
+                if abs(y2-y1) < max(5, abs(x2-x1)*0.05):
+                    ys.append((y1+y2)/2)
+        ys.sort()
+        clusters = []
+        for y in ys:
+            if not clusters or y - clusters[-1] > ch * 0.02:
+                clusters.append(y)
+        print(f"  → 크롭크기: {cw}x{ch}, 라인수: {nlines}, "
+              f"수평ys: {len(ys)}, 클러스터: {len(clusters)}")
+
+        # 디버그: 크롭 저장
+        cv2.imwrite(f"output/raw_dump/v2_diag_{idx}.png", crop)
+
+cap.release()
diff --git a/docs/devlog/2026-03-25.md b/docs/devlog/2026-03-25.md
new file mode 100644
index 0000000..da69312
--- /dev/null
+++ b/docs/devlog/2026-03-25.md
@@ -0,0 +1,7 @@
+# Devlog — 2026-03-25
+
+| # | 시간 | 작업 설명 | 커밋 | 상태 |
+|---|------|-----------|------|------|
+| 1 | 00:00~01:00 | HSV 트림 + pHash 클러스터 중복 제거 (v3 고도화) | `pending` | ✅ |
+| 2 | 01:00~01:30 | 파노라마 스티칭: 템플릿 매칭 스크롤 오프셋 + 연속 프레임 합성 | `pending` | ✅ |
+| 3 | 12:00~21:50 | 1080p 다운로드 + dHash 32×32 + OOM 방지 (MAX_FRAME_WIDTH=1280) | `pending` | 🔧 |
diff --git a/docs/devlog/entries/20260325-001.md b/docs/devlog/entries/20260325-001.md
new file mode 100644
index 0000000..b88c0da
--- /dev/null
+++ b/docs/devlog/entries/20260325-001.md
@@ -0,0 +1,34 @@
+# Pipeline v3→v4: Dedup + 파노라마 + 1080p 고도화
+
+- **시간**: 2026-03-25 00:00~21:50
+- **Commit**: `pending`
+- **Vikunja**: 신규 태스크 생성 예정
+
+## 작업 내용
+
+### v3 고도화 (HSV + 트림)
+- `_trim_to_content`: HSV 색공간 기반 흰색비율(30~97%) 분석으로 Tab 영역 정밀 트림
+- Two-tier HSV 마스크 (pure white + bright pastel) → 노란 하이라이트/컬러 배경 정확 배제
+- 검출용 업스케일 960px width로 Tab 라인 인식률 향상
+
+### v4 중복 제거 (pHash + 파노라마)
+- `_dhash` + `_dedup_by_hash`: pHash 클러스터링으로 반복 연습 구간 제거
+- `_detect_scroll_offset`: 템플릿 매칭(오른쪽 60%)으로 수평 스크롤 오프셋 측정
+- `_stitch_scroll_segment` + `_merge_scroll_candidates`: 연속 스크롤 프레임 파노라마 합성
+- 4단계 파이프라인: MSE → 파노라마 → pHash
+
+### 1080p 전환
+- yt-dlp 포맷을 1080p 우선으로 변경 (기존 720p fallback to 360p 문제 해결)
+- MAX_FRAME_WIDTH=1280 캡으로 4K OOM 방지
+- dHash 16×16→32×32 (256→1024bit), max_hamming 50→20
+- test_pipeline.py에 YouTube URL 추가 + --download 모드 + gc.collect()
+
+## 결정 사항
+- **MSE → 파노라마 → pHash 순서**: MSE가 완전 동일 제거, 파노라마가 겹침 합성, pHash가 반복 구간 제거. 각 단계가 다른 특성의 중복 처리
+- **dHash 32×32 선택**: 16×16은 Tab의 6선 구조를 구분 못함(모두 유사 hash). 32×32는 마디번호/음표 위치까지 차별화
+- **MAX_FRAME_WIDTH=1280**: 1920은 OOM 유발, 960은 360p와 큰 차이 없음. 1280은 1080p 소스에서 충분한 품질 + RAM 절약
+
+## 미완료
+- **1080p 파이프라인 전체 테스트 미완료**: 3개 영상 순차 실행 시 메모리 행으로 결과 미확인
+- 사용자 피드백: "여전히 중복 있음" → 마디번호 기반 추가 검증 로직 필요
+- test_pipeline.py ↔ youtube_tab_to_pdf.py 완전 통합 미정
diff --git a/dump_frames.py b/dump_frames.py
new file mode 100644
index 0000000..1e51e06
--- /dev/null
+++ b/dump_frames.py
@@ -0,0 +1,33 @@
+"""원본 프레임 덤프 — 각 영상에서 5개 프레임을 랜덤 추출"""
+import sys
+if sys.platform == "win32":
+    sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+import cv2
+import numpy as np
+from pathlib import Path
+
+output = Path("output")
+dump_dir = output / "raw_dump"
+dump_dir.mkdir(exist_ok=True)
+
+mp4s = sorted(output.glob("*.mp4"))
+for vi, mp4 in enumerate(mp4s):
+    cap = cv2.VideoCapture(str(mp4))
+    total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+    fps = cap.get(cv2.CAP_PROP_FPS)
+    w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+    h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+    print(f"Video {vi+1}: {mp4.name[:30]}... ({w}x{h}, {fps:.0f}fps, {total} frames)")
+
+    # 균등 간격으로 5개 프레임
+    indices = np.linspace(total * 0.1, total * 0.9, 5, dtype=int)
+    for i, idx in enumerate(indices):
+        cap.set(cv2.CAP_PROP_POS_FRAMES, idx)
+        ret, frame = cap.read()
+        if ret:
+            path = dump_dir / f"v{vi+1}_raw_{i}.png"
+            cv2.imwrite(str(path), frame)
+            print(f"  frame {idx} → {path.name} ({frame.shape})")
+    cap.release()
+
+print(f"\n덤프 완료: {dump_dir}")
diff --git a/test_pipeline.py b/test_pipeline.py
new file mode 100644
index 0000000..a3df904
--- /dev/null
+++ b/test_pipeline.py
@@ -0,0 +1,110 @@
+#!/usr/bin/env python3
+"""로컬 캐시된 mp4 파일로 파이프라인 테스트 (다운로드 스킵)
+1080p 다운로드 모드: python test_pipeline.py --download
+"""
+import sys
+import os
+from pathlib import Path
+import importlib.util
+import argparse
+import gc
+
+# youtube_tab_to_pdf 모듈 임포트
+spec = importlib.util.spec_from_file_location(
+    "pipeline", str(Path(__file__).parent / "youtube_tab_to_pdf.py"))
+pipeline = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(pipeline)
+
+# 테스트용 YouTube URLs
+TEST_URLS = {
+    "video_1": "https://www.youtube.com/watch?v=x76IMSvWR0o",  # 晴る
+    "video_2": "https://www.youtube.com/watch?v=90BWvJY6KbE",  # 新宝島
+    "video_3": "https://www.youtube.com/watch?v=Ri9g4lwnrJQ",  # 空奏列車
+}
+
+
+def test_video(mp4_path: Path, label: str):
+    """단일 영상 테스트 — 다운로드 없이 로컬 파일 직접 사용"""
+    print(f"\n{'='*60}")
+    print(f"테스트: {label}")
+    print(f"파일: {mp4_path.name}")
+    print(f"{'='*60}")
+
+    output_dir = Path("output")
+    debug_dir = output_dir / "debug_frames" / label
+    debug_dir.mkdir(parents=True, exist_ok=True)
+
+    # Step 2: 프레임 추출
+    frames = pipeline.extract_frames(mp4_path)
+
+    # Step 3: 패턴 감지
+    pattern = pipeline.detect_pattern(frames)
+
+    # Step 4: 고유 프레임 추출
+    if pattern == "scroll":
+        unique = pipeline.extract_unique_scroll(frames)
+    elif pattern == "split":
+        unique = pipeline.extract_unique_split(frames)
+    else:
+        unique = pipeline.extract_unique_overlay(frames)
+
+    # Step 5: PDF 생성
+    pdf_path = output_dir / f"test_{label}.pdf"
+    pipeline.generate_pdf(unique, pdf_path, debug_dir=debug_dir)
+
+    print(f"\n결과: {pattern} / {len(unique)}개 고유 프레임")
+    return pattern, len(unique)
+
+
+def download_test_videos():
+    """1080p로 테스트 영상 다운로드"""
+    output_dir = Path("output")
+    output_dir.mkdir(exist_ok=True)
+
+    for label, url in TEST_URLS.items():
+        print(f"\n--- {label} 다운로드 ---")
+        try:
+            video_path, title = pipeline.download_video(url, output_dir)
+            print(f"  → 완료: {video_path.name}")
+        except Exception as e:
+            print(f"  → 실패: {e}")
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--download", action="store_true",
+                        help="1080p로 테스트 영상 다운로드")
+    args = parser.parse_args()
+
+    if args.download:
+        download_test_videos()
+        return
+
+    output_dir = Path("output")
+    mp4_files = sorted(output_dir.glob("*.mp4"))
+
+    if not mp4_files:
+        print("output/ 폴더에 mp4 파일이 없습니다!")
+        print("  → python test_pipeline.py --download 로 영상 다운로드")
+        sys.exit(1)
+
+    print(f"캐시된 영상 {len(mp4_files)}개 발견:")
+    for f in mp4_files:
+        print(f"  - {f.name} ({f.stat().st_size / 1024 / 1024:.1f} MB)")
+
+    results = {}
+    for i, mp4 in enumerate(mp4_files):
+        label = f"video_{i+1}"
+        pattern, count = test_video(mp4, label)
+        results[label] = (mp4.name, pattern, count)
+        gc.collect()  # 1080p 프레임 메모리 해제
+
+    print(f"\n{'='*60}")
+    print("전체 결과 요약:")
+    print(f"{'='*60}")
+    for label, (name, pattern, count) in results.items():
+        print(f"  {label}: {pattern:8s} → {count:4d}개 프레임 | {name[:40]}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/youtube_tab_to_pdf.py b/youtube_tab_to_pdf.py
index 049d5ed..1703c34 100644
--- a/youtube_tab_to_pdf.py
+++ b/youtube_tab_to_pdf.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 """
 YouTube Tab → PDF 캡처 파이프라인
-YouTube 기타 TAB 영상에서 Tab 프레임을 추출하여 깔끔한 PDF로 만듭니다.
+YouTube 기타 TAB 영상에서 Tab 프레임을 추출하여 깔끔한 A4 PDF 악보로 만듭니다.
 
 사용법:
     python youtube_tab_to_pdf.py "https://youtu.be/VIDEO_ID"
@@ -14,7 +14,6 @@ import sys
 import subprocess
 import shutil
 import re
-import tempfile
 from pathlib import Path
 from typing import List, Tuple, Optional
 
@@ -22,152 +21,255 @@ import cv2
 import numpy as np
 from PIL import Image
 
-# Windows 콘솔 인코딩 강제 UTF-8
+# Windows 콘솔 인코딩
 if sys.platform == "win32":
     sys.stdout.reconfigure(encoding="utf-8", errors="replace")
     sys.stderr.reconfigure(encoding="utf-8", errors="replace")
 
 
-# ─── Configuration ───────────────────────────────────────────────────────
+# ─── 설정 ─────────────────────────────────────────────────────────────────
+
+DEFAULT_FPS = 2
+SIMILARITY_THRESHOLD = 0.95
+OVERLAY_SIMILARITY_THRESHOLD = 0.55
+
+OVERLAY_MIN_AREA_RATIO = 0.05
+OVERLAY_MAX_AREA_RATIO = 0.6
+MIN_TAB_LINES = 4
+
+# 프레임 추출 시 최대 폭 (1080p→1280p 다운스케일로 메모리 세이브)
+MAX_FRAME_WIDTH = 1280
+# 검출용 업스케일 폭 (360p→960px, 1.5x → Tab 라인 두꺼워짐)
+DETECT_WIDTH = 960
 
-DEFAULT_FPS = 2           # 프레임 추출 빈도 (초당 N프레임)
-DEFAULT_CROP_RATIO = 0.55 # 상단 크롭 비율 (스크롤형)
-SIMILARITY_THRESHOLD = 0.95  # 프레임 유사도 임계값 (SSIM 대신 히스토그램 비교)
-OVERLAY_MIN_AREA_RATIO = 0.05  # 오버레이 박스 최소 면적 비율
-OVERLAY_MAX_AREA_RATIO = 0.6   # 오버레이 박스 최대 면적 비율
-MIN_TAB_LINES = 4              # Tab 악보 최소 수평 라인 수 (6줄 중 4줄 이상)
-SPLIT_TOP_RATIO = 0.42         # 분할 화면 상단 영역 비율 (핸드캠 제외)
 PDF_DPI = 150
-PDF_PAGE_WIDTH_MM = 210   # A4
+PDF_PAGE_WIDTH_MM = 210
+PDF_PAGE_HEIGHT_MM = 297
+PDF_MARGIN_MM = 10
+TAB_GAP_MM = 3
 
 
-# ─── Step 1: Download ────────────────────────────────────────────────────
+# ─── Step 1: 다운로드 ─────────────────────────────────────────────────────
 
 def _find_yt_dlp() -> str:
-    """yt-dlp 실행 파일 경로 찾기"""
     yt_dlp = shutil.which("yt-dlp")
     if yt_dlp:
         return yt_dlp
-    # pip user-installed path (Windows)
     for pyver in ["Python312", "Python311", "Python310"]:
-        user_scripts = Path(os.environ.get("APPDATA", "")) / "Python" / pyver / "Scripts"
-        yt_dlp_path = user_scripts / "yt-dlp.exe"
-        if yt_dlp_path.exists():
-            return str(yt_dlp_path)
-    # conda env Scripts
-    conda_path = Path(sys.executable).parent / "Scripts" / "yt-dlp.exe"
-    if conda_path.exists():
-        return str(conda_path)
-    raise RuntimeError("yt-dlp를 찾을 수 없습니다. pip install yt-dlp를 실행하세요.")
+        p = Path(os.environ.get("APPDATA", "")) / "Python" / pyver / "Scripts" / "yt-dlp.exe"
+        if p.exists():
+            return str(p)
+    p = Path(sys.executable).parent / "Scripts" / "yt-dlp.exe"
+    if p.exists():
+        return str(p)
+    raise RuntimeError("yt-dlp를 찾을 수 없습니다. pip install yt-dlp")
 
 
 def download_video(url: str, output_dir: Path) -> Tuple[Path, str]:
-    """yt-dlp로 YouTube 영상 다운로드. 반환: (파일 경로, 제목)"""
+    """영상 다운로드 (1080p 우선)"""
     print("[1/5] 영상 다운로드 중...")
-
     yt_dlp = _find_yt_dlp()
 
-    # 제목 추출 (encoding 안전 처리)
     result = subprocess.run(
         [yt_dlp, "--get-title", "--encoding", "utf-8", url],
         capture_output=True, encoding="utf-8", errors="replace"
     )
     title = (result.stdout or "").strip() or "untitled"
-    # 파일명 안전 문자로 변환
     safe_title = re.sub(r'[\\/:*?"<>|\x00-\x1f]', '_', title)[:80]
-
     video_path = output_dir / f"{safe_title}.mp4"
 
     if video_path.exists():
         print(f"  → 이미 다운로드됨: {video_path.name}")
         return video_path, safe_title
 
+    # 1080p 우선, 720p 폴백, 최종 best
     subprocess.run(
         [yt_dlp,
-         "-f", "best[height<=720][ext=mp4]/best[ext=mp4]/best",
+         "-f", "bestvideo[height>=1080][ext=mp4]+bestaudio[ext=m4a]/"
+               "bestvideo[height>=720][ext=mp4]+bestaudio[ext=m4a]/"
+               "best[height>=720]/best",
+         "--merge-output-format", "mp4",
          "-o", str(video_path), url],
-        encoding="utf-8", errors="replace",
-        check=True
+        encoding="utf-8", errors="replace", check=True
     )
     print(f"  → 다운로드 완료: {video_path.name}")
     return video_path, safe_title
 
 
-# ─── Step 2: Frame Extraction ────────────────────────────────────────────
+# ─── Step 2: 프레임 추출 ──────────────────────────────────────────────────
 
 def extract_frames(video_path: Path, fps: float = DEFAULT_FPS) -> List[np.ndarray]:
-    """OpenCV VideoCapture로 프레임 추출"""
     print(f"[2/5] 프레임 추출 중 (fps={fps})...")
     cap = cv2.VideoCapture(str(video_path))
     if not cap.isOpened():
         raise RuntimeError(f"영상을 열 수 없습니다: {video_path}")
 
     video_fps = cap.get(cv2.CAP_PROP_FPS)
-    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
-    frame_interval = max(1, int(video_fps / fps))
+    total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+    interval = max(1, int(video_fps / fps))
+    w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+    h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+
+    # 4K 이상 → 1080p 다운스케일 (OOM 방지)
+    need_resize = w > MAX_FRAME_WIDTH
+    if need_resize:
+        scale = MAX_FRAME_WIDTH / w
+        target_size = (MAX_FRAME_WIDTH, int(h * scale))
+        print(f"  → {w}x{h} → {target_size[0]}x{target_size[1]} 다운스케일")
 
     frames = []
-    frame_idx = 0
+    idx = 0
     while True:
         ret, frame = cap.read()
         if not ret:
             break
-        if frame_idx % frame_interval == 0:
+        if idx % interval == 0:
+            if need_resize:
+                frame = cv2.resize(frame, target_size, interpolation=cv2.INTER_AREA)
             frames.append(frame)
-        frame_idx += 1
+        idx += 1
 
     cap.release()
-    print(f"  → {len(frames)}개 프레임 추출 (전체 {total_frames}프레임, 원본 {video_fps:.1f}fps)")
+    print(f"  → {len(frames)}개 프레임 추출 ({w}x{h}, 원본 {video_fps:.0f}fps)")
     return frames
 
 
-# ─── Step 3: Pattern Detection ───────────────────────────────────────────
+# ─── 핵심: 흰색 배경 Tab 영역 검출 ───────────────────────────────────────
 
-def _has_tab_lines(region: np.ndarray, min_lines: int = MIN_TAB_LINES) -> bool:
-    """영역 내에 Tab 악보 수평 라인(기타 6줄)이 있는지 확인"""
+def _find_white_tab_strip(frame: np.ndarray, min_strip_ratio: float = 0.10) -> Optional[Tuple[int, int]]:
+    """프레임에서 흰색 배경의 Tab 스트립 영역의 Y범위(top, bottom)를 반환.
+
+    전략: HSV 색공간에서 밝고(V>180) + 무채색(S<40)인 행을 찾아
+    연속된 흰색 영역이 일정 비율 이상인 영역을 Tab 영역으로 판정.
+    grayscale 단독보다 노란 하이라이트, 컬러 배경을 정확히 배제.
+    """
+    h, w = frame.shape[:2]
+    margin_x = int(w * 0.1)
+
+    # HSV 변환: 채도(S)와 명도(V) 동시 사용
+    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
+    _, s_ch, v_ch = cv2.split(hsv)
+
+    roi_v = v_ch[:, margin_x:w - margin_x]
+    roi_s = s_ch[:, margin_x:w - margin_x]
+
+    # 2단계 흰색 마스크:
+    #   1) 순수 흰색: V > 180, S < 40 (Tab 배경)
+    #   2) 밝은 파스텔: V > 200, S < 100 (노란/초록 하이라이트 박스)
+    pure_white = (roi_v > 180) & (roi_s < 40)
+    bright_pastel = (roi_v > 200) & (roi_s < 100)
+    tab_mask = pure_white | bright_pastel
+
+    # 각 행의 Tab-like 픽셀 비율
+    row_tab_ratio = np.mean(tab_mask, axis=1)
+    bright_mask = row_tab_ratio > 0.5  # 행의 50% 이상이 Tab-like
+
+    # 연속된 흰색 행 영역 찾기
+    regions = []
+    start = None
+    for i in range(h):
+        if bright_mask[i]:
+            if start is None:
+                start = i
+        else:
+            if start is not None:
+                length = i - start
+                if length >= h * min_strip_ratio:
+                    regions.append((start, i))
+                start = None
+    if start is not None:
+        length = h - start
+        if length >= h * min_strip_ratio:
+            regions.append((start, h))
+
+    if not regions:
+        return None
+
+    # 가장 넓은 흰색 스트립 반환
+    best = max(regions, key=lambda r: r[1] - r[0])
+
+    # 약간의 패딩 추가 (하단 짤림 방지)
+    pad = int(h * 0.03)
+    top = max(0, best[0] - pad)
+    bottom = min(h, best[1] + pad)
+
+    return (top, bottom)
+
+
+def _trim_to_content(crop: np.ndarray, margin_px: int = 6) -> np.ndarray:
+    """넓게 크롭된 Tab 이미지에서 Tab 콘텐츠 영역만 정밀 트림.
+
+    전략: HSV 기반으로 각 행의 '흰색 배경 비율'을 계산.
+    - Tab 영역: 30~95%가 흰색 (흰 배경 + Tab 라인/숫자)
+    - 기타 영상: 흰색 < 20% (어두운 배경)
+    - 순수 여백: 흰색 > 97%
+    이를 통해 상/하단의 기타 영상과 빈 여백 모두 제거."""
+    h, w = crop.shape[:2]
+    if h < 15 or w < 50:
+        return crop
+
+    hsv = cv2.cvtColor(crop, cv2.COLOR_BGR2HSV)
+    _, s_ch, v_ch = cv2.split(hsv)
+
+    # 흰색/밝은 파스텔 픽셀 비율 (Tab 배경 감지)
+    white_mask = ((v_ch > 180) & (s_ch < 40)) | ((v_ch > 200) & (s_ch < 100))
+    row_white = np.mean(white_mask, axis=1)
+
+    # Tab 행 = 흰색 비율 30~97% (라인/숫자 + 흰 배경)
+    tab_rows = (row_white > 0.30) & (row_white < 0.97)
+
+    # 콘텐츠 존재 확인 (어두운 픽셀 > 1%)
+    gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY) if len(crop.shape) == 3 else crop
+    row_dark = np.mean(gray < 180, axis=1)
+    content_rows = row_dark > 0.02
+
+    # Tab 행 OR 콘텐츠 행
+    valid_rows = tab_rows | content_rows
+
+    # 상단: 첫 번째 유효 행
+    top = 0
+    for i in range(h):
+        if valid_rows[i] and row_white[i] > 0.20:
+            top = max(0, i - margin_px)
+            break
+
+    # 하단: 마지막 유효 행
+    bottom = h
+    for i in range(h - 1, -1, -1):
+        if valid_rows[i] and row_white[i] > 0.20:
+            bottom = min(h, i + margin_px)
+            break
+
+    if bottom - top < 15:
+        return crop
+
+    return crop[top:bottom, :]
+
+
+def _has_tab_content(region: np.ndarray) -> bool:
+    """흰색 영역 내에 실제 Tab 내용이 있는지 검증.
+    방법: 흰색 배경 위의 어두운 픽셀(Tab 라인, 숫자, 코드명) 비율을 확인.
+    Tab 영역은 일반적으로 3~25%의 어두운 콘텐츠를 포함."""
     if region is None or region.size == 0:
         return False
 
     gray = cv2.cvtColor(region, cv2.COLOR_BGR2GRAY) if len(region.shape) == 3 else region
     h, w = gray.shape
-    if h < 20 or w < 50:
+    if h < 15 or w < 50:
         return False
 
-    # 이진화 (밝은 배경 + 어두운 라인)
-    _, binary = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY_INV)
+    # 어두운 픽셀 비율 (< 180 = 라인/숫자/코드 등)
+    dark_pixels = np.sum(gray < 180)
+    dark_ratio = dark_pixels / gray.size
 
-    # 수평 라인 강조: 가로 커널 모폴로지
-    horiz_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (max(w // 4, 30), 1))
-    horiz = cv2.morphologyEx(binary, cv2.MORPH_OPEN, horiz_kernel)
-
-    # HoughLinesP로 수평 라인 검출
-    lines = cv2.HoughLinesP(horiz, 1, np.pi / 180, threshold=50,
-                            minLineLength=w // 3, maxLineGap=20)
-    if lines is None:
-        return False
-
-    # 거의 수평인 라인만 필터 (각도 < 5도)
-    horizontal_ys = []
-    for line in lines:
-        x1, y1, x2, y2 = line[0]
-        if abs(y2 - y1) < max(5, abs(x2 - x1) * 0.087):  # ~5도
-            horizontal_ys.append((y1 + y2) / 2)
-
-    if len(horizontal_ys) < min_lines:
-        return False
-
-    # Y좌표 클러스터링: 가까운 라인을 하나로 묶기 (6줄 그룹 검출)
-    horizontal_ys.sort()
-    clusters = []
-    for y in horizontal_ys:
-        if not clusters or y - clusters[-1] > h * 0.02:  # 2% 거리 이상이면 새 클러스터
-            clusters.append(y)
-
-    return len(clusters) >= min_lines
+    # Tab 영역: 3~25%가 어두운 콘텐츠 (순수 흰 배경이면 < 1%, 기타 영상이면 > 30%)
+    return 0.02 < dark_ratio < 0.30
 
 
-def _detect_white_region(frame: np.ndarray) -> Optional[Tuple[int, int, int, int]]:
-    """흰색 사각형 영역 검출 (Tab 여부 무관). 반환: (x, y, w, h) or None"""
+# ─── Step 3: 패턴 감지 ────────────────────────────────────────────────────
+
+def _detect_tab_overlay(frame: np.ndarray) -> Optional[Tuple[int, int, int, int]]:
+    """Tab을 포함한 흰색 오버레이 박스 검출"""
     gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
     h, w = gray.shape
 
@@ -184,75 +286,22 @@ def _detect_white_region(frame: np.ndarray) -> Optional[Tuple[int, int, int, int
         x, y, cw, ch = cv2.boundingRect(cnt)
         area = cw * ch
         ratio = area / total_area
-
+        # 오버레이 = 프레임 폭의 85% 미만인 독립 박스 (전폭 스트립은 scroll)
+        width_ratio = cw / w
         if (OVERLAY_MIN_AREA_RATIO < ratio < OVERLAY_MAX_AREA_RATIO
-                and cw > ch * 0.5
-                and area > best_area):
-            best = (x, y, cw, ch)
-            best_area = area
+                and width_ratio < 0.85
+                and cw > ch * 0.5 and area > best_area):
+            # Tab 내용 검증
+            region = frame[y:y + ch, x:x + cw]
+            if _has_tab_content(region):
+                best = (x, y, cw, ch)
+                best_area = area
 
     return best
 
 
-def _detect_tab_overlay(frame: np.ndarray) -> Optional[Tuple[int, int, int, int]]:
-    """Tab 악보가 포함된 흰색 오버레이 박스 검출. 반환: (x, y, w, h) or None"""
-    bbox = _detect_white_region(frame)
-    if bbox is None:
-        return None
-
-    x, y, w, h = bbox
-    region = frame[y:y + h, x:x + w]
-
-    # Tab 수평 라인이 있는 경우에만 반환
-    if _has_tab_lines(region, min_lines=3):
-        return bbox
-    return None
-
-
-def _detect_split_screen(frames: List[np.ndarray], sample_count: int = 10) -> bool:
-    """분할 화면 감지: 상단이 밝은 Tab 용지, 하단이 어두운 핸드캠인지 확인
-
-    엄격한 기준:
-    - 상단 평균 밝기 > 180 (Tab 용지는 거의 흰색)
-    - 하단 평균 밝기 < 100 (핸드캠은 일반적으로 어두움)
-    - 밝기 차이 > 80
-    - 상단에 Tab 수평 라인이 4개 이상 존재
-    """
-    DETECT_SPLIT = 0.5  # 감지용 분할 비율
-
-    if len(frames) < sample_count:
-        sample_count = len(frames)
-
-    indices = np.linspace(0, len(frames) - 1, sample_count, dtype=int)
-    split_count = 0
-
-    for idx in indices:
-        frame = frames[idx]
-        fh, fw = frame.shape[:2]
-        top_half = frame[0:int(fh * DETECT_SPLIT), :]
-        bottom_half = frame[int(fh * DETECT_SPLIT):, :]
-
-        top_brightness = np.mean(cv2.cvtColor(top_half, cv2.COLOR_BGR2GRAY))
-        bottom_brightness = np.mean(cv2.cvtColor(bottom_half, cv2.COLOR_BGR2GRAY))
-
-        # 엄격한 밝기 기준: Tab 용지(>180) + 어두운 핸드캠(<100) + 큰 차이(>80)
-        if (top_brightness > 180 and bottom_brightness < 100
-                and top_brightness - bottom_brightness > 80
-                and _has_tab_lines(top_half, min_lines=4)):
-            split_count += 1
-
-    ratio = split_count / sample_count
-    return ratio > 0.3
-
-
 def detect_pattern(frames: List[np.ndarray], sample_count: int = 20) -> str:
-    """영상 패턴 감지: 'scroll', 'overlay', 또는 'split'
-
-    감지 순서:
-    1. overlay — Tab 오버레이 박스가 가장 구체적이므로 최우선
-    2. split — 상단 Tab 용지 + 하단 핸드캠 = 엄격한 밝기 기준
-    3. scroll — 기본 (상단 크롭)
-    """
+    """영상 패턴 감지: scroll (우선) vs overlay"""
     print("[3/5] 영상 패턴 분석 중...")
 
     if len(frames) < sample_count:
@@ -261,366 +310,443 @@ def detect_pattern(frames: List[np.ndarray], sample_count: int = 20) -> str:
     indices = np.linspace(0, len(frames) - 1, sample_count, dtype=int)
     sample_frames = [frames[i] for i in indices]
 
-    # 1) 오버레이 검출 먼저 — Tab 라인이 있는 흰 박스 (가장 구체적)
-    overlay_count = 0
-    for frame in sample_frames:
-        if _detect_tab_overlay(frame) is not None:
-            overlay_count += 1
+    # 1) 흰색 Tab 스트립 감지 (scroll) — 우선 검사
+    tab_top_count = 0
+    tab_bottom_count = 0
+    for f in sample_frames:
+        strip = _find_white_tab_strip(f)
+        if strip is not None:
+            top, bottom = strip
+            h = f.shape[0]
+            mid = (top + bottom) / 2
+            if mid < h * 0.5:
+                tab_top_count += 1
+            else:
+                tab_bottom_count += 1
 
+    tab_count = tab_top_count + tab_bottom_count
+    tab_ratio = tab_count / sample_count
+
+    # 60% 이상에서 흰색 스트립 → scroll
+    if tab_ratio >= 0.6:
+        position = "상단" if tab_top_count > tab_bottom_count else "하단"
+        print(f"  → 패턴: scroll (Tab {position}, 감지율: {tab_ratio:.0%})")
+        return "scroll"
+
+    # 2) 스트립 감지율 낮으면 오버레이 체크
+    overlay_count = sum(1 for f in sample_frames if _detect_tab_overlay(f) is not None)
     overlay_ratio = overlay_count / sample_count
-    if overlay_ratio > 0.3:
-        print(f"  → 패턴: overlay (Tab 오버레이 감지율: {overlay_ratio:.0%})")
+    if overlay_ratio > 0.2:
+        print(f"  → 패턴: overlay (감지율: {overlay_ratio:.0%})")
         return "overlay"
 
-    # 2) 분할 화면(split) 검출 — 상단 Tab 용지 + 하단 핸드캠
-    if _detect_split_screen(frames, sample_count):
-        print("  → 패턴: split (상단 Tab + 하단 핸드캠)")
-        return "split"
-
-    # 3) 기본: 스크롤형
-    print(f"  → 패턴: scroll (오버레이 감지율: {overlay_ratio:.0%})")
+    # 3) 둘 다 아니면 scroll 기본값
+    position = "상단" if tab_top_count > tab_bottom_count else "하단"
+    print(f"  → 패턴: scroll (fallback, Tab {position}, 감지율: {tab_ratio:.0%})")
     return "scroll"
 
 
-# ─── Step 4: Extract Unique Tab Frames ────────────────────────────────────
+# ─── Step 4: 고유 Tab 프레임 추출 ─────────────────────────────────────────
 
 def compare_frames(frame1: np.ndarray, frame2: np.ndarray) -> float:
-    """두 프레임의 유사도 비교 (0~1, 1=동일).
-
-    픽셀 수준 정규화 상호상관(NCC) 사용 — 히스토그램 방식보다
-    Tab 내용 변화(프렛 번호, 마디 위치 등)를 정확히 감지.
-    """
-    # 그레이스케일 변환
+    """MSE 기반 유사도 (0~1, 1=동일)"""
     g1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY) if len(frame1.shape) == 3 else frame1
     g2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY) if len(frame2.shape) == 3 else frame2
 
-    # 크기 맞추기
     if g1.shape != g2.shape:
         g2 = cv2.resize(g2, (g1.shape[1], g1.shape[0]))
 
-    # 표준화된 크기로 축소 (속도 + 노이즈 감소)
-    target_w = 320
+    target_w = 480
     if g1.shape[1] > target_w:
         scale = target_w / g1.shape[1]
-        new_size = (target_w, int(g1.shape[0] * scale))
-        g1 = cv2.resize(g1, new_size)
-        g2 = cv2.resize(g2, new_size)
+        sz = (target_w, int(g1.shape[0] * scale))
+        g1 = cv2.resize(g1, sz)
+        g2 = cv2.resize(g2, sz)
 
-    # 정규화 상호상관 (NCC): 픽셀 수준 비교
-    # MSE 기반: 0=동일, 높을수록 다름 → 유사도로 변환
-    g1_f = g1.astype(np.float32) / 255.0
-    g2_f = g2.astype(np.float32) / 255.0
-    mse = np.mean((g1_f - g2_f) ** 2)
+    mse = np.mean(((g1.astype(np.float32) - g2.astype(np.float32)) / 255.0) ** 2)
+    return max(0.0, 1.0 - min(mse * 8.0, 1.0))
 
-    # MSE → 유사도 변환 (0~1, 1=동일)
-    # factor 8: MSE 0.005→sim 0.96, MSE 0.06→sim 0.52, MSE 0.13+→sim 0.0
-    similarity = 1.0 - min(mse * 8.0, 1.0)
-    return max(0.0, similarity)
+
+def _dhash(image: np.ndarray, hash_size: int = 32) -> np.ndarray:
+    """Difference Hash — 구조 기반 해시 (32×32 = 1024비트).
+    인접 픽셀의 밝기 차이를 기록하여 위치 이동에 강건한 fingerprint 생성.
+    16→32 확대로 마디번호/음표 위치까지 구분 가능."""
+    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if len(image.shape) == 3 else image
+    resized = cv2.resize(gray, (hash_size + 1, hash_size), interpolation=cv2.INTER_AREA)
+    return (resized[:, 1:] > resized[:, :-1]).flatten()
+
+
+def _dedup_by_hash(frames: List[np.ndarray],
+                   max_hamming: int = 20) -> List[np.ndarray]:
+    """pHash 기반 클러스터 중복 제거.
+    유사 프레임을 그룹핑하고, 각 그룹에서 가장 선명한(Laplacian 분산 최대) 1장만 선택.
+    → 스크롤 중복 + 반복 연습 구간 모두 제거."""
+    if not frames:
+        return []
+
+    hashes = [_dhash(f) for f in frames]
+    n = len(frames)
+    used = [False] * n
+    clusters = []
+
+    for i in range(n):
+        if used[i]:
+            continue
+        cluster = [i]
+        used[i] = True
+        for j in range(i + 1, n):
+            if used[j]:
+                continue
+            dist = int(np.sum(hashes[i] != hashes[j]))
+            if dist <= max_hamming:
+                cluster.append(j)
+                used[j] = True
+        clusters.append(cluster)
+
+    # 각 클러스터에서 최고 선명도 프레임 선택
+    result = []
+    for cluster in clusters:
+        best_idx = max(cluster, key=lambda idx: cv2.Laplacian(
+            cv2.cvtColor(frames[idx], cv2.COLOR_BGR2GRAY)
+            if len(frames[idx].shape) == 3 else frames[idx],
+            cv2.CV_64F).var())
+        result.append(frames[best_idx])
+
+    return result
+
+
+def _detect_scroll_offset(frame_a: np.ndarray, frame_b: np.ndarray,
+                          template_ratio: float = 0.6,
+                          min_confidence: float = 0.75) -> Tuple[int, float]:
+    """두 프레임 사이의 수평 스크롤 오프셋 검출.
+    frame_a의 오른쪽 template_ratio 영역을 frame_b에서 탐색.
+    Returns: (scroll_px, confidence). scroll_px > 0 = 왼쪽으로 스크롤됨."""
+    ga = cv2.cvtColor(frame_a, cv2.COLOR_BGR2GRAY) if len(frame_a.shape) == 3 else frame_a
+    gb = cv2.cvtColor(frame_b, cv2.COLOR_BGR2GRAY) if len(frame_b.shape) == 3 else frame_b
+
+    # 높이 맞추기
+    if ga.shape[0] != gb.shape[0]:
+        target_h = min(ga.shape[0], gb.shape[0])
+        ga = ga[:target_h, :]
+        gb = gb[:target_h, :]
+
+    h, w = ga.shape
+    template_w = int(w * template_ratio)
+    if template_w < 20 or template_w >= w:
+        return (0, 0.0)
+
+    template = ga[:, w - template_w:]
+    result = cv2.matchTemplate(gb, template, cv2.TM_CCOEFF_NORMED)
+    _, max_val, _, max_loc = cv2.minMaxLoc(result)
+
+    scroll_px = (w - template_w) - max_loc[0]
+    if max_val < min_confidence or scroll_px <= 0:
+        return (0, max_val)
+
+    return (scroll_px, max_val)
+
+
+def _stitch_scroll_segment(segment: List[np.ndarray]) -> np.ndarray:
+    """스크롤 연속 프레임을 하나의 파노라마로 합성.
+    template matching으로 겹치는 영역을 제거하고 새 영역만 이어붙임."""
+    if len(segment) == 1:
+        return segment[0]
+
+    # 공통 높이 결정
+    min_h = min(f.shape[0] for f in segment)
+    panorama = segment[0][:min_h, :]
+
+    for i in range(1, len(segment)):
+        curr = segment[i][:min_h, :]
+        scroll_px, conf = _detect_scroll_offset(segment[i-1][:min_h, :], curr)
+
+        if scroll_px > 0 and conf > 0.7:
+            # 새로운 영역(오른쪽 scroll_px 픽셀)만 추가
+            new_strip = curr[:, curr.shape[1] - scroll_px:]
+            panorama = np.hstack([panorama, new_strip])
+        else:
+            # 스크롤 실패 → 전체 프레임 추가 (safe fallback)
+            panorama = np.hstack([panorama, curr])
+
+    return panorama
+
+
+def _merge_scroll_candidates(candidates: List[np.ndarray],
+                             min_scroll: int = 5,
+                             min_segment_len: int = 2) -> List[np.ndarray]:
+    """후보 프레임들을 스크롤 연결 여부로 그룹핑.
+    연속 스크롤 구간은 파노라마 합성, 나머지는 개별 유지."""
+    if len(candidates) <= 1:
+        return candidates
+
+    # 연속 프레임 간 스크롤 오프셋 측정
+    offsets = []
+    for i in range(len(candidates) - 1):
+        scroll_px, conf = _detect_scroll_offset(candidates[i], candidates[i+1])
+        offsets.append((scroll_px, conf))
+
+    # 스크롤 연속 구간(run) 분리
+    result = []
+    segment_start = 0
+    i = 0
+
+    while i < len(candidates):
+        # 다음 프레임과 스크롤 연결인지 확인
+        if i < len(offsets) and offsets[i][0] >= min_scroll and offsets[i][1] > 0.7:
+            # 스크롤 시작: 연속 구간 탐색
+            seg_end = i + 1
+            while seg_end < len(offsets) and offsets[seg_end][0] >= min_scroll and offsets[seg_end][1] > 0.7:
+                seg_end += 1
+            seg_end += 1  # 마지막 프레임 포함
+
+            segment = candidates[i:seg_end]
+            if len(segment) >= min_segment_len:
+                # 파노라마 합성
+                panorama = _stitch_scroll_segment(segment)
+                result.append(panorama)
+            else:
+                result.extend(segment)
+
+            i = seg_end
+        else:
+            result.append(candidates[i])
+            i += 1
+
+    return result
 
 
 def extract_unique_scroll(frames: List[np.ndarray],
-                          crop_ratio: float = DEFAULT_CROP_RATIO,
                           threshold: float = SIMILARITY_THRESHOLD) -> List[np.ndarray]:
-    """스크롤형: 상단 크롭 후 중복 제거"""
-    print("[4/5] 스크롤형 Tab 프레임 추출 중...")
+    """스크롤형: 업스케일 + HSV + median voting + 트림 + MSE → 파노라마 → pHash"""
+    print(f"[4/5] 스크롤형 Tab 추출 중 (threshold={threshold})...")
 
-    unique = []
-    prev_crop = None
+    # ── Phase 1: 전체 프레임의 strip 위치 수집 (median voting) ──
+    strip_tops = []
+    strip_bottoms = []
 
-    for i, frame in enumerate(frames):
-        h, w = frame.shape[:2]
-        crop = frame[0:int(h * crop_ratio), :]
+    for frame in frames:
+        orig_h, orig_w = frame.shape[:2]
+        if orig_w < DETECT_WIDTH:
+            scale = DETECT_WIDTH / orig_w
+            upscaled = cv2.resize(frame, (DETECT_WIDTH, int(orig_h * scale)),
+                                  interpolation=cv2.INTER_LANCZOS4)
+        else:
+            upscaled = frame
+            scale = 1.0
 
-        if prev_crop is None:
-            unique.append(crop)
-            prev_crop = crop
+        strip = _find_white_tab_strip(upscaled)
+        if strip is not None:
+            up_top, up_bottom = strip
+            strip_tops.append(int(up_top / scale))
+            strip_bottoms.append(int(up_bottom / scale))
+
+    if not strip_tops:
+        print("  → 흰색 스트립 미감지")
+        return []
+
+    median_top = int(np.median(strip_tops))
+    median_bottom = int(np.median(strip_bottoms))
+    print(f"  → 크롭 영역: y={median_top}~{median_bottom} "
+          f"(median of {len(strip_tops)} strips)")
+
+    # ── Phase 2: 크롭 + 트림 + MSE 1차 필터 ──
+    candidates = []
+    all_compared = []
+
+    for frame in frames:
+        h = frame.shape[0]
+        top = max(0, median_top)
+        bottom = min(h, median_bottom)
+        tab_crop = frame[top:bottom, :]
+
+        if not _has_tab_content(tab_crop):
             continue
 
-        sim = compare_frames(crop, prev_crop)
-        if sim < threshold:
-            unique.append(crop)
-            prev_crop = crop
+        tab_crop = _trim_to_content(tab_crop)
 
-    print(f"  → {len(unique)}개 고유 프레임 선별 (임계값: {threshold})")
+        compare_img = cv2.resize(tab_crop, (480, 120), interpolation=cv2.INTER_AREA)
+
+        is_dup = False
+        for ref in all_compared:
+            if compare_frames(compare_img, ref) >= threshold:
+                is_dup = True
+                break
+
+        if not is_dup:
+            candidates.append(tab_crop)
+            all_compared.append(compare_img)
+
+    print(f"  → MSE 1차: {len(candidates)}개 후보")
+
+    # ── Phase 2.5: 파노라마 스티칭 (스크롤 겹침 제거) ──
+    stitched = _merge_scroll_candidates(candidates)
+    if len(stitched) != len(candidates):
+        print(f"  → 파노라마: {len(candidates)}개 → {len(stitched)}개 (스크롤 합성)")
+
+    # ── Phase 3: pHash 2차 클러스터 중복 제거 ──
+    unique = _dedup_by_hash(stitched, max_hamming=50)
+    print(f"  → pHash 2차: {len(unique)}개 고유 Tab 프레임")
     return unique
 
 
-def _normalize_overlay(crop: np.ndarray, target_w: int = 320,
-                        target_h: int = 120) -> np.ndarray:
-    """오버레이 크롭을 고정 크기 흰색 캔버스 위에 배치 (비교 정규화용)"""
-    h, w = crop.shape[:2]
-    scale = min(target_w / w, target_h / h)
-    new_w = int(w * scale)
-    new_h = int(h * scale)
-    resized = cv2.resize(crop, (new_w, new_h))
-
-    # 흰색 캔버스에 중앙 배치
-    canvas = np.full((target_h, target_w, 3), 255, dtype=np.uint8)
-    offset_x = (target_w - new_w) // 2
-    offset_y = (target_h - new_h) // 2
-    canvas[offset_y:offset_y + new_h, offset_x:offset_x + new_w] = resized
-    return canvas
-
-
 def extract_unique_overlay(frames: List[np.ndarray],
-                           threshold: float = SIMILARITY_THRESHOLD) -> List[np.ndarray]:
-    """오버레이형: Tab 라인이 있는 흰 박스 영역 검출 후 중복 제거
-
-    슬라이딩 윈도우 비교: 각 프레임을 최근 N개 고유 프레임과 비교하여
-    점진적 변화 누적(drift)에 의한 중복을 방지.
-    """
-    print("[4/5] 오버레이형 Tab 프레임 추출 중...")
-
-    WINDOW_SIZE = 5  # 최근 5개 고유 프레임과 비교
-    MIN_CROP_H = 40  # 최소 크롭 높이 (너무 작은 검출 제외)
-    MIN_CROP_W = 100 # 최소 크롭 폭
+                           threshold: float = OVERLAY_SIMILARITY_THRESHOLD) -> List[np.ndarray]:
+    """오버레이형: Tab 오버레이 박스 추출 + 전체 히스토리 중복 제거"""
+    print("[4/5] 오버레이형 Tab 추출 중...")
 
     unique = []
-    recent_normalized = []  # 최근 고유 프레임 정규화 결과
+    all_normalized = []
 
-    for i, frame in enumerate(frames):
+    for frame in frames:
         bbox = _detect_tab_overlay(frame)
         if bbox is None:
             continue
 
         x, y, w, h = bbox
-        # 최소 크기 필터
-        if h < MIN_CROP_H or w < MIN_CROP_W:
+        if h < 40 or w < 100:
             continue
 
-        # 약간의 패딩 추가
         pad = 10
         x = max(0, x - pad)
         y = max(0, y - pad)
         w = min(frame.shape[1] - x, w + 2 * pad)
         h = min(frame.shape[0] - y, h + 2 * pad)
 
-        overlay_crop = frame[y:y + h, x:x + w]
-        normalized = _normalize_overlay(overlay_crop)
+        crop = frame[y:y + h, x:x + w]
 
-        # 최근 N개 고유 프레임과 비교 — 하나라도 유사하면 건너뛰기
-        is_duplicate = False
-        for ref_norm in recent_normalized:
-            sim = compare_frames(normalized, ref_norm)
-            if sim >= threshold:
-                is_duplicate = True
+        # 밝기 필터
+        if np.mean(cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)) < 120:
+            continue
+
+        # 정규화
+        normalized = cv2.resize(crop, (480, 180), interpolation=cv2.INTER_AREA)
+        canvas = np.full((180, 480, 3), 255, dtype=np.uint8)
+        canvas[:normalized.shape[0], :normalized.shape[1]] = normalized
+
+        # 전체 히스토리 비교
+        is_dup = False
+        for ref in all_normalized:
+            if compare_frames(canvas, ref) >= threshold:
+                is_dup = True
                 break
 
-        if not is_duplicate:
-            unique.append(overlay_crop)
-            recent_normalized.append(normalized)
-            # 윈도우 크기 유지
-            if len(recent_normalized) > WINDOW_SIZE:
-                recent_normalized.pop(0)
+        if not is_dup:
+            unique.append(crop)
+            all_normalized.append(canvas)
 
-    print(f"  → {len(unique)}개 고유 오버레이 프레임 선별")
+    print(f"  → {len(unique)}개 고유 Tab 오버레이")
     return unique
 
 
-def extract_unique_split(frames: List[np.ndarray],
-                         crop_ratio: float = SPLIT_TOP_RATIO,
-                         threshold: float = 0.95) -> List[np.ndarray]:
-    """분할 화면형: 상단 Tab 영역 크롭 후 중복 제거
-
-    MSE 기반 비교에서 동일 프레임은 sim>0.999, 커서만 이동 시 ~0.995.
-    실제 Tab 전환 시 sim 0.60~0.91. threshold=0.95가 적절한 균형점.
-    """
-    print(f"[4/5] 분할 화면형 Tab 프레임 추출 중 (crop={crop_ratio:.0%}, sim={threshold})...")
-
-    unique = []
-    prev_crop = None
-
-    for i, frame in enumerate(frames):
-        h, w = frame.shape[:2]
-        crop = frame[0:int(h * crop_ratio), :]
-
-        # 밝기 필터: 어두운 프레임(인트로/아웃트로) 제외
-        gray_crop = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
-        mean_brightness = np.mean(gray_crop)
-        if mean_brightness < 120:  # 어두운 프레임 건너뛰기
-            continue
-
-        # Tab 라인이 있는 프레임만 선별
-        if not _has_tab_lines(crop, min_lines=3):
-            continue
-
-        if prev_crop is None:
-            unique.append(crop)
-            prev_crop = crop
-            continue
-
-        sim = compare_frames(crop, prev_crop)
-        if sim < threshold:
-            unique.append(crop)
-            prev_crop = crop
-
-    print(f"  → {len(unique)}개 고유 분할화면 프레임 선별")
-    return unique
-
-
-# ─── Step 5: Generate PDF ─────────────────────────────────────────────────
+# ─── Step 5: A4 PDF 생성 ─────────────────────────────────────────────────
 
 def generate_pdf(frames: List[np.ndarray], output_path: Path,
                  debug_dir: Optional[Path] = None) -> None:
-    """고유 프레임들을 하나의 PDF로 합성"""
-    print("[5/5] PDF 생성 중...")
-
+    """Tab 프레임들을 A4 페이지에 여러 행으로 배치"""
+    print("[5/5] A4 PDF 생성 중...")
     if not frames:
-        print("  ⚠ 추출된 프레임이 없습니다!")
+        print("  ⚠ 프레임 없음!")
         return
 
-    pil_images = []
+    page_w = int(PDF_PAGE_WIDTH_MM / 25.4 * PDF_DPI)
+    page_h = int(PDF_PAGE_HEIGHT_MM / 25.4 * PDF_DPI)
+    margin = int(PDF_MARGIN_MM / 25.4 * PDF_DPI)
+    gap = int(TAB_GAP_MM / 25.4 * PDF_DPI)
+    content_w = page_w - 2 * margin
+
+    resized = []
     for i, frame in enumerate(frames):
-        # BGR → RGB
         rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
         img = Image.fromarray(rgb)
-
-        # 디버그 모드: 개별 이미지 저장
         if debug_dir:
             img.save(debug_dir / f"frame_{i:04d}.png")
+        scale = content_w / img.width
+        img_r = img.resize((content_w, int(img.height * scale)), Image.LANCZOS)
+        resized.append(img_r)
 
-        pil_images.append(img)
+    pages = []
+    cur_y = margin
+    page = Image.new('RGB', (page_w, page_h), (255, 255, 255))
 
-    # PDF 생성: 첫 이미지에 나머지를 append
-    # 각 프레임을 PDF 페이지로 변환 (원본 크기 유지)
-    pdf_pages = []
-    for img in pil_images:
-        # RGB → PDF 호환 (RGBA 미지원이므로 RGB로)
-        if img.mode != 'RGB':
-            img = img.convert('RGB')
-        pdf_pages.append(img)
+    for img in resized:
+        if cur_y + img.height > page_h - margin:
+            pages.append(page)
+            page = Image.new('RGB', (page_w, page_h), (255, 255, 255))
+            cur_y = margin
+        page.paste(img, (margin, cur_y))
+        cur_y += img.height + gap
 
-    if pdf_pages:
-        first_page = pdf_pages[0]
-        rest_pages = pdf_pages[1:] if len(pdf_pages) > 1 else []
-        first_page.save(
-            str(output_path),
-            save_all=True,
-            append_images=rest_pages,
-            resolution=PDF_DPI,
-        )
-        print(f"  → PDF 생성 완료: {output_path}")
-        print(f"     {len(pdf_pages)} 페이지, 파일 크기: {output_path.stat().st_size / 1024:.0f} KB")
+    if cur_y > margin + gap:
+        pages.append(page)
 
-
-# ─── Also generate single long PNG ────────────────────────────────────────
-
-def generate_long_image(frames: List[np.ndarray], output_path: Path) -> None:
-    """모든 프레임을 하나의 긴 이미지로 이어붙이기"""
-    if not frames:
+    if not pages:
         return
 
-    # 가장 넓은 프레임에 맞춰 통일
-    max_width = max(f.shape[1] for f in frames)
-    resized = []
+    pages[0].save(str(output_path), save_all=True,
+                  append_images=pages[1:], resolution=PDF_DPI)
+    print(f"  → PDF: {len(resized)} Tab → {len(pages)} 페이지, {output_path.stat().st_size // 1024} KB")
+
+
+def generate_long_image(frames: List[np.ndarray], output_path: Path) -> None:
+    """Tab을 하나의 긴 이미지로"""
+    if not frames:
+        return
+    max_w = max(f.shape[1] for f in frames)
+    imgs = []
     for f in frames:
-        if f.shape[1] != max_width:
-            scale = max_width / f.shape[1]
-            new_h = int(f.shape[0] * scale)
-            f = cv2.resize(f, (max_width, new_h))
-        resized.append(f)
-
-    concat = np.vstack(resized)
-    rgb = cv2.cvtColor(concat, cv2.COLOR_BGR2RGB)
-    img = Image.fromarray(rgb)
-    img.save(str(output_path))
-    print(f"  → 롱 이미지 생성: {output_path} ({img.width}x{img.height})")
+        if f.shape[1] != max_w:
+            scale = max_w / f.shape[1]
+            f = cv2.resize(f, (max_w, int(f.shape[0] * scale)))
+        imgs.append(f)
+    concat = np.vstack(imgs)
+    Image.fromarray(cv2.cvtColor(concat, cv2.COLOR_BGR2RGB)).save(str(output_path))
+    print(f"  → 롱 이미지: {max_w}x{concat.shape[0]}")
 
 
-# ─── Main Pipeline ────────────────────────────────────────────────────────
+# ─── Main ─────────────────────────────────────────────────────────────────
 
 def main():
-    parser = argparse.ArgumentParser(
-        description="YouTube 기타 TAB 영상 → PDF 캡처",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-예시:
-  python youtube_tab_to_pdf.py "https://youtu.be/90BWvJY6KbE"
-  python youtube_tab_to_pdf.py "https://youtu.be/Ri9g4lwnrJQ" -o my_tab.pdf --debug
-  python youtube_tab_to_pdf.py "https://youtu.be/VIDEO" --pattern overlay --crop-ratio 0.6
-        """,
-    )
-    parser.add_argument("url", help="YouTube 영상 URL")
-    parser.add_argument("-o", "--output", help="출력 PDF 파일 경로")
-    parser.add_argument("--crop-ratio", type=float, default=DEFAULT_CROP_RATIO,
-                        help=f"Tab 영역 크롭 비율 (기본: {DEFAULT_CROP_RATIO})")
-    parser.add_argument("--fps", type=float, default=DEFAULT_FPS,
-                        help=f"프레임 추출 빈도 (기본: {DEFAULT_FPS})")
-    parser.add_argument("--similarity", type=float, default=SIMILARITY_THRESHOLD,
-                        help=f"프레임 유사도 임계값 (기본: {SIMILARITY_THRESHOLD})")
-    parser.add_argument("--pattern", choices=["auto", "scroll", "overlay", "split"],
-                        default="auto", help="영상 패턴 (기본: auto)")
-    parser.add_argument("--debug", action="store_true", help="중간 이미지 저장")
-
+    parser = argparse.ArgumentParser(description="YouTube TAB → A4 PDF")
+    parser.add_argument("url", help="YouTube URL")
+    parser.add_argument("-o", "--output", help="출력 PDF 경로")
+    parser.add_argument("--fps", type=float, default=DEFAULT_FPS)
+    parser.add_argument("--similarity", type=float, default=None)
+    parser.add_argument("--pattern", choices=["auto", "scroll", "overlay"],
+                        default="auto")
+    parser.add_argument("--debug", action="store_true")
     args = parser.parse_args()
 
-    # 출력 디렉토리 설정
     output_dir = Path("output")
     output_dir.mkdir(exist_ok=True)
-
-    # Debug 디렉토리
     debug_dir = None
     if args.debug:
         debug_dir = output_dir / "debug_frames"
         debug_dir.mkdir(exist_ok=True)
 
-    # ── Step 1: Download ──
     video_path, safe_title = download_video(args.url, output_dir)
-
-    # ── Step 2: Extract Frames ──
     frames = extract_frames(video_path, fps=args.fps)
     if not frames:
-        print("❌ 프레임을 추출할 수 없습니다.")
+        print("❌ 프레임 추출 실패")
         sys.exit(1)
 
-    # ── Step 3: Detect Pattern ──
-    if args.pattern == "auto":
-        pattern = detect_pattern(frames)
-    else:
-        pattern = args.pattern
-        print(f"[3/5] 패턴 수동 지정: {pattern}")
+    pattern = detect_pattern(frames) if args.pattern == "auto" else args.pattern
 
-    # ── Step 4: Extract Unique Frames ──
     if pattern == "scroll":
-        unique_frames = extract_unique_scroll(
-            frames, crop_ratio=args.crop_ratio, threshold=args.similarity
-        )
-    elif pattern == "split":
-        # split 모드: 자체 최적값 사용 (crop=42%, sim=0.98)
-        # CLI에서 명시 지정 시에만 override
-        split_kwargs = {}
-        if args.crop_ratio != DEFAULT_CROP_RATIO:  # 사용자가 직접 지정한 경우
-            split_kwargs['crop_ratio'] = args.crop_ratio
-        if args.similarity != SIMILARITY_THRESHOLD:
-            split_kwargs['threshold'] = args.similarity
-        unique_frames = extract_unique_split(frames, **split_kwargs)
+        sim = args.similarity if args.similarity else SIMILARITY_THRESHOLD
+        unique = extract_unique_scroll(frames, threshold=sim)
     else:
-        unique_frames = extract_unique_overlay(
-            frames, threshold=args.similarity
-        )
+        sim = args.similarity if args.similarity else OVERLAY_SIMILARITY_THRESHOLD
+        unique = extract_unique_overlay(frames, threshold=sim)
 
-    if not unique_frames:
-        print("❌ 고유 프레임을 찾을 수 없습니다. --similarity 값을 낮추거나 --pattern을 수동 지정해보세요.")
+    if not unique:
+        print("❌ 고유 Tab 프레임 없음. --similarity를 낮추거나 --pattern을 수동 지정하세요.")
         sys.exit(1)
 
-    # ── Step 5: Generate Output ──
-    if args.output:
-        pdf_path = Path(args.output)
-    else:
-        pdf_path = output_dir / f"{safe_title}.pdf"
+    pdf_path = Path(args.output) if args.output else output_dir / f"{safe_title}.pdf"
+    generate_pdf(unique, pdf_path, debug_dir=debug_dir)
+    generate_long_image(unique, pdf_path.with_suffix(".png"))
 
-    generate_pdf(unique_frames, pdf_path, debug_dir=debug_dir)
-
-    # 보너스: 긴 이미지도 생성
-    long_img_path = pdf_path.with_suffix(".png")
-    generate_long_image(unique_frames, long_img_path)
-
-    print(f"\n✅ 완료!")
-    print(f"   PDF: {pdf_path}")
-    print(f"   PNG: {long_img_path}")
-    if debug_dir:
-        debug_count = len(list(debug_dir.glob("*.png")))
-        print(f"   Debug: {debug_dir} ({debug_count}개 이미지)")
+    print(f"\n✅ 완료! PDF: {pdf_path}")
 
 
 if __name__ == "__main__":