2.4 KiB
Guitar Score Extraction Pipeline (youtube_tab_to_pdf v2)
What This Is
This project aims to automate the extraction of guitar tablature from YouTube videos into clean, readable PDFs. The current objective (v2) is to completely rebuild the OpenCV-based "scroll" and "overlay" extraction pipeline from scratch (zero-based) to solve the chronic issue of missing/discontinuous measure numbers.
Target Users
- Guitarists wanting to practice songs from YouTube covers without manually transcribing or struggling to pause the video.
Core Value
100% reliable measure extraction without overlaps, repetitions, or jumps, resulting in a perfectly sequenced PDF score.
Context
The previous implementation (merge_panoramas_list and cv2.matchTemplate) relied on horizontal scrolling offset matches which completely failed when similar-looking choruses or repeating measures appeared, leading to entire sections of the song being overwritten and skipped. Additionally, the OCR-based measure duplicate detection was too unstable due to video compression noise and differing fonts.
Existing Capabilities (Brownfield)
- ✓ YouTube
yt-dlpintegration and 1080p -> 720p scaling. - ✓ Frame extraction memory-efficient loop (
DEFAULT_FPS=2). - ✓ Target Tab color isolation (
_find_white_tab_strip). - ✓ PDF generation via
img2pdf.
Active Requirements
- Implement Temporal Tracking to measure pixel shift velocity (
v_x) across frames instead of purely matching past panoramic bounds. - Implement Time-Median Filter to erase moving playheads and animated cursors cleanly.
- Robustly detect Tab Staff Line horizontal rows.
- Slice continuous stream by strictly calculating elapsed
v_xdistance rather than relying on unreliable OCR text or thin measure bars. - Create rigorous test suite asserting 0 missing frames across reference videos (
video_1,video_2,video_3).
Key Decisions
| Decision | Rationale | Outcome |
|---|---|---|
| Zero-Based Rebuild | Legacy horizontal stitching math was fundamentally flawed for repeating melodies. | — Pending |
| Separation of CV tracking | youtube_tab_to_pdf.py is too heavy (914 lines), move CV logic to video_cv_tracker.py. |
— Pending |
| Time-Median Filter | Necessary to remove the playhead cursor which interferes with continuous sequence matching. | — Pending |
Last updated: 2026-03-28 after initialization