37 lines
2.4 KiB
Markdown
37 lines
2.4 KiB
Markdown
# Guitar Score Extraction Pipeline (youtube_tab_to_pdf v2)
|
|
|
|
## What This Is
|
|
This project aims to automate the extraction of guitar tablature from YouTube videos into clean, readable PDFs. The current objective (v2) is to completely rebuild the OpenCV-based "scroll" and "overlay" extraction pipeline from scratch (zero-based) to solve the chronic issue of missing/discontinuous measure numbers.
|
|
|
|
## Target Users
|
|
- Guitarists wanting to practice songs from YouTube covers without manually transcribing or struggling to pause the video.
|
|
|
|
## Core Value
|
|
100% reliable measure extraction without overlaps, repetitions, or jumps, resulting in a perfectly sequenced PDF score.
|
|
|
|
## Context
|
|
The previous implementation (`merge_panoramas_list` and `cv2.matchTemplate`) relied on horizontal scrolling offset matches which completely failed when similar-looking choruses or repeating measures appeared, leading to entire sections of the song being overwritten and skipped. Additionally, the OCR-based measure duplicate detection was too unstable due to video compression noise and differing fonts.
|
|
|
|
## Existing Capabilities (Brownfield)
|
|
- ✓ YouTube `yt-dlp` integration and 1080p -> 720p scaling.
|
|
- ✓ Frame extraction memory-efficient loop (`DEFAULT_FPS=2`).
|
|
- ✓ Target Tab color isolation (`_find_white_tab_strip`).
|
|
- ✓ PDF generation via `img2pdf`.
|
|
|
|
## Active Requirements
|
|
- [ ] Implement Temporal Tracking to measure pixel shift velocity ($v_x$) across frames instead of purely matching past panoramic bounds.
|
|
- [ ] Implement Time-Median Filter to erase moving playheads and animated cursors cleanly.
|
|
- [ ] Robustly detect Tab Staff Line horizontal rows.
|
|
- [ ] Slice continuous stream by strictly calculating elapsed $v_x$ distance rather than relying on unreliable OCR text or thin measure bars.
|
|
- [ ] Create rigorous test suite asserting 0 missing frames across reference videos (`video_1`, `video_2`, `video_3`).
|
|
|
|
## Key Decisions
|
|
| Decision | Rationale | Outcome |
|
|
|----------|-----------|---------|
|
|
| **Zero-Based Rebuild** | Legacy horizontal stitching math was fundamentally flawed for repeating melodies. | — Pending |
|
|
| **Separation of CV tracking** | `youtube_tab_to_pdf.py` is too heavy (914 lines), move CV logic to `video_cv_tracker.py`. | — Pending |
|
|
| **Time-Median Filter** | Necessary to remove the playhead cursor which interferes with continuous sequence matching. | — Pending |
|
|
|
|
---
|
|
*Last updated: 2026-03-28 after initialization*
|