docs: initialize project
This commit is contained in:
36
.planning/PROJECT.md
Normal file
36
.planning/PROJECT.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Guitar Score Extraction Pipeline (youtube_tab_to_pdf v2)
|
||||
|
||||
## What This Is
|
||||
This project aims to automate the extraction of guitar tablature from YouTube videos into clean, readable PDFs. The current objective (v2) is to completely rebuild the OpenCV-based "scroll" and "overlay" extraction pipeline from scratch (zero-based) to solve the chronic issue of missing/discontinuous measure numbers.
|
||||
|
||||
## Target Users
|
||||
- Guitarists wanting to practice songs from YouTube covers without manually transcribing or struggling to pause the video.
|
||||
|
||||
## Core Value
|
||||
100% reliable measure extraction without overlaps, repetitions, or jumps, resulting in a perfectly sequenced PDF score.
|
||||
|
||||
## Context
|
||||
The previous implementation (`merge_panoramas_list` and `cv2.matchTemplate`) relied on horizontal scrolling offset matches which completely failed when similar-looking choruses or repeating measures appeared, leading to entire sections of the song being overwritten and skipped. Additionally, the OCR-based measure duplicate detection was too unstable due to video compression noise and differing fonts.
|
||||
|
||||
## Existing Capabilities (Brownfield)
|
||||
- ✓ YouTube `yt-dlp` integration and 1080p -> 720p scaling.
|
||||
- ✓ Frame extraction memory-efficient loop (`DEFAULT_FPS=2`).
|
||||
- ✓ Target Tab color isolation (`_find_white_tab_strip`).
|
||||
- ✓ PDF generation via `img2pdf`.
|
||||
|
||||
## Active Requirements
|
||||
- [ ] Implement Temporal Tracking to measure pixel shift velocity ($v_x$) across frames instead of purely matching past panoramic bounds.
|
||||
- [ ] Implement Time-Median Filter to erase moving playheads and animated cursors cleanly.
|
||||
- [ ] Robustly detect Tab Staff Line horizontal rows.
|
||||
- [ ] Slice continuous stream by strictly calculating elapsed $v_x$ distance rather than relying on unreliable OCR text or thin measure bars.
|
||||
- [ ] Create rigorous test suite asserting 0 missing frames across reference videos (`video_1`, `video_2`, `video_3`).
|
||||
|
||||
## Key Decisions
|
||||
| Decision | Rationale | Outcome |
|
||||
|----------|-----------|---------|
|
||||
| **Zero-Based Rebuild** | Legacy horizontal stitching math was fundamentally flawed for repeating melodies. | — Pending |
|
||||
| **Separation of CV tracking** | `youtube_tab_to_pdf.py` is too heavy (914 lines), move CV logic to `video_cv_tracker.py`. | — Pending |
|
||||
| **Time-Median Filter** | Necessary to remove the playhead cursor which interferes with continuous sequence matching. | — Pending |
|
||||
|
||||
---
|
||||
*Last updated: 2026-03-28 after initialization*
|
||||
Reference in New Issue
Block a user