2.0 KiB
2.0 KiB
Roadmap
Phase 1: CV Core Refactor (video_cv_tracker.py)
Goal: Isolate and establish the core computer vision algorithms needed for temporal continuous extraction of guitar tabs without OCR.
- Dependencies: None.
- Plans:
01-create-tracker.md: Buildvideo_cv_tracker.pyintroducing exactly three core functions:extract_roi_median()(playhead killer),compute_pixel_shift()(1D phase correlation tracking), andstitch_temporally()(append newly shifted columns only).02-unit-tests.md: Write minimal unit tests feeding dummy arrays simulating guitar chord lines scrolling to verify exactlyv_xshift is returned.
Phase 2: Refactoring youtube_tab_to_pdf.py
Goal: Tear down the old logic and integrate the new temporal tracking mechanism.
- Dependencies: Phase 1.
- Plans:
01-remove-legacy.md: Deletecv2.matchTemplate-heavy sprawling logic, the fragile_merge_scroll_candidates(), and unpredictable_detect_measure_bars().02-integrate-tracker.md: Hookextract_unique_scrolldirectly to thevideo_cv_trackergenerator and loop frames across time, returning one continuous panoramic image.
Phase 3: Slicing & PDF Integration
Goal: Reliably chop the massive horizontal panoramic tab into A4 width segments.
- Dependencies: Phase 2.
- Plans:
01-robust-measure-chop.md: Given a complete panorama, cut it blindly into fixed max chunk widths (simulating A4 line breaks) OR slice correctly using the tab color projection. Measure jumps are natively prevented by Phase 1.02-pdf-export.md: Hand off arrays back to the existingimg2pdfPDF generation stack.
Phase 4: Final Acceptance Testing
Goal: Execute test suite against video_1 (晴る), video_2 (新宝島), video_3 (空奏列車).
- Dependencies: Phase 3.
- Plans:
01-execute-end-to-end.md: Runpython test_pipeline.py.02-verify-output.md: Visually inspect theoutput/debug_frames/panoramas to prove zero overlaps and strict chronological transcription.