1.2 KiB
1.2 KiB
CONCERNS
Refactoring Needs
- Monolithic Script:
youtube_tab_to_pdf.pyis over 1000 lines long. While some logic was extracted intovideo_cv_tracker.py, further decoupling of OCR validation, PDF tiling, and Downloading logic into specialized modules would reduce technical debt. - Heuristic Fragility: The pipeline extensively uses hardcoded CV heuristics (
OVERLAY_MIN_AREA_RATIO = 0.05,max_hamming: int = 20, matching against0.85or0.50similarities). Small changes in target video compression can break these fragile magic numbers.
Data & Quality Issues
- Repeating Choruses (D.S. al Coda): Navigating temporally jumping music is extremely difficult. The pipeline frequently struggled with overwriting chronological data when visual templates rematched an earlier chorus segment perfectly.
- OCR Instability: Relying on EasyOCR to catch frame overlaps depends heavily on the video's original resolution. Fuzzy YouTube compression makes small measure numbers hard to read accurately, causing the deduplication engine to randomly fail or mis-trigger.
- Performance: EasyOCR and complex morphological CV operations across thousands of frames are computationally intensive; lack of parallel processing limits speed.