docs(planning): generate codebase map via gsd-map-codebase
This commit is contained in:
17
.planning/codebase/ARCHITECTURE.md
Normal file
17
.planning/codebase/ARCHITECTURE.md
Normal file
@@ -0,0 +1,17 @@
|
||||
# ARCHITECTURE
|
||||
|
||||
## System Design
|
||||
The project is built as a sequential, multi-step **Data Processing Pipeline** processing raw video into a formatted A4 PDF. The primary entry point pushes the video through 5 logical steps:
|
||||
|
||||
1. **Download (`Step 1`)**: Fetches target YouTube video.
|
||||
2. **Frame Extraction (`Step 2`)**: Opens OpenCV VideoCapture, skips frames based on `DEFAULT_FPS` to limit memory, and crops out non-meaningful content.
|
||||
3. **Pattern Detection (`Step 3`)**: Classifies the scrolling behavior (e.g., `scroll` vs `overlay`) using temporal tracking and template matching.
|
||||
4. **Frame Dedup & Stitching (`Step 4/5`)**: The core logic engine. Filters out duplicates caused by video pauses or rewind/D.S. al Coda behavior. Tracks pixel movements, stitches horizontal scrolling tabs, or stacks overlay pages using `TemporalTracker`.
|
||||
5. **PDF Tiling (`Step 6`)**: Breaks stitched panoramas into A4 printable chunks and bounds them using layout metrics.
|
||||
|
||||
## Key Subsystems
|
||||
- **Temporal Tracker** (`video_cv_tracker.py`): Tracks time-series differences between frames by evaluating column/row variations rather than relying entirely on brute-force image subtraction. This captures Page Flips cleanly.
|
||||
- **Duplicate Prevention Engine**: Employs a tiered validation:
|
||||
- Phase 1: Difference Hash (`_dhash`) clustering & Laplacian Variance.
|
||||
- Phase 2: Template match history memory (Catching identical choruses via distance vs time).
|
||||
- Phase 3: OCR measure verification (Ensuring numerical monotonicity).
|
||||
Reference in New Issue
Block a user