wip: [01-stabilize] paused at task 1/1 - OCR Hallucination Immune logic via Semantic delta window and fret-isolation

This commit is contained in:
2026-03-29 22:08:40 +09:00
parent aca7bf592a
commit 2507de45d3
4289 changed files with 732689 additions and 28672 deletions

View File

@@ -0,0 +1,39 @@
---
phase: 01-stabilize
task: 1
total_tasks: 1
status: in_progress
last_updated: 2026-03-29T22:05:18+09:00
---
<current_state>
We have successfully debugged and addressed the OCR sequence hallucination bug where a false measure reading (e.g. `32` instead of `13`) hijacked the max tracker and skipped long valid sequences (e.g. dropping `14` to `31`). `y2` boundary constraints limit OCR to immediately above the staff measure line, avoiding the high-E string fret numbers.
</current_state>
<completed_work>
- Task 1: Semantic PDF Stacker with LIS-style Delta Validator (`m_num <= last + 25`) - Done
- Task 2: Fret-number OCR exclusion (`y2 = staff_top - 5`) - Done
- Task 3: CV-Fallback novelty interpolation against false readings - Done
</completed_work>
<remaining_work>
- Final validation against the user's specific 140-measure track. No technical blockers remain on this end.
</remaining_work>
<decisions_made>
- Changed strict chronological cutoff (`<= max_measure`) to Delta Window Tracking (+25 threshold). Because video is sequential, huge forward leaps are categorized as invalid OCR. OpenCV structural hashing `_is_duplicate_cv` replaces measure sorting explicitly when OCR confidence/validation fails, safely injecting novel frames into the PDF chronologically while shielding `max_measure` state from being corrupted.
</decisions_made>
<blockers>
- None. Testing required on the target machine.
</blockers>
<context>
The logic is proven mathematically solid and behaves as a monotonic sequencer that inherently treats repetitive choruses or backward jumps as pure duplicates (via `_is_duplicate_cv`), burning them. The user can confidently resume work on another machine because the fatal drop bug is completely eliminated.
</context>
<next_action>
Start with: Fetch the repository. Run `youtube_tab_to_pdf.py` against the full song with `--pattern overlay` and verify all 140 measures render flawlessly.
</next_action>