diff --git a/Session-2026-04-11.-.md b/Session-2026-04-11.-.md new file mode 100644 index 0000000..78ce524 --- /dev/null +++ b/Session-2026-04-11.-.md @@ -0,0 +1,67 @@ +# Session 2026-04-11 — Phase 01 Retuning & Hermes Agent v0.8.0 + +**Type**: Maintenance +**Focus**: LLM tuning re-verification, Hermes Agent update, primary model switch + +--- + +## Summary + +Phase 01 (LLM Tuning) 재검증 세션과 Hermes Agent 주요 업데이트를 병행하여 진행. 두 LLM 역할 최적화 파라미터 갱신 및 Qwen 3.5 35B를 메인 모델로 승격, Hermes Agent v0.7.x → v0.8.0 업데이트 완료. + +## Key Outcomes + +### LLM Engine +- **balanced** (Qwen 3.5 35B-A3B): 61.62 → **64.16 t/s** (+4.1%) +- **fast** (Gemma 4 26B-A4B): 74.65 → **71.89 t/s** (Vision GPU 추가 trade-off) +- Default role: `fast` → `balanced` +- 두 역할 모두 `--mlock/--poll/--prio/-t/-tb` 제거 + +### Qwen Primary Promotion +- 속도 차이 1.25 t/s (negligible) +- 35B > 26B 품질 우위 +- Thinking mode + 한국어/코딩 강점 +- Vision CPU 오프로드 수용 (6.4s/image) + +### Hermes Agent +- 버전: `fff237e1` → `e902e55b` (340 commits, v0.8.0) +- 로컬 8개 패치 자동 병합 (0 conflict) +- 설정: `custom/qwen3.5-35b-a3b` + `DISCORD_HOME_CHANNEL` 수동 설정 + +## Experiments (Not Adopted) + +- **Speculative Decoding** (E2B draft): +14% gen vs -31% cold start → 채택 안 함 +- **llama.cpp b8757**: Gemma 4 9% 회귀 → b8660 유지 + +## Files Changed + +- `config/engine_models.json` +- `docs/v3_balanced_retuning_log.md` (new) +- `docs/v3_fast_retuning_log.md` (new) +- `.planning/reports/20260411-session-report.md` (new) +- `.planning/phases/01-llm-tuning/VERIFICATION.md` (updated) +- `.planning/STATE.md` (updated) +- `.planning/HANDOFF.json` (updated) +- `agents/hermes-agent` (submodule bump) +- `scripts/bench_short.py`, `bench_long.py`, `test_ts_ratios.py` (new utilities) + +## Git History + +``` +e02626f chore(session): pause work — Qwen promoted to primary + Hermes v0.8.0 +0dee779 refactor(phase-01): v3 retune fast & balanced roles +``` + +## Hardware Constraints Documented + +- GPU 0: PCIe 3.0 x4 (3.94 GB/s) — bottleneck +- GPU 1: PCIe 4.0 x16 (31.5 GB/s) +- Total VRAM: 24 GB (2x RTX 3060 12GB) + +## Next Session + +- Run `/gsd-resume-work` to reload state from `HANDOFF.json` +- Options: + - Start Hermes Agent (`run_hermes_agent.bat`) + - Resume Phase 05 (VS Code Extension Packaging) + - New feature development