Variet/variet_llm

Table of Contents

Session 2026-04-11 — Phase 01 Retuning & Hermes Agent v0.8.0

Summary
Key Outcomes

LLM Engine
Qwen Primary Promotion
Hermes Agent

Experiments (Not Adopted)
Files Changed
Git History
Hardware Constraints Documented
Next Session

Session 2026-04-11 — Phase 01 Retuning & Hermes Agent v0.8.0

Type: Maintenance Focus: LLM tuning re-verification, Hermes Agent update, primary model switch

Summary

Phase 01 (LLM Tuning) 재검증 세션과 Hermes Agent 주요 업데이트를 병행하여 진행. 두 LLM 역할 최적화 파라미터 갱신 및 Qwen 3.5 35B를 메인 모델로 승격, Hermes Agent v0.7.x → v0.8.0 업데이트 완료.

Key Outcomes

LLM Engine

balanced (Qwen 3.5 35B-A3B): 61.62 → 64.16 t/s (+4.1%)
fast (Gemma 4 26B-A4B): 74.65 → 71.89 t/s (Vision GPU 추가 trade-off)
Default role: fast → balanced
두 역할 모두 --mlock/--poll/--prio/-t/-tb 제거

Qwen Primary Promotion

속도 차이 1.25 t/s (negligible)
35B > 26B 품질 우위
Thinking mode + 한국어/코딩 강점
Vision CPU 오프로드 수용 (6.4s/image)

Hermes Agent

버전: fff237e1 → e902e55b (340 commits, v0.8.0)
로컬 8개 패치 자동 병합 (0 conflict)
설정: custom/qwen3.5-35b-a3b + DISCORD_HOME_CHANNEL 수동 설정

Experiments (Not Adopted)

Speculative Decoding (E2B draft): +14% gen vs -31% cold start → 채택 안 함
llama.cpp b8757: Gemma 4 9% 회귀 → b8660 유지

Files Changed

config/engine_models.json
docs/v3_balanced_retuning_log.md (new)
docs/v3_fast_retuning_log.md (new)
.planning/reports/20260411-session-report.md (new)
.planning/phases/01-llm-tuning/VERIFICATION.md (updated)
.planning/STATE.md (updated)
.planning/HANDOFF.json (updated)
agents/hermes-agent (submodule bump)
scripts/bench_short.py, bench_long.py, test_ts_ratios.py (new utilities)

Git History

e02626f chore(session): pause work — Qwen promoted to primary + Hermes v0.8.0
0dee779 refactor(phase-01): v3 retune fast & balanced roles

Hardware Constraints Documented

GPU 0: PCIe 3.0 x4 (3.94 GB/s) — bottleneck
GPU 1: PCIe 4.0 x16 (31.5 GB/s)
Total VRAM: 24 GB (2x RTX 3060 12GB)

Next Session

Run /gsd-resume-work to reload state from HANDOFF.json
Options:
- Start Hermes Agent (run_hermes_agent.bat)
- Resume Phase 05 (VS Code Extension Packaging)
- New feature development