Variet/variet_llm

Fork 0

Files

Variet-Worker 96c91cb57a feat(phase-06): complete Hermes Agent windows fixes & deployment

2026-04-08 23:04:20 +09:00

1.8 KiB

Raw Blame History

gsd_state_version, milestone, milestone_name, status, last_updated, last_activity, progress

gsd_state_version

milestone

milestone_name

status

last_updated

last_activity

progress

1.0

v1.1

milestone

planning

2026-04-08T01:58:00.000Z

2026-04-08

total_phases	completed_phases	total_plans	completed_plans
3	2	3	2

Project State

Project Reference

A high-performance, locally-hosted AI assistant system built on two RTX 3060 12GB GPUs. It uses a "2+0" architecture where Machine A acts as a dedicated inference server running large language models, while Machine B handles the user interface (VS Code, Discord) and tool execution.

Current Position

Phase: 05 Plan: 05-PLAN.md (1 of 1) Status: Ready to execute Last activity: 2026-04-08

Progress

[████████████████████] 100% (Phase 01: LLM Tuning) [████████████████████] 100% (Phase 02: API Engine)

Completed Phases

Phase 01 (LLM Tuning): 5개 모델 최적 설정 확정 (74.65 / 61.62 / 16.0 / 16.7 / 8.95 t/s)
Phase 02 (API Engine): Variet Engine v1.0 — FastAPI 프록시 + 핫스왑 + 503 보호

Recent Decisions

2+0 GPU Architecture (Machine A API Server, Machine B tools).
5-tier model strategy: fast/balanced/deep-coder/deep-logic/ultra.
GPU 0 PCIe x4 제약 → 122B MoE는 GPU 1 단독 사용.
Variet Engine: 단일 포트(8000) FastAPI 리버스 프록시.
config/engine_models.json → 모든 설정의 Single Source of Truth.
CLI-First 검증 전략: VS Code Extension 전 OpenClaude CLI로 에이전트 루프 먼저 검증.

Roadmap Evolution

Phase 6 added: Install and evaluate Hermes Agent

Pending Todos

0 pending.

Blockers/Concerns

None.

Session Continuity

Last session: 2026-04-08T10:58:00+09:00 Stopped at: Phase 05 PLAN created, user will execute manually Resume file: .planning/phases/05-vscode-extension-packaging/.continue-here.md

1.8 KiB Raw Blame History