Files
Variet Worker f3e9e9f053 feat(engine): balanced 역할 jinja thinking + checkpoint RAM 오프로드
- --jinja + --chat-template-kwargs '{"enable_thinking":true}' 추가
- -cram 8192: context checkpoint를 GPU 대신 CPU RAM에 저장
  (GPU CUDA OOM 크래시 방지 — cuMemSetAccess 실패 at device:1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:44:15 +09:00
..