feat: Variet Engine v1.0 + 5-model tuning complete
Phase 01 (LLM Tuning): - Gemma4 26B: 74.65 t/s (fast) - Qwen 35B: 61.62 t/s (balanced) - Gemma4 31B: 16.0 t/s (deep-coder) - Qwen 27B: 16.7 t/s (deep-logic) - Qwen 122B: 8.95 t/s (ultra, GPU 1 only) Phase 02 (API Engine): - FastAPI reverse proxy on port 8000 - /engine/switch hot-swap with 503 protection - config/engine_models.json as single source of truth - Replaced 4 individual .bat files with unified engine File cleanup: - scripts/ 85 files -> 9 + _archive/ - Root .bat files -> _archive/
This commit is contained in:
24
scripts/_archive/qwen_latest.txt
Normal file
24
scripts/_archive/qwen_latest.txt
Normal file
@@ -0,0 +1,24 @@
|
||||
UD-IQ4_NL | pure-GPU minbatch | 65.11 | GPU | 19177
|
||||
UD-IQ4_NL | pure-GPU nommap small | 65.01 | GPU | 19672
|
||||
UD-IQ4_NL | pure-GPU row-split | 13.65 | GPU | 19427
|
||||
UD-IQ4_NL | pure-GPU ts=0.5,0.5 | 64.92 | GPU | 19664
|
||||
UD-IQ4_NL | pure-GPU all-tricks | 64.72 | GPU | 19171
|
||||
UD-IQ4_NL | tune t=2 | 64.87 | GPU | 19170
|
||||
UD-IQ4_NL | tune t=6 | 64.88 | GPU | 19168
|
||||
UD-IQ4_NL | tune t=8 | 64.5 | GPU | 19168
|
||||
UD-IQ4_NL | tune ub=256 b=1024 | 64.73 | GPU | 20640
|
||||
UD-IQ4_NL | tune ub=256 b=2048 | 63.69 | GPU | 20614
|
||||
UD-IQ4_NL | tune kv=q8_0/q8_0 | 64.78 | GPU | 20422
|
||||
UD-IQ4_NL | tune kv=f16/f16 | 65.53 | GPU | 22812
|
||||
UD-IQ4_NL | FINAL | 66.31 | GPU | 22811
|
||||
MXFP4_MOE | pure-GPU minbatch | 63.06 | GPU | 22747
|
||||
MXFP4_MOE | pure-GPU nommap small | 63.75 | GPU | 22579
|
||||
MXFP4_MOE | pure-GPU ts=0.5,0.5 | 62.88 | GPU | 22578
|
||||
MXFP4_MOE | pure-GPU all-tricks | 62.55 | GPU | 22743
|
||||
MXFP4_MOE | tune t=2 | 63.07 | GPU | 22601
|
||||
MXFP4_MOE | tune t=6 | 63.58 | GPU | 22583
|
||||
MXFP4_MOE | tune t=8 | 62.92 | GPU | 22536
|
||||
MXFP4_MOE | tune ub=256 b=1024 | 62.76 | GPU | 22874
|
||||
MXFP4_MOE | tune ub=256 b=2048 | 62.74 | GPU | 22912
|
||||
MXFP4_MOE | FINAL | 63.71 | GPU | 22566
|
||||
Q4_K_M | pure-GPU nommap small | 62.29 | GPU | 22975
|
||||
Reference in New Issue
Block a user