Phase 01 (LLM Tuning): - Gemma4 26B: 74.65 t/s (fast) - Qwen 35B: 61.62 t/s (balanced) - Gemma4 31B: 16.0 t/s (deep-coder) - Qwen 27B: 16.7 t/s (deep-logic) - Qwen 122B: 8.95 t/s (ultra, GPU 1 only) Phase 02 (API Engine): - FastAPI reverse proxy on port 8000 - /engine/switch hot-swap with 503 protection - config/engine_models.json as single source of truth - Replaced 4 individual .bat files with unified engine File cleanup: - scripts/ 85 files -> 9 + _archive/ - Root .bat files -> _archive/
124 lines
3.9 KiB
JSON
124 lines
3.9 KiB
JSON
[
|
|
{
|
|
"id": "code_generate",
|
|
"category": "coding",
|
|
"name": "Python 함수 생성",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 47.56,
|
|
"tps": 16.82,
|
|
"eval_criteria": [
|
|
"correctness",
|
|
"type_hints",
|
|
"docstring",
|
|
"edge_cases"
|
|
]
|
|
},
|
|
{
|
|
"id": "code_debug",
|
|
"category": "coding",
|
|
"name": "버그 찾기 & 수정",
|
|
"model": "qwen27b",
|
|
"response": "# Fixed Code\n\n```python\ndef find_duplicates(arr):\n count = {}\n for item in arr:\n count[item] = count.get(item, 0) + 1\n \n # Filter to only include items that appear more than once\n duplicates = {item: count for item, count in count.items() if count > 1}\n return duplicates\n```\n\n## What Was Fixed\n\n1. **Changed return type**: Now returns a dictionary `{value: count}` instead of a list\n2. **Fixed counting logic**: Counts all occurrences of each item, not just marks as seen\n3. **Handles empty lists**: Returns `{}` instead of crashing\n4. **Removed unnecessary set conversion**: No need to convert to set since we're returning a dict\n\n## Test Cases\n\n```python\nprint(find_duplicates([1, 2, 2, 3, 3, 3])) # {2: 2, 3: 3}\nprint(find_duplicates([])) # {}\nprint(find_duplicates([1, 1, 1])) # {1: 3}\nprint(find_duplicates([1, 2, 3])) # {}\nprint(find_duplicates([5])) # {}\n```\n\n## Alternative (More Pythonic)\n\n```python\nfrom collections import Counter\n\ndef find_duplicates(arr):\n count = Counter(arr)\n return {item: count for item, count in count.items() if count > 1}\n```\n\nThis version uses `Counter` from the `collections`",
|
|
"tokens": 800,
|
|
"time": 48.01,
|
|
"tps": 16.66,
|
|
"eval_criteria": [
|
|
"bug_identified",
|
|
"correct_fix",
|
|
"clean_code"
|
|
]
|
|
},
|
|
{
|
|
"id": "code_refactor",
|
|
"category": "coding",
|
|
"name": "TypeScript 리팩토링",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 47.99,
|
|
"tps": 16.67,
|
|
"eval_criteria": [
|
|
"types",
|
|
"error_handling",
|
|
"backoff",
|
|
"production_quality"
|
|
]
|
|
},
|
|
{
|
|
"id": "korean_schedule",
|
|
"category": "assistant_kr",
|
|
"name": "한국어 일정 관리",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 47.75,
|
|
"tps": 16.75,
|
|
"eval_criteria": [
|
|
"korean_fluency",
|
|
"schedule_analysis",
|
|
"practical_advice"
|
|
]
|
|
},
|
|
{
|
|
"id": "korean_email",
|
|
"category": "assistant_kr",
|
|
"name": "한국어 이메일 요약",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 48.05,
|
|
"tps": 16.65,
|
|
"eval_criteria": [
|
|
"korean_summary",
|
|
"action_items",
|
|
"conciseness"
|
|
]
|
|
},
|
|
{
|
|
"id": "tool_calling",
|
|
"category": "tool_use",
|
|
"name": "Function Calling (JSON)",
|
|
"model": "qwen27b",
|
|
"response": "[{\"tool\": \"get_calendar\", \"args\": {\"date\": \"tomorrow\"}}, {\"tool\": \"search_web\", \"args\": {\"query\": \"latest quarterly report\"}}, {\"tool\": \"send_email\", \"args\": {\"to\": \"john@example.com\", \"subject\": \"Quarterly Report Summary\", \"body\": \"Summary of the latest quarterly report attached for your review.\"}}]",
|
|
"tokens": 719,
|
|
"time": 43.06,
|
|
"tps": 16.7,
|
|
"eval_criteria": [
|
|
"correct_sequence",
|
|
"valid_json",
|
|
"complete_args"
|
|
]
|
|
},
|
|
{
|
|
"id": "structured_output",
|
|
"category": "tool_use",
|
|
"name": "구조화 출력 (JSON)",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 48.01,
|
|
"tps": 16.66,
|
|
"eval_criteria": [
|
|
"correct_parsing",
|
|
"valid_json",
|
|
"completeness"
|
|
]
|
|
},
|
|
{
|
|
"id": "reasoning",
|
|
"category": "reasoning",
|
|
"name": "논리 추론",
|
|
"model": "qwen27b",
|
|
"response": "",
|
|
"tokens": 800,
|
|
"time": 47.67,
|
|
"tps": 16.78,
|
|
"eval_criteria": [
|
|
"correct_answer",
|
|
"clear_steps",
|
|
"math_accuracy"
|
|
]
|
|
}
|
|
] |