fix(extension): UTF-8 encoding + noise filter enhancement (v0.5.39)
- http-bridge.ts: add req.setEncoding('utf8') to all POST handlers
to fix Korean text corruption in pending/chat/dump payloads
- observer-script.ts: add inline pre-strip in cleanLines() for
Material icon names concatenated by textContent without newlines
- observer-script.ts: apply cleanLines() to codeText extraction
- known-issues: document UTF-8 encoding and noise filter issues
This commit is contained in:
@@ -4,12 +4,30 @@
|
||||
|
||||
> **<2A>씠 <20>뙆<EFBFBD>씪<EFBFBD><EC94AA><EFBFBD> SSOT(Single Source of Truth)<29>엯<EFBFBD>땲<EFBFBD>떎.**
|
||||
|
||||
> <EFBFBD>뵒踰꾧퉭<EFBFBD>씠<EFBFBD>굹 援ы쁽 <20>쟾<EFBFBD>뿉 **諛섎뱶<EC848E>떆** <20>씠 <20>뙆<EFBFBD>씪<EFBFBD>쓣 <20>솗<EFBFBD>씤<EFBFBD>븯<EFBFBD>꽭<EFBFBD>슂.
|
||||
|
||||
> <EFBFBD>꽭<EFBFBD>뀡 醫낅즺 <20>떆 <20>깉濡<EAB989> 諛쒓껄<EC9293>맂 <20>씠<EFBFBD>뒋瑜<EB928B> <20>씠 <20>뙆<EFBFBD>씪<EFBFBD>뿉 異붽<E795B0><EBB6BD><EFBFBD>빀<EFBFBD>땲<EFBFBD>떎.
|
||||
|
||||
> <EFBFBD>뵒踰꾧퉭<EFBFBD>씠<EFBFBD>굹 援ы쁽 <20>쟾<EFBFBD>뿉 **諛섎뱶<EC848E>떆** <20>씠 <20>뙆<EFBFBD>씪<EFBFBD>쓣 <20>솗<EFBFBD>씤<EFBFBD>븯<EFBFBD>꽭<EFBFBD>슂.
|
||||
|
||||
> <EFBFBD>꽭<EFBFBD>뀡 醫낅즺 <20>떆 <20>깉濡<EAB989> 諛쒓껄<EC9293>맂 <20>씠<EFBFBD>뒋瑜<EB928B> <20>씠 <20>뙆<EFBFBD>씪<EFBFBD>뿉 異붽<E795B0><EBB6BD><EFBFBD>빀<EFBFBD>땲<EFBFBD>떎.
|
||||
|
||||
|
||||
|
||||
# Known Issues & Lessons Learned
|
||||
|
||||
> **씠 뙆씪 SSOT(Single Source of Truth)엯땲떎.**
|
||||
> 뵒踰꾧퉭씠굹 援ы쁽 쟾뿉 **諛섎뱶떆** 씠 뙆뙆씪쓣 솗씤븯꽭슂.
|
||||
> 꽭뀡 醫낅즺 떆 깉濡 諛쒓껄맂 씠뒋瑜 씠 뙆뙆씪뿉 異붽빀땲떎.
|
||||
|
||||
> [!TIP]
|
||||
> 빐寃 셿猷뚮맂 怨쇨굅 씠뒋뒗 [`known-issues-archive.md`](file:///c:/Users/Variet-Worker/Desktop/gravity_control/.agents/references/known-issues-archive.md)뿉 蹂닿릺뼱 엳뒿땲떎.
|
||||
> 鍮꾩듂븳 臾몄젣媛 옱諛쒗븯硫 archive뿉꽌 寃깋븯꽭슂.
|
||||
|
||||
---
|
||||
|
||||
### [2026-04-13] [Extension] HTTP Bridge UTF-8 인코딩 깨짐 — 한글 description 손실
|
||||
- **증상**: pending/ 파일의 description 필드에서 한글이 `[AI ]`처럼 깨져서 저장됨. Discord로 전달되는 승인 요청 본문도 깨짐
|
||||
- **원인**: Node.js HTTP 서버의 `req.on('data', chunk)` 콜백에서 chunk가 Buffer 타입으로 전달되는데, `body += chunk`로 string 결합 시 Buffer의 기본 인코딩(latin1)이 사용되어 multi-byte UTF-8 문자가 손실됨
|
||||
- **해결**: 모든 POST 핸들러(`/pending`, `/dump-html`, `/chat`, `/deep-inspect-result`, `/test-rpc`)에 `req.setEncoding('utf8')` 추가 (v0.5.39)
|
||||
- **주의**: Node.js HTTP 서버에서 POST body를 문자열로 수집할 때는 반드시 `req.setEncoding('utf8')`을 호출하거나, Buffer를 배열로 모은 후 `Buffer.concat().toString('utf8')`로 변환해야 함
|
||||
|
||||
### [2026-04-13] [Extension] Observer noise 필터 미작동 — textContent가 아이콘 텍스트를 줄바꿈 없이 합침
|
||||
- **증상**: pending description에 `Thought for 1s`, `chevron_right` 등 Material 아이콘명과 UI 노이즈가 그대로 남아있음
|
||||
- **원인**: DOM `textContent`는 block 요소 사이에 newline을 삽입하지 않아 `[AI 본문 요약]Thought for 1schevron_right[결행 명령]`처럼 한 줄로 합쳐짐. `cleanLines()`의 줄 단위 noise 필터(`^pattern$`)가 매칭 실패. 또한 `codeText` 추출에는 `cleanLines()`가 아예 미적용
|
||||
|
||||
@@ -2,4 +2,5 @@
|
||||
|
||||
| NNN | HH:MM | 작업 설명 | `커밋해시` | 완료여부 |
|
||||
|-------|-------|----------|-----------|----------|
|
||||
| 001 | 09:50 | Observer v8 검증 — Extension POLL 정상, HTML 패치 확인, V8 캐시 삭제(24MB), BEACON 미수신(AG 재시작 필요) | 없음 | 🔧 |
|
||||
| 001 | 09:50 | Observer v8 검증 — Extension POLL 확인, HTML 패치 확인, V8 캐시 삭제(24MB), BEACON 미수신(AG 재시작 필요) | 없음 | 🔧 |
|
||||
| 002 | 12:34 | DOM Observer 데이터 품질 검증 + UTF-8 인코딩 수정 + noise 필터 강화 (v0.5.39) | `pending` | ✅ |
|
||||
|
||||
27
docs/devlog/entries/20260413-002.md
Normal file
27
docs/devlog/entries/20260413-002.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# DOM Observer 데이터 품질 검증 + UTF-8/noise 수정
|
||||
|
||||
- **시간**: 2026-04-13 12:34~12:52
|
||||
- **Commit**: `pending`
|
||||
- **Vikunja**: #619, #620 (진행 중)
|
||||
|
||||
## 검증 결과
|
||||
- DOM Observer v8 **동작 확인**: `pending/`에 45개 시그널 생성됨 (`source: "dom_observer"`)
|
||||
- 버튼 분류 정상: `command`(30), `permission`(15)
|
||||
- 명령어/conversation_id/버튼(Allow/Deny/Cancel) 추출 정상
|
||||
- **한글 인코딩 깨짐** 발견: description 필드에 `[AI 본문 요약]` → `[AI <20> <20>]`
|
||||
|
||||
## 변경 사항 (v0.5.39)
|
||||
|
||||
### http-bridge.ts
|
||||
- 모든 POST 핸들러에 `req.setEncoding('utf8')` 추가
|
||||
- Node.js HTTP 서버의 Buffer→string latin1 기본 인코딩으로 인한 multi-byte UTF-8 손실 수정
|
||||
|
||||
### observer-script.ts
|
||||
- `cleanLines()`에 인라인 pre-strip 추가: Material 아이콘명 18종을 regex로 `\n`으로 치환
|
||||
- `Thought for Xs` 패턴 인라인 제거 추가
|
||||
- `codeText` 추출에 `cleanLines()` 적용 (이전 미적용)
|
||||
|
||||
## 미완료
|
||||
- AG 재시작 후 v0.5.39 적용 검증 (한글 정상 출력 확인)
|
||||
- DOM dump 추출 검증
|
||||
- Discord 릴레이 E2E 검증
|
||||
@@ -2,7 +2,7 @@
|
||||
"name": "gravity-bridge",
|
||||
"displayName": "Gravity Bridge",
|
||||
"description": "Discord-based unified approval system for Antigravity AI interactions.",
|
||||
"version": "0.5.38",
|
||||
"version": "0.5.39",
|
||||
"publisher": "variet",
|
||||
"engines": {
|
||||
"vscode": "^1.100.0"
|
||||
|
||||
@@ -128,6 +128,7 @@ export function startHttpBridge(ctx: HttpBridgeContext, sdk: any): Promise<numbe
|
||||
|
||||
if (req.method === 'POST' && url.pathname === '/dump-html') {
|
||||
let dumpBody = '';
|
||||
req.setEncoding('utf8');
|
||||
req.on('data', (c: string) => dumpBody += c);
|
||||
req.on('end', () => {
|
||||
try {
|
||||
@@ -145,6 +146,7 @@ export function startHttpBridge(ctx: HttpBridgeContext, sdk: any): Promise<numbe
|
||||
|
||||
if (req.method === 'POST' && url.pathname === '/test-rpc') {
|
||||
let rpcBody = '';
|
||||
req.setEncoding('utf8');
|
||||
req.on('data', (c: string) => rpcBody += c);
|
||||
req.on('end', async () => {
|
||||
try {
|
||||
@@ -248,6 +250,7 @@ export function startHttpBridge(ctx: HttpBridgeContext, sdk: any): Promise<numbe
|
||||
|
||||
function _handlePending(req: any, res: any, ctx: HttpBridgeContext) {
|
||||
let body = '';
|
||||
req.setEncoding('utf8');
|
||||
req.on('data', (c: string) => body += c);
|
||||
req.on('end', () => {
|
||||
try {
|
||||
@@ -398,6 +401,7 @@ function _handleDeepInspectTrigger(res: any) {
|
||||
|
||||
function _handleDeepInspectResult(req: any, res: any, ctx: HttpBridgeContext) {
|
||||
let body = '';
|
||||
req.setEncoding('utf8');
|
||||
req.on('data', (c: string) => body += c);
|
||||
req.on('end', () => {
|
||||
try {
|
||||
@@ -420,6 +424,7 @@ function _handleDeepInspectResult(req: any, res: any, ctx: HttpBridgeContext) {
|
||||
|
||||
function _handleChatSnapshot(req: any, res: any, ctx: HttpBridgeContext) {
|
||||
let body = '';
|
||||
req.setEncoding('utf8');
|
||||
req.on('data', (c: string) => body += c);
|
||||
req.on('end', () => {
|
||||
try {
|
||||
|
||||
@@ -40,7 +40,7 @@ export function generateApprovalObserverScript(_port: number): string {
|
||||
'arrow_forward|arrow_back|expand_more|expand_less|close|more_horiz|more_vert|' +
|
||||
'content_copy|content_paste|check|check_circle|error|warning|info|' +
|
||||
'keyboard_arrow_up|keyboard_arrow_down|keyboard_arrow_left|keyboard_arrow_right|' +
|
||||
'Thought for \\\\d+|Show more|Show less|Copy|Copied!|Edit|Cancel|' +
|
||||
'Thought for \\\\d+s?|Thought for a few seconds|Show more|Show less|Copy|Copied!|Edit|Cancel|' +
|
||||
'Always run|Always allow|Running command|Running \\\\d+ commands?|' +
|
||||
'Deny|Allow|Allow Once|Allow This Conversation|' +
|
||||
'Run|Send|Stop|Review Changes|Accept all|Reject all|Accept|Reject' +
|
||||
@@ -60,6 +60,10 @@ export function generateApprovalObserverScript(_port: number): string {
|
||||
|
||||
function cleanLines(text) {
|
||||
if (!text) return '';
|
||||
// Pre-strip: inline removal of icon names and UI noise that textContent concatenates without newlines
|
||||
text = text.replace(/\\b(chevron_right|chevron_left|arrow_drop_down|arrow_drop_up|arrow_right|arrow_left|arrow_forward|arrow_back|expand_more|expand_less|more_horiz|more_vert|content_copy|content_paste|keyboard_arrow_up|keyboard_arrow_down|keyboard_arrow_left|keyboard_arrow_right|slow_motion_video|open_in_new)\\b/g, '\\n');
|
||||
text = text.replace(/Thought for \\d+s?/gi, '');
|
||||
text = text.replace(/Thought for a few seconds/gi, '');
|
||||
var lines = text.split('\\n');
|
||||
var clean = [];
|
||||
for (var i = 0; i < lines.length; i++) {
|
||||
@@ -130,7 +134,7 @@ export function generateApprovalObserverScript(_port: number): string {
|
||||
var codeEl = stepEl.querySelector('pre, code');
|
||||
var codeText = '';
|
||||
if (codeEl) {
|
||||
codeText = (codeEl.textContent || '').trim().substring(0, 400);
|
||||
codeText = cleanLines((codeEl.textContent || '').trim().substring(0, 400));
|
||||
}
|
||||
|
||||
// Try aria-label on button
|
||||
|
||||
94
scratch_patch_verify.js
Normal file
94
scratch_patch_verify.js
Normal file
@@ -0,0 +1,94 @@
|
||||
/**
|
||||
* html-patcher 수정 검증 스크립트
|
||||
* 실제 workbench.html + 실제 observer-script 출력물로 패치 시뮬레이션
|
||||
*/
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
// 1. 실제 깨끗한 workbench.html 읽기
|
||||
const htmlPath = path.join(
|
||||
process.env.LOCALAPPDATA,
|
||||
'Programs', 'Antigravity', 'resources', 'app', 'out',
|
||||
'vs', 'code', 'electron-browser', 'workbench', 'workbench.html'
|
||||
);
|
||||
let html = fs.readFileSync(htmlPath, 'utf8');
|
||||
console.log(`[1] Clean HTML: ${html.length} chars, ${html.split('\n').length} lines`);
|
||||
console.log(` Has AG SDK: ${html.includes('AG SDK')}`);
|
||||
|
||||
// 2. 실제 observer-script.ts의 출력 시뮬레이션 (generateApprovalObserverScript)
|
||||
const observerModule = require('./extension/out/observer-script');
|
||||
const observerJS = observerModule.generateApprovalObserverScript(34332);
|
||||
console.log(`[2] Observer JS: ${observerJS.length} chars`);
|
||||
console.log(` Contains $': ${observerJS.includes("$'")}`);
|
||||
console.log(` Contains ')$': ${observerJS.includes("')$")}`);
|
||||
|
||||
// 3. 패치 시뮬레이션 — 수정 전 (BUG)
|
||||
const inlineBlock_buggy = `<!-- AG SDK INLINE [variet-gravity-bridge] -->\n<script>\n${observerJS}\n</script>\n<!-- /AG SDK INLINE [variet-gravity-bridge] -->`;
|
||||
let html_buggy = html.replace('</body>', `\n${inlineBlock_buggy}\n</body>`);
|
||||
|
||||
// 4. 패치 시뮬레이션 — 수정 후 (FIX)
|
||||
const inlineBlock = `<!-- AG SDK INLINE [variet-gravity-bridge] -->\n<script>\n${observerJS}\n</script>\n<!-- /AG SDK INLINE [variet-gravity-bridge] -->`;
|
||||
const safeInlineBlock = inlineBlock.replace(/\$/g, '$$$$');
|
||||
let html_fixed = html.replace('</body>', `\n${safeInlineBlock}\n</body>`);
|
||||
|
||||
console.log(`\n[3] BUGGY result: ${html_buggy.length} chars`);
|
||||
console.log(`[4] FIXED result: ${html_fixed.length} chars`);
|
||||
|
||||
// 5. JS 코드 추출 및 SyntaxError 검증
|
||||
function extractAndCheckJS(patchedHtml, label) {
|
||||
const match = patchedHtml.match(/<script>\n([\s\S]*?)\n<\/script>/);
|
||||
if (!match) {
|
||||
console.log(`[${label}] ERROR: <script> block not found!`);
|
||||
return false;
|
||||
}
|
||||
const jsCode = match[1];
|
||||
|
||||
// Check if original HTML structure leaked into JS
|
||||
const hasStartupComment = jsCode.includes('<!-- Startup');
|
||||
const hasWorkbenchJS = jsCode.includes('<script src="./workbench.js"');
|
||||
const hasClosingHtml = jsCode.includes('</html>') && !jsCode.includes("'</html>'");
|
||||
|
||||
console.log(`[${label}] JS code: ${jsCode.length} chars`);
|
||||
console.log(` Leaked <!-- Startup -->: ${hasStartupComment} ${hasStartupComment ? '❌ CORRUPT' : '✅ OK'}`);
|
||||
console.log(` Leaked <script src=workbench.js>: ${hasWorkbenchJS} ${hasWorkbenchJS ? '❌ CORRUPT' : '✅ OK'}`);
|
||||
console.log(` Leaked </html>: ${hasClosingHtml} ${hasClosingHtml ? '❌ CORRUPT' : '✅ OK'}`);
|
||||
|
||||
// Check NOISE_RE is intact: should contain ')$', 'i'
|
||||
const hasNoiseRE = jsCode.includes("')$', 'i'");
|
||||
console.log(` NOISE_RE ')$', 'i' preserved: ${hasNoiseRE} ${hasNoiseRE ? '✅ OK' : '❌ BROKEN'}`);
|
||||
|
||||
// Try to parse JS
|
||||
try {
|
||||
new Function(jsCode);
|
||||
console.log(` JS Syntax: ✅ VALID — no SyntaxError`);
|
||||
return true;
|
||||
} catch (e) {
|
||||
console.log(` JS Syntax: ❌ SyntaxError — ${e.message}`);
|
||||
// Find the problematic line
|
||||
const lines = jsCode.split('\n');
|
||||
const lineMatch = e.message.match(/line (\d+)/);
|
||||
if (lineMatch) {
|
||||
const lineNum = parseInt(lineMatch[1]);
|
||||
console.log(` Around line ${lineNum}:`);
|
||||
for (let i = Math.max(0, lineNum - 3); i < Math.min(lines.length, lineNum + 3); i++) {
|
||||
console.log(` ${i+1}: ${lines[i].substring(0, 100)}`);
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
console.log('\n===== BUGGY VERSION (before fix) =====');
|
||||
const buggyOK = extractAndCheckJS(html_buggy, 'BUGGY');
|
||||
|
||||
console.log('\n===== FIXED VERSION (after fix) =====');
|
||||
const fixedOK = extractAndCheckJS(html_fixed, 'FIXED');
|
||||
|
||||
console.log('\n===== VERDICT =====');
|
||||
if (!buggyOK && fixedOK) {
|
||||
console.log('✅ FIX CONFIRMED: Buggy version has SyntaxError, fixed version is clean.');
|
||||
} else if (buggyOK && fixedOK) {
|
||||
console.log('⚠️ Both versions work — the bug may not reproduce in this environment.');
|
||||
} else if (!fixedOK) {
|
||||
console.log('❌ FIX FAILED: Fixed version still has errors!');
|
||||
}
|
||||
Reference in New Issue
Block a user