feat(observer): v15 AG Native chat relay — scanChatBodies dual strategy (#632)

- Add AG Native DOM path: #conversation + .leading-relaxed.select-text
- Keep Cascade path: [data-testid=conversation-view] + [data-step-index]
- Register #632 in known-issues.md (SDK+DOM both blocked for AG Native)
- Bump version 0.5.50 → 0.5.51
- Add DOM analysis helper scripts
This commit is contained in:
Variet Worker
2026-04-16 05:28:44 +09:00
parent a00d561e28
commit 729875f3a6
7 changed files with 463 additions and 10 deletions

View File

@@ -21,6 +21,15 @@
> 鍮꾩듂븳 臾몄젣媛 옱諛쒗븯硫 archive뿉꽌 寃깋븯꽭슂.
### [2026-04-16] [Extension] ★ AG Native 세션 AI 응답이 Discord에 전혀 전달되지 않음 (미해결, #632)
- **증상**: Discord에 **명령 승인 신호만 전달**되고, AI의 대화 응답/답변 텍스트는 전혀 전달되지 않음. 수십 세션에 걸쳐 지속 발생.
- **원인 1 (SDK 경로)**: `GetCascadeTrajectorySteps(cascadeId=세션ID)``500 trajectory not found`. AG Native 세션은 Cascade trajectory API에 등록되지 않아 step-probe의 RT-CAPTURE가 불가능. `stepCount`가 항상 1, `delta`는 항상 0이므로 응답 캡처 루프 진입 자체가 안 됨.
- **원인 2 (DOM 경로)**: Observer `scanChatBodies()``[data-testid="conversation-view"]`, `[data-step-index]`, `.markdown-body`, `.prose` 등을 탐색하지만, **AG Native 렌더러에는 이 셀렉터가 전부 존재하지 않음** (DOM dump 확인: hasConversationView=false, hasStepIndex=false, hasMarkdownBody=false, hasProse=false, dataTestIds=[])
- **결과**: 버튼 감지(scan→/pending)는 `button` 태그 직접 탐색이므로 정상 작동하나, AI 응답 텍스트 추출 경로는 SDK/DOM 양쪽 모두 구조적으로 차단됨
- **해결 방향**: AG Native 채팅 패널의 실제 DOM 구조를 deep-inspect로 분석하여, AI 응답 컨테이너의 올바른 셀렉터(class/attribute)를 찾아 `scanChatBodies()` 수정 필요. SDK 경로는 AG 구조적 한계로 사용 불가.
- **주의**: AG Native 렌더러는 `data-testid`, `data-step-index` 등 Cascade 전용 속성을 사용하지 않음. DOM 분석 시 반드시 AG 패널이 활성화된 상태에서 dump를 취득해야 함 (설정 페이지와 혼동 금지)
- **Vikunja**: #632
### [2026-04-16] [Extension] 터미널 출력(stdout) 텍스트가 명령어로 Discord에 전송 (v0.5.50)
- **증상**: Discord에 `cmd="No extension.log found"`, `cmd="AG CLI not found..."`, `cmd="Log found: C:\..."` 등 터미널 **출력** 텍스트가 명령어로 전송됨
- **원인**: Observer가 code 블록 2개를 감지: (1) 프롬프트+명령어 → JUNK_CODE_RE로 스킵, (2) 터미널 출력 → 유효한 code로 판단 → description에 포함. http-bridge enrichment에서 description에 prompt marker(`>`)가 없으면 rawDesc 전체를 enrichedCmd로 채택

View File

@@ -0,0 +1,160 @@
"""Analyze AG Native DOM structure to find AI response containers."""
import json, os, sys
def load_dump():
bridge = os.path.join(os.path.expanduser('~'), '.gemini', 'antigravity', 'bridge')
# Try deep-inspect result first, then dump_html
for fname in ['deep-inspect-result.json', 'dump_html.json']:
fpath = os.path.join(bridge, fname)
if os.path.exists(fpath):
print(f"Loading: {fname} ({os.path.getsize(fpath)} bytes)")
with open(fpath, 'r', encoding='utf-8-sig') as f:
return json.load(f), fname
return None, None
def find_text_containers(node, path="", depth=0, results=None):
"""Recursively find nodes with substantial text content (potential AI response containers)."""
if results is None:
results = []
if not isinstance(node, dict):
return results
tag = node.get('tag', '')
cls = node.get('cls', '')
text = node.get('text', '')
attrs = node.get('attrs', {})
children = node.get('children', [])
cur_path = f"{path}/{tag}"
if cls:
short_cls = cls[:60]
cur_path += f".{short_cls}"
# Look for nodes with long text (potential AI responses)
if text and len(text) > 50:
results.append({
'path': cur_path,
'depth': depth,
'tag': tag,
'cls': cls[:100],
'text_len': len(text),
'text_preview': text[:120],
'attrs': {k:v for k,v in attrs.items() if k not in ('style',)}
})
for child in children:
find_text_containers(child, cur_path, depth+1, results)
return results
def find_by_class_pattern(node, patterns, path="", depth=0, results=None):
"""Find nodes matching class patterns."""
if results is None:
results = []
if not isinstance(node, dict):
return results
tag = node.get('tag', '')
cls = node.get('cls', '')
attrs = node.get('attrs', {})
children = node.get('children', [])
text = node.get('text', '')
cur_path = f"{path}/{tag}"
for pattern in patterns:
if pattern.lower() in cls.lower() or pattern.lower() in str(attrs).lower():
child_count = len(children)
results.append({
'path': cur_path,
'depth': depth,
'tag': tag,
'cls': cls[:150],
'pattern': pattern,
'text_preview': text[:80] if text else '',
'child_count': child_count,
'attrs': {k:v[:50] for k,v in attrs.items() if k != 'style'}
})
for child in children:
find_by_class_pattern(child, patterns, cur_path, depth+1, results)
return results
def analyze_chat_structure(node, path="", depth=0):
"""Find the chat/conversation area by looking at the main layout."""
if not isinstance(node, dict):
return
tag = node.get('tag', '')
cls = node.get('cls', '')
children = node.get('children', [])
text = node.get('text', '')
attrs = node.get('attrs', {})
# Print interesting structural nodes at shallow depths
if depth <= 6:
child_count = len(children)
has_text = bool(text and len(text) > 10)
info = f"{' '*depth}{tag}"
if cls:
info += f" .{cls[:80]}"
if attrs:
attr_str = ' '.join(f'{k}={v[:30]}' for k,v in attrs.items() if k not in ('style','class'))
if attr_str:
info += f" [{attr_str}]"
info += f" children={child_count}"
if has_text:
info += f" text=\"{text[:50]}...\""
print(info)
for child in children:
analyze_chat_structure(child, f"{path}/{tag}", depth+1)
data, fname = load_dump()
if not data:
print("No dump file found!")
sys.exit(1)
# Handle both dump formats
body = data.get('body', data)
qi = data.get('quickInfo', {})
print("=" * 60)
print("QUICK INFO")
print("=" * 60)
if qi:
for k, v in qi.items():
if k == 'buttons':
print(f"buttons ({len(v)}):")
for b in v[:15]:
print(f" [{b.get('tag')}] \"{b.get('text','')[:50]}\" visible={b.get('visible')} cls={b.get('cls','')[:60]}")
elif k == 'dataAttrs':
print(f"dataAttrs: {v[:30]}")
else:
print(f"{k}: {v}")
print("\n" + "=" * 60)
print("CHAT-RELATED CLASS PATTERNS")
print("=" * 60)
patterns = ['chat', 'message', 'conversation', 'response', 'answer', 'reply',
'markdown', 'prose', 'content', 'panel', 'agent', 'assistant',
'planner', 'step', 'trajectory', 'bot', 'ai-', 'turn']
matches = find_by_class_pattern(body, patterns)
for m in matches:
print(f" [{m['tag']}] cls=\"{m['cls']}\" pattern={m['pattern']} children={m['child_count']} {m.get('attrs',{})}")
print("\n" + "=" * 60)
print("LONG TEXT NODES (potential AI responses)")
print("=" * 60)
texts = find_text_containers(body)
texts.sort(key=lambda x: x['text_len'], reverse=True)
for t in texts[:20]:
print(f" [{t['tag']}] depth={t['depth']} len={t['text_len']} cls=\"{t['cls'][:60]}\"")
print(f" text: \"{t['text_preview']}\"")
if t['attrs']:
print(f" attrs: {t['attrs']}")
print("\n" + "=" * 60)
print("DOM TREE (depth<=6)")
print("=" * 60)
analyze_chat_structure(body)

View File

@@ -0,0 +1,19 @@
import json, os, sys
dump_path = os.path.join(os.path.expanduser('~'), '.gemini', 'antigravity', 'bridge', 'dump_html.json')
with open(dump_path, 'r', encoding='utf-8') as f:
data = json.load(f)
qi = data.get('quickInfo', {})
print('=== Quick Info ===')
print('hasConversationView:', qi.get('hasConversationView'))
print('hasStepIndex:', qi.get('hasStepIndex'))
print('hasBotColor:', qi.get('hasBotColor'))
print('hasMarkdownBody:', qi.get('hasMarkdownBody'))
print('hasProse:', qi.get('hasProse'))
print('totalElements:', qi.get('totalElements'))
print('dataTestIds:', qi.get('dataTestIds'))
print('dataAttrs (first 20):', qi.get('dataAttrs', [])[:20])
print('buttons (first 10):')
for b in qi.get('buttons', [])[:10]:
print(f" [{b.get('tag')}] {b.get('text', '')[:60]} visible={b.get('visible')}")

View File

@@ -0,0 +1,83 @@
"""Search AG Native DOM dump for chat content and buttons."""
import json, os
fpath = os.path.join(os.path.expanduser('~'), '.gemini', 'antigravity', 'bridge', 'dump_html_5.json')
with open(fpath, 'r', encoding='utf-8-sig') as f:
data = json.load(f)
body = data.get('body', data.get('bodyTree', {}))
qi = data.get('quickInfo', {})
# Show all buttons
print('=== BUTTONS ===')
for b in qi.get('buttons', []):
print(f' [{b["tag"]}] "{b["text"][:60]}" visible={b["visible"]} cls={b.get("cls","")[:80]}')
# Data attrs
print('\n=== DATA ATTRS ===')
for attr in qi.get('dataAttrs', []):
print(f' {attr}')
# Recursive search for nodes by text
def find_nodes_by_text(node, target, path='', results=None, depth=0):
if results is None: results = []
if not isinstance(node, dict): return results
tag = node.get('tag','')
cls = node.get('cls','')
text = node.get('text','')
children = node.get('children', [])
cur = f'{path}/{tag}'
if target.lower() in text.lower():
results.append({'path': cur, 'depth': depth, 'cls': cls[:80], 'text': text[:80], 'children': len(children)})
for c in children:
find_nodes_by_text(c, target, cur, results, depth+1)
return results
print('\n=== NODES containing "Always run" ===')
matches = find_nodes_by_text(body, 'Always run')
for m in matches:
print(f' depth={m["depth"]} cls="{m["cls"]}" text="{m["text"]}" children={m["children"]}')
print('\n=== NODES containing "Always" ===')
matches = find_nodes_by_text(body, 'Always')
for m in matches:
print(f' depth={m["depth"]} cls="{m["cls"]}" text="{m["text"]}" children={m["children"]}')
# Find ALL text nodes with > 30 chars
def find_all_text(node, results=None, depth=0, path=''):
if results is None: results = []
if not isinstance(node, dict): return results
tag = node.get('tag','')
cls = node.get('cls','')
text = node.get('text','')
children = node.get('children', [])
if text and len(text) > 30:
results.append({'depth': depth, 'tag': tag, 'cls': cls[:80], 'text': text[:100], 'path': f'{path}/{tag}'})
for c in children:
find_all_text(c, results, depth+1, f'{path}/{tag}')
return results
print('\n=== LONG TEXT NODES (>30 chars) ===')
texts = find_all_text(body)
texts.sort(key=lambda x: len(x['text']), reverse=True)
for t in texts[:25]:
print(f' d={t["depth"]} [{t["tag"]}] cls="{t["cls"][:50]}" len={len(t["text"])} "{t["text"][:80]}"')
# Find nodes with many children (structural containers)
def find_containers(node, results=None, depth=0, path=''):
if results is None: results = []
if not isinstance(node, dict): return results
tag = node.get('tag','')
cls = node.get('cls','')
children = node.get('children', [])
if len(children) > 5:
results.append({'depth': depth, 'tag': tag, 'cls': cls[:100], 'children': len(children), 'path': f'{path}/{tag}'})
for c in children:
find_containers(c, results, depth+1, f'{path}/{tag}')
return results
print('\n=== CONTAINERS (>5 children) ===')
conts = find_containers(body)
conts.sort(key=lambda x: x['children'], reverse=True)
for c in conts[:20]:
print(f' d={c["depth"]} [{c["tag"]}] children={c["children"]} cls="{c["cls"][:70]}"')

View File

@@ -0,0 +1,109 @@
"""Trace the DOM path from body to AI response container."""
import json, os
fpath = os.path.join(os.path.expanduser('~'), '.gemini', 'antigravity', 'bridge', 'dump_html_5.json')
with open(fpath, 'r', encoding='utf-8-sig') as f:
data = json.load(f)
body = data.get('body', data.get('bodyTree', {}))
def find_path_to_class(node, target_cls, path=None, depth=0):
"""Find the DOM path down to a node with a matching class."""
if path is None: path = []
if not isinstance(node, dict): return []
tag = node.get('tag', '')
cls = node.get('cls', '')
children = node.get('children', [])
text = node.get('text', '')
attrs = node.get('attrs', {})
entry = {
'depth': depth,
'tag': tag,
'cls': cls[:120],
'children': len(children),
'text': text[:60] if text else '',
'attrs': {k:v[:40] for k,v in attrs.items() if k not in ('style',)}
}
if target_cls.lower() in cls.lower():
return path + [entry]
for i, child in enumerate(children):
result = find_path_to_class(child, target_cls, path + [entry], depth+1)
if result:
return result
return []
# Find path to the AI response container
print("=== PATH TO 'leading-relaxed select-text' ===")
path = find_path_to_class(body, 'leading-relaxed select-text')
for p in path:
indent = ' ' * p['depth']
print(f'{indent}[{p["tag"]}] cls="{p["cls"]}" children={p["children"]} {p["attrs"]}')
if p['text']:
print(f'{indent} text: "{p["text"]}"')
# Now get the full subtree of the AI response container
def get_subtree(node, target_cls, depth=0):
if not isinstance(node, dict): return None
cls = node.get('cls', '')
if target_cls.lower() in cls.lower():
return node
for child in node.get('children', []):
result = get_subtree(child, target_cls, depth+1)
if result:
return result
return None
print("\n=== AI RESPONSE CONTAINER SUBTREE ===")
container = get_subtree(body, 'leading-relaxed select-text')
if container:
def print_tree(node, depth=0, max_depth=4):
if not isinstance(node, dict) or depth > max_depth: return
tag = node.get('tag','')
cls = node.get('cls','')[:80]
text = node.get('text','')
children = node.get('children', [])
indent = ' ' * depth
line = f'{indent}[{tag}]'
if cls: line += f' cls="{cls}"'
line += f' children={len(children)}'
if text: line += f' text="{text[:60]}"'
print(line)
for c in children:
print_tree(c, depth+1, max_depth)
print_tree(container, 0, 3)
# Also search for the chat panel container - what wraps the entire conversation
print("\n=== SEARCH FOR CHAT PANEL WRAPPERS ===")
chat_patterns = ['chat', 'antigravity', 'gemini', 'panel', 'agentview', 'sidebar', 'conversation']
for pat in chat_patterns:
path = find_path_to_class(body, pat)
if path:
last = path[-1]
print(f' Pattern "{pat}" found at depth={last["depth"]} [{last["tag"]}] cls="{last["cls"]}" children={last["children"]}')
# Find the parent chain from body to the container - look by scanning ALL class names
print("\n=== ALL UNIQUE CLASS NAMES (depth <= 12) ===")
all_classes = set()
def collect_classes(node, depth=0, max_depth=12):
if not isinstance(node, dict) or depth > max_depth: return
cls = node.get('cls', '')
if cls:
for c in cls.split():
if len(c) > 3 and not c.startswith('{') and 'mtk' not in c:
all_classes.add(c)
for child in node.get('children', []):
collect_classes(child, depth+1, max_depth)
collect_classes(body)
# Print classes sorted, grouped by potential relevance
relevant = sorted([c for c in all_classes if any(k in c.lower() for k in
['chat', 'message', 'response', 'agent', 'gemini', 'turn', 'model', 'user', 'bot', 'conversation', 'markdown', 'prose', 'text-', 'content'])])
print("Relevant classes:")
for c in relevant:
print(f' {c}')

View File

@@ -2,7 +2,7 @@
"name": "gravity-bridge",
"displayName": "Gravity Bridge",
"description": "Discord-based unified approval system for Antigravity AI interactions.",
"version": "0.5.50",
"version": "0.5.51",
"publisher": "variet",
"engines": {
"vscode": "^1.100.0"

View File

@@ -1,7 +1,7 @@
export function generateApprovalObserverScript(_port: number): string {
return `
// ── Gravity Bridge v14: Strict Scope + Junk Filter ──
// v14: Strict 5-level DOM scope, CSS/source code/icon-glue filters, no fallback
// ── Gravity Bridge v15: AG Native Chat Relay ──
// v15: AG Native #conversation + .leading-relaxed.select-text chat body scanning
(function(){
'use strict';
var BASE='',_obs=false,_sent={},_ready=false;
@@ -10,7 +10,7 @@ export function generateApprovalObserverScript(_port: number): string {
var CLEANUP_MS=300000;
function log(m){console.log('[GB Observer] '+m);}
log('v14 Script loaded — Strict Scope + Junk Filter');
log('v15 Script loaded — AG Native Chat Relay');
// DIAGNOSTIC BEACON: immediate POST to confirm script execution in renderer
try {
@@ -460,15 +460,16 @@ export function generateApprovalObserverScript(_port: number): string {
}
// ══════════════════════════════════════════════════════════════════
// v7: STEP-AWARE CHAT BODY SCANNING
// Scans [data-step-index] elements inside [data-testid="conversation-view"]
// Extracts AI response text while filtering UI noise
// v15: AG-NATIVE + CASCADE DUAL CHAT BODY SCANNING
// AG Native: #conversation > ... > .leading-relaxed.select-text
// Cascade: [data-testid="conversation-view"] > [data-step-index]
// ══════════════════════════════════════════════════════════════════
var _lastScrapedStepIndex = -1;
var _lastStepText = '';
var _lastStepTextTime = 0;
var _lastStepTextSent = false;
var _lastResponseBlockCount = 0; // track number of response blocks for AG Native
function extractCleanStepText(stepEl) {
if (!stepEl) return '';
@@ -495,7 +496,7 @@ export function generateApprovalObserverScript(_port: number): string {
}
// Try to get text from markdown rendering area first
// Look for known markdown container patterns
// AG Native uses .leading-relaxed.select-text, Cascade uses .markdown-body/.prose
var mdEl = clone.querySelector('.markdown-body, .prose, [class*="markdown"], [class*="rendered"]');
var rawText = '';
if (mdEl && mdEl.innerText && mdEl.innerText.trim().length > 10) {
@@ -515,8 +516,80 @@ export function generateApprovalObserverScript(_port: number): string {
// One-time DOM dump
dumpDOMStructure();
// PRIMARY: Find conversation-view container
var cv = document.querySelector('[data-testid="conversation-view"]');
// ── STRATEGY 1: AG Native — #conversation or .antigravity-agent-side-panel ──
var cv = document.querySelector('#conversation');
if (!cv) {
cv = document.querySelector('.antigravity-agent-side-panel');
}
if (cv) {
// AG Native path: find AI response blocks by class pattern
// DOM structure: #conversation > ... > .leading-relaxed.select-text (AI response text)
var responseBlocks = cv.querySelectorAll('.leading-relaxed.select-text');
if (responseBlocks.length > 0) {
// Process the LAST (most recent) response block
var lastBlock = responseBlocks[responseBlocks.length - 1];
// Skip if already scraped
if (lastBlock.dataset.agChatScraped === 'true' || lastBlock.dataset.agChatScraped === 'pending') {
// Check for NEW blocks since last scrape
if (responseBlocks.length > _lastResponseBlockCount) {
// New block appeared — process it
for (var rbi = responseBlocks.length - 1; rbi >= 0; rbi--) {
if (responseBlocks[rbi].dataset.agChatScraped !== 'true' && responseBlocks[rbi].dataset.agChatScraped !== 'pending') {
lastBlock = responseBlocks[rbi];
break;
}
}
if (lastBlock.dataset.agChatScraped === 'true' || lastBlock.dataset.agChatScraped === 'pending') return;
} else {
return; // Already scraped, no new blocks
}
}
var blockText = extractCleanStepText(lastBlock);
if (blockText && blockText.length > 30) {
// QUALITY CHECK: Skip if the text is mostly short lines (UI artifacts)
var lines = blockText.split('\\n').filter(function(l) { return l.trim().length > 0; });
var longLines = lines.filter(function(l) { return l.trim().length > 20; });
if (longLines.length === 0) {
log('AG-Native: skipped (no long lines, likely UI noise)');
return;
}
// Wait for content to stabilize (3s no change)
if (_lastStepText !== blockText) {
_lastStepText = blockText;
_lastStepTextTime = Date.now();
_lastStepTextSent = false;
return; // Wait for next scan cycle
}
if (_lastStepTextSent) return;
if (Date.now() - _lastStepTextTime < 3000) return; // Still waiting
// Content is stable — send it
_lastStepTextSent = true;
_lastResponseBlockCount = responseBlocks.length;
lastBlock.dataset.agChatScraped = 'pending';
log('AG-Native chat relay: blocks=' + responseBlocks.length + ' text=' + blockText.length + ' chars');
(function(el, txt, count) {
fetch(BASE + '/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: txt, source: 'ag_native_block_' + count, block_index: count })
}).then(function() { el.dataset.agChatScraped = 'true'; log('AG-Native chat sent OK'); })
.catch(function(e) { el.dataset.agChatScraped = 'false'; log('AG-Native chat send error: ' + e.message); });
})(lastBlock, blockText, responseBlocks.length);
}
return; // AG Native path handled — don't fall through to Cascade path
}
}
// ── STRATEGY 2: Cascade — [data-testid="conversation-view"] ──
cv = document.querySelector('[data-testid="conversation-view"]');
if (!cv) {
// FALLBACK: Try older selectors
cv = document.querySelector('[class*="conversation"], [class*="chat-container"]');