验收场景

Agent UI 工作按行为验收，而不是按组件或文档文件是否存在验收。以下场景可用于产品 QA、自动化测试或设计评审。

1. 发送与首状态

用户发送 prompt。
UI 乐观创建 user message。
Runtime listener 在 submit 前注册。
Runtime 接受工作后，首个 answer text 前出现 runtime status。
支持时 composer 暴露 interrupt/cancel。

通过条件：用户能在文本流开始前知道 Agent 还活着。

2. Text/reasoning 分离

Runtime 发出 reasoning/thinking content 和 final answer text。
Running reasoning 渲染为 process content，并保持 live 可见；完成后默认折叠或摘要。
Final answer 渲染为干净 message text。
Hydration 后 completed reasoning 不作为 final answer text 重放。

通过条件：最终回答不被 <think>、raw reasoning log 或 process status 污染。

3. 穿插式 active turn

Runtime 依次发出 reasoning、tool、text、reasoning summary 和后续 text。
UI 按 event/part 顺序穿插渲染这些 parts。
Running tool/process 默认展开或显示 live body。
Timeline 不在同屏重复展开 inline process 已显示的同一事实。
Turn 完成后，process 归档为默认折叠的 timeline summary。

通过条件：用户看到的是 live 执行顺序，而不是头部堆叠的思考区或双重嵌套过程块。

4. Final reconciliation

Runtime 流式发出 text deltas。
Runtime 随后发出 final answer content。
UI 把 final answer 与 streamed content reconcile。

通过条件：最终文本不会重复或二次追加。

5. Tool call

Runtime 发出带稳定 tool call id 的 tool start。
UI 显示压缩 tool row 和安全输入摘要。
Tool progress 更新该 row，不进入最终回答正文。
Tool result 链接 output details 或 offload reference。
错误渲染为可恢复 tool failure UI。

通过条件：工具执行可见、可检查，并且不混入最终回答正文。

6. Human-in-the-loop

Runtime 发出带 id、type、scope 和可选 schema 的 action request。
UI 把 request 提升为 approval/input surface。
用户 approve、reject、edit 或 answer。
Response 通过 runtime action response API 发送。
只有 runtime 确认后，UI 才把 request 标为 resolved。

通过条件：高风险或阻塞工作有明确、可审计的用户控制。

7. Queue 与 steer

当前已有 active run。
用户继续输入 prompt。
UI 把 queue 和 steer 作为不同模式展示。
Queue 创建或更新 queued turn summary。
Steer 指向 active run，并显示 pending steer state。

通过条件：用户能区分“下一轮执行”和“改变当前执行”。

8. Artifact 工作区

Runtime 发出带稳定 artifact id 的 artifact created/updated。
Conversation 显示紧凑 artifact card 或 reference。
Artifact 工作区使用 artifact service data 打开 preview/editor/diff/version/export 区域。
Edits、exports、forks 或 handoffs 通过 artifact APIs 或受控 runtime actions。
保存失败时保留 last confirmed version，并继续显示 unsaved local edits。

通过条件：交付物离开聊天正文，成为 editable、versioned、exportable artifacts。

9. Evidence export

用户或系统触发 evidence export。
UI 显示后台进度或 task capsule。
Evidence service 返回 durable references。
Timeline/evidence surface 链接 summary、trace、artifacts、verification、review 或 replay。

通过条件：evidence 可追溯到 runtime facts，并且不阻塞 chat streaming。

10. 旧 session 恢复

用户打开旧 session。
Shell、tab、title 和 cached snapshot 在可用时立即显示。
Recent messages 先于 full timeline details 渲染。
Queue/pending action/runtime summary 随后 hydrate。
Older messages、tool details、artifacts 和 evidence 按需加载。

通过条件：旧 session 不需要 full history 或所有 artifacts 后才 first paint。

11. Missing facts

Runtime 缺少 artifact kind、verification status 或 provider stage。
UI 显示 unknown、unavailable 或 stale，而不是猜测。
用户控制保持安全且可恢复。

通过条件：UI 不伪造 success、approval、artifact type 或 evidence verdict。

12. Task 与 multi-agent state

Runtime 发出 queued turn、background task、teammate、subagent 或 remote-agent update，并带 stable task/agent id。
UI 更新 task capsules、team roster、work board 或 task center，不创建假 assistant prose。
Needs-input、failed、plan-ready 和 delegated-approval 状态优先级高于普通 running。
Completed task details 归档到 timeline summaries、worker notifications 或 task history。

通过条件：长任务和 multi-agent 工作在 final answer transcript 外可观察、可控制。

13. Coordinator team

Coordinator 把工作委托给一个或多个 teammates。
UI 展示 coordinator、teammates、roles、statuses 以及 parent/child session 或 thread ids。
Worker results 作为 worker notifications 到达，而不是真实用户消息。
Coordinator synthesis 与 worker result facts、transcript refs 分离。

通过条件：用户能看清谁做了什么，并能追溯 worker results，而不会把它们误认为用户发言。

14. Parallel workers

Runtime 为独立任务启动多个 workers。
UI 展示 fanout/fanin state、wait state、partial completion、failures 和 retry/continue controls。
可用时，UI 展示 team phase、active count、queued count、provider concurrency group 等 queue/parallelism facts。
Running workers 在 active 时保持可见。
Completed worker details 归档到 timeline/evidence，不把 team 压平成一个 assistant。

通过条件：并行委托可见、可恢复、可审计。

15. Specialist handoff

Runtime 把 active owner 从一个 teammate 切换到另一个 teammate。
UI 展示 from、to、reason、resume target 与 memory/context boundary。
过去 transcript authorship 不被重写。
新 owner 可以带自己的 policy 与 context constraints 继续。

通过条件：用户能知道现在谁负责工作，以及为什么切换。

16. Review team

Runtime 或用户请求 reviewer/verifier teammate 评审。
UI 展示 reviewer、target、status、evidence refs、verdict 与 requested fixes。
Review verdict 保持在 review/evidence facts 中，而不是只写进 final prose。
Requested fixes 可重新分派给 teammate 或 work item。

通过条件：review 是一条可追溯 evidence 与后续 owner 的一等 lane。

17. Human/agent work board

Work items 被分派给 humans 与 agents。
UI 展示 assignee、status、blocker、comments、dependencies 与 progress。
用户 assignment 或 status change 通过 board/team API 写回。
Completed work 链接到 artifacts 或 evidence。

通过条件：混合人/Agent 工作以 task 管理，不隐藏在 chat transcript 里。

18. Background teammate

Runtime 调度或唤醒 background teammate。
UI 展示 wake reason、schedule、current run、last run record、pause/resume 与 termination controls。
Background results 归档为 timeline/evidence facts。
UI 不为 background work 引入额外 hierarchy。

通过条件：background agent work 能作为 teammate-owned work 被理解。

19. Remote teammate

Runtime 创建或连接 remote agent task。
UI 展示 remote agent card/capability、task id、status、messages、input/auth needs 与 artifact updates。
Input-required 与 auth-required 状态被提升为用户控制。
没有 remote task confirmation 时，transient idle status 不能被当成 terminal completion。

通过条件：remote agent work 复用相同 team surfaces，同时保留 remote protocol truth。

20. Context 与 compaction

Runtime 发出 context selection、missing context、budget、retrieval 或 compaction facts。
UI 展示 context chips、budget state、missing-context fallback 或 compaction boundary。
Compaction 不把旧 reasoning 作为 final answer text 重放。
Source refs 与 citations 继续链接到 evidence 或 context facts。

通过条件：context 与 memory 变化是显式 facts，不是隐藏文本突变。

21. Diagnostics 与 metrics

Runtime 或 client 发出 safe diagnostics 或 performance metrics。
UI 把它们保留在 diagnostics surfaces 或 trace views。
正常 conversation text 不包含 raw debug logs。
Metrics 能解释 submit-to-status、first-text、paint、hydration、detail-load latency。

通过条件：debugging 可追溯，但不污染用户可见 transcript。

22. Agent Runtime profile projection

Runtime 提供 RuntimeEvent、ThreadReadModel、TaskSnapshot 或 EvidencePack facts。
UI 保留 sessionId、threadId、turnId、taskId、runId、toolCallId、actionId、evidenceId 等 runtime correlation ids。
Status、task capsules、HITL controls、timeline/evidence、replay 与 review surfaces 都从这些 facts 投影。
缺失 facts 时，UI 显示 unknown、unavailable 或 stale，而不是创建本地 runtime truth。

通过条件：Agent UI 可以投影 Agent Runtime 兼容来源，同时不成为 execution、approval、routing、task 或 evidence facts 的 owner。详见 Runtime Profile 测试用例。

验收场景 ​

1. 发送与首状态 ​

2. Text/reasoning 分离 ​

3. 穿插式 active turn ​

4. Final reconciliation ​

5. Tool call ​

6. Human-in-the-loop ​

7. Queue 与 steer ​

8. Artifact 工作区 ​

9. Evidence export ​

10. 旧 session 恢复 ​

11. Missing facts ​

12. Task 与 multi-agent state ​

13. Coordinator team ​

14. Parallel workers ​

15. Specialist handoff ​

16. Review team ​

17. Human/agent work board ​

18. Background teammate ​

19. Remote teammate ​

20. Context 与 compaction ​

21. Diagnostics 与 metrics ​

22. Agent Runtime profile projection ​