Appearance
Acceptance scenarios
Agent UI work is accepted by behavior, not by the existence of a component or document file. Use these scenarios for product QA, automated tests, or design review.
1. Send and first status
- User sends a prompt.
- The UI creates the user message optimistically.
- Runtime listener is registered before submit.
- Runtime status appears before first answer text when the runtime accepts work.
- The composer exposes interrupt/cancel when supported.
Pass condition: the user can tell the agent is alive before text streaming begins.
2. Text/reasoning separation
- Runtime emits reasoning/thinking content and final answer text.
- Reasoning renders as process content, collapsed or summarized by default.
- Final answer renders as clean message text.
- Completed reasoning is not replayed as final answer text after hydration.
Pass condition: no <think> text, raw reasoning log, or process status pollutes the final answer.
3. Final reconciliation
- Runtime streams text deltas.
- Runtime later emits final answer content.
- The UI reconciles the final answer with streamed content.
Pass condition: final text is not duplicated or appended twice.
4. Tool call
- Runtime emits tool start with stable tool call id.
- UI shows a compressed tool row with safe input summary.
- Tool progress updates the row without entering final answer text.
- Tool result links to output details or offload reference.
- Errors render as recoverable tool failure UI.
Pass condition: tool execution is visible, inspectable, and not mixed into final answer prose.
5. Human-in-the-loop
- Runtime emits an action request with id, type, scope, and optional schema.
- UI promotes the request to an approval/input surface.
- User approves, rejects, edits, or answers.
- Response is sent through the runtime action response API.
- UI only marks the request resolved after runtime confirmation.
Pass condition: high-risk or blocked work has explicit, auditable user control.
6. Queue and steer
- A run is active.
- User enters another prompt.
- UI offers queue and steer as different modes.
- Queue creates or updates a queued turn summary.
- Steer targets the active run and shows pending steer state.
Pass condition: the user can distinguish “run this next” from “change what is happening now.”
7. Artifact workspace
- Runtime emits artifact created/updated with stable artifact id.
- Conversation shows a compact artifact card or reference.
- Artifact Workspace opens preview/editor/diff/version/export areas using artifact service data.
- Edits, exports, forks, or handoffs go through artifact APIs or controlled runtime actions.
- Failed saves preserve the last confirmed version and keep unsaved local edits visible.
Pass condition: deliverables leave the chat body and become editable, versioned, exportable artifacts.
8. Evidence export
- User or system triggers evidence export.
- UI shows background progress or task capsule.
- Evidence service returns durable references.
- Timeline/evidence surface links summary, trace, artifacts, verification, review, or replay.
Pass condition: evidence is traceable to runtime facts and does not block chat streaming.
9. Old-session recovery
- User opens an old session.
- Shell, tab, title, and cached snapshot appear immediately when available.
- Recent messages render before full timeline details.
- Queue/pending action/runtime summary hydrate next.
- Older messages, tool details, artifacts, and evidence load on demand.
Pass condition: old sessions do not require full history or all artifacts before first paint.
10. Missing facts
- Runtime omits artifact kind, verification status, or provider stage.
- UI shows
unknown,unavailable, orstalerather than guessing. - User controls remain safe and recoverable.
Pass condition: UI never fabricates success, approval, artifact type, or evidence verdict.