Appearance
Acceptance scenarios
Agent UI work is accepted by behavior, not by the existence of a component or document file. Use these scenarios for product QA, automated tests, or design review.
1. Send and first status
- User sends a prompt.
- The UI creates the user message optimistically.
- Runtime listener is registered before submit.
- Runtime status appears before first answer text when the runtime accepts work.
- The composer exposes interrupt/cancel when supported.
Pass condition: the user can tell the agent is alive before text streaming begins.
2. Text/reasoning separation
- Runtime emits reasoning/thinking content and final answer text.
- Running reasoning renders as process content and remains live-visible; completed reasoning is collapsed or summarized by default.
- Final answer renders as clean message text.
- Completed reasoning is not replayed as final answer text after hydration.
Pass condition: no <think> text, raw reasoning log, or process status pollutes the final answer.
3. Interleaved active turn
- Runtime emits reasoning, tool, text, reasoning summary, then more text in sequence.
- UI renders those parts interleaved in event/part order.
- Running tool/process content is expanded by default or shows its live body.
- Timeline does not expand a duplicate copy of the same fact already shown by inline process.
- After turn completion, process content archives into collapsed timeline summaries by default.
Pass condition: the user sees live execution order, not a top-heavy thinking stack or double-nested process blocks.
4. Final reconciliation
- Runtime streams text deltas.
- Runtime later emits final answer content.
- The UI reconciles the final answer with streamed content.
Pass condition: final text is not duplicated or appended twice.
5. Tool call
- Runtime emits tool start with stable tool call id.
- UI shows a compressed tool row with safe input summary.
- Tool progress updates the row without entering final answer text.
- Tool result links to output details or offload reference.
- Errors render as recoverable tool failure UI.
Pass condition: tool execution is visible, inspectable, and not mixed into final answer prose.
6. Human-in-the-loop
- Runtime emits an action request with id, type, scope, and optional schema.
- UI promotes the request to an approval/input surface.
- User approves, rejects, edits, or answers.
- Response is sent through the runtime action response API.
- UI only marks the request resolved after runtime confirmation.
Pass condition: high-risk or blocked work has explicit, auditable user control.
7. Queue and steer
- A run is active.
- User enters another prompt.
- UI offers queue and steer as different modes.
- Queue creates or updates a queued turn summary.
- Steer targets the active run and shows pending steer state.
Pass condition: the user can distinguish “run this next” from “change what is happening now.”
8. Artifact workspace
- Runtime emits artifact created/updated with stable artifact id.
- Conversation shows a compact artifact card or reference.
- Artifact Workspace opens preview/editor/diff/version/export areas using artifact service data.
- Edits, exports, forks, or handoffs go through artifact APIs or controlled runtime actions.
- Failed saves preserve the last confirmed version and keep unsaved local edits visible.
Pass condition: deliverables leave the chat body and become editable, versioned, exportable artifacts.
9. Evidence export
- User or system triggers evidence export.
- UI shows background progress or task capsule.
- Evidence service returns durable references.
- Timeline/evidence surface links summary, trace, artifacts, verification, review, or replay.
Pass condition: evidence is traceable to runtime facts and does not block chat streaming.
10. Old-session recovery
- User opens an old session.
- Shell, tab, title, and cached snapshot appear immediately when available.
- Recent messages render before full timeline details.
- Queue/pending action/runtime summary hydrate next.
- Older messages, tool details, artifacts, and evidence load on demand.
Pass condition: old sessions do not require full history or all artifacts before first paint.
11. Missing facts
- Runtime omits artifact kind, verification status, or provider stage.
- UI shows
unknown,unavailable, orstalerather than guessing. - User controls remain safe and recoverable.
Pass condition: UI never fabricates success, approval, artifact type, or evidence verdict.
12. Task and multi-agent state
- Runtime emits a queued turn, background task, teammate, subagent, or remote-agent update with a stable task/agent id.
- UI updates task capsules, team roster, work board, or task center without creating fake assistant prose.
- Needs-input, failed, plan-ready, and delegated-approval states are promoted above normal running state.
- Completed task details archive into timeline summaries, worker notifications, or task history.
Pass condition: long-running and multi-agent work is observable and controllable outside the final answer transcript.
13. Coordinator team
- A coordinator delegates work to one or more teammates.
- UI shows the coordinator, teammates, roles, statuses, and parent/child session or thread ids.
- Worker results arrive as worker notifications, not as real user messages.
- Coordinator synthesis remains separate from worker result facts and transcript refs.
Pass condition: the user can see who did what and can trace worker results without confusing them with user speech.
14. Parallel workers
- Runtime spawns multiple workers for independent tasks.
- UI shows fanout/fanin state, wait state, partial completion, failures, and retry/continue controls.
- UI shows queue/parallelism facts such as team phase, active count, queued count, and provider concurrency group when available.
- Running workers remain visible while active.
- Completed worker details archive into timeline/evidence without flattening the team to one assistant.
Pass condition: parallel delegation is visible, resumable, and auditable.
15. Specialist handoff
- Runtime changes active owner from one teammate to another.
- UI shows from, to, reason, resume target, and memory/context boundary.
- Past transcript authorship is not rewritten.
- The new owner can continue with its own policy and context constraints.
Pass condition: the user can tell who owns the work now and why.
16. Review team
- Runtime or user requests review from a reviewer/verifier teammate.
- UI shows reviewer, target, status, evidence refs, verdict, and requested fixes.
- Review verdict stays in review/evidence facts rather than final prose only.
- Requested fixes can be assigned back to a teammate or work item.
Pass condition: review is a first-class lane with traceable evidence and follow-up ownership.
17. Human/agent work board
- Work items are assigned to humans and agents.
- UI shows assignee, status, blocker, comments, dependencies, and progress.
- User assignment or status changes write through a board/team API.
- Completed work links to artifacts or evidence.
Pass condition: mixed human/agent work is managed as tasks, not hidden in the chat transcript.
18. Background teammate
- Runtime schedules or wakes a background teammate.
- UI shows wake reason, schedule, current run, last run record, pause/resume, and termination controls.
- Background results archive as timeline/evidence facts.
- The UI does not introduce a separate hierarchy for background work.
Pass condition: background agent work is understandable as teammate-owned work.
19. Remote teammate
- Runtime creates or connects to a remote agent task.
- UI shows remote agent card/capability, task id, status, messages, input/auth needs, and artifact updates.
- Input-required and auth-required states are promoted to user controls.
- A transient idle status is not treated as terminal completion without remote task confirmation.
Pass condition: remote agent work follows the same team surfaces while preserving remote protocol truth.
20. Context and compaction
- Runtime emits context selection, missing context, budget, retrieval, or compaction facts.
- UI shows context chips, budget state, missing-context fallback, or compaction boundary.
- Compaction does not replay old reasoning as final answer text.
- Source refs and citations remain linked to evidence or context facts.
Pass condition: context and memory changes are explicit facts, not hidden text mutations.
21. Diagnostics and metrics
- Runtime or client emits safe diagnostics or performance metrics.
- UI keeps them in diagnostics surfaces or trace views.
- Normal conversation text stays free of raw debug logs.
- Metrics can explain submit-to-status, first-text, paint, hydration, and detail-load latency.
Pass condition: debugging remains traceable without polluting the user-facing transcript.
22. Agent Runtime profile projection
- Runtime provides
RuntimeEvent,ThreadReadModel,TaskSnapshot, orEvidencePackfacts. - UI preserves runtime correlation ids such as
sessionId,threadId,turnId,taskId,runId,toolCallId,actionId, andevidenceId. - Status, task capsules, HITL controls, timeline/evidence, replay, and review surfaces are projected from those facts.
- UI renders missing facts as
unknown,unavailable, orstaleinstead of creating local runtime truth.
Pass condition: Agent UI can project an Agent Runtime-compatible source without becoming the owner of execution, approval, routing, task, or evidence facts. See Runtime profile test cases.