Skip to content

Tool UI surface

Tool UI turns external work into understandable process evidence without polluting the final answer.

Tool step anatomy

A tool step SHOULD expose:

FieldPurpose
tool.idStable id for linking, retry, evidence, and logs.
tool.kindCommand, browser, file, API, search, database, custom.
statusPending, running, succeeded, failed, cancelled, timed out.
input.summarySafe, compact input preview.
output.previewHuman-readable result preview.
output.refPointer to full output when too large.
durationTiming summary.
riskPermission or sensitivity category when relevant.
source_refsSources or artifacts produced by the tool.

Input rendering

Tool input often contains secrets, long JSON, file paths, or irrelevant defaults. Clients SHOULD:

  • show the smallest meaningful input fields
  • redact secrets and credentials
  • collapse long JSON by default
  • keep raw input available only in trusted diagnostic views
  • explain omitted fields with count or size

Output rendering

Output shapeRecommended rendering
Empty outputShow No output with exit/status information.
Short textInline preview inside process step.
Large textSummary, byte/token count, and open-details action.
JSON objectRender important keys first; allow raw view.
Image or mediaThumbnail or placeholder with open action.
File changeDiff or artifact reference.
Source listEvidence/source surface entry.
ErrorFailure summary, stderr preview, retry or diagnostic action.

Large output rule

Large tool output SHOULD NOT be inserted into the final answer or message body. Use an offloaded reference and a detail surface.

Recommended thresholds are product-specific, but an implementation SHOULD define when to:

  • truncate input
  • summarize output
  • offload full output
  • warn about context impact
  • require explicit expansion

Retry and replay

If retry is supported:

  • Show whether the tool is safe to retry.
  • Preserve each attempt with its own status and output reference.
  • Do not overwrite failed attempt evidence.
  • Require confirmation for high-risk or non-idempotent tools.

Acceptance scenarios

  1. A tool with no output shows an explicit empty state.
  2. A tool with large output shows a summary and detail action, not full output in the answer.
  3. A failed tool preserves stderr or diagnostic preview.
  4. A generated file becomes an artifact reference.
  5. Tool evidence remains linkable from timeline and evidence surfaces.

Draft runtime-first standard for agent interaction surfaces.