Acceptance Scenarios

A compatible implementation should pass these scenarios:

A tool declaration can be discovered without loading every tool into the model context.
A deferred tool can be found by exact name and keyword search, then loaded before invocation.
Input arguments can be schema-validated before execution.
Tool-specific validation can reject unsafe values before filesystem or network IO.
Sensitive arguments are redacted from telemetry unless explicitly allowed.
model_input, observable_input, permission_input, and call_input remain distinguishable after hooks and permission prompts.
A risky tool call can pause for approval and then resume with the same invocation id.
A denied or rejected call produces a terminal error result instead of disappearing.
A long-running call emits ordered progress and can report cancellation support accurately.
Parallel read tools can run together while exclusive write tools preserve order.
A sibling failure can cancel dependent tools with synthetic result records.
A large result returns refs or previews rather than embedding the full payload.
Empty successful output is distinguishable from missing output.
A tool result can create or link to an Agent Artifact.
Evidence can reconstruct which tool ran, with what native id, and why the result is trusted or not.
Policy can explain why a tool was allowed, denied, deferred, or waived.
MCP, OpenAPI, provider function calling, CLI, browser, and peer-agent tools preserve their native ids.

Acceptance Scenarios ​