Skip to content

Project classification

Agent QC starts with classification. The same repository can match several profiles. Classification decides which risks the report is allowed to judge.

Classify by owned risk, not by language, framework, company, or UI style.

Profiles

ProfileUse when the project ownsCommon test focus
agent-runtime-cliagent loop, CLI, task execution, sandbox, tools, resumeunit, sandbox policy, protocol streams, CLI e2e, subprocess cleanup
agent-sdk-apipublic SDK, generated client, API wrapperspublic signatures, fake server integration, generated contract drift
agent-tool-mcp-gatewaytool declarations, MCP/ACP bridge, connector runtimeprotocol conformance, stdio/http recovery, resource and permission refs
multi-channel-agent-gatewaychat/channel adapters, webhooks, auth, mediachannel contracts, auth/secrets, live opt-in, media routing, Docker smoke
agent-ui-tui-desktopGUI, TUI, desktop shell, browser-visible flowsrendering, screenshots, terminal fixtures, Playwright, accessibility
agent-skills-pluginsskills, plugins, manifests, loaders, marketplaceschema, discovery, package boundary, fixture install, trust policy
background-agent-schedulercron, queues, workers, retries, long-running agentsdeterministic time, leases, checkpointing, races, stress
agent-distribution-releaseinstall, package, Docker, cross-platform releasepackage contents, install smoke, OS matrix, supply-chain scan
agent-evals-qualitytask quality, model behavior, rubrics, generated outputsbaseline comparison, semantic judge, grounding, safety/policy evals

Mixed-profile examples

Project shapeProfiles
Codex-like runtime with TUI and app-server protocolagent-runtime-cli, agent-ui-tui-desktop, agent-tool-mcp-gateway, agent-sdk-api, agent-distribution-release
Claude Code-like local snapshotagent-ui-tui-desktop, agent-runtime-cli, agent-sdk-api, agent-skills-plugins; mark release/CI claims as unknown if metadata is absent
OpenClaw-like gateway and QA Labmulti-channel-agent-gateway, agent-tool-mcp-gateway, agent-ui-tui-desktop, agent-skills-plugins, agent-distribution-release, agent-evals-quality
Hermes-like Python agentagent-runtime-cli, background-agent-scheduler, agent-tool-mcp-gateway, multi-channel-agent-gateway, agent-ui-tui-desktop, agent-distribution-release
Desktop GUI with native bridgeagent-ui-tui-desktop, agent-tool-mcp-gateway, agent-runtime-cli, agent-skills-plugins, agent-distribution-release
Standards/documentation site with schemas and examplesagent-distribution-release, optionally agent-sdk-api if schemas/CLI are consumed

Classification roles

A useful plan identifies owners:

RoleQuestion
Profile ownerWhich project shape owns the risk?
Fact ownerWhich system writes the fact being verified?
Surface ownerWhere is the fact projected to users/operators?
Gate ownerWhich command, CI job, script, qcloop item, or review executes the gate?
Evidence ownerWhere are durable logs, traces, screenshots, transcripts, reports, and waivers stored?
Risk ownerWho decides waiver, release, or retry?

Classification rules

  • Classify by owned risk, not by language.
  • A repository can have multiple profiles; do not force it into one label.
  • If a project exposes user-visible work, include a surface classification even if most code is backend/library code.
  • If a test requires credentials or a real provider, mark it live-provider and opt in explicitly.
  • If a release artifact is shipped, include agent-distribution-release even for docs-heavy projects.
  • If a UI shows runtime state, include both surface and runtime/protocol gates; UI alone is not runtime proof.
  • If repo metadata is missing, state the limitation instead of inventing CI/release guarantees.
  • If cases are repeated and independent, qcloop can execute them, but project gates still need evidence.

Decision tree

text
Does the project execute agent turns, tools, shell, sandbox, or resume?
  -> agent-runtime-cli
Does it expose a public SDK, generated client, schema, or app-server API?
  -> agent-sdk-api
Does it declare, route, or bridge tools/MCP/ACP/connectors?
  -> agent-tool-mcp-gateway
Does it connect to chat channels, webhooks, mobile, QR, or media routing?
  -> multi-channel-agent-gateway
Does a user/operator see GUI, TUI, WebUI, desktop, or browser UI?
  -> agent-ui-tui-desktop
Does it load skills/plugins/manifests or marketplace assets?
  -> agent-skills-plugins
Does it schedule background/long-running/retry work?
  -> background-agent-scheduler
Does it ship packages, Docker images, installers, or docs site artifacts?
  -> agent-distribution-release
Does it judge model/task quality with rubrics, baselines, or reports?
  -> agent-evals-quality

What classification is not

Classification is not:

  • a technology stack label;
  • a maturity grade;
  • a promise that all gates have passed;
  • a release checklist by itself;
  • a reason to ignore project-specific AGENTS/CONTRIBUTING rules.

Classification only selects the risks and evidence lanes that must be proven.

Draft standard for evidence-driven quality control of Agent projects.