Skip to content

Agent task

An agent task is the runtime-owned unit of work that gives an agent objective, scope, lifecycle, progress, relationships, and delivery semantics.

A task is not a checklist item, not a chat message, not a model request, and not only a background job. It is the durable execution object that can span turns, start runs, spawn subagents, wait for input, produce artifacts, and be audited after recovery.

Design pressure

Real runtimes show the same task pressure in different forms:

  • Terminal agents keep foreground work, local shell work, remote agent work, scheduled work, and backgrounded work under one task surface.
  • Gateway and scheduler agents need durable jobs, delivery state, per-run output, checkpoint/resume, missed-run handling, and failure notification.
  • Typed coding runtimes need thread goals, todo lists, plan items, turn status, job items, approval state, and spawn edges to join through stable ids.
  • Desktop runtimes need task profiles, automation jobs, subagent turns, execution summaries, evidence export, and UI projection to read from the same fact chain.
  • Durable workflow systems show why workflow id, run id, task queue, child work, signals, cancellation, retries, and history cannot be left as prose.

The portable contract therefore needs a task model above individual tool calls and below host-product workflows.

Task is not job, run, step, or todo

ObjectMeaningRuntime rule
taskSemantic objective with lifecycle, owner, relationships, constraints, and acceptance.Stable across retries, turns, backgrounding, and recovery.
runOne execution attempt for a task.New run for retry, resume-after-crash, or alternate worker execution.
stepOrdered item inside a run or turn.Model, reasoning, tool, process, approval, artifact, warning, or evidence item.
jobDurable batch or scheduled dispatcher.May create or own tasks, but should not replace task semantics.
todo / plan itemAgent-visible checklist.Useful progress hint, not authoritative lifecycle.

A compatible runtime SHOULD expose all five concepts when they exist instead of flattening them into one message or one task string.

Task record

A task SHOULD include these fields:

FieldPurpose
task_idStable task id.
parent_task_id / root_task_idTask graph linkage.
session_id / thread_id / turn_idConversation or execution context linkage when applicable.
title / objectiveHuman-readable work statement.
task_kind / task_familyPortable classification and coarse grouping.
visibilityforeground, background, internal, or hidden.
statusNormalized lifecycle status.
priorityScheduling hint, not a completion guarantee.
requested_by / owner / assigneeUser, agent, workflow, channel, or worker attribution.
scopeWorkspace, project, thread, account, environment, or host boundary refs.
constraintsPermission, sandbox, network, model, tool, cost, time, and output constraints.
task_profileCapability, latency, budget, fallback, and continuity profile.
acceptanceCompletion criteria or refs.
progressPercent, phase, current step, summary, counters, or output refs.
current_run_idActive run, if any.
attemptsPrior and active runs.
relationshipsDependencies, blocks, child tasks, source tasks, spawned agents, artifacts.
artifacts / evidence_refsProduced or consumed refs.
last_error / status_reasonStructured failure, block, or wait explanation.
created_at / updated_at / started_at / ended_atLifecycle timestamps.

Status model

A portable runtime SHOULD support these normalized task statuses:

StatusMeaning
draftDefined but not yet accepted by runtime.
acceptedRuntime accepted the task and assigned identity.
queuedWaiting for scheduler, queue, dependency, or worker capacity.
preparingResolving context, tools, model, policy, or environment.
runningActive execution is making progress.
waiting_inputWaiting for user or external structured input.
waiting_permissionWaiting for human, policy, or host approval.
waiting_resourceWaiting for credential, quota, file, network, worker, or external system.
blockedCannot proceed until a named blocker is resolved.
pausedIntentionally paused and resumable.
retryingA retry or fallback run is being prepared or active.
cancellingCancellation requested; cleanup is in progress.
cancelledStopped by user, policy, or runtime.
timed_outStopped because a time or inactivity limit fired.
failedTerminal failure with error facts.
lostRuntime cannot prove whether the worker is still alive.
completedAcceptance criteria are satisfied and facts are reconciled.
archivedNo longer active in scheduling, but retained for history.
staleSnapshot may not reflect current execution.
unknownRuntime lacks enough facts to assert a status.

Implementation-native states MAY be preserved in native_status, but portable consumers SHOULD receive the normalized status.

Attempts and runs

A task SHOULD keep attempts rather than replacing history on retry.

A task_attempt or run SHOULD include:

FieldPurpose
run_id / attempt_idStable execution attempt id.
statusRun lifecycle status.
workerAgent, process, hosted worker, scheduler, or external runtime.
input_refsPrompt, files, dataset rows, schedule trigger, or event refs.
output_refsResult, stdout/stderr, artifact, report, or external output refs.
checkpoint_refsResume, rollback, or reconstruction boundaries.
started_at / ended_atAttempt timing.
attempt_countRetry count or ordinal.
retry_policyMax attempts, backoff, non-retryable errors, timeout policy.
last_errorStructured failure fact.
completion_summaryFinal human-readable but non-authoritative summary.

A new attempt SHOULD be created for retry after terminal error, worker loss, crash recovery, different routing decision, or user-requested rerun.

Task graph

A task graph SHOULD represent relationships explicitly:

RelationshipMeaning
parent / childDecomposition or delegation.
blocks / blocked_byDependency ordering.
source_taskOutput or context came from another task.
source_attemptOutput or context came from a specific run.
spawned_subagentChild agent context was created for the task.
assigned_threadThread currently executing part of the task.
produced_artifactArtifact ref produced by the task.
consumed_artifactArtifact ref used as input.
evidenceEvidence, replay, review, trace, or audit ref.

Edges SHOULD carry created_at, updated_at, status, and optional reason. Cancellation intent SHOULD stick to the graph so schedulers stop adding new child work while active children settle.

A2A peer tasks

When a task is delegated to an Agent2Agent peer, the local runtime SHOULD create a local task wrapper or remote task ref instead of replacing its task model with the peer protocol object.

An A2A mapping SHOULD preserve:

FieldPurpose
native_protocola2a or another peer protocol name.
remote_task_idA2A taskId or peer-native task id.
remote_context_idA2A contextId when supplied.
remote_agent_ref / agent_card_refAgent Card, discovery record, or configured peer.
delivery_mechanismpolling, streaming, subscription, push notification, or custom.
remote_status / native_statusPeer-native state before normalization.
remote_artifact_refsArtifact refs received from the peer.

A2A messages SHOULD map to task input, clarification, or status events. A2A artifacts SHOULD map to durable artifact refs and task graph edges. Completion SHOULD require both peer terminal state and local reconciliation of artifacts, evidence, and delivery facts.

Progress and output

A runtime SHOULD report progress as facts, not only natural language:

  • phase: planning, working, verifying, delivering, waiting, or implementation-specific values.
  • current step or checklist summary.
  • counters for items, child tasks, job items, tool calls, tests, or files.
  • new output refs and output offsets for append-only logs.
  • delivery state: pending, delivered, queued, failed, parent missing, or not applicable.
  • validation state and acceptance coverage.

Large output SHOULD use refs. A progress event may carry a small summary and point to stdout, artifact, report, or evidence refs.

Events

A compatible runtime SHOULD emit the normalized task family:

EventWhen emitted
task.createdTask identity and initial objective exist.
task.acceptedRuntime accepted ownership.
task.queuedTask entered a queue.
task.startedFirst run or active execution started.
task.updatedMetadata, owner, constraints, or profile changed.
task.progressProgress, counters, phase, summary, or output refs changed.
task.waitingTask waits for input, permission, resource, dependency, or worker.
task.blockedTask cannot proceed until a blocker is resolved.
task.paused / task.resumedTask was paused or resumed.
task.retryingRetry or fallback attempt began.
task.cancel_requestedCancellation intent recorded.
task.cancelledTask reached cancelled terminal state.
task.timed_outRuntime timeout or inactivity timeout fired.
task.failedTask reached failed terminal state.
task.lostRuntime lost authoritative worker state.
task.completedTask reached completed terminal state.
task.archivedTask was removed from active scheduling.
task.delegatedChild task, subagent, job, or worker assignment was created.
task.dependency.updatedRelationship or blocker state changed.
task.attempt.started / task.attempt.completed / task.attempt.failedAttempt lifecycle changed.

Task events SHOULD carry task_id; attempt events SHOULD also carry run_id or attempt_id.

Control plane

A runtime SHOULD expose or map these operations:

CommandRequired semantics
create_taskCreate task identity with objective, scope, profile, constraints, and idempotency key.
update_taskUpdate title, objective, metadata, priority, assignee, acceptance, or constraints.
start_taskStart a run, bind worker/thread/environment, and emit attempt facts.
append_task_progressAppend progress, output refs, counters, or delivery state.
pause_task / resume_taskPause or resume without losing graph and attempt history.
cancel_taskRecord cancellation intent and propagate to active workers or child tasks.
retry_taskStart a new attempt with explicit retry reason and inherited constraints.
complete_task / fail_taskReconcile terminal state, artifacts, evidence, and acceptance facts.
list_tasks / get_taskReturn durable task read models.
link_tasks / unlink_tasksUpdate parent, child, block, source, artifact, or evidence edges.

Mutating commands MUST write runtime facts. UI-only task cards or local optimistic state are not authoritative.

Snapshot projection

Read models SHOULD expose:

  • task_summary: active count, terminal count, failed count, lost count, waiting count, and recent terminal tasks.
  • tasks: compact task records with status, title, owner, scope, current run, progress, relationships, and refs.
  • task_graph: edges needed to recover parent/child and dependency views.
  • attempts: active and recent attempts with output and checkpoint refs.
  • blocked_tasks: blockers and required action ids.
  • delivery_state: whether output was delivered back to parent, channel, or UI.

Snapshots SHOULD mark missing data as unknown, stale, or not_applicable rather than inferring success.

Anti-patterns

  • Treating the model's todo list as the task lifecycle authority.
  • Replacing a task record on retry and losing prior attempt failures.
  • Keeping background work only in UI memory, so restart loses the task.
  • Reporting completed before artifacts, evidence, or delivery facts are reconciled.
  • Creating child agents or jobs without task graph edges.
  • Emitting only final prose for a long-running task without progress, output refs, or errors.
  • Using scheduler ticks as product-level tasks.
  • Treating a lost worker as success because no error was observed.

Draft standard for portable agent execution runtimes.