Objectives¶

Objectives define the intended outcomes that each Unitt is expected to achieve and act as the primary benchmark for measuring runtime success. Every Objective should define the expected process output in a way that can be validated on each run, allowing the runtime to continuously evaluate whether execution remains aligned with the intended operational goals of the system. The recommended design pattern is for each Unitt to maintain one primary objective that defines the core mission of the runtime, decomposed into a small set of sub-objectives that the runtime gates and validates independently.

Objectives are informed by the active agentic goal-specification research lineage, including the Anthropic multi-agent research system (which names objective ambiguity as the number-one driver of duplicated work), Plan-and-Solve prompting (Wang et al., ACL 2023), Plan-and-Execute (LangGraph), HTN / ChatHTN (Muñoz-Avila 2025), MetaGPT SOP staging, AgentBoard sub-goal progress scoring, τ-bench pass^k reliability framing, OKR-Agent (Zheng et al.), the OWASP Top 10 for Agentic Applications 2026 "excessive agency" mitigation guidance, and the Graph Harness replan-vs-repair discipline. Selection criteria for objective shape and decomposition strategy are documented in Reference › Research › Assembly Objectives.

Objective Shape¶

A primary objective is the verifiable end state the Unitt is responsible for producing; it is outcome-shaped rather than task-shaped. Outcome-shaped objectives (the existence of a record, a passing test, a closed ticket, a measurable score) survive replanning and tool drift because they are independent of the procedure used to reach them. Task-shaped objectives ("call API X then API Y") break the moment a tool changes. The Anthropic subagent template enumerates four load-bearing fields that every objective spec should declare: objective, output format, tool and source guidance, and explicit task boundaries.

flowchart TD
    PO[Primary Objective] --> OF[Output Format]
    PO --> TG[Tool / Source Guidance]
    PO --> TB[Task Boundaries]
    PO --> SC[Success Condition]
    PO --> CN[Constraints]

    SC --> OR[Outcome Oracle]
    OR --> DET[Deterministic: schema / hash / unit test]
    OR --> JDG[LLM-as-Judge: rubric for subjective]

    classDef stage fill:#ffd541,stroke:#222021,color:#222021
    class PO,OF,TG,TB,SC,CN,OR,DET,JDG stage

Sub-Objectives¶

Sub-objectives define the smaller staged execution goals that collectively work toward completion of the primary objective. These sub-objectives allow the runtime to decompose larger operational tasks into sequential, parallel, or iterative execution stages that can be individually validated, monitored, retried, optimized, or escalated throughout execution. Sub-objectives may define dependencies, execution ordering, validation checkpoints, or success conditions, allowing the runtime to continuously measure progress while coordinating workflows, skills, tools, patterns, and connectors toward completion of the overall objective.

The platform supports four canonical decomposition strategies, selected based on the workload profile and documented in Reference › Research › Assembly Objectives. Granularity guidance from published practice: 3-5 sub-objectives for coarse plans, 8-10 for detailed plans, with each sub-objective sized to roughly one tool-call cluster.

flowchart LR
    PO[Primary Objective] --> D1{Decomposition Strategy}
    D1 --> PS[Plan-and-Solve]
    D1 --> PE[Plan-and-Execute]
    D1 --> SOP[MetaGPT SOP]
    D1 --> HTN[HTN / ChatHTN]
    D1 --> DAG[Graph Harness DAG]

    PS --> SO[Sub-Objectives]
    PE --> SO
    SOP --> SO
    HTN --> SO
    DAG --> SO

    classDef stage fill:#ffd541,stroke:#222021,color:#222021
    class PO,D1,PS,PE,SOP,HTN,DAG,SO stage

Primary Example

Example Objective Prompt

Generate a qualified outbound prospecting email for a target company.

Example Sub-Objectives

Research the company website.
Build a structured company profile.
Summarize the business model.
Identify nearest competitors.
Validate competitor pricing.
Analyze market positioning.
Generate a draft email.

Sub-Objective Dependency Graph¶

Modern sub-objectives form a DAG, not a strict tree. Independent sub-objectives execute in parallel waves; dependent sub-objectives wait on upstream artifacts. The Graph Harness discipline mandates three commitments that the platform enforces: the plan is immutable for the plan version, planning / execution / recovery are separate layers, and recovery escalates retry → local patch → replan before any global replan is permitted.

flowchart LR
    SO1[Research Website] --> SO2[Build Profile]
    SO1 --> SO4[Identify Competitors]
    SO2 --> SO3[Summarize Business Model]
    SO4 --> SO5[Validate Pricing]
    SO5 --> SO6[Analyze Positioning]
    SO3 --> SO7[Draft Email]
    SO6 --> SO7
    SO7 --> HR[Human Approval]
    HR --> OUT[Outcome]

    classDef stage fill:#ffd541,stroke:#222021,color:#222021
    class SO1,SO2,SO3,SO4,SO5,SO6,SO7,HR,OUT stage

Validation Checkpoints¶

Every sub-objective edge is wrapped in a typed gate that enforces pre-conditions, executes the sub-objective, validates post-conditions, evaluates confidence, and either commits a checkpoint or routes to retry, local repair, replan, or escalation. The gate is the same shape used by the Fabric › Flow layer, applied at the objective decomposition tier instead of the runtime topology tier.

flowchart LR
    READY[Ready] --> PRE[Pre-Condition Gate]
    PRE -->|fail| ESC[Escalate / Replan]
    PRE -->|pass| EX[Execute Sub-Objective]
    EX --> POST[Post-Condition Gate]
    POST -->|fail| RT{Retries < N?}
    RT -->|yes| EX
    RT -->|no| LR[Local Repair]
    LR -->|success| POST
    LR -->|fail| RP[Replan]
    RP -->|fail| ESC
    POST -->|pass| CONF{Confidence >= τ?}
    CONF -->|yes| CKPT[Checkpoint + Unblock Dependents]
    CONF -->|no| HRQ[Human Review Queue]

    classDef stage fill:#ffd541,stroke:#222021,color:#222021
    class READY,PRE,EX,POST,RT,LR,RP,ESC,CONF,CKPT,HRQ stage

Constraints¶

Constraints define the maximum operational boundaries of what a Unitt is permitted to attempt while pursuing its Objectives. Constraints are designed to limit unsafe, excessive, misaligned, or unauthorized execution behavior by establishing explicit runtime limits around actions, autonomy, permissions, budgets, communication, or escalation behavior. Constraints are continuously evaluated throughout execution to ensure that runtime activity remains controllable, governable, and aligned with the intended safety and operational boundaries of the system.

Constraints map directly to the OWASP Top 10 for Agentic Applications mitigations for Excessive Agency (ASI category): least-agency posture (minimum autonomy to complete the task), per-tool profiles restricting permissions and data, explicit human confirmation for sensitive actions, isolated execution environments with enforced network policies, identity isolation between tasks, and a central policy engine that checks every sensitive action. The Constraint surface is the platform's primary user-facing knob for enforcing these mitigations declaratively.

Constraint Example

Example Constraint Prompt

Define the runtime boundaries for an outbound prospecting agent that may research approved companies, compare competitors, and draft emails. Specify what the agent must never do, what requires human approval, which systems it may use, how far execution may proceed, what budget and confidence limits apply, and when execution should stop or escalate.

Example Outcome Constraints

Do not send emails without human approval.
Do not contact companies outside the approved target list.
Limit runtime execution to 15 workflow stages.
Do not exceed the configured token or API budget.
Only use approved connectors and authenticated tools.
Escalate for review if confidence falls below 80%.
Do not store sensitive customer data in long-term memory.
Stop execution if competitor pricing cannot be validated.

Reliability And Cost Framing¶

Modern Objectives are scored on pass^k reliability rather than single-shot pass^1. τ-bench shows GPT-4-class agents routinely drop below pass^8 < 25% even when pass^1 ≈ 50%, so a single successful run is not a sufficient outcome signal. Objectives in Emergence are paired with a pass^k floor and a cost-per-success ceiling at definition time; both flow into the Fabric › Test release gate.

Metric	Use
`pass^k`	Floor on consecutive-success probability for the workload tier.
Cost-per-success	Ceiling on total dollars per successful run.
Sub-goal progress rate	Triage signal localizing failure to a sub-objective.
Outcome oracle	Final state predicate the runtime evaluates against.

Objective Validation¶

Before a Unitt can run, its Objective spec must pass an explicit validation pipeline. Validation prevents the OWASP-Agentic specification failure class, which is responsible for roughly 42% of multi-agent failures per the MAST taxonomy.

flowchart LR
    SPEC[Objective Spec] --> DRY[Dry-Run Plan Generation]
    DRY --> COST[Cost Ceiling Estimation]
    COST --> POL[Policy Simulation]
    POL --> ORA[Oracle Reachability Check]
    ORA --> REP[Validation Report]
    REP -->|pass| READY[Unitt Ready]
    REP -->|fail| FIX[Return To Editor]

    classDef stage fill:#ffd541,stroke:#222021,color:#222021
    class SPEC,DRY,COST,POL,ORA,REP,READY,FIX stage

Dry-run plan generation produces the sub-objective DAG without side effects and lints it for unreachable nodes, missing post-conditions, and tool-permission gaps.
Cost ceiling estimation sums expected tokens and tool calls per node and compares to the configured budget.
Policy simulation evaluates a synthetic representative trajectory against policy.md to surface would-be excessive-agency steps.
Oracle reachability confirms the success oracle is reachable under the declared tool set.

Selection Heuristic¶

Profile	Objective Shape	Decomposition	Validation
Short, single-tool, deterministic	Single outcome-shaped	None	Deterministic oracle (schema / regex / hash).
Multi-step reasoning, one agent	Single outcome	Plan-and-Solve prompt	Final-state oracle + self-check.
Long-horizon, mixed tools	Outcome + KRs	Plan-and-Execute (LangGraph)	Per-step deterministic gate + replay.
Domain SOP	Outcome per role	MetaGPT SOP staging	Structured handover artifact validation.
Parallelizable, data-flow heavy	Outcome with KR DAG	Graph Harness wave execution	Pre / post-condition gate per node.
Symbolic / verifiable domain	Outcome + formal post-conditions	HTN / ChatHTN	Symbolic proof + LLM-as-Judge fallback.
High-risk / non-reversible	Outcome + tight constraints	Decomposed + approval gates	Policy engine + HITL per sensitive node.

Cross-References¶

Core supplies the identity, rules, and policies that bound the objective surface.
Patterns converts the objective into a runtime workflow graph.
Skills and Tools are the executable means the objective draws on.
Emergence › WorldSim and Fabric › Test score the objective end-to-end.
Reference › Research › Assembly Objectives documents citations and selection criteria.