Skip to content

Assembly Skills

The skill packaging and procedural-memory patterns referenced throughout Assembly › Skills draw on the 2024-2026 agentic skill research lineage. This page catalogs the canonical skill formats, activation strategies, validation methods, and the selection criteria the platform uses when configuring skills for a given Unitt.

Anthropic Agent Skills

A Skill is a directory containing a SKILL.md plus optional scripts, templates, and assets; the file opens with YAML frontmatter where name (≤64 chars, must match the folder) and description (≤1024 chars, governs activation) are required, and optional fields include allowed-tools, license, and when_to_use (SKILL.md spec, agentskills.io specification). On December 18, 2025 Anthropic released Agent Skills as an open standard at agentskills.io and shipped org-wide management for Team / Enterprise admins. The same SKILL.md works unmodified across Claude Code, OpenAI Codex CLI, Gemini CLI, GitHub Copilot, Cursor, VS Code, Goose, Amp, and OpenCode, with Microsoft, OpenAI, Atlassian, Figma, Cursor, and GitHub all adopting.

Skill Activation And Retrieval

Activation uses progressive disclosure: at startup the agent pre-loads only every skill's name plus description (roughly 5K tokens for 50 skills) into the system prompt; only when the model judges relevance does it read the full SKILL.md body into context; body assets load on demand (Anthropic skill guide). For libraries too large for even metadata to fit, langgraph-bigtool runs semantic search over tool / skill descriptions in an InMemoryStore and injects only the top-k matches at runtime. The description string is the load-bearing retrieval key.

Procedural Memory

CoALA (Sumers et al., 2023) divides long-term memory into semantic, episodic, and procedural tiers; the procedural tier "encodes skills and procedures, often represented as code snippets, tool definitions, or implicitly within LLM parameters"; exactly the substrate SKILL.md formalizes. Voyager (Wang et al., 2023) is the canonical reference: each Minecraft skill is a JavaScript function indexed by the embedding of its natural-language description; on a new task the top-5 skills are retrieved by cosine similarity and injected into the prompt. In Emergence, the Skills assembly is the procedural-memory tier: write-once-via-promotion, embedding-indexed, code-verified.

MCP Prompts Versus Skills

MCP defines three server primitives; tools, resources, prompts; where Prompts are reusable templates a server offers over JSON-RPC alongside its capabilities (MCP architecture). Skills are the guidance layer (filesystem-resident, no protocol, no execution), while MCP is the execution layer (servers, auth, JSON-RPC). MCP Prompts live inside a single server and are pulled across a wire; Skills are static folders the host loads into the model's context. MCP gives an agent the ability to act; Skills tell it how (Maxim AI, Armin Ronacher).

Complex Skills

A complex Skill bundles a workflow (numbered steps in the SKILL.md body) plus executable scripts in the folder, declared tool / connector requirements (allowed-tools), and policy text. Anthropic cites examples like "fetch transactions, run anomaly detection, cross-reference against compliance policies, and generate reports" loaded in one activation pass. Runtime activation is deny-by-default for tools: a subagent inherits the skill's tool allowlist only, applying least privilege as immutable config the model cannot widen.

Skill Validation

Anthropic's skill-creator v2 (released March 2026) adds four agents (Executor, Grader, Comparator, Analyzer) supporting Create / Eval / Improve / Benchmark modes, with task-based assertions, blind A / B comparison between skill versions, and trigger-tuning that iteratively rewrites the description until activation accuracy improves (5 of 6 official document-creation skills improved using a 60/40 train/test split with up to 5 iterations) (Tessl, skill-creator SKILL.md). The framework defines two regression classes: regression (model plus skill worse after a model upgrade) and outgrowth (base model now passes evals without the skill; recommend retire).

Skill Versioning

Skill metadata carries a SemVer string; Anthropic's /v1/skills endpoint exposes create / view / upgrade-version operations; eval results are pinned to a published version so v1.2.0 can be compared with v1.1.0 deterministically. In bundle / fabric workflows, skills are pinned alongside model and tool refs in version control, and evals gate as a CI step before promotion.

Skill Marketplaces

Distribution shapes range from curated first-party (anthropics/skills GitHub repo, ~17 official skills) to large vector-indexed public registries (ClawHub, 20K+ skills, semantic search) to wide-net aggregators (SkillsMP indexes 425K-800K SKILL.md files scraped from GitHub with ≥ 2 stars and minimal curation) (KDnuggets marketplaces). Internal / private distribution is typically GitOps: a git repo of skills/*/SKILL.md synced to admin-provisioned org pools, with experimental zip-upload paths gated by an explicit config flag (gend.co deploy guide).

Skill Safety

Snyk's ToxicSkills study (February 2026) scanned ClawHub and found 36.82% of skills (1,467) had at least one security flaw and 13.4% (534) had critical issues; malware delivery, prompt injection, hardcoded secrets; and OpenClaw's barrier to publish was a SKILL.md plus a one-week-old GitHub account, no signing, no review (Penligent). Skill poisoning is the analog of tool poisoning but evades CVE / SBOM because malicious instructions are not code dependencies; OWASP has drafted an Agentic Skills Top 10 and tools like skillfortify produce an Agent-SBOM (ASBOM). Mitigations: trusted-publisher signing, reject unsigned at runtime, static scan SKILL.md + assets, propagate trust scores through skill-to-skill dependency edges.

Skill Composition

Composition is supported by reference: a SKILL.md may instruct the agent to invoke another skill, producing a skill graph traversed at runtime via the same description-retrieval mechanism. Anthropic's skill-creator is itself a meta-skill; a skill that authors other skills. Best-practice taxonomy: discovery / selection, context economy, instruction calibration, workflow control, executable code; composition lives in the workflow-control category.

Selection Criteria

Need Packaging Shape Activation Validation
Single deterministic capability MCP Tool Always-on, listed in system prompt Tool unit test, contract test
External system access MCP Server (Tools + Resources + Prompts) Always-on connector Integration test against sandbox
Reusable methodology / convention Simple Skill (SKILL.md only) Description-triggered, progressive disclosure skill-creator Eval, trigger-tuning
Multi-stage workflow + scripts + policy Complex Skill (SKILL.md + scripts + allowed-tools) Description-triggered, then load assets Eval + A/B + outgrowth check
100s-1000s of capabilities Skill library + bigtool Semantic top-k retrieval Retrieval accuracy + per-skill eval
Skill that builds skills Meta-skill Invoked by author / user Output-skill passes own evals

Picking Heuristic

If it teaches how, ship a Skill; if it executes what, ship a Tool / MCP server; if both, ship a complex Skill that declares its required tools.

Cross-References