Assembly Skills¶
The skill packaging and procedural-memory patterns referenced throughout Assembly › Skills draw on the 2024-2026 agentic skill research lineage. This page catalogs the canonical skill formats, activation strategies, validation methods, and the selection criteria the platform uses when configuring skills for a given Unitt.
Anthropic Agent Skills¶
A Skill is a directory containing a SKILL.md plus optional scripts, templates, and assets; the file opens with YAML frontmatter where name (≤64 chars, must match the folder) and description (≤1024 chars, governs activation) are required, and optional fields include allowed-tools, license, and when_to_use (SKILL.md spec, agentskills.io specification). On December 18, 2025 Anthropic released Agent Skills as an open standard at agentskills.io and shipped org-wide management for Team / Enterprise admins. The same SKILL.md works unmodified across Claude Code, OpenAI Codex CLI, Gemini CLI, GitHub Copilot, Cursor, VS Code, Goose, Amp, and OpenCode, with Microsoft, OpenAI, Atlassian, Figma, Cursor, and GitHub all adopting.
Skill Activation And Retrieval¶
Activation uses progressive disclosure: at startup the agent pre-loads only every skill's name plus description (roughly 5K tokens for 50 skills) into the system prompt; only when the model judges relevance does it read the full SKILL.md body into context; body assets load on demand (Anthropic skill guide). For libraries too large for even metadata to fit, langgraph-bigtool runs semantic search over tool / skill descriptions in an InMemoryStore and injects only the top-k matches at runtime. The description string is the load-bearing retrieval key.
Procedural Memory¶
CoALA (Sumers et al., 2023) divides long-term memory into semantic, episodic, and procedural tiers; the procedural tier "encodes skills and procedures, often represented as code snippets, tool definitions, or implicitly within LLM parameters"; exactly the substrate SKILL.md formalizes. Voyager (Wang et al., 2023) is the canonical reference: each Minecraft skill is a JavaScript function indexed by the embedding of its natural-language description; on a new task the top-5 skills are retrieved by cosine similarity and injected into the prompt. In Emergence, the Skills assembly is the procedural-memory tier: write-once-via-promotion, embedding-indexed, code-verified.
MCP Prompts Versus Skills¶
MCP defines three server primitives; tools, resources, prompts; where Prompts are reusable templates a server offers over JSON-RPC alongside its capabilities (MCP architecture). Skills are the guidance layer (filesystem-resident, no protocol, no execution), while MCP is the execution layer (servers, auth, JSON-RPC). MCP Prompts live inside a single server and are pulled across a wire; Skills are static folders the host loads into the model's context. MCP gives an agent the ability to act; Skills tell it how (Maxim AI, Armin Ronacher).
Complex Skills¶
A complex Skill bundles a workflow (numbered steps in the SKILL.md body) plus executable scripts in the folder, declared tool / connector requirements (allowed-tools), and policy text. Anthropic cites examples like "fetch transactions, run anomaly detection, cross-reference against compliance policies, and generate reports" loaded in one activation pass. Runtime activation is deny-by-default for tools: a subagent inherits the skill's tool allowlist only, applying least privilege as immutable config the model cannot widen.
Skill Validation¶
Anthropic's skill-creator v2 (released March 2026) adds four agents (Executor, Grader, Comparator, Analyzer) supporting Create / Eval / Improve / Benchmark modes, with task-based assertions, blind A / B comparison between skill versions, and trigger-tuning that iteratively rewrites the description until activation accuracy improves (5 of 6 official document-creation skills improved using a 60/40 train/test split with up to 5 iterations) (Tessl, skill-creator SKILL.md). The framework defines two regression classes: regression (model plus skill worse after a model upgrade) and outgrowth (base model now passes evals without the skill; recommend retire).
Skill Versioning¶
Skill metadata carries a SemVer string; Anthropic's /v1/skills endpoint exposes create / view / upgrade-version operations; eval results are pinned to a published version so v1.2.0 can be compared with v1.1.0 deterministically. In bundle / fabric workflows, skills are pinned alongside model and tool refs in version control, and evals gate as a CI step before promotion.
Skill Marketplaces¶
Distribution shapes range from curated first-party (anthropics/skills GitHub repo, ~17 official skills) to large vector-indexed public registries (ClawHub, 20K+ skills, semantic search) to wide-net aggregators (SkillsMP indexes 425K-800K SKILL.md files scraped from GitHub with ≥ 2 stars and minimal curation) (KDnuggets marketplaces). Internal / private distribution is typically GitOps: a git repo of skills/*/SKILL.md synced to admin-provisioned org pools, with experimental zip-upload paths gated by an explicit config flag (gend.co deploy guide).
Skill Safety¶
Snyk's ToxicSkills study (February 2026) scanned ClawHub and found 36.82% of skills (1,467) had at least one security flaw and 13.4% (534) had critical issues; malware delivery, prompt injection, hardcoded secrets; and OpenClaw's barrier to publish was a SKILL.md plus a one-week-old GitHub account, no signing, no review (Penligent). Skill poisoning is the analog of tool poisoning but evades CVE / SBOM because malicious instructions are not code dependencies; OWASP has drafted an Agentic Skills Top 10 and tools like skillfortify produce an Agent-SBOM (ASBOM). Mitigations: trusted-publisher signing, reject unsigned at runtime, static scan SKILL.md + assets, propagate trust scores through skill-to-skill dependency edges.
Skill Composition¶
Composition is supported by reference: a SKILL.md may instruct the agent to invoke another skill, producing a skill graph traversed at runtime via the same description-retrieval mechanism. Anthropic's skill-creator is itself a meta-skill; a skill that authors other skills. Best-practice taxonomy: discovery / selection, context economy, instruction calibration, workflow control, executable code; composition lives in the workflow-control category.
Selection Criteria¶
| Need | Packaging Shape | Activation | Validation |
|---|---|---|---|
| Single deterministic capability | MCP Tool | Always-on, listed in system prompt | Tool unit test, contract test |
| External system access | MCP Server (Tools + Resources + Prompts) | Always-on connector | Integration test against sandbox |
| Reusable methodology / convention | Simple Skill (SKILL.md only) | Description-triggered, progressive disclosure | skill-creator Eval, trigger-tuning |
| Multi-stage workflow + scripts + policy | Complex Skill (SKILL.md + scripts + allowed-tools) | Description-triggered, then load assets | Eval + A/B + outgrowth check |
| 100s-1000s of capabilities | Skill library + bigtool | Semantic top-k retrieval | Retrieval accuracy + per-skill eval |
| Skill that builds skills | Meta-skill | Invoked by author / user | Output-skill passes own evals |
Picking Heuristic¶
If it teaches how, ship a Skill; if it executes what, ship a Tool / MCP server; if both, ship a complex Skill that declares its required tools.
Cross-References¶
- Assembly › Skills; developer-facing platform layer.
- Reference › Research › Assembly Tools; tool execution surface skills declare against.
- Reference › Research › Memory Systems; procedural memory tier that backs Skills.