Skills
The catalog. Each skill has its own page with a summary, its key opinions, install snippets, and a link to the canonical SKILL.md.
Currently shipping
- rest-api-design: design and review HTTP REST APIs. Resource-oriented URLs, PATCH for state transitions, domain-expressive error codes, flat error envelopes, idempotency, content-type negotiation, typed contracts across TS/Python/Go/Rust.
- structured-code-review: a review-only output format for code reviews. Source-of-truth-aware preamble, severity-tagged findings, file:line citations, no-findings-still-formal. Composes with domain-review skills.
- task-handoff-summaries: three structured report formats. Implementation summary (before commit), worker handoff summary (multi-agent to orchestrator), closeout summary (after completion). Hard rules against using the summary to mask incomplete work.
- cross-agent-review: workflow for routine cross-vendor agent peer review (Claude reviews Codex’s work; Codex reviews Claude’s). The handoff package with self-assessment redacted, the cold-review discipline, the disagreement protocol, the bounded iteration loop.
- multi-agent-git-workflow: git discipline for multi-agent work. Branch hierarchy, worktree-per-agent topology, orchestrator/worker roles, merge authority, acceptance/rejection rules, plus universal commit discipline (Conventional Commits, mandatory task ID, co-author line, UAT gate, no silent amends).
- branch-promotion-discipline: the layer above multi-agent-git-workflow. 3-tier
developtouattomainpromotion, UAT branch as a long-lived environment, per-tier CI gate matrix, source-ref enforcement, hotfix flow with forward-merge, branch protection ruleset, pre-commit hook setup.
How skills are evaluated
Every skill in this collection passes through an evaluation loop before it ships:
- Draft the SKILL.md based on real policies or design intent
- Write 4 test prompts in
evals/evals.jsonthat probe the skill’s distinct opinions - Spawn parallel runs, one with the skill loaded, one baseline without
- Grade outputs against per-case assertions; aggregate into a benchmark
- Iterate until the with-skill output is materially better than the baseline on the assertions that matter
The eval set ships in the repo at plugins/<plugin-name>/skills/<skill-name>/evals/evals.json so anyone can re-run it.
Contributing a new skill
To add a new skill to the collection:
- Plugin manifest:
plugins/<plugin-name>/.claude-plugin/plugin.jsonwithname,description,version. - Skill file:
plugins/<plugin-name>/skills/<skill-name>/SKILL.mdin the standard SKILL format (YAML frontmatter withname,description; markdown body). - Marketplace entry: add to
.claude-plugin/marketplace.jsonunderpluginswithsource: ./plugins/<plugin-name>. - Evals (optional but the whole point):
plugins/<plugin-name>/skills/<skill-name>/evals/evals.jsonwith test cases and per-assertion pass/fail criteria. - Docs page:
skills/<skill-name>.md(this directory) with the human-facing summary, key opinions, install snippets, and a link to the SKILL.md.
Open a PR. If you ran evals, drop the iteration-1 benchmark in the PR description.
See the existing rest-api-design and structured-code-review pages for the format.