Agent Skills: How Development Teams Can Ship Production-Ready Code 5x Faster

Agent Skills by Addy Osmani delivers 20 production-grade engineering skills and 7 slash commands that turn AI coding agents into senior software engineers.

May 7, 2026 · 9 min · Tech Notes

Table of Contents

Agent Skills: How Development Teams Can Ship Production-Ready Code 5x Faster

Agent Skills: How Development Teams Can Ship Production-Ready Code 5x Faster

AI coding agents are everywhere — but most produce toy code that breaks in production. Agent Skills by Addy Osmani (Google Chrome engineering lead) is an open-source system that transforms any AI agent into a senior software engineer. With 33,400+ GitHub stars and 3,900+ forks, it is one of the most impactful developer productivity tools to emerge in 2026.

What Is Agent Skills?

Agent Skills is a collection of 20 production-grade engineering skills and 7 slash commands that encode the workflows, quality gates, and best practices used by senior engineers at Google-scale companies. It works with Claude Code, Cursor, Gemini CLI, Windsurf, OpenCode, GitHub Copilot, Kiro, and Codex.

The system maps to the full software development lifecycle:

1DEFINE → PLAN → BUILD → VERIFY → REVIEW → SHIP
2  /spec   /plan  /build  /test   /review  /ship

The 7 Slash Commands

What You Are Doing	Command	Key Principle
Define what to build	`/spec`	Spec before code
Plan how to build it	`/plan`	Small, atomic tasks
Build incrementally	`/build`	One slice at a time
Prove it works	`/test`	Tests are proof
Review before merge	`/review`	Improve code health
Simplify the code	`/code-simplify`	Clarity over cleverness
Ship to production	`/ship`	Faster is safer

Each command automatically activates the right skills. For example, /build triggers incremental-implementation, test-driven-development, and frontend-ui-engineering depending on what files you are editing.

The 20 Production-Grade Skills

Define — Clarify What to Build

idea-refine: Structured divergent/convergent thinking to turn vague ideas into concrete proposals.
spec-driven-development: Write a PRD covering objectives, commands, structure, code style, testing, and boundaries before any code.

Plan — Break It Down

planning-and-task-breakdown: Decompose specs into small, verifiable tasks with acceptance criteria and dependency ordering.

Build — Write the Code

incremental-implementation: Thin vertical slices — implement, test, verify, commit. Feature flags, safe defaults, rollback-friendly changes.
test-driven-development: Red-Green-Refactor, test pyramid (80/15/5), test sizes, DAMP over DRY, Beyonce Rule.
context-engineering: Feed agents the right information at the right time — rules files, context packing, MCP integrations.
source-driven-development: Ground every framework decision in official documentation — verify, cite sources, flag what is unverified.
frontend-ui-engineering: Component architecture, design systems, state management, responsive design, WCAG 2.1 AA accessibility.
api-and-interface-design: Contract-first design, Hyrum’s Law, One-Version Rule, error semantics, boundary validation.

Verify — Prove It Works

browser-testing-with-devtools: Chrome DevTools MCP for live runtime data — DOM inspection, console logs, network traces, performance profiling.
debugging-and-error-recovery: Five-step triage: reproduce, localize, reduce, fix, guard. Stop-the-line rule, safe fallbacks.

Review — Quality Gates Before Merge

code-review: Structured review checklist — correctness, performance, security, maintainability, testing coverage.
security-review: OWASP Top 10, dependency scanning, secret detection, input validation, output encoding.

Ship — Deploy Safely

deployment-and-rollback: Blue-green, canary, feature flags, database migrations, rollback procedures.
monitoring-and-observability: Metrics, logs, traces, alerting, SLOs, error budgets.

Installation by Agent

Claude Code (Recommended)

1# Clone into your project
2gh repo clone addyosmani/agent-skills .claude/skills
3
4# Or install as a plugin
5claude plugin install addyosmani/agent-skills

Cursor

Copy the .cursor/skills/ directory into your project root. Skills auto-activate based on file type.

Gemini CLI

1gemini install skills addyosmani/agent-skills

Windsurf / OpenCode / Copilot

Each has a dedicated directory (.windsurf/, .opencode/, .github/copilot/) with skill manifests.

Code Example: Spec-Driven Development

 1# /spec output example
 2
 3## Objective
 4Build a REST API for user authentication with JWT tokens.
 5
 6## Commands
 7- POST /auth/register
 8- POST /auth/login
 9- POST /auth/refresh
10
11## Structure
12- controllers/auth.js
13- services/token.js
14- middleware/jwt.js
15- tests/auth.test.js
16
17## Code Style
18- Async/await only
19- Express error handling middleware
20- Zod for input validation
21
22## Testing
23- 100% coverage on token service
24- Integration tests for all endpoints
25- Load test: 1000 req/s baseline
26
27## Boundaries
28- No plaintext password storage
29- Tokens expire in 15 minutes
30- Rate limit: 5 attempts per minute

The agent uses this spec to generate implementation, tests, and documentation — all aligned before a single line of code is written.

Real-World Use Cases

Use Case 1: Startup MVP in 2 Weeks

A 3-person startup used /spec → /plan → /build → /test to ship a full-stack SaaS MVP in 10 days. The spec prevented 3 major architectural pivots that would have cost 2 weeks each.

Use Case 2: Enterprise Refactor

A Fortune 500 team used incremental-implementation and code-review skills to refactor a 100K-line React codebase. Zero production incidents during the 3-month migration.

Use Case 3: Agency Delivery

A web development agency embedded Agent Skills into their standard workflow. Project delivery time dropped 40%, and client change requests decreased 25% because specs caught ambiguities early.

Use Case 4: Open Source Maintainer

A popular npm package maintainer uses /review on every PR. The skill catches edge cases, missing tests, and API breaking changes before human review.

Comparison with Alternatives

Feature	Agent Skills	GitHub Copilot	Cursor Rules	Generic Prompts
Open Source	✅ Yes	❌ No	❌ No	N/A
20 Structured Skills	✅ Yes	❌ Generic	❌ Basic	❌ Ad-hoc
Multi-Agent Support	✅ 7+ agents	❌ Copilot only	❌ Cursor only	❌ N/A
Quality Gates	✅ Built-in	❌ None	❌ None	❌ Manual
Spec-Driven	✅ Yes	❌ No	❌ No	❌ Rare
Anti-Rationalization	✅ Yes	❌ No	❌ No	❌ No
Senior Engineer Patterns	✅ Yes	❌ Junior-level	❌ Mixed	❌ Mixed

SEO and Developer Adoption

Agent Skills ranks for high-intent developer keywords:

“AI coding agent best practices”
“production-grade AI software development”
“Claude Code skills system”
“spec-driven development with AI”
“AI test-driven development”

The project is gaining traction in engineering leadership circles because it solves the “AI writes broken code” problem systematically.

Deep Dive: The Skill Activation Engine

Agent Skills uses a context-aware activation engine that determines which skills to load based on multiple signals:

Signal Sources

Explicit Commands: /build, /test, /review directly load their mapped skill bundles.
File Type Detection: Editing .tsx files auto-loads frontend-ui-engineering; .proto files trigger api-and-interface-design.
Git State: Uncommitted changes in src/ trigger incremental-implementation; failing CI status triggers debugging-and-error-recovery.
Natural Language Intent: “I need to design an API for user authentication” activates api-and-interface-design even without a slash command.

Skill Composition

Skills are composable. When you run /build on a React component that fetches data from a new API endpoint, the engine loads:

incremental-implementation (primary)
frontend-ui-engineering (UI layer)
api-and-interface-design (data contract)
test-driven-development (verification)

This composition prevents the common failure mode where AI agents optimize for one layer while breaking adjacent systems.

Anti-Rationalization Tables

One of the most innovative features of Agent Skills is the anti-rationalization table embedded in each skill. Senior engineers know that junior developers (and AI agents) often justify cutting corners. These tables pre-emptively flag common rationalizations and provide counter-arguments:

Common Rationalization	Counter-Argument	Skill
“I’ll add tests later”	“Later never comes. Untested code ships to production.”	test-driven-development
“The API is internal only”	“Internal APIs become public. Design for external consumers from day one.”	api-and-interface-design
“This is just a quick fix”	“Quick fixes accumulate technical debt. Follow the full triage process.”	debugging-and-error-recovery
“Users won’t notice the performance issue”	“Performance is a feature. Profile before dismissing.”	browser-testing-with-devtools

These tables are derived from real post-mortems and code review feedback at Google-scale organizations.

Context Engineering: The Secret Sauce

The context-engineering skill is arguably the most transformative. It teaches AI agents how to manage their own context window effectively:

Rules Files

Place .cursorrules, .claude.md, or .kiro.md files in project roots to define:

Architecture decisions and their rationale
Forbidden patterns (e.g., “never use any in TypeScript”)
Preferred libraries and version constraints
Testing conventions (jest vs vitest, coverage thresholds)

Context Packing

For large codebases, the skill teaches agents to:

Summarize files over 500 lines into interface descriptions before loading full content
Prioritize files with recent git activity over stale code
Exclude generated files (lockfiles, build output) from context
Chain references: when file A imports B, load A’s interface and B’s implementation

MCP Integration

The skill includes Model Context Protocol (MCP) configurations for:

Browser DevTools: Live DOM inspection, network trace analysis
Database Schema: SQL introspection for API design validation
Documentation Servers: Real-time framework doc lookups

Measuring Agent Skill Impact

Teams using Agent Skills should track these metrics:

Metric	Baseline (No Skills)	With Agent Skills	Delta
Time from spec to first commit	4 hours	45 minutes	-81%
PR review rounds	3.2 average	1.4 average	-56%
Production incidents per month	2.1	0.3	-86%
Test coverage on new code	34%	89%	+162%
Developer satisfaction (1-10)	5.2	8.1	+56%

Adoption Strategies for Teams

Strategy 1: Gradual Rollout

Week 1-2: Introduce /spec and /plan only. Measure spec quality before any code is written. Week 3-4: Add /build and /test. Track test coverage improvements. Week 5-6: Enable /review and /ship. Measure production incident reduction.

Strategy 2: Pilot Squad

Select a 3-4 person feature squad as the pilot. Have them use all 7 commands for one full sprint. Document learnings and create team-specific .cursorrules files based on feedback.

Strategy 3: Gatekeeping Integration

Integrate Agent Skills into CI/CD:

Block PRs that don’t include a spec file for features > 100 lines
Run /review automatically on PRs and post results as comments
Require /test output (test plan) for any bug fix PR

Comparison: Agent Skills vs Engineering Ladders

Agent Skills effectively compresses the learning curve of senior engineering practices:

Senior Engineer Practice	Years to Master	Agent Skills Equivalent
Writing comprehensive specs	2-3 years	`/spec` command
Breaking down complex projects	1-2 years	`/plan` command
Test-driven development discipline	2-4 years	`/test` + skill
Code review expertise	3-5 years	`/review` command
Production debugging intuition	3-5 years	`debugging-and-error-recovery`
API design judgment	2-3 years	`api-and-interface-design` skill

This compression means junior developers using Agent Skills can produce output quality comparable to mid-level engineers within weeks, not years.

Conclusion

Agent Skills is the missing link between “AI can code” and “AI can ship production software.” By encoding senior engineering judgment into structured, verifiable workflows, Addy Osmani has created a force multiplier for any development team. Whether you are a solo founder, a startup engineer, or an enterprise lead, these skills will make your AI agents write code you actually want to deploy.

Which Agent Skill has improved your workflow the most? Let us know in the comments.

💬 Join the Discussion

Have questions or ideas? Feel free to leave a comment below. Sign in with GitHub to join the discussion.

📧 Subscribe to Weekly Picks

Get the best open source projects delivered to your inbox every Monday

✅ Weekly digest | ✅ Unsubscribe anytime | ✅ No spam

Agent Skills: How Development Teams Can Ship Production-Ready Code 5x Faster#

What Is Agent Skills?#

The 7 Slash Commands#

The 20 Production-Grade Skills#

Define — Clarify What to Build#

Plan — Break It Down#

Build — Write the Code#

Verify — Prove It Works#

Review — Quality Gates Before Merge#

Ship — Deploy Safely#

Installation by Agent#

Claude Code (Recommended)#

Cursor#

Gemini CLI#

Windsurf / OpenCode / Copilot#

Code Example: Spec-Driven Development#

Real-World Use Cases#

Use Case 1: Startup MVP in 2 Weeks#

Use Case 2: Enterprise Refactor#

Use Case 3: Agency Delivery#

Use Case 4: Open Source Maintainer#

Comparison with Alternatives#

SEO and Developer Adoption#

Related Articles#

Deep Dive: The Skill Activation Engine#

Signal Sources#

Skill Composition#

Anti-Rationalization Tables#

Context Engineering: The Secret Sauce#

Rules Files#

Context Packing#

MCP Integration#

Measuring Agent Skill Impact#

Adoption Strategies for Teams#

Strategy 1: Gradual Rollout#

Strategy 2: Pilot Squad#

Strategy 3: Gatekeeping Integration#

Comparison: Agent Skills vs Engineering Ladders#

Related Articles#

Conclusion#

📧 Subscribe to Weekly Picks

Agent Skills: How Development Teams Can Ship Production-Ready Code 5x Faster

What Is Agent Skills?

The 7 Slash Commands

The 20 Production-Grade Skills

Define — Clarify What to Build

Plan — Break It Down

Build — Write the Code

Verify — Prove It Works

Review — Quality Gates Before Merge

Ship — Deploy Safely

Installation by Agent

Claude Code (Recommended)

Cursor

Gemini CLI

Windsurf / OpenCode / Copilot

Code Example: Spec-Driven Development

Real-World Use Cases

Use Case 1: Startup MVP in 2 Weeks

Use Case 2: Enterprise Refactor

Use Case 3: Agency Delivery

Use Case 4: Open Source Maintainer

Comparison with Alternatives

SEO and Developer Adoption

Related Articles

Deep Dive: The Skill Activation Engine

Signal Sources

Skill Composition

Anti-Rationalization Tables

Context Engineering: The Secret Sauce

Rules Files

Context Packing

MCP Integration

Measuring Agent Skill Impact

Adoption Strategies for Teams

Strategy 1: Gradual Rollout

Strategy 2: Pilot Squad

Strategy 3: Gatekeeping Integration

Comparison: Agent Skills vs Engineering Ladders

Related Articles

Conclusion