Tool Customization, Agents & Governance
Build your own rules, agents, and skills — the assets that make
Module Overview
Module 6 is where you stop re-explaining your standards every session and start encoding them. You learn the three universal customization concepts — rules (always-on project guidance), agents (named scoped AI roles), and skills (on-demand domain expertise) — all as portable markdown that transfers across tools. The point is leverage: build the asset once, and every future session and every future cohort benefits.
| At a glance | |
|---|---|
| Covers | Rules (AGENTS.md and tool-specific overrides); agents/personas as portable markdown; skills (SKILL.md) loaded on demand; tight scoping; governance and contributing back |
| When it runs | Week 3 Monday (in-person), after Module 5 |
| Builds on | Module 3 (agents from prompts) and your work across Weeks 1–3 |
| Leads into | Module 7 — sharing what you built with the next cohort |
What you'll produce
One rule, one agent, and one skill — built, tested against a real task, and committed to the project’s knowledge directory as version-controlled assets the next cohort inherits.
What You’re Building
By the end of this module, you will have built and contributed three assets to the project’s knowledge directory. These are not exercises — they are real, version-controlled assets that every future session and every future cohort participant benefits from.
| Asset | What it is | Where it goes | Loaded |
|---|---|---|---|
| Rule | A project standard the AI follows automatically every session | AGENTS.md (cross-tool) or tool-specific override file | Always-on, before you type anything |
| Agent | A named, scoped AI role you invoke for a recurring task | knowledge/agents/{your-role}/{name}.md | On invocation (manual or auto-delegated) |
| Skill | Packaged domain expertise loaded when a task matches | knowledge/skills/{name}/SKILL.md | Dynamically, when task matches description |
When to use which
Rule: the standard applies to EVERY session. Error format, coding conventions, security policies.
Agent: you do this task REPEATEDLY and want consistent role-quality output. Code review, BRD drafting, ADR checking.
Skill: you need this expertise on SOME tasks but not all. Quality gate checks, domain-specific standards.
An agent is a prompt promoted
Everything you build here rests on Module 3 (Prompt Engineering Fundamentals). An agent is a great prompt promoted to a saved, named entity; a rule is always-on prompt guidance; a skill is packaged prompt context. If your agent or skill underperforms, the problem is almost always the system prompt inside it — apply the Module 3 structure and refinement loop to fix it.
Part 1 — Building a Rule
Rules are instructions the AI sees at the start of every session. You write them once; every session benefits. AGENTS.md is the cross-tool baseline — read automatically by most AI coding tools. Tool-specific override files layer on top when you need tool-specific behavior. See the Appendix for wiring details per tool.
What Makes a Good Rule
| Principle | Bad example | Good example |
|---|---|---|
| Specific and verifiable | Write clean code | All error responses must use {success: false, error: {code, message}} |
| Scoped correctly | All team members should communicate well | Status updates use format: [date] [status] [blockers] [next steps] |
| Actionable by the AI | Use good judgment on error handling | Never hard-delete records — use soft delete with is_deleted = 1 |
| Not redundant | Use proper TypeScript types | All API response types must extend BaseResponse from types/api.ts |
Build Your Rule — E→P→A→V
[EVALUATE]
Load: AGENTS.md, docs/arch-docs/
Read through all existing rules in AGENTS.md.
For my role as [Solutions Designer / Product / PM / SA / TL / Dev],
identify:
1. Which rules directly affect my work?
2. What standards am I re-explaining to the AI every session
that should already be a rule?
3. Which existing rules are too vague to be actionable?
List the gaps. Don’t write rules yet.
[PLAN]
Based on the gaps identified, I want to add these rules:
[List 2-3 rule candidates]
For each rule, confirm:
- Is it specific enough to pass/fail?
- Does it belong in AGENTS.md (all tools) or a tool-specific file?
- Does it overlap with any existing rule?
- Will the AI be able to act on it without human judgment?
[APPLY]
Write the rules in the exact format used in AGENTS.md.
Each rule must be:
- One clear statement (not a paragraph)
- Verifiable: someone can check yes/no
- Scoped: specify when it applies
Add them to the appropriate section of AGENTS.md.
[VALIDATE]
Start a fresh session (new conversation).
Load: the updated AGENTS.md
Give the AI a task that should trigger the new rules.
Does it follow them without being told explicitly?
If not: the rule is too vague, too buried, or conflicts
with another rule. Rewrite and test again.
Tool wiring
See the Appendix for how to load AGENTS.md and tool-specific override files in your AI tool.
Part 2 — Building an Agent
An agent is a prompt that’s been promoted to a saved, named, reusable role. One job per agent. Tight scope, restricted tools, explicit constraints.
The Agent Markdown Format
--- name: agent-name-in-kebab-case description: One sentence — precise enough to drive auto-delegation. tools: Read, Grep, Glob # only what the agent needs model: claude-opus-4-7 # optional override --- # Agent Name You are [role description]. [What you know. What project you’re working on. What standards apply.] ## What you do [Specific responsibilities and behaviors] ## What you never do [Explicit constraints — safety, scope, tools] ## Output format [Exact format specification — structure, fields, severity labels]
Design Principles
| Principle | Why | Example |
|---|---|---|
| One job per agent | Broad agents get confused about which mode to operate in | adr-reviewer, not architecture-everything |
| Tight tool scope | Prevents accidents. A reviewer should not have Write access | tools: Read, Grep, Glob (no Write, no Bash) |
| Explicit "never do" | Overrides default helpfulness when it would cause harm | Never suggest architectural changes — flag for human review |
| Precise description | Drives auto-delegation in tools that support it | Reviews backend route handlers for AGENTS.md compliance |
| Output format spec | Don’t leave format to AI discretion | Severity | Line | Issue | Suggested Fix |
Build Your Agent — E→P→A→V
[EVALUATE]
Load: knowledge/agents/ (browse existing agents)
Load: AGENTS.md
For my role, what is the most common recurring task
I prompt the AI for? What do I keep re-explaining?
Look at existing agents (e.g., code-reviewer.md).
What design decisions did they make?
- How precise is the description?
- How are tools scoped?
- What’s in the ‘never do’ section?
- How is output format specified?
[PLAN]
I want to build an agent for: [your recurring task]
Design:
- Name (kebab-case, action-oriented): [e.g., brd-reviewer]
- Description (one sentence, delegation-precise):
- Tools needed: [Read/Write/Grep/Glob/Bash — minimum set]
- What it does: [specific responsibilities]
- What it never does: [explicit constraints]
- Output format: [exact specification]
Review this design before I write the file.
[APPLY]
Plan approved. Write the agent markdown file.
Follow the exact format: YAML frontmatter (name, description,
tools, model) + system prompt body (role, what you do,
what you never do, output format).
Save to: knowledge/agents/{my-role}/{agent-name}.md
[VALIDATE]
Test the agent against a real task from this week.
1. Does it produce role-quality output on the first try?
2. Does the output match the specified format exactly?
3. Does it respect the ‘never do’ constraints?
4. Would this output need manual correction before use?
5. Is the description precise enough for auto-delegation?
If the output needs correction: identify what the system
prompt is missing and iterate. One refinement cycle is normal.
More than two means the scope or instructions are unclear.
Tool wiring
See the Appendix for how to load agents in your AI tool (file copy, paste, or symlink depending on tool).
Part 3 — Building a Skill
Skills are packaged domain expertise loaded dynamically when a task matches. They keep the main context clean while making specialized knowledge available on demand. The SKILL.md format is an open standard — the portable artifact is a markdown file that works across tools.
The SKILL.md Format
--- name: skill-name-in-kebab-case description: One sentence — precise enough to drive dynamic loading. --- # Skill Name ## What this skill provides [Domain expertise — standards, conventions, checklists] ## Inputs [What the AI receives when this skill is loaded] ## What to do [Step-by-step or checklist instructions] ## Output format [Exact output specification] ## Constraints [What this skill explicitly does not cover]
Build Your Skill — E→P→A→V
[EVALUATE]
Load: knowledge/skills/ (browse existing skills)
For my role, what quality check or domain expertise
do I apply repeatedly — but not on every single task?
Examples: BRD completeness check, AC testability check,
ADR options completeness, dev task sizing, test coverage,
handoff readiness check.
Look at existing skills. How is the description worded?
How specific is the checklist?
[PLAN]
I want to build a skill for: [your quality check / domain area]
Design:
- Name (kebab-case): [e.g., brd-completeness-check]
- Description (one sentence, loading-precise):
- What expertise it packages: [checklist or standards]
- Inputs it expects: [what file/artifact it checks]
- Output format: [PASS/FAIL per item, severity, summary]
- What it does NOT cover: [explicit scope boundaries]
Review this design before I write the file.
[APPLY]
Plan approved. Write the SKILL.md file.
Follow the exact format: YAML frontmatter (name, description)
+ body (what it provides, inputs, what to do, output format,
constraints).
Save to: knowledge/skills/{skill-name}/SKILL.md
[VALIDATE]
Load the skill into your AI tool and run it against
a real artifact from this week.
1. Did it load when the task matched the description?
2. Did the checklist catch everything you’d expect?
3. Were there false positives (flagged things that are fine)?
4. Were there false negatives (missed real issues)?
5. Does the output format match what you specified?
If the checklist has gaps: add the missing items.
If false positives: tighten the criteria.
Part 4 — Governance
The more autonomy you give an agent, the more important its constraints are. Governance is about designing for the failure cases, not just the happy path.
Tool Scope as Governance
| Agent type | Tools to grant | Tools to withhold |
|---|---|---|
| Reviewers and checkers | Read, Grep, Glob | Write, Bash (cannot modify files) |
| Implementers and refactorers | Read, Write, Bash | Scope Bash to specific directories in system prompt |
| Documentation agents | Read, Write | Bash (limit Write to docs/ or knowledge/ in system prompt) |
When in doubt: give fewer tools
You can always expand scope. Contracting it after an agent has caused a problem is harder.
Autonomy Levels
| Level | Agent behavior | Use for |
|---|---|---|
| Read and report | Full autonomy. Reads, analyzes, reports. No human approval before output. | Reviews, audits, analysis |
| Draft and propose | High autonomy. Drafts artifacts, suggests changes. Human approves before commit. | Code generation, document drafting |
| Execute with confirmation | Medium. Performs writes but asks confirmation at each irreversible step. | File modifications, refactoring |
| Plan only | Low autonomy. Produces a plan. Human implements. | Production deployments, DB migrations, security changes |
"Never Do" Rules to Consider
Every well-designed agent has an explicit constraints section. Start with these and add role-specific ones:
## What you never do - Never hard-delete records. Only soft-delete with is_deleted = 1. - Never expose password_hash, reset_token, or reset_expires in output. - Never suggest architectural changes — flag for human review. - Never commit or push code — leave that to the human. - Never modify files outside [specified directory]. - Never accept requirement changes — document for stakeholder sign-off. - Never log or echo PII in output, even in error messages. - Never include real client data in agent prompts stored in the repo.
Role-Specific Build Guide
Same framework, different domain. Use the E→P→A→V prompts from Parts 1–3 — the table below tells you what to build for your role.
| Role | Rule extension | Agent to build | Skill to build |
|---|---|---|---|
| Solutions Designer | BRD format, required fields, language standards (no technical solutioning) | brd-reviewer: checks BRD against quality bar | brd-completeness-check: verifies all sections, numeric IDs, success criteria |
| Product Mgr / Designer | User story format, AC testability language, wireframe annotation | prd-reviewer: checks PRD against quality bar, BRD traceability | user-story-ac-check: verifies ACs testable, references BRD |
| Project Manager | Status update format, risk escalation thresholds, handoff checklists | handoff-readiness-reviewer: checks artifacts meet gate conditions | sprint-completion-check: verifies task status, flags incomplete |
| Solutions Architect | ADR required fields, architecture consistency, NFR documentation | adr-reviewer: checks ADR options, rationale, consequences | architecture-brd-consistency-check: cross-references arch vs BRD |
| Tech Lead | PR description requirements, code review criteria, Dev Task sizing | code-reviewer (extend): add 2+ new checks for stack/team conventions | dev-task-quality-check: verifies atomic, traceable, clear ACs |
| Developer | Stack-specific conventions, test naming, dependency management | unit-test-reviewer: checks test quality bar (AC coverage, mocking, naming) | test-coverage-check: verifies happy/boundary/error paths per AC |
Where to commit
Rules: AGENTS.md (or tool-specific override file)
Agent: knowledge/agents/{your-role}/{agent-name}.md
Skill: knowledge/skills/{skill-name}/SKILL.md
Wire into your tool following the tables in Parts 1–3.
Deliverable Quality Bar
Each asset must pass its own Evaluate → Plan → Apply → Validate cycle. Here is what "pass" means for each:
| Asset | Pass criteria |
|---|---|
| Rule | Start a fresh session with the updated AGENTS.md loaded. Give the AI a task that should trigger the rule. The AI follows it without being told explicitly. |
| Agent | Run the agent against a real task from this week. It produces role-quality output on the first try (or second after one refinement). The output matches the specified format exactly. The 'never do' constraints are respected. |
| Skill | Load the skill and give the AI a task that matches the description. It loads dynamically. The checklist catches what you’d expect. The output matches the specified format. No critical false negatives. |
Due
Week 3 Friday — presented and contributed before Module 7. All three assets committed to knowledge/ under your role’s subfolder. Included in the Sprint 1 Readiness gate review alongside code and tests.
Self-Check
- I can explain the difference between rules, agents, and skills — and when to use each.
- I’ve audited AGENTS.md and identified at least 2 rule additions for my role.
- I’ve built a custom agent with tight scope, tool restrictions, system prompt, never-do section, and output format.
- I’ve built a skill with precise description, step-by-step instructions, and output format.
- I’ve tested both assets against real tasks and validated role-quality output.
- I’ve wired my agent and skill into my primary AI tool.
- I understand governance: tool scope, never-do sections, autonomy levels, privacy guardrails.
- All three assets are committed to knowledge/ under my role’s subfolder.
Appendix — Tool-Specific Wiring Reference
The formats (AGENTS.md, agent markdown, SKILL.md) are portable across tools. How you load them varies. This appendix covers wiring for the primary AI tools. The methodology and the assets are the same — only the loading mechanism changes.
Rules
| Tool | Primary rules file | Override / extend |
|---|---|---|
| Claude Code | AGENTS.md (auto-loaded every session) | .claude/CLAUDE.md for Claude-specific behavior |
| OpenCode | AGENTS.md (auto-loaded every session) | opencode.json for tool/provider config; per-agent overrides |
| Cursor | AGENTS.md (auto-loaded) | .cursor/rules/*.mdc for file-pattern-specific rules |
| Gemini (Workspace) | Paste rules into Gem instructions or session start | Gem knowledge files for project context |
| ChatGPT / other | Paste rules into system prompt or custom instructions | Project-level instructions if supported |
Agents
| Tool | Agent location | Invocation |
|---|---|---|
| Claude Code | .claude/agents/{name}.md (copy from knowledge/agents/) | Auto-delegated by description match, or @agent-name |
| OpenCode | .opencode/agents/{name}.md (copy from knowledge/agents/) or opencode.json | Auto-invoked by description match, or @-mention the subagent |
| Cursor | .cursor/commands/ for command agents | Command palette or Agent Requested rule trigger |
| Gemini (Workspace) | Paste system prompt into Gem instructions | Select the Gem before starting session |
| ChatGPT / other | Paste system prompt at session start | Manual — paste the agent prompt to start each session |
Skills
| Tool | Skill location | Loading |
|---|---|---|
| Claude Code | knowledge/skills/{name}/SKILL.md or .claude/skills/ | Dynamic — loaded when task matches description |
| OpenCode | .opencode/skills/{name}/SKILL.md (copy from knowledge/skills/) | Dynamic — loaded when task matches description |
| Cursor | knowledge/skills/{name}/SKILL.md | Dynamic if indexed; manual paste otherwise |
| Gemini (Workspace) | Paste SKILL.md content into session | Manual — load at session start when relevant |
| ChatGPT / other | Paste SKILL.md content into session | Manual — include in prompt when relevant task arises |
The portable artifact is the markdown file
The agent markdown file, the SKILL.md, and AGENTS.md are the assets. They live in your repo’s knowledge/ directory. How you load them into a specific tool is a wiring detail that changes as tools evolve. The asset itself transfers across any tool that accepts a system prompt.