On-Ramp · Optional

◈ Environment Setup

⏱ 20 min

Terminal, Git & Node — 20-Minute Setup

Claude Code runs in your terminal and works with Git and Node. This page gets you ready in 20 minutes. If you already use these tools daily, skip ahead — you're ready.

Skip this page if: You can open a terminal, run git --version, run node --version, and know what a working directory is. Jump straight to The Mental Model →

Objective

By the end of this page you will have: a working terminal, Git installed and configured, Node.js 20+ installed, and a test project folder ready. Every step has a verification command — run it before moving on.

Step 1 — Open a Terminal

Mac

Press Cmd+Space, type Terminal, press Enter. Or install iTerm2 for a better experience.

Windows

Install Windows Terminal from the Microsoft Store. Use PowerShell or WSL2 (Ubuntu). Claude Code works best in WSL2 on Windows.

Linux

You already have a terminal. Open it with Ctrl+Alt+T on most distros.

# Verify: your terminal is working
echo "hello"
# Expected output: hello

Step 2 — Install Git

Git tracks changes to your files. Claude Code uses it constantly — checking diffs, creating branches, reading history.

Mac

Run git --version. If not installed, macOS will prompt you to install Xcode Command Line Tools. Accept it. Takes ~5 minutes.

Windows (WSL2)

sudo apt update && sudo apt install git -y

Linux

sudo apt install git -y (Ubuntu/Debian) or sudo dnf install git -y (Fedora)

# Verify
git --version
# Expected: git version 2.x.x or higher

# Configure your identity (one-time)
git config --global user.name "Your Name"
git config --global user.email "you@example.com"

Step 3 — Install Node.js 20+

Node.js runs JavaScript outside the browser. The course starter project uses Node. Claude Code itself is distributed via npm.

All platforms — recommended method

Install nvm (Node Version Manager), then install Node 20:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
Close and reopen your terminal, then:
nvm install 20 && nvm use 20

Alternative — direct installer

Download the LTS installer from nodejs.org. Choose "LTS" (20.x or higher).

# Verify
node --version
# Expected: v20.x.x or higher
npm --version
# Expected: 10.x.x or higher

Step 4 — Understand the Terminal Basics

Five commands cover 90% of terminal use in this course:

Command	What it does	Example
`pwd`	Print working directory — where you are	`pwd` → `/Users/you/projects`
`ls`	List files in current folder	`ls` or `ls -la` for hidden files
`cd`	Change directory	`cd my-project` or `cd ..` to go up
`mkdir`	Make a new folder	`mkdir my-project`
`Ctrl+C`	Stop a running command	Use when something hangs

Step 5 — Create Your Project Folder

Claude Code always runs inside a project folder. Create one now — you'll use it throughout the course.

# Create and enter your project folder
mkdir my-claude-project
cd my-claude-project

# Initialise Git (makes it a repo)
git init

# Verify
git status
# Expected: On branch main, nothing to commit

Step 6 — Verify You're Ready

Check all five boxes before continuing:

Terminal opens and I can type commands

git --version returns 2.x or higher

node --version returns v20.x or higher

I created a project folder and ran git init

I know how to use cd, ls, and Ctrl+C

Stuck on setup? Post in the Q&A Discussion thread — include your OS, the command you ran, and the error message.

Checkpoint: All five boxes checked. You're ready to install Claude Code. Continue to The Mental Model →

Self-Paced · 24 Modules · 3 Labs · Vetted against live product May 2026

Claude Code + Cowork: From Unaware to Native

Master the agent loop that powers both Claude Code (for developers) and Claude Cowork (for everyone). Same architecture, same mental model, two surfaces.

4Levels

24Modules

3Labs

~90mReading

~30hLab Work

Last verified: 2026-05-11
Claude Code: ≥1.0  |  Node: ≥20  |  VS Code: any
Feature states may change — run claude doctor to confirm your install.

💬

Course Community — GitHub Discussions

Ask questions · Share your capstone · Get unstuck · Help others

→

○

Unaware

4 modules · ~15m

→

◐

Curious

5 modules · ~25m

→

◉

Fluent

7 modules + 2 labs

→

●

Native

6 modules + 1 lab

⏱ 10 min read

The Mental Model — Externalization

Claude Code and Cowork are both harnesses that externalize memory, skills, and protocols around the same agent loop. Code gives the agent a codebase. Cowork gives it your work life. Learn the loop once, apply it everywhere.

This course uses the externalization framework from cognitive science and LLM agent research. Following Norman's theory of cognitive artifacts, external tools don't just enhance capabilities — they change the nature of the tasks the model addresses. Every Claude Code and Cowork feature maps to one of four externalization dimensions:

🧠 Memory — Externalized State

What the agent knows across time. In Code: CLAUDE.md (semantic — rules you write), .claude/memory/ (episodic — things the agent learned), /resume (session recovery). In Cowork: context files (about-me.md, brand-voice.md, working-preferences.md), global instructions. Both compound over time — every week you refine them, the agent gets better at your specific work.

🛠 Skills — Externalized Procedures

How the agent acts. In Code: skills with context:fork and $ARGUMENTS, subagents, bundled skills (/debug, /simplify, /batch). In Cowork: plugins with /productivity:start, /data:explore, and domain-specific slash commands. Both transform improvised output into structured, repeatable composition.

🔗 Protocols — Externalized Interaction

How the agent talks to the outside world. In Code: MCP servers, claude mcp add. In Cowork: connectors (Slack, Drive, Notion, Snowflake — 60+ integrations). Both use the Model Context Protocol under the hood. plugin.json is itself a protocol — a formal contract about what a plugin exposes.

🎛 Harness — The Coordinator

Six design aspects: control (plan mode, permission modes, opusplan), sandboxing (worktrees, container isolation, Cowork's VM), oversight (diff review, plan approval, HITL gates), observability (/cost, /usage, OTEL tracing), governance (plugin scopes, private marketplaces, admin controls), and context management (/compact, context rot mitigation, subagent delegation).

Two Tracks, One Journey

🎬 Reading Modules (24 modules · ~90 min total)

Short (2–6 min) hands-on walkthroughs of Claude Code and Cowork features. Commands, shortcuts, demos. Each module has one learning objective. Cowork parallels are threaded throughout via ⚡ callout boxes — you learn both surfaces without switching context.

📘 Foundation Labs (3 labs · ~30 hrs total)

Deep-dive labs from the Intelligence-to-Physical Systems self-study guide. Build AI systems from scratch (RAG pipelines, multi-agent orchestration, sandboxed agent runtimes) to understand the theory behind what Claude Code and Cowork automate. Each lab produces a portfolio deliverable and includes a 3-tier assessment rubric.

Prerequisites

For the Reading Modules

A computer (macOS, Linux, or Windows with WSL). A paid Claude account (Pro, Max, Team, or Enterprise). Basic familiarity with a terminal. No coding experience required for Level 1 — the first four modules are conceptual.

For the Foundation Labs

Python 3.11+. Comfort with pip, virtual environments, and basic scripting. Docker installed for Lab C. The labs are independent — you can skip any or do them in any order, though the recommended sequence is A → B → C.

Version & Plan Requirements: This course targets Claude Code v2.1.84+ (April 2026). Some features require specific plan tiers: 1M-token context (Max/Team/Enterprise), private plugin marketplaces (Enterprise), Cowork Routines (Pro/Max, research preview). Features marked with ★ PREVIEW may change. Run claude --version to verify your install. Run claude update to get the latest.

Pacing Guide

Sprint (1 week): Reading modules only. Levels 0–2 on day 1, Level 3 on days 2–3, Level 4 on days 4–5, Capstone on days 6–7. Labs as follow-up homework.

Steady (4 weeks): Week 1: Levels 0–2 + Lab A. Week 2: Level 3 (Modules 10–13) + Lab B. Week 3: Level 3 (Modules 14–16) + Lab C. Week 4: Level 4 + Capstone.

Corporate rollout: One level per week. Labs assigned as homework between sessions. Pair each level with a practical exercise from the assessment rubric. Use the Capstone as a team demonstration.

Course Project: Build a CLI Task Manager

One project, built across modules

The best way to learn is to build something real. This course threads a single project through key modules — a CLI task manager app. You'll scaffold it in Module 05, add CLAUDE.md in Module 10, create a /deploy-check skill in Module 11, add a test hook in Module 15, build a security-reviewer agent in Module 17, and package everything as a plugin in Module 21. By the capstone, you have a working reference implementation demonstrating every level.

You choose the stack. TypeScript + Node, Python + Click, Go + Cobra, Rust + Clap — whatever you're comfortable with. The project is a vehicle for learning Claude Code features, not a software engineering exercise. If you don't have a preference, use TypeScript:

mkdir task-cli && cd task-cli
npm init -y && npm i commander chalk
claude  # start building with Claude Code

Look for 📌 PROJECT callouts in modules — they tell you what to build at each stage. If you get stuck, clone the starter repo and checkout the matching milestone branch to see the reference implementation:

# Clone the reference project
git clone https://github.com/douglasmun/claude-code-course-starter.git
git checkout milestone-1-scaffold  # start at your level

How to Read This Course

Externalization tags

Look for the colored badges — 🧠 Memory, 🛠 Skills, 🔗 Protocols, 🎛 Harness — on each module. They reinforce the mental model so you always know which dimension a feature belongs to.

⚡ Cowork callouts

Every module that teaches a Claude Code concept includes a sky-blue ⚡ box showing the Cowork equivalent. If you're primarily a Cowork user, these boxes are your fast path through the course.

Checkpoints

Every module ends with a concrete checkpoint — something you can do or explain to prove you've internalized the content. Don't skip to the next module until you can meet the checkpoint.

Command callouts

New commands and shortcuts are tagged with green COMMANDS badges. The full reference is in the Synthesis section.

Companion Resources

Claude Code Cheat Sheet v1.0

The definitive quick-lookup reference for every slash command, keyboard shortcut, CLI flag, hook event, environment variable, and plugin — on one page. Written by the same author as this course. Use the cheat sheet as your daily reference; use this course as your learning path. They complement each other.

→ douglasmun.github.io/claude-code-cheatsheet-v1.0.html

Starter Repo — claude-code-course-starter

The progressive mini-project built across the course. Clone it, checkout the milestone branch matching your current module, and build alongside the lessons. Each branch adds one layer — from scaffold to plugin. Use it as a reference when you get stuck, or as a template for your own projects.

→ github.com/douglasmun/claude-code-course-starter

Level 1 · Unaware · Module 01

● What is Claude Code?

⏱ 8 min

What is Claude Code?

Differentiate Claude Code from chat-based AI and code autocomplete. Understand what "agentic coding" means and why it matters.

Objective

Explain in one sentence what makes Claude Code fundamentally different from autocomplete, chat, and traditional coding tools.

Three Generations of AI Coding Tools

Generation 1: Autocomplete

Predicts the next few tokens based on what you've typed. GitHub Copilot, TabNine. You write code; the tool fills in. You are the driver; the AI is a fast typist.

Generation 2: Chat

You describe what you want in a conversation window. ChatGPT, Claude.ai chat. You copy-paste the output somewhere useful. You are the operator; the AI is your advisor.

Generation 3: Agentic

You describe an outcome. The agent reads your codebase, makes a plan, shows it to you, and executes every step autonomously — editing files, running tests, fixing errors, and committing. You are the director; the AI is the executor. This is Claude Code.

Where Claude Code Runs

Claude Code isn't locked to a single surface. It runs in six places, all using the same agent loop:

Terminal (CLI) — the primary interface. Full power.

VS Code — extension with Cmd+Esc launch. Sees your editor context.

JetBrains — IntelliJ, PyCharm, WebStorm, CLion, Rider.

Desktop App — standalone application.

Web — claude.ai/code in your browser.

iOS — mobile access for monitoring and dispatch.

What Makes It Agentic

It reads before it writes

Claude Code explores your codebase — files, directory structure, git history, dependencies, error messages — before proposing any change. It builds a mental model of what exists.

It plans before it acts

In plan mode, Claude shows you its reasoning and proposed approach before touching any file. You approve the plan, then it executes.

It verifies after it acts

Claude can run tests, check linting, and verify its own changes — then iterate if something fails. The loop continues until the goal is met.

Nothing lands without your approval

Every file edit produces a diff you review. You accept, reject, or modify. Claude commits nothing without sign-off by default.

Key Specs

1M-token context window (Max/Team/Enterprise)

Up to 10 parallel subagents

Plugin marketplace — 11 role-specific plugins (Productivity, Sales, Finance, Legal, Marketing, Data Analysis, Software Development, Customer Support, Creative Production, Project Management, Enterprise Search)

MCP for external tool integration

OTEL tracing for enterprise observability

Cowork: Same agent architecture applied to non-code work. Instead of reading a codebase, it reads your file system. Instead of editing code, it creates documents, spreadsheets, presentations, and reports. Instead of running tests, it pulls live data from Slack, Drive, and Notion. Now generally available on macOS and Windows with enterprise controls.

Checkpoint: You can explain the three generations of AI coding tools and articulate what "agentic" means: goal-directed, tool-using, autonomous loop with human oversight.

Level 1 · Unaware · Module 02

● What is an AI Agent?

⏱ 12 min

What is an AI Agent?

The core concept behind everything in this course. Once you internalize the agent loop, you stop memorizing features and start understanding why they exist.

Objective

Explain the agent loop, the spectrum from prompt to multi-agent, and map every Claude Code/Cowork feature to a specific step in the loop.

A Chatbot Responds. An Agent Acts.

A chatbot takes your message and produces a reply. An agent takes your goal and works toward it — reading, planning, acting, observing the results, and iterating until the goal is met or it needs your input. The loop is the fundamental difference.

The Agent Loop

Every AI agent follows the same cycle, whether it's Claude Code editing TypeScript, Cowork building a board deck, or a self-driving car navigating traffic:

👁

Perceive

Read files, errors, context

→

🧠

Reason

Analyze, plan approach

→

⚡

Act

Edit code, run commands

→

🔍

Observe

Check results, run tests

→

🔄

Iterate

Repeat or report back

The loop continues until the goal is met, the agent gets stuck, or you intervene. That last part is critical — you are the oversight layer. Every approval gate, every diff review, every plan confirmation is your opportunity to redirect the loop.

The Spectrum: From Prompt to Multi-Agent

Level	What it does	Loop?	Example
Prompt	One input → one output	No	ChatGPT answering a question
Chain	Fixed sequence of steps	No	LangChain pipeline
Agent	Goal-directed, chooses tools, iterates	Yes	Claude Code / Cowork session
Multi-Agent	Multiple agents with specialized roles	Nested	Claude Code subagents / Cowork sub-agents

How Claude Code Implements Each Step

👁 Perceive

Claude reads your codebase — file contents, directory structure, git history, package dependencies, error messages, diagnostics, even screenshots you paste. It builds a mental model of what exists before doing anything. In Cowork: reads your files, connected tools (Slack, Drive, Notion), and context files.

🧠 Reason

It analyzes the problem, considers codebase conventions (from CLAUDE.md), and formulates an approach. In plan mode, this step is made visible — you see the thinking before any action. In Cowork: the "show me your plan before executing" instruction surfaces this step.

⚡ Act

It edits files, creates new ones, runs shell commands, calls external tools via MCP. Each action produces a diff or output you can review. Skills structure this step — pre-defined action sequences replace improvisation. In Cowork: creates documents, spreadsheets, presentations, pulls data from connectors.

🔍 Observe

It checks results — did tests pass? Did the build succeed? Did the linter flag anything? Hooks make this step deterministic: auto-run prettier after every edit, auto-run tests after every write. In Cowork: safety defaults enforce observation rules ("flag all assumptions").

🔄 Iterate or Stop

If the goal isn't met, it loops back. If it needs clarification, it asks. If done, it presents the result for approval. Subagents run their own nested loops in isolation, returning only a summary — your main context stays clean. Nothing lands without your sign-off.

Why This Matters for the Rest of the Course

Plan mode exists because of Reason

It makes the reasoning step explicit and reviewable. You approve the plan before Act begins. Without understanding the loop, plan mode seems like a UI toggle. With the loop, you see it as the agent making its thinking visible.

Subagents exist because of scope

A single agent loop has finite context. Subagents run their own loops in isolation — perceive, reason, act, observe — and return a summary. Your main loop stays clean. Without the loop, subagents seem like a performance trick. With the loop, you see them as context management.

Skills exist because of Act

Skills are pre-defined action sequences. Instead of the agent improvising a code review every time, a skill gives it a structured procedure. Without the loop, skills seem like macros. With the loop, you see them as reliable Act-step templates.

Hooks exist because of Observe

Hooks add deterministic observations — auto-run prettier, auto-run tests, block dangerous commands. Unlike the LLM's probabilistic behavior, hooks always fire. Without the loop, hooks seem like automation. With the loop, you see them as guaranteed quality gates.

You exist because of oversight

The agent loop runs until you approve. Diff review, plan approval, permission modes — these are all mechanisms for human-in-the-loop control at different points in the cycle. You are the most important part of the loop.

Key insight: Every feature in Claude Code and Cowork maps to a specific part of the agent loop. Once you internalize the loop, you stop memorizing commands and start understanding why they exist. The rest of this course is an elaboration of this single diagram.

Checkpoint: You can draw the agent loop (perceive → reason → act → observe → iterate), place Claude Code features on it (plan mode = Reason, skills = Act, hooks = Observe, subagents = nested loops, you = oversight), and explain the spectrum from prompt to multi-agent.

Level 1 · Unaware · Module 03

● Code vs. Cowork

⏱ 10 min

Same Loop, Different Surface

Two products built on one architecture. Learn once, apply everywhere. This module maps every concept so you never need to learn twice.

Objective

Map every Claude Code concept to its Cowork equivalent and explain why the mental model transfers completely.

The Architecture Is Identical

Both run inside a sandboxed environment on your device. Both use the same foundation models. Both follow the same perceive → reason → act → observe loop. Both delegate to sub-agents for parallel work. The only difference is what the agent can see and touch.

Concept

Claude Code

Cowork

Runs in

Terminal / IDE

Claude Desktop (sandboxed VM)

Accesses

Git repo

File system + screen (Computer Use)

Memory

CLAUDE.md + .claude/memory/

Context files + global instructions

Procedures

Skills with context:fork

Plugins (dept-specific + custom)

External tools

MCP servers

Connectors (60+ via MCP)

Parallel work

Subagents + /batch

Sub-agents (auto-parallel)

Plan review

Plan mode (Shift+Tab)

"Show plan before executing"

Safety

Hooks + /hooks UI

Safety defaults + approval gates

Scheduling

/loop (in-session)

Scheduled Tasks + Routines (cloud)

Mobile trigger

—

Dispatch (phone → desktop)

Screen control

—

Computer Use (open apps, browser)

Context isolation

Worktrees (claude -w)

Projects

Cowork's Key Capabilities

File System Access

Claude reads and writes to a folder on your computer. Not uploads and downloads — direct read/write. It can reference last month's report to match formatting, pull data from old spreadsheets, and apply brand guidelines mid-task without you mentioning them.

Dispatch

Text a task from your phone → Cowork runs the full workflow on your desktop → you return to finished files. Requires the desktop app running and computer awake. Your phone becomes a remote trigger for complex desktop workflows.

Scheduled Tasks & Routines

Scheduled Tasks run on a timer while your desktop is open. If your machine is off, the task is skipped but shows in task history and auto-runs when you reopen. Routines (research preview) are cloud-hosted — they run even when your laptop is closed.

Computer Use ★ PREVIEW

Claude can interact directly with your screen — open apps, navigate browsers, run tools. It reaches for connectors first (structured data), falls back to browser (web actions), and only uses screen control as a last resort (when no other path exists).

Projects

Isolated workstreams without context bleeding. Run 5 different projects simultaneously — each with its own files, connectors, and context. The Cowork equivalent of Claude Code's git worktrees.

The Core Shift: Delegate Outcomes, Not Steps

This principle applies to both Code and Cowork. It's the single most important prompting shift for agentic tools:

❌ Step-by-step (chatbot thinking)

"Help me make a content calendar for March." — Vague goal, no definition of done, Claude will guess at everything.

✅ Outcome-based (agent thinking)

"I need a complete 30-day content pipeline. Read every file in this folder. Extract my brand voice patterns and top-performing formats. Generate 30 content ideas ranked by audience value. Write full drafts for the top 5. Build a publishing calendar in Excel with dates, platforms, and repurposing plans. Save everything as finished files." — Clear definition of done, specific deliverables, verifiable.

Cowork Context Files = CLAUDE.md for Your Work Life

The highest-leverage setup step for Cowork — building context files — is the exact same concept as CLAUDE.md. Both are semantic memory that compounds over time:

about-me.md

Who you are, your role, responsibilities, industry context, what success looks like. Include 1–2 examples of output you're proud of. Equivalent to the project description in CLAUDE.md.

brand-voice.md

How you communicate: tone, phrases you use naturally, phrases that sound wrong, and 2–3 actual writing samples. Samples are more useful than abstract descriptions. Equivalent to code style rules in CLAUDE.md.

working-preferences.md

How you want Claude to behave: questions before execution? Short or long outputs? File formats? Naming conventions? What should Claude never do without asking? Equivalent to .claude/settings.json and hooks.

Cowork's Top 3 Starter Plugins

Productivity

Task management, calendars, daily workflows. /productivity:start reviews your priorities and organizes your day. Connects to Slack, Notion, Asana, Linear, Jira, Monday, ClickUp, Microsoft 365.

Data Analysis

Drop a CSV, run /data:explore. Claude reads the full dataset, summarizes every column, flags anomalies, and suggests analyses before you've said what you're looking for. Connects to Snowflake, BigQuery, Databricks, Amplitude.

Role-specific (Sales / Marketing / Legal / Finance)

Install the one matching your function. Each loads domain methodology, structured workflows, and connector integrations. Plus partner plugins from Slack, LSEG, S&P Global, Apollo, Common Room.

Cowork Limitations to Know

No cross-session memory

Every session starts fresh. Context files and global instructions are the workaround — same pattern as CLAUDE.md and /resume. Document important decisions in files Claude can read next time.

Tasks die if you close the app

Cowork runs as an active session inside Claude Desktop. Quit = task stops mid-execution. Sleep mode is fine. Quitting is not. Routines (cloud) solve this for scheduled work.

Usage burns fast

Multi-step agent workflows consume significantly more allocation than chat. Batch related tasks into single sessions. Max plans ($100–$200/month) exist for a reason.

Desktop only (for now)

No mobile Cowork, no browser version. Dispatch lets you trigger from your phone, but execution happens on desktop. Use a cloud-synced folder for file consistency across machines.

Cowork Quick-Start Guide — 15 Minutes to Your First Workflow

If you're primarily a Cowork user (not a developer), here's the fastest path to productive use:

Step 1: Create your workspace (2 min)

Create a dedicated folder: ~/claude-workspace/. This is where Cowork reads and writes. Put existing documents you want Claude to reference here — last month's report, your team template, your brand guidelines PDF.

Step 2: Write your context files (5 min)

Create three files in your workspace folder. These are your CLAUDE.md equivalent — they tell Claude who you are and how to work:

# about-me.md
I'm [your name], [your role] at [company].
I work on [what you do]. My stakeholders are [who].
Success looks like [definition].

Here's an example of output I'm proud of:
[paste 1-2 paragraphs of your best work]

# brand-voice.md
Tone: [professional/casual/technical/warm]
We say: [phrases you use]
We never say: [phrases that sound wrong]
Example writing sample:
[paste 2-3 paragraphs of real writing]

# working-preferences.md
- Always ask clarifying questions before starting
- Show your plan before executing multi-step tasks
- Never send emails without my review
- Save all outputs as finished files, not drafts
- Use bullet points for internal docs, prose for external

Step 3: Install your first plugin (3 min)

Open Claude Desktop → Cowork tab → Customize → Plugins. Install the plugin matching your role: Productivity, Marketing, Sales, Finance, Legal, or Data Analysis. Each comes pre-configured with domain skills and connectors.

Step 4: Connect your tools (3 min)

In Customize → Connectors, connect the tools you use daily. Start with 2–3: Slack + Google Drive is the most common starting point. Each connector uses OAuth — click, authorize, done. Connectors persist across sessions.

Step 5: Run your first real task (2 min)

Give Claude an outcome-based task that uses your context files and connectors:

"Read the files in my workspace. Using my brand voice, write a weekly update email for my stakeholders summarizing this week's Slack conversations in #product-updates. Save as a draft in my workspace."

Claude reads your context files (knows your brand voice), pulls Slack data (via connector), writes in your style, and saves a file. Review, edit if needed, send.

Step 6: Build your first custom plugin (optional, 5 min)

Once you have a workflow you repeat, automate it with Plugin Create. In the Cowork tab, say: "I want to create a plugin that generates weekly stakeholder updates from Slack. Use my brand voice. Include a /weekly-update slash command." Claude assembles the skills, connectors, and commands into a plugin package. Customize and save.

Key insight: If you build a great CLAUDE.md, you already know how to build great context files for Cowork. If you learn Cowork's context file pattern, you'll write better CLAUDE.md files. The skill transfers because the concept — semantic memory for an agent — is the same.

The Same Task — Two Surfaces

This is the clearest way to see that the mental model is identical. The task: summarise last week's work and draft a status update. Same agent loop, different access permissions.

Step

Claude Code (developer)

Cowork (anyone)

Perceive

Reads git log, commit messages, changed files, PR descriptions from the last 7 days

Reads Slack #standup channel (via connector), Google Drive project folder, calendar events

Memory

CLAUDE.md describes the project, team conventions, and what "status update" means for this repo

about-me.md, brand-voice.md describe role, stakeholders, and preferred tone

Reason

Groups commits by feature, identifies blockers from PR comments, notes what shipped vs. in-progress

Groups Slack threads by topic, identifies decisions made, flags items needing follow-up

Act

Writes CHANGELOG.md entry + drafts a Slack message. Runs /pr-review skill if reviews are part of the update.

Writes a formatted email draft saved to workspace. Runs /weekly-update plugin skill.

Observe

Runs git diff to verify the CHANGELOG is accurate; hook auto-formats on save

You review the draft before sending; safety default flags any assumptions Claude made

Your role

Review the draft, approve or revise, send

The prompt you'd write is nearly identical in both surfaces: "Summarise what I worked on last week and draft a status update for my stakeholders." The agent loop steps are identical. Only the data sources differ.

Going Deeper: 30 Cowork Best Practices

The quick-start above gets you productive. The guide below makes you dangerous. 30 Best Practices That Make Claude Cowork 100x More Powerful was compiled from Nav Toor's foundational work, Anthropic's official documentation, and community workflows — and it maps directly to the externalization framework you just learned.

Part

What it covers

1 — Context Architecture (1–5)

_MANIFEST.md structure, Global Instructions as your permanent OS, three persistent context files (about-me, brand-voice, working-style), folder instructions, deliberate context scoping

2 — Task Design (6–10)

Define end state not process, plan before execution, uncertainty handling, batch related work, subagents for parallel processing

3 — Automation (11–13)

/schedule for recurring tasks, externalize everything to files, combine scheduling with connectors for real automation

4 — Plugins & Skills (14–16)

Stack plugins for compound capability, build custom skills for repeatable workflows, Plugin Create for conversational plugin building

5 — Safety (17)

Treat Cowork like a powerful employee: backup protocols, folder isolation, prompt injection awareness, connector trust verification

6 — Continuity & QA (18–22)

decisions-log.md for session-to-session continuity, self-review steps, rollback protocols for file ops, incremental refinement, monthly system audits

7 — Workflow Intelligence (23–30)

Dedicated sandbox folder, Chat vs. Cowork decision matrix, pre-answer clarifying questions, mid-task steering, Excel↔PowerPoint cross-app, context rot via subagents, draft prompts in Chat first, always review before you ship

The core principle in one line: Invest in setup. Reduce prompting. Review before shipping. The people thriving with Cowork spent an afternoon on context architecture — and now write ten-word prompts that produce client-ready deliverables. Download the full guide: 30-cowork-best-practices.pdf

Checkpoint: You can map any Claude Code concept (CLAUDE.md, skills, MCP, hooks, subagents, worktrees) to its Cowork equivalent — and explain why they're the same underlying idea applied to different access permissions. You can describe what each agent loop step does for the same task across both surfaces.

Level 1 · Unaware · Module 04

● Why Teams Are Switching

⏱ 8 min

Why Teams Are Switching

The productivity case for agentic AI — with real numbers, real workflows, and the three unlocks that change how teams ship.

Objective

Understand the business case for agentic AI, be motivated to install, and know what changes in your daily workflow.

The productivity case for AI coding tools is real but contested. Anthropic has published internal adoption data. GitHub's Copilot studies show measurable PR velocity gains. Individual reports vary wildly — 2× for some workflows, near-zero for others. What the evidence consistently shows is that the gains are concentrated in specific task types: boilerplate, test generation, cross-file refactors. That's where this course focuses.

What Actually Changes

The shift isn't "AI writes code for you." The shift is in which tasks you attempt. Work that was too tedious, too cross-cutting, or too risky to start becomes feasible because the agent handles the drudgery while you handle the judgment:

Tasks you attempted but dreaded

Multi-file refactors, dependency upgrades, boilerplate migration, test coverage expansion. These go from "weekend project" to "20-minute delegation."

Tasks you wouldn't have attempted

Full security audits of a module, generating ADRs from git history, building internal documentation from code comments, exploratory analysis of unfamiliar codebases. Work you'd never have queued because the effort-to-value ratio was wrong — that's where the compounding value is.

Tasks that change shape

Code review shifts from "read every line" to "review the agent's plan, then spot-check the diff." Onboarding shifts from "read the wiki" to "clone the repo, run claude, ask questions." Debugging shifts from "reproduce manually" to "paste the stack trace, ask for a failing test."

The Three Unlocks

1. Plan Mode — Visible Reasoning

Makes the agent's Reason step explicit. You see the approach before any code is touched. Read-only exploration → approved plan → supervised execution. This is what makes delegation feel safe — you're not blindly trusting the agent, you're reviewing its thinking. In Cowork: the equivalent is instructing Claude to show its plan and ask questions before executing.

2. Subagents — Parallel Nested Loops

Delegate independent pieces of a task to isolated agents that each run their own perceive → reason → act → observe cycle. Your main context stays clean. In Code: Explore (read-only, Haiku), Plan (analytical, Sonnet), General-purpose (full tools). In Cowork: sub-agents process 10 documents in 5 minutes instead of 50 minutes sequentially, then synthesize the results.

3. Skills, Hooks, & Plugins — Externalized Expertise

Turn tribal knowledge into shareable, reproducible workflows. Skills are slash commands. Hooks are guardrails. Plugins bundle everything into one-command installs. In Code: your team's code review process becomes /pr-review. In Cowork: domain plugins turn Claude from a generalist into a specialist for sales, legal, finance, or any function. New hires get the full toolkit on day one.

Cowork impact story: One user ran 45,000 rows of quarterly revenue data through the Data Analysis plugin. Within eight minutes, Claude identified a pricing anomaly costing $14,000/month that their data team had missed for two quarters. Another user automated their weekly 5-client status update workflow: one command (/client-updates) pulls from Notion, writes personalized updates, and stages them in Gmail drafts.

The Workflow Shift

Most developers who adopt Claude Code report the same evolution:

Week 1

Claude Code is a sidebar while you code in the main editor. You use it for quick questions and small edits.

Week 2

You start delegating multi-file tasks. Plan mode becomes your default. You queue messages and check back.

Week 3

Claude becomes your primary interface. You peek at code only when reviewing diffs. The terminal is where you work; the editor is where you verify.

Week 4+

You build skills, configure hooks, and start sharing workflows with your team. The agent isn't a tool you use — it's a team member you direct.

Checkpoint: You can name two things agentic AI does that a chatbot cannot, explain the three unlocks (plan mode, subagents, skills/plugins), and describe how daily workflow changes over the first month.

Level 2 · Curious · Module 05

● Install & First Run

⏱ 15 min + setup

Install & First Run

Get Claude Code running in your terminal. Verify the setup. Meet /powerup — the built-in tutorial that teaches you while you work.

Objective

Install Claude Code, verify with claude doctor, authenticate, start your first session, and run /powerup.

System Requirements

macOS 10.15+ / Ubuntu 20.04+ / Windows 10+ (WSL or Git for Windows)

Node.js 18+ (for npm fallback only)

4 GB RAM minimum

Paid Claude account (Pro, Max, Team, or Enterprise)

Installation

VET FIX Updated install method

The native binary is now the recommended install. NPM is officially deprecated. Homebrew is available for macOS users.

# Recommended: native binary (fastest, auto-updates)
curl -fsSL https://claude.ai/install.sh | bash

# macOS alternative via Homebrew
brew install --cask claude-code

# NPM (deprecated — migrate with `claude install`)
npm install -g @anthropic-ai/claude-code

Verify & Authenticate

# Check version
claude --version

# Authenticate (opens browser for OAuth)
claude auth login

# Verify everything works
claude doctor  # look for green checks

# Check auth status anytime
claude auth status

# Log out when needed
claude auth logout

First Session

Navigate to a real project folder (not an empty directory — Claude needs code to perceive), then start:

# Start your first session
cd your-project/
claude

# Claude will scan the directory structure
# and wait for your first prompt

Meet /powerup

/powerup is Claude Code's built-in interactive tutorial. Run it now, before continuing the course. Type /powerup at the Claude Code prompt and it launches a sequence of guided exercises directly in your terminal — you practice on real files, not mock examples.

What it covers: Prompting patterns, file navigation, plan mode, diff review, and basic context management. Each exercise takes 2–5 minutes. The full set runs about 20–30 minutes. You don't need to complete all of it before continuing — but do at least the first three exercises before moving to Module 6 (Context Management), because /powerup covers plan mode and diff review hands-on in a way this course's text cannot replicate.

When to use it: Once on first install, then again if you take a long break from Claude Code and want to re-calibrate. It's not a one-time thing — experienced users sometimes replay specific exercises when learning a new surface (VS Code extension, web, JetBrains).

# Launch the built-in tutorial — do this now
/powerup

Commands Introduced

claude · claude --version · claude doctor · claude auth login · claude auth status · claude auth logout · /powerup

Tooling Standard

Context	Command
Install Claude Code	`curl -fsSL https://claude.ai/install.sh \| sh`
Project package manager	`npm` (or `pnpm` / `bun` — pick one, stay consistent)
Run in CI	`npx @anthropic-ai/claude-code -p "..."`
Bun fallback	`bun run src/index.ts <command>`

npm install is deprecated. Use the native binary above.

Cowork setup: Install the Claude Desktop app from claude.com. Create a dedicated workspace folder (not your Documents directory — use something like ~/claude-workspace/). Drop your context files (about-me.md, brand-voice.md, working-preferences.md) into it. Enable Cowork in the app settings. No terminal needed — everything is GUI-driven.

Checkpoint: claude doctor shows all green checks. claude auth status shows you're authenticated. You see the Claude Code prompt in your terminal. You've run /powerup and completed at least one lesson.

Level 2 · Curious · Module 06

● Prompts, Context & Housekeeping

⏱ 20 min

First Prompts, Context Management & Housekeeping

Put the agent loop into practice. Learn the everyday commands. Understand why sessions degrade and how to prevent it.

Objective

Accept your first AI-generated diff, manage session context with the four everyday commands, understand context rot, and choose the right model for each task.

Your First Prompt: Explore → Ask → Edit

This pattern maps directly to the agent loop from Module 02. You're guiding the Perceive → Reason → Act cycle step by step — training yourself to think in agent terms before you start delegating larger outcomes.

# Step 1: Perceive — let Claude read first
"Read the files in src/auth/ and explain the login flow."
# Claude explores the directory, reads files, builds context

# Step 2: Reason — verify understanding
"What's the difference between the JWT and session-based paths?"
# Claude analyzes, synthesizes, presents its understanding

# Step 3: Act — request a verifiable change
"Add input validation to the email field — show me the diff first."
# Claude proposes edits as a unified diff, waits for y/n

Why "show me the diff first"? This is your first habit to build. By asking Claude to show the diff before applying, you practice the oversight part of the loop. Accept with y, reject with n, or give feedback to iterate.

Four Everyday Commands

These are the context management primitives you'll use in every session. They map to the 🧠 Memory dimension of the externalization framework:

/clear — Fresh context

Wipe the current conversation context and start a new task in the same session. Use between unrelated tasks to prevent context bleeding. Your CLAUDE.md stays loaded — only the conversation resets.

/compact — Compress context

When a session grows long, /compact summarizes earlier conversation to free context space. Claude re-attaches the most recently invoked skills after compaction (up to 25K tokens combined, starting from the most recent). Use proactively before you hit context limits — don't wait for degradation.

/cost — Check spend

Shows token usage and estimated cost for the current session. A typical session consumes 50K–200K input tokens and 10K–50K output tokens. Haiku session: $0.10–$0.45. Sonnet: $0.25–$1.00. Opus: $0.50–$2.25. Check periodically, especially on complex tasks.

/resume — Session recovery

Reopen a previous session. The picker shows your last 50. Useful when you closed the terminal mid-task, or when you want to continue yesterday's work with the context intact.

/btw <question> — Side questions at zero context cost

Ask a meta-question without polluting the main conversation. /btw what's the convention for error handling in this project? — Claude answers using the parent prompt cache, and the exchange never enters conversation history. This is one of the most token-efficient features and a primary defense against context rot.

/context — Visualize context usage

Displays a colored grid showing exactly where your context budget is going — conversation, CLAUDE.md, tool schemas, memory, skills. Run this when costs spike or quality drops. The fastest cost/quality win is usually finding the single biggest offender (often a stale conversation or an oversized MCP result) rather than micro-optimizing everywhere.

/branch / /rewind — Conversation management

/branch [name] forks the conversation — try an approach without losing the main thread. /rewind reverts to a previous checkpoint. Use when you want to explore two approaches and compare results.

/recap — Session summary

Shows a recap of the current session's context — what was discussed, what decisions were made, what files were changed. Useful when resuming after a break.

Context Rot — Why AI Gets Worse the Longer You Chat

As your conversation grows, older context gets compressed or dropped entirely. Claude's behavior drifts from your original instructions. This is context rot — the agent equivalent of memory loss. It affects both Claude Code and Cowork sessions. Understanding it is essential for reliable results.

How to detect it

Claude starts ignoring rules from CLAUDE.md. It repeats patterns you corrected earlier. It "forgets" architectural decisions from the start of the session. Output quality drops noticeably. These are all signs of context rot.

What Context Rot Looks Like in Practice

This is an annotated transcript from a real 40-turn session. The same instruction is given twice — once at the start, once at turn 38. The outputs are different.

Turn 3 · Session start · context clean

Add error handling to the saveTask function. Use console.error, not console.log — that's the project convention.

Claude · turn 3

I'll add error handling following the console.error convention from CLAUDE.md. [Correct diff — uses console.error, matches style guide.]

Turn 38 · 35 turns of unrelated work later · context compressed

Add error handling to the listTasks function.

Claude · turn 38 — context rot in effect

I'll add error handling to listTasks. [Diff uses console.log — the console.error instruction from turn 3 has been compressed out of context and is no longer visible to the model.]

Claude is not malfunctioning — the instruction from turn 3 is genuinely no longer in its context window. This is context rot: information loss through compression, not a bug.

The same session, with /compact at turn 20

Run /compact at turn 20 before starting a new phase of work. Claude summarizes the conversation history into a dense representation — the console.error rule survives because it was referenced multiple times. Turn 38 now produces the correct diff. Same session, same instruction, completely different outcome.

How to prevent it

Five strategies, in order of effectiveness:

CLAUDE.md — critical rules survive compaction because they're re-loaded from disk, not from conversation history

/compact proactively — run it at natural task boundaries, not after you notice drift

Subagent delegation — subagents start with fresh context; complex sub-tasks never pollute the parent

Session splitting — use /clear between unrelated tasks so early instructions don't crowd later ones

Hooks — deterministic behavior (format on save, block dangerous commands) doesn't depend on context at all

Message Queuing

You can type and send multiple follow-up messages while Claude is still working on the current task. Claude queues them intelligently — if it needs your feedback, it pauses before processing queued messages. If it finishes a task cleanly, it picks up the next one.

The workflow

Queue your tasks: "Add comments to this module." "Also fix the lint errors." "And update the README." Then go about your day. Claude processes them sequentially when it's ready. Check back later — in many cases, you'll return to a batch of completed work. But check periodically because it might need your input.

Model Selection

Claude Code defaults to the best model for your plan tier. You can override per-session:

# Use a specific model
claude --model claude-sonnet-4-6

# Check current model
/config

Decision rule:
Exploration → Haiku — fast, cheap, great for Explore subagents and codebase mapping.
Daily coding → Sonnet — the default. Best balance of speed, quality, and cost.
Hard reasoning → Opus — architecture decisions, security analysis, complex debugging.
Complex refactoring → opusplan — Opus plans the approach, Sonnet executes the changes. Best of both: deep reasoning for the plan, fast execution for the edits.

Effort Levels

Independently from model selection, /effort controls how hard Claude thinks on each turn:

# Set effort for the current session
/effort low      # quick renames, formatting — minimal reasoning
/effort medium   # default — routine development
/effort high     # complex debugging, architecture decisions
/effort xhigh    # deep agentic reasoning on Opus — hardest problems
/effort auto     # Claude picks based on task complexity

/effort low is often enough for "rename this variable everywhere" work. /effort xhigh on Opus is extraordinary for complex agentic coding but burns tokens fast. Match effort to the actual difficulty of the task.

Model × Effort Decision Matrix

Low

Medium

High

XHigh

Haiku

Renames, grep

Explore agent

—

Sonnet

Formatting

Daily coding ★

Multi-file edits

Cost-heavy

Opus

Overkill

Quick reviews

Architecture ★

Hardest problems ★

opusplan

—

Refactoring ★

Complex migration

★ = recommended sweet spot for that model tier

Commands Introduced

/clear · /compact · /cost · /resume · /help · /config · --model · opusplan

Troubleshooting & Recovery

Things will go wrong. Here's what to do when they do:

Claude loops on the same error

It's stuck in an Act → Observe → Act cycle that isn't converging. Press Esc to interrupt. Give explicit direction: "Stop trying that approach. Instead, try X." If that fails, /clear and restate the goal with more constraints.

Output quality collapses mid-session

This is context rot in action. Claude stops following CLAUDE.md rules, repeats mistakes, or produces generic output. Run /compact immediately. If quality doesn't recover, start a fresh session — use /resume on the old one to check what was accomplished, then open a new session for the remaining work.

A subagent gets stuck or hangs

Check the /agents Running tab. If the agent is looping, Ctrl+F kills all background agents. Re-launch with a more constrained prompt or a lower maxTurns to prevent infinite loops.

Costs spike unexpectedly

Check /cost immediately. Common causes: running Opus on trivial tasks, not compacting long sessions, re-reading the entire codebase repeatedly. Switch to Sonnet or Haiku, compact, and scope your prompts tighter.

A hook breaks the workflow

A PreToolUse hook that exits non-zero blocks every tool call. If you can't fix it in-session, edit .claude/settings.json directly to remove the broken hook, then restart Claude.

Cost Management Strategies

Agentic workflows consume significantly more than chat. Here's how to stay efficient:

Use Haiku subagents for exploration

The Explore agent runs on Haiku by default — ~10x cheaper than Opus. Use context: fork + agent: Explore for any read-only investigation before committing to an Opus task.

Use opusplan for refactoring

Opus plans the approach (deep reasoning), Sonnet executes the edits (fast, cheap). You get Opus-quality architecture decisions at Sonnet-level execution costs.

Compact proactively

Don't wait for context rot — a bloated context means you're paying for redundant tokens on every turn. Compact early, especially before starting a new phase of work within the same session.

Use @file references instead of re-reading

Instead of asking Claude to "read all files in src/", point it at specific files: @src/auth/login.ts. Narrower perception = fewer tokens = lower cost.

Split long sessions

One 2-hour session with 500K context costs more per turn than four 30-minute sessions with 100K context each. Split when you cross task boundaries.

--max-budget-usd N — Hard cost cap

The most effective cost control. Set a dollar limit per session: claude --max-budget-usd 5. The session stops when the cap is reached. Essential for headless/CI runs where there's no human watching spend.

Prefer Grep/Glob over Read for discovery

Full file reads are expensive. When searching for patterns, use Grep and Glob first to narrow down, then Read only the files that matter. This can reduce Perceive-step tokens by 10x on large codebases.

--bare for scripted one-shots

Skips loading plugins, hooks, and skills on startup. Faster and cheaper for simple headless tasks where you don't need the full harness.

ENABLE_PROMPT_CACHING_1H=1

Enables 1-hour prompt cache TTL for API-key users. Subscription users get this automatically. Dramatically reduces cost on repeated turns — the system prompt and conversation prefix are cached instead of re-processed. Avoid switching models mid-conversation, which invalidates the cache.

Cowork parallel: Cowork sessions also start fresh every time — no cross-session memory. Your context files are the equivalent of CLAUDE.md — they give the agent your history, preferences, and rules on every launch. Usage burns faster than chat; check Settings → Usage regularly. Batch related tasks into single sessions to stay efficient.

Model selection in Cowork: Cowork uses Sonnet by default for most tasks. Opus 4 is available on Max and Enterprise plans — select it in Settings → Model for complex multi-step work (large research tasks, long document synthesis, agent-heavy workflows). Haiku is used internally for sub-agent steps when Cowork decomposes a task into parallel parts. You don't pick Haiku directly — Cowork chooses it automatically for read-only subtasks to keep costs low.

Checkpoint: You've accepted your first AI-generated diff. You can explain context rot and name three strategies to prevent it. You know when to use /compact vs. subagent delegation vs. /clear. You can choose the right model for a given task.

Level 2 · Curious · Module 07

● Claude Code in VS Code

⏱ 10 min

Claude Code in VS Code

Bring the agent into the editor where you already work. No terminal switching, no copy-paste — Claude sees what you see.

Objective

Install the VS Code extension, fix a real bug using it, and understand when to use the editor vs. the terminal.

Setup

Step 1: Install

Open VS Code → Extensions (Cmd+Shift+X) → search "Claude Code" → Install. The extension shares your terminal authentication — no separate login needed.

Step 2: Launch

Cmd+Esc (Mac) or Ctrl+Esc (Windows/Linux) opens the Claude Code panel inside your editor. Claude immediately sees your currently selected code, open tabs, file diagnostics (errors, warnings), and workspace structure.

What Claude Sees in the Editor

Selected code — highlight a function, Claude knows what you're pointing at

Open tabs — Claude uses these as context about what you're working on

File diagnostics — TypeScript errors, ESLint warnings, test failures

Workspace structure — directory tree, package.json, configs

Git state — current branch, staged changes, recent commits

Hands-On Task — Fix a Bug Without Leaving the Editor

Do this now. It takes under 5 minutes and gives you a concrete feel for editor-mode context.

Step 1 — Open the starter repo in VS Code

Open the claude-code-course-starter folder. Navigate to src/commands/done.ts. This file handles the done <id> command.

Step 2 — Select the done function body

Highlight the function from its opening brace to its closing brace. You're telling Claude exactly what to focus on — selected code is weighted more heavily than open tabs.

Step 3 — Launch Claude Code (Cmd+Esc / Ctrl+Esc) and ask:

"What happens if the user passes a non-numeric ID like 'abc'? Is that case handled? If not, propose a fix."

Step 4 — Review the inline diff

Claude proposes a change rendered as a diff in the editor panel. Read the diff. If the fix is correct, accept it. If you disagree with the approach, type a follow-up: "Add the validation but keep the error message consistent with the existing style."

Step 5 — Run the tests without leaving VS Code

Open the VS Code terminal panel (Ctrl+`) and run npm test. If the test for done was already covering this case, it will still pass. If not, ask Claude to add a test for the invalid-ID path.

What you just experienced: Claude had selected code + open tabs + file diagnostics as context — no terminal copy-paste, no switching windows. The editor surface works best for exactly this: a focused question about visible code, answered and applied without breaking your flow.

Terminal vs. Editor: When to Use Which

Use the terminal when...

Multi-file refactors, plan mode explorations, subagent delegation, running tests, git operations, large-scope tasks where you want full context control.

Use the editor when...

Quick fixes on visible code, debugging with selected context, reviewing and editing in-place, when you want to stay in your normal editor flow without switching windows.

Other Surfaces

JetBrains IDEs

The same extension works across IntelliJ IDEA, PyCharm, WebStorm, CLion, and Rider. Install from JetBrains Marketplace → search "Claude Code". Same keyboard shortcut (Cmd+Esc / Ctrl+Esc), same context model — Claude sees your selected code, open files, and IDE diagnostics. If you already did the hands-on task above in VS Code, the JetBrains experience is identical in practice.

Desktop App & Web (claude.ai/code)

Both give you Claude Code without a terminal or IDE. The Desktop App is a standalone window; the web version runs in your browser. Both surfaces have the same session management and slash commands. Neither has direct file system access to your project by default — you paste code or upload files. Use these for quick questions or when you're away from your dev machine, not for multi-file work.

Mobile (iOS / Android)

There is no Claude Code mobile client. If you see references to "iOS monitoring" or "Dispatch triggering" from mobile in other resources, those refer to Cowork's Dispatch feature (text a task from your phone → Cowork runs it on your desktop), not Claude Code. Claude Code requires a terminal or IDE and runs on your computer.

Shortcuts Introduced

Cmd+Esc / Ctrl+Esc (launch from editor)

Checkpoint: You completed Steps 1–5 above: opened done.ts, selected a function, asked Claude a specific question, reviewed the diff, and ran the tests — all without leaving VS Code. You can describe one scenario where you'd use the editor instead of the terminal.

Level 2 · Curious · Module 08

● Plan Mode & Diff Review

⏱ 15 min

Plan Mode & Reviewing Diffs

Think before you type. Review before you accept. These two skills separate productive Claude Code users from frustrated ones.

Objective

Use plan mode for multi-file tasks, evaluate AI-generated diffs critically, and know when each of the three modes is appropriate.

The Three Modes

Shift+Tab cycles through three modes. Each maps to a different part of the agent loop:

Default Mode

Claude reads, reasons, and acts — the full loop. It proposes diffs and waits for your approval. Best for: focused tasks where you trust the scope. Agent loop: Perceive → Reason → Act → you approve.

Accept Edits Mode

Claude applies changes automatically without waiting for diff approval. Best for: low-risk tasks like formatting, renaming, or adding comments where you've already reviewed the plan. Agent loop: full loop with auto-accept. Use with caution — pair with hooks for guardrails.

Plan Mode

Claude reads and reasons but does not act. It explores the codebase, analyzes the problem, and presents a detailed plan — then stops. You review the plan, give feedback, and only then switch to Default mode for execution. Best for: anything touching 3+ files, unfamiliar codebases, architectural changes, or any task where the wrong approach wastes more time than planning. Agent loop: Perceive → Reason → stop.

Auto Mode — Intelligent Autonomous Operation

For experienced users who want faster flow without giving up all safety:

claude --auto or config set autoMode true

Auto mode uses an AI classifier to approve routine, low-risk actions (file reads, grep, listing directories) automatically while still prompting you for high-risk actions (file writes, shell commands, git operations). It's the sweet spot between full manual approval and no approval at all — you get faster flow on safe operations without losing oversight on dangerous ones.

When to use auto mode: Daily development on projects you know well, where you trust the codebase and have hooks configured as guardrails. Pair with a PostToolUse hook that runs tests after edits — auto mode handles the routine approvals, hooks catch the mistakes.

When NOT to use auto mode: Unfamiliar codebases, production-critical changes, anything involving secrets or credentials, or when working without hooks configured. In these cases, Default or Plan mode gives you the oversight you need.

When to Use Plan Mode

Multi-file changes (refactors, migrations, renames)

Unfamiliar codebases (the plan reveals Claude's understanding)

Architectural decisions (you want to review the approach, not just the code)

High-risk changes (auth, payments, data access)

When you'd normally sketch on a whiteboard first

How to Review Diffs

Use /diff for an interactive diff viewer — shows both uncommitted changes and per-turn changes so you can see exactly what Claude modified at each step. This is your most important quality gate:

1. Read the intent, not just the lines

Before looking at code changes, check: does the plan match what you asked for? If the approach is wrong, reject the plan before diffs are generated. Don't waste time reviewing code built on a flawed premise.

2. Check what was removed

Deletions are where bugs hide. AI may remove error handling, logging, edge-case branches, or fallback paths it considers "unnecessary." Always scan deletions carefully — they're easier to miss than additions.

3. Verify the test story

If Claude edited business logic but didn't touch tests, ask: "Did this change break any tests? Are there tests that should be updated?" Better yet, add a hook (Module 15) that auto-runs tests after every edit.

4. Check for hallucinated APIs

Claude may call functions, methods, or library features that don't exist in your project. Cross-reference any unfamiliar API calls. This is especially common with newer libraries or internal utilities.

5. Use partial acceptance

You can accept some files and reject others. Claude will iterate on the rejected portions while keeping the accepted changes. This is much faster than rejecting everything and rephrasing.

Shortcuts Introduced

Shift+Tab (cycle modes) · Esc (interrupt Claude mid-action) · y/n (accept/reject diffs)

Cowork plan review: Add to your global instructions: "DO NOT start working yet. First, show me your plan and ask clarifying questions. Only begin once we've aligned." Cowork also has a built-in AskUserQuestion feature — when it needs more information, it generates a structured form with specific options rather than guessing. This is Cowork's equivalent of plan mode: make the Reason step visible before the Act step begins.

Checkpoint: You can name the three modes, explain when each is appropriate, and list five things to check when reviewing an AI-generated diff. You've used plan mode on a multi-file task.

Level 2 · Curious · Module 09

● Prompting Patterns

⏱ 20 min

Prompting Patterns That Work

Universal patterns for any agent — Code or Cowork. These aren't chatbot tricks; they're ways of guiding the agent loop reliably.

Objective

Rewrite a vague prompt into a constrained, verifiable, outcome-based delegation. Use XML tags for complex tasks. Know the common anti-patterns.

Pattern 1: Delegate Outcomes, Not Steps

This is the single most important shift for agentic tools. Stop telling the agent how to work. Describe what done looks like. ChatGPT trained you to write detailed step-by-step prompts. Agentic tools train you to give clear outcomes. One skill depreciates as models improve; the other compounds.

❌ Step-by-step (chatbot thinking)

"First, read the auth module. Then find the login function. Then check if there's validation. Then add email validation using regex. Then add a test."

✅ Outcome-based (agent thinking)

"Refactor the auth module to use Zod for all input validation. All existing tests must pass. Follow the patterns in @src/validation/schemas.ts. Add tests for edge cases (empty string, malformed email, SQL injection attempt). Run tests before committing."

The second version tells Claude what done looks like (Zod validation, passing tests, edge cases covered) without dictating the steps. Claude chooses the best path through the agent loop.

Pattern 2: Context First

Feed the Perceive step before triggering Act. The more Claude understands before acting, the better its first attempt.

# Good: perceive first, then act
"Read src/api/ and understand the routing pattern, then add a new /health endpoint following the same pattern."

# Bad: act without context
"Add a /health endpoint."  # Claude guesses at your routing patterns

Pattern 3: Constrain the Scope

Limit the agent's Act surface. Unbounded prompts produce sprawling, hard-to-review changes that trigger context rot.

"Only modify files in src/auth/ — don't touch anything else."
"Fix the bug but don't refactor surrounding code."
"Add the feature without changing the database schema."

Pattern 4: Use XML Tags for Complex Delegations

Claude is specifically trained to recognize XML tags as structural markers. For multi-component tasks, tags prevent Claude from mixing up what's background context, what's the actual task, and what the guardrails are.

<context>
Series B SaaS startup, $8M ARR.
Board meeting next Tuesday.
Three non-technical board members.
</context>

<instructions>
Build a board deck with financial summary,
product update, and hiring pipeline.
Save as .pptx with our brand template.
</instructions>

<constraints>
- Professional but not stiff tone
- No jargon the non-technical board members won't know
- Every slide needs a "so what" — not just metrics
- Maximum 15 slides
</constraints>

Tag names are flexible — <background>, <rules>, <data>, <output_format>, whatever makes sense for your task. Consistency within a project matters more than naming convention. This works equally well in Claude Code prompts, CLAUDE.md files, skills, and Cowork instructions.

Pattern 5: Front-Load Clarification, Not Correction

Spend 30 seconds aligning before the agent works. Don't spend 10 minutes correcting after.

"DO NOT start working yet. First, ask me clarifying questions so we can define the approach together. Only begin once we've aligned."

Add this to the end of any non-trivial prompt. Claude will ask specific questions about scope, approach, and constraints before executing. In Cowork, this triggers the AskUserQuestion feature — structured forms with options instead of confident guessing.

Pattern 6: Request Verification

Strengthen the Observe step — don't leave it to chance.

"After making the change, run the tests and show me the output."
"Verify that the build succeeds before suggesting a commit."
"Check if this change breaks any existing imports."

Pattern 7: Batch Related Tasks

Instead of three separate sessions, batch into one. Less usage, better context coherence, more efficient agent loops.

# Claude Code: one session, three related tasks
"Read all files in src/api/. Fix lint errors. Add missing JSDoc comments. Run tests. Commit with a descriptive message."

# Cowork equivalent: one session, three related tasks
"I have receipts in /receipts. Organize by month into subfolders. Create an expense spreadsheet with category totals and formulas. Write a one-page summary for my accountant. Save everything in /outputs."

Pattern 8: Iterate, Don't Restart

If the first attempt is 80% right, say what's wrong rather than rephrasing from scratch. Restarting wastes all the context the agent built. Iteration preserves it.

"Good, but keep the original error message — don't change that string."
"The logic is right but use early returns instead of nested ifs."
"This is close — also handle the null case on line 47."

Pattern 9: Visual Input — Paste Images Directly

Claude Code can process images pasted directly into the terminal. This is a general Perceive capability, not just for debugging:

Screenshots of UI bugs

"This button is overflowing. Fix the CSS." Paste the screenshot — Claude sees the visual problem and the code simultaneously.

Architecture diagrams

Paste a whiteboard photo or Mermaid diagram screenshot. "Implement this architecture. Start with the data flow layer."

Error screenshots

Paste a browser console screenshot, a terminal error, or a Sentry alert. Faster than copying text — and Claude sees formatting, colors, and context you'd lose in a paste.

Design mockups

Paste a Figma export or hand-drawn wireframe. "Build this component in React. Match the spacing and colors exactly."

Common Anti-Patterns

Too vague

"Make the code better." — Claude doesn't know what "better" means to you. Be specific: "Reduce cyclomatic complexity of processOrder() by extracting validation into a separate function."

Too prescriptive

"On line 47, change x to userCount, then on line 48..." — You're doing Claude's job. Describe the goal: "Rename variables in processOrder() to be descriptive." Let the agent figure out how.

No verification

"Refactor this module." without "Run tests after." — You're trusting the Act step without an Observe step. Always request verification for non-trivial changes.

Restarting instead of iterating

Rephrasing the entire prompt when the first attempt was 80% right. You just threw away all the context Claude built. Say what's wrong with the 20%, not the 100%.

Checkpoint: You can rewrite a vague prompt into a constrained, outcome-based, XML-structured delegation — for either Code or Cowork. You can identify the four anti-patterns in someone else's prompt and fix them.

Level 3 · Fluent · Module 10

● CLAUDE.md, Memory & Setup

⏱ 25 min + hands-on

CLAUDE.md, .claude/memory/, & .claudeignore

Externalize codebase knowledge into three memory layers. Set up the project so every session starts smarter than the last.

Objective

Bootstrap a project with claude init, write an effective CLAUDE.md, configure .claudeignore for sensitive files, and understand the three memory layers.

🧠 Externalization: Semantic Memory

CLAUDE.md is the most important file in your project for Claude Code. It's durable policy — semantic memory loaded into every session and re-loaded after compaction. It's what you'd tell a new team member on day one, encoded once and reused forever.

claude init — Bootstrap from Your Codebase

Run claude init in any repo. Claude reads your codebase, infers stack, conventions, and patterns, then generates a starter CLAUDE.md. Edit it to be authoritative — this is policy, not a suggestion. Keep it under 200 lines. Use @filename references for deep context instead of pasting content.

CLAUDE.md Structure

The best CLAUDE.md files follow this structure — short, authoritative, convention-focused:

# Project: Acme API
## Stack
- TypeScript 5.4, Fastify 4, Zod 3, Prisma 5
- PostgreSQL 16, Redis 7

## Commands
- `pnpm test` — run all tests
- `pnpm test:watch` — watch mode
- `pnpm build` — production build
- `pnpm lint` — ESLint + Prettier

## Conventions
- Strict mode, no `any`. Use `unknown` + type guards.
- Zod for all request validation — never manual parsing.
- Tests in `__tests__/` adjacent to source files.
- Use the logger from `@/lib/logger`, never `console.log`.
- Errors: throw typed errors from `@/lib/errors`.

## Architecture
See @docs/architecture.md for system boundaries.
See @docs/adr/ for architecture decision records.

## Don't
- Modify files in `src/legacy/` without explicit approval.
- Add new dependencies without checking `@package.json` first.
- Commit directly — always create a branch.
- Run `pnpm test` before any commit suggestion.

The Three Memory Layers

CLAUDE.md — Semantic Memory

You write it. Durable project policy. Loaded every session. Survives compaction. Rules, conventions, commands, architecture.

YOU WRITE

.claude/memory/ — Episodic Memory

Auto-managed. Claude writes here when it learns patterns, decisions, context. MEMORY.md index + per-topic files. Gitignored.

AUTO

Agent Memory — Per-Agent

Custom subagents with memory: user. Each maintains its own MEMORY.md in ~/.claude/agent-memory/. Persists across sessions.

PER-AGENT

Layer 1: CLAUDE.md — Semantic (you write it)

Durable project policy. Loaded every session. Survives compaction. Contains rules, conventions, commands, architecture decisions. Project-level: repo root. Personal-level: ~/.claude/CLAUDE.md. Use it for rules, not procedures (procedures belong in skills).

Layer 2: .claude/memory/ — Episodic (auto-managed)

Claude Code auto-manages a .claude/memory/ directory with a MEMORY.md index and per-topic files. Claude writes here when it learns something about your codebase — patterns it discovered, decisions you made, context it wants to remember. Add memory/ to .gitignore — it's per-user state. This is separate from CLAUDE.md.

Layer 3: Agent memory — Per-agent (Module 17)

Custom subagents can have memory: user in their frontmatter. Each agent maintains its own MEMORY.md in ~/.claude/agent-memory/<agent>/. A security-reviewer agent remembers past findings; a doc-writer agent remembers your style. Covered in depth in Module 17.

.claudeignore — Keep Secrets Out of Context

Like .gitignore but for Claude's perception. Keep sensitive files, large binaries, and irrelevant directories out of the agent's view:

# .claudeignore
.env
.env.*
secrets/
*.pem
*.key
node_modules/
dist/
coverage/
.claude/memory/   # per-user, not shared

The .claude/ Folder Structure

.claude/
├── settings.json       # team config (commit to git)
├── settings.local.json # personal overrides (gitignore)
├── CLAUDE.md           # or at repo root
├── rules/              # rule fragments loaded by path
│   ├── typescript.md   # loaded for .ts files
│   └── testing.md      # loaded for test files
├── skills/             # custom slash commands (Module 11)
├── agents/             # custom subagents (Module 17)
├── hooks/              # shell scripts for hooks
└── memory/             # auto-managed (gitignore)

.claude/rules/*.md — Modular rule fragments

Instead of putting everything in CLAUDE.md, you can split rules into separate files in .claude/rules/. Each file is a Markdown fragment loaded alongside CLAUDE.md. Use this for large projects where a single CLAUDE.md would exceed 200 lines, or when different rules apply to different parts of the codebase. Rules can be path-scoped — a rule about TypeScript conventions loads only when working in .ts files.

Commands Introduced

claude init · @filename (inline file reference) · .claudeignore

Cowork parallel: Your CLAUDE.md maps to Cowork's three context files: about-me.md (role + context), brand-voice.md (communication style + samples), working-preferences.md (behavior rules + guardrails). Both are semantic memory that compounds. Refine weekly — after a month, the quality difference is unrecognizable.

Artifact Taxonomy

Artifact	What it is
Skill (`/skill-name`)	A markdown procedure file in `.claude/skills/`
Slash command	What you type to invoke a skill
Agent	A subagent with its own persona and memory, in `.claude/agents/`
Plugin	A packaged bundle of skills, agents, hooks, and MCP config

Real Session: Bootstrapping CLAUDE.md

Here is what a real setup session looks like. Annotations in purple explain what Claude is doing at each step.

You (terminal)

claude init

Claude reads your package.json, tsconfig.json, src/ structure, and existing test files. It infers stack, conventions, and patterns — this takes 10–20 seconds on a typical project.

Claude

I've analyzed your codebase. Generated CLAUDE.md with your TypeScript/Commander stack, npm commands, and src/commands/ pattern. Review and edit it — I'll treat it as authoritative policy from this session forward.

The generated file is a starting point, not finished. The next step is to make it authoritative.

You

Update CLAUDE.md: we never use console.log for errors — always console.error. And all file I/O goes through src/store/tasks.ts, never directly from commands.

You are encoding team conventions that a linter can't catch. These are the highest-value lines in CLAUDE.md — non-obvious rules Claude would otherwise violate.

Claude

Updated CLAUDE.md with two rules under ## Don't: "Use console.log for errors — use console.error" and "Read/write files directly from src/commands/ — all I/O goes through src/store/tasks.ts".

You

Add @docs/architecture.md as an @file reference in the Architecture section.

@file references load the target file into context on demand, without pasting its full content into CLAUDE.md. Keeps CLAUDE.md short; Claude fetches depth when it needs it.

Claude

Added See @docs/architecture.md for system boundaries. under ## Architecture. Claude will load that file whenever architecture context is relevant.

You

What are the conventions for this project?

This is the verification step. You're confirming Claude reads CLAUDE.md correctly before writing any code. A wrong answer here means your CLAUDE.md has an issue — fix it now, not after a broken diff.

Claude

From CLAUDE.md: TypeScript strict mode, no any. One file per command in src/commands/. All I/O through src/store/tasks.ts. Errors via console.error. Tests in __tests__/ mirroring src/ structure. New commands and features both require tests before the PR.

✓ Claude recited the rules correctly. CLAUDE.md is working. You can now start writing code with confidence that Claude will follow your project's conventions.

Checkpoint: Your repo has a CLAUDE.md (under 200 lines), a .claudeignore, and you can explain the three memory layers (CLAUDE.md = semantic, .claude/memory/ = episodic, agent memory = per-agent). You know which goes in git and which doesn't.

Foundation Lab A · Self-Study Guide Part 1 Module 2

📘 Advanced RAG & Knowledge Memory

⏱ MVP 3–5 hrs · Full 15–20 hrs

Advanced RAG & Security Knowledge Memory

Move beyond basic vector search into retrieval methods that preserve context, reduce hallucination, and give grounded citations.

Prerequisites: Comfortable with Python (list comprehensions, functions, imports). Familiarity with pip and virtual environments. No prior ML or NLP experience required — the lab explains embeddings from first principles. Libraries needed: sentence-transformers, numpy, rank-bm25 (all installable via pip). Sample documents are in labs/lab-a-samples/ in the starter repo — clone it before starting.

Time estimate — be honest with yourself: The MVP (load docs, chunk, embed, retrieve with cosine similarity) takes 3–5 hours if you know Python and have used ML libraries before. If you're new to embeddings or sentence-transformers, budget 6–8 hours for MVP. Full completion with hybrid BM25 + vector search, contextual compression, and citation grounding is 15–20 hours. The 10 sample documents in lab-a-samples/ are ready to use — your time goes into the pipeline code, not data preparation.

Module 10 teaches you to externalize knowledge into CLAUDE.md. This lab teaches you why knowledge retrieval matters by building a RAG system from scratch. The concepts — hybrid search, contextual compression, citation grounding — explain what happens under the hood when any agent retrieves context from files, MCP servers, or memory.

Learning Objectives

By the end of this lab you can

Explain the limitations of classic (naive) RAG

Implement hybrid search combining BM25 + vector retrieval

Apply parent-document retrieval for context preservation

Build a contextual compression pipeline

Add metadata filtering for domain-specific documents

Return answers with source citations and confidence notes

Topics Covered

Classic RAG vs. advanced RAG architectures

Chunking strategies (fixed, semantic, parent-child)

Embedding models (Gemini, OpenAI-compatible, local)

Vector databases (Qdrant, ChromaDB, or Pinecone)

BM25 keyword search (sparse retrieval)

Ensemble/hybrid retrieval with weighted fusion

Contextual compression (LLM-based extraction)

Citation-grounded answer generation

Metadata filtering (doc type, date, classification)

Evaluation: retrieval precision, answer faithfulness

The Lab: Cybersecurity Knowledge Assistant

Build a RAG pipeline that answers security questions from internal documents with source attribution. The system must handle multiple document types and return grounded, citation-backed answers.

Phase 1: Document Ingestion (~2 hrs)

Collect 20–30 documents: public APT write-ups, MITRE ATT&CK technique descriptions, sample incident timelines, SOC runbooks. Implement two chunking strategies (fixed-size with overlap, and semantic/parent-child) and compare retrieval quality.

Phase 2: Hybrid Retrieval Engine (~3 hrs)

Set up a vector database (Qdrant recommended). Implement both dense retrieval (embeddings) and sparse retrieval (BM25). Combine them using an EnsembleRetriever with tunable weights. Add metadata fields: document type, threat actor, date, classification level.

# Hybrid retrieval skeleton
from langchain.retrievers import EnsembleRetriever

retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, vector_retriever],
    weights=[0.4, 0.6]  # tune these
)

# Add metadata filtering
filtered = vector_retriever.with_config(
    search_kwargs={"filter": {"doc_type": "apt_report"}}
)

Phase 3: Contextual Compression (~2 hrs)

Raw chunks often contain irrelevant noise. Add a contextual compression layer that extracts only the sentences relevant to the query before passing to the LLM. Compare answer quality with and without compression.

Phase 4: Citation-Grounded Answers (~2 hrs)

Modify the generation prompt to require source attribution. Each claim in the answer must reference the specific document and chunk it came from. Add confidence notes ("high confidence — 3 sources agree" vs. "low confidence — single source, 2019 data").

Phase 5: Evaluation (~1 hr)

Build a test set of 10–15 questions with known answers. Measure retrieval precision (are the right chunks returned?) and answer faithfulness (does the answer stay grounded in the sources?). Identify failure modes.

Recommended Stack

Python 3.11+ with LangChain or LlamaIndex

Qdrant (Docker) or ChromaDB (local)

Embedding: Gemini text-embedding or OpenAI ada-002

LLM: Claude API via Anthropic SDK

BM25: rank-bm25 Python package

Optional: Streamlit or Gradio for UI

Deliverable

A working RAG pipeline that accepts natural language security questions and returns answers with: source document name, page/chunk reference, relevant quote (under 15 words), and confidence level. Test query: "What TTPs does APT29 use for lateral movement?" should return a grounded answer citing specific ATT&CK techniques with source attribution.

Assessment Criteria

Not passing: Only vector search, no hybrid, no citations, no evaluation

Beginner: Basic vector search returns relevant chunks

Intermediate: Hybrid search + metadata filtering + citations

Advanced: Contextual compression + confidence scoring + eval metrics

Bridge to Claude Code + Cowork: CLAUDE.md is semantic memory — rules the agent always knows. .claude/memory/ is auto-managed episodic memory — things the agent learned over time. Obsidian via MCP is an external knowledge base — a document store the agent queries at runtime. Cowork's context files serve the same role. The RAG concepts you built in this lab are the theory behind how every agent's Perceive step retrieves relevant context before reasoning.

Level 3 · Fluent · Module 11

● Skills & $ARGUMENTS

⏱ 20 min + hands-on

Custom Skills, $ARGUMENTS, & context:fork

Externalize procedures into reusable slash commands. Make them dynamic with parameters. Run them in isolated subagents for clean context.

Objective

Build three types of skills: a static skill, a dynamic skill with $ARGUMENTS, and a forked skill that runs in a subagent. Understand frontmatter fields and when to use each pattern.

🛠 Externalization: Procedural Knowledge

Skills transform improvised prompts into structured, repeatable procedures. The YAML frontmatter is the specification (what, when, permissions); the body is the execution (how). This maps to the Act step in the agent loop — structured action instead of improvisation.

What skills are — and are not. A skill is a persistent prompt, not a script. When you invoke /pr-review, Claude reads the SKILL.md body as its instructions and then reasons and acts — just as if you had typed those instructions directly. There is no compiled code, no deterministic execution path, no guaranteed output format. The skill makes Claude more consistent and more repeatable, but it is still Claude reasoning from natural language each time.

This means: skills can't loop, can't branch conditionally, can't call external APIs, and can't guarantee identical output on identical input. What they can do is encode your team's best practices, constraint set, and process into a reusable form that Claude follows reliably. Think of a skill as a standing operating procedure, not a function call.

Skill Anatomy

Skills live in .claude/skills/ (team, committed to repo) or ~/.claude/skills/ (personal). Each is a folder with a SKILL.md file and optional helper scripts. Slash commands and skills are unified — .claude/commands/ still works but .claude/skills/ is recommended.

Pattern 1: Static Skill

# .claude/skills/pr-review/SKILL.md
---
name: pr-review
description: Review the current branch diff for bugs, security, and style
tools: [Read, Grep, Bash]
user-invocable: true
disable-model-invocation: false   # Claude can also auto-invoke when relevant
---
Run `git diff main..HEAD` and review the changes for:
1. Security issues (injection, auth bypasses, hardcoded secrets)
2. Missing or broken tests
3. Style violations per CLAUDE.md
4. Performance concerns in hot paths

Post findings as a Markdown checklist with severity levels.

Pattern 2: Dynamic Skill with $ARGUMENTS

Use $ARGUMENTS to accept parameters from the slash command. /deploy-check staging passes "staging" as the argument:

# .claude/skills/deploy-check/SKILL.md
---
name: deploy-check
description: Verify the application is ready to deploy to a target environment
tools: [Bash]
user-invocable: true
disable-model-invocation: true   # only you can trigger deploys
allowed-tools: Bash(npm:*), Bash(git:*)
---
Check $ARGUMENTS is ready for deployment:
1. Run `pnpm test` — abort if any test fails
2. Run `pnpm build` — abort on build errors
3. Verify the $ARGUMENTS deployment target is healthy
4. Report readiness status

Pattern 3: Forked Skill (auto-subagent)

Add context: fork and agent: to frontmatter. The skill runs in an isolated subagent — your main context stays clean. Results are summarized back. This is the most powerful pattern for exploration and research tasks.

context: fork vs agent: — what each field does. They are independent.

context: fork controls isolation: the skill runs in a fresh subagent context that cannot read or pollute your main session. The subagent reports back a summary when it finishes. Use it whenever the task involves broad exploration or could accumulate a lot of irrelevant context.

agent: controls which subagent runs it: Explore is read-only and runs on Haiku (fast, cheap — use for codebase mapping and search), Plan is analytical and cannot edit files (use for architecture review), general-purpose has full tools (default if you omit the field). You can also name a custom agent defined in .claude/agents/.

You can use agent: Explore without context: fork (the task runs in your current session but uses Haiku), and context: fork without agent: (forks a general-purpose subagent). The combination — fork + Explore — is the sweet spot for read-heavy research tasks: isolated context, fast model, cheap.

# .claude/skills/deep-research/SKILL.md
---
name: deep-research
description: Research a topic thoroughly across the codebase
context: fork
agent: Explore    # built-in read-only agent (fast, Haiku)
---
Research $ARGUMENTS thoroughly:
1. Find all relevant files using Glob and Grep
2. Read and analyze the code in detail
3. Check git history for recent changes
4. Summarize findings with specific file references
5. Identify potential issues or improvements

Skill Type Comparison

When to use each skill type

Type	Trigger	Example
Static	Fixed procedure, no user input	`/pr-review` always does the same review
$ARGUMENTS	User passes a target	`/deploy-check staging`
context:fork	Background research needed	`/deep-research topic` — forks a subagent

Project Skills Reference

Skills built across this course

Skill	Type	Description
`/add-feature`	$ARGUMENTS	Plan, implement, and test a new feature by name
`/pr-review`	Static	Review the current branch diff for bugs, security, and style
`/deploy-check`	$ARGUMENTS	Verify the application is ready to deploy to a target environment
`/deep-research`	context:fork	Research a topic across the codebase via Explore subagent
`/generate-adr`	Static	Draft an architecture decision record for a given decision
`/review-suite`	context:fork (orchestrator)	Dispatch security-scanner + test-reviewer in parallel; synthesise go/no-go

Key Frontmatter Fields

name — the /slash-command name

description — when Claude should use it (keep under 1,536 chars)

tools — allowed tools list

user-invocable — appears in /menu

disable-model-invocation — only you can trigger it

allowed-tools — granular tool permissions

context: fork — runs in subagent

agent: — which subagent (Explore, Plan, general-purpose, or custom)

Skill budget: Skill descriptions are loaded into context. The budget scales at 1% of context window (fallback: 8K chars). If you have many skills, descriptions get truncated. Put the key use case first. Set SLASH_COMMAND_TOOL_CHAR_BUDGET env var to raise the limit if needed.

Cowork parallel: Cowork plugins are pre-built skill bundles — skills, connectors, slash commands, and sub-agents packaged together. Click "Customize" on any installed plugin to adjust. Or use Plugin Create to build from scratch — describe what you want, and Claude assembles the package. Plugins are file-based, same format as Code skills.

Commands Introduced

Custom /skill-name slash commands · $ARGUMENTS for dynamic parameters · context: fork + agent: frontmatter

Checkpoint: You can explain what a skill actually is (a persistent prompt, not a script), name the three patterns (static, $ARGUMENTS, context:fork), and describe when each is appropriate. You've built at least one skill in the starter repo that Claude follows when you invoke it from the slash menu.

Level 3 · Fluent · Module 12

● Subagents & Background

⏱ 15 min

Built-in Subagents

Nested agent loops that protect your main context. Up to 10 in parallel. Background them and keep working.

Objective

Delegate work to each built-in subagent type, background a running agent with Ctrl+B, and use the /agents dashboard to monitor parallel tasks.

🎛 Agent Loop Connection

Subagents are nested agent loops. Each runs its own perceive → reason → act → observe cycle in isolated context, then returns a summary to the parent loop. This is the primary defense against context rot — complex sub-tasks get fresh context instead of polluting your main session.

Three Built-in Agents

Explore — Perceive only

Read-only, runs on Haiku (fast, cheap). Maps your codebase, finds references, traces call chains, answers structural questions. Cannot edit files. Best for: "Where is X used?", "How does Y work?", codebase mapping before a refactor. Use via skills with context: fork + agent: Explore, or ask Claude to "use the Explore agent."

Plan — Perceive + Reason

Gathers context deeply, synthesizes, and returns a recommended approach. Analytical, not executable. Best for: designing a multi-file change, evaluating tradeoffs, creating a migration plan. Claude uses this automatically in plan mode.

General-purpose — Full loop

Has all tools — read, edit, run commands, write files. Runs a complete agent loop in isolation and reports back. Best for: independent sub-tasks ("add tests for module X while I work on module Y"), documentation generation, multi-perspective analysis.

Working with Subagents

/agents — The dashboard

Open with /agents. Two tabs: Library (available agents — built-in + custom) and Running (active agent tasks with status). Monitor parallel work without leaving your main session.

Ctrl+B — Background a running agent

If a subagent is running and you want to keep typing, Ctrl+B sends it to the background. It continues working and reports back when done. Queue your next task immediately — don't wait for the agent to finish.

Parallel execution

Up to 10 subagents can run simultaneously. Ask Claude to "launch three agents in parallel" for tasks like: compare implementations across frameworks, analyze multiple files for the same pattern, or run different test strategies simultaneously.

When to Delegate vs. Do It Yourself

Delegate when...

The task is independent (doesn't need your main conversation's context), exploratory (you're not sure what you'll find), or would pollute your main context with irrelevant detail.

Don't delegate when...

The task depends on earlier conversation context, requires your real-time judgment at each step, or is a quick question the main session can answer faster than spinning up a subagent.

Commands Introduced

/agents (dashboard) · Ctrl+B (background) · implicit via context: fork skills

Cowork parallel: Cowork auto-parallelizes when it detects independent parts of a task. 10 documents that take 50 minutes sequentially → 5 minutes in parallel, then synthesized. You can explicitly request it: "Process these 10 vendor proposals in parallel, then create a comparison matrix."

Checkpoint: You've delegated a task to each subagent type (Explore, Plan, General-purpose), backgrounded one with Ctrl+B, and monitored it in the /agents Running tab while continuing your main work.

Foundation Lab B · Self-Study Guide Part 1 Module 1

📘 Multi-Agent Orchestration

⏱ MVP 3–5 hrs · Full 12–16 hrs

Multi-Agent Orchestration

Understand agent roles, task decomposition, communication patterns, and safety boundaries by building a multi-agent pipeline from scratch.

Prerequisites: Python comfort (TypedDict, dataclasses, basic async). No prior multi-agent framework experience required — the lab builds the orchestration logic from scratch using plain Python, so you see what frameworks like CrewAI and AutoGen abstract away. Sample threat intel data is in labs/lab-b-samples/ in the starter repo. If you want to add a real LLM-powered extractor for the stretch goal, you'll need an ANTHROPIC_API_KEY.

Time estimate: Getting the three-stage pipeline (collector → extractor → reporter) running with the provided sample data takes 3–5 hours. The sample data in lab-b-samples/mock-threat-intel.json has 10 IOCs across four types and three confidence levels — enough to verify your pipeline produces correct output. Full completion with the mapper stretch goal and a real LLM-powered extractor is 12–16 hours.

Module 12 introduced Claude Code's built-in subagents (Explore, Plan, General-purpose). This lab teaches the theory of multi-agent systems so you can design better custom agents in Level 4. By building agents from scratch, you'll understand the orchestration principles that Claude Code's harness implements for you.

Learning Objectives

By the end of this lab you can

Decompose a complex task into agent-sized roles

Implement specialized agents with distinct tools and permissions

Design inter-agent communication patterns

Apply human-in-the-loop controls at critical decision points

Set safety boundaries that agents cannot override

Evaluate pipeline output quality and identify failure modes

Topics Covered

Agent roles and task decomposition patterns

Sequential vs. parallel orchestration

CrewAI, AutoGen, or LangGraph frameworks

Tool use and function calling

Inter-agent message passing

Human-in-the-loop (HITL) control patterns

Agent safety boundaries and guardrails

Error handling and retry strategies

Output quality evaluation

Cost management for multi-agent workflows

The Lab: Threat Hunting Agent Squad

Build a 4-agent pipeline where each agent has a distinct role, distinct tools, and defined communication boundaries. The pipeline accepts raw threat intelligence and produces a structured, actionable report.

Phase 1: Agent Design (~2 hrs)

Define roles, responsibilities, and tool permissions for each agent. Specify what each agent can and cannot do. Document the data flow between agents — what format does Agent 1's output need to be in for Agent 2 to consume it?

Agent 1: Collector

Ingests raw CTI notes, threat reports, blog posts, or log snippets. Handles document parsing, encoding normalization, and format detection. Outputs a clean, structured JSON document. Tools: file reader, URL fetcher, encoding detector. Maps to: Claude Code's Explore agent (Perceive-only).

Agent 2: Extractor

Extracts structured data: IOCs (hashes, IPs, domains, URLs), TTPs (techniques and procedures), threat actor names, malware families, and targeted sectors. Uses regex for IOCs and LLM reasoning for TTPs. Tools: regex engine, NER model, LLM extraction chain. Maps to: specialized Reason step.

Agent 3: Mapper

Maps extracted TTPs to the MITRE ATT&CK framework. Assigns technique IDs (e.g., T1059.001), tactics, and confidence scores. Cross-references against the ATT&CK knowledge base. Tools: ATT&CK API or local JSON, similarity search. Maps to: Plan agent (Perceive + Reason).

Agent 4: Reporter

Synthesizes all upstream output into a structured threat-hunting report formatted for Jira, GitLab issues, or Markdown. Includes executive summary, IOC table, TTP mapping, recommended detection rules, and source references. Tools: template engine, markdown writer. Maps to: General-purpose agent (full Act).

# CrewAI skeleton
from crewai import Agent, Task, Crew, Process

collector = Agent(
    role="CTI Collector",
    goal="Parse and normalize raw threat intelligence",
    tools=[file_reader, url_fetcher],
    allow_delegation=False  # safety boundary
)
extractor = Agent(
    role="IOC Extractor",
    goal="Extract IOCs, TTPs, and threat actor names",
    tools=[regex_tool, llm_extractor]
)
mapper = Agent(
    role="ATT&CK Mapper",
    goal="Map TTPs to MITRE framework with confidence scores",
    tools=[attack_api]
)
reporter = Agent(
    role="Report Writer",
    goal="Produce a structured threat-hunting report",
    tools=[template_engine, markdown_writer]
)

# Sequential pipeline with HITL gate before reporting
crew = Crew(
    agents=[collector, extractor, mapper, reporter],
    tasks=[collect_task, extract_task, map_task, report_task],
    process=Process.sequential,
    human_input=True  # pause for approval before final report
)

Phase 2: Implementation (~4 hrs)

Build each agent. Start with the Collector (simplest I/O) and work forward. Test each agent independently before connecting them. Use a single known threat report as your test document throughout.

Phase 3: Orchestration & HITL (~2 hrs)

Wire the pipeline together. Add a human-in-the-loop gate between the Mapper and Reporter — the human reviews the TTP mappings before the report is generated. Handle edge cases: what happens when the Extractor finds no IOCs? When the Mapper can't match a TTP?

Phase 4: Evaluation (~2 hrs)

Run 3–5 different threat reports through the pipeline. Evaluate: completeness (did it catch all IOCs?), accuracy (are the ATT&CK mappings correct?), and usefulness (would a SOC analyst act on this report?). Document failure modes.

Recommended Stack

Python 3.11+ with CrewAI, AutoGen, or LangGraph

Claude API for extraction and reasoning

MITRE ATT&CK STIX data (local JSON or API)

Regex for IOC extraction (ioc-finder package)

Jinja2 or similar for report templates

Optional: Streamlit dashboard for HITL review

Deliverable

A working multi-agent workflow that accepts a raw threat document and outputs a structured threat-hunting report with: executive summary, IOC table (type, value, context), TTP-to-ATT&CK mapping with confidence scores, recommended detection rules, and source references. The pipeline must include at least one human-in-the-loop gate.

Assessment Criteria

Not passing: Agents exist but don't communicate, no structured output, no HITL gate

Beginner: Agents run sequentially, basic output

Intermediate: HITL gate, error handling, structured output

Advanced: Parallel where possible, confidence scoring, eval metrics

Bridge to Claude Code + Cowork: The Collector maps to Claude Code's Explore agent (Perceive-only, read-only). The Mapper maps to the Plan agent (Perceive + Reason). The Reporter maps to General-purpose (full Act). In Module 17, you'll build custom agents with .claude/agents/ that mirror these roles — same orchestration concepts, but with sandboxing, permission modes, persistent memory, and context:fork built into the harness. Cowork achieves similar parallelism with its sub-agent system for document processing.

Level 3 · Fluent · Module 13

● Plugins & Bundled Skills

⏱ 15 min

Plugins & Bundled Skills

Install before building. Use the built-ins you didn't know existed. The plugin ecosystem is the fastest way to add capability without writing a single skill file.

Objective

Use the bundled skills (/debug, /simplify). Install a marketplace plugin. Understand the plugin architecture (skills + agents + hooks + MCP in one package).

Bundled Skills — Always Available

These ship with Claude Code and are available in every session without installation:

/debug

Structured debugging workflow. Reads the error, traces the execution path, proposes a fix, and verifies. Use this as your default debugging entry point before manual patterns (Module 16).

/simplify

Reduces code complexity. Identifies nested conditionals, long functions, and tangled logic, then refactors step by step.

/batch

Parallel file operations across your codebase. "Migrate all components from class-based to functional." Uses worktrees under the hood for isolation.

/loop

In-session scheduled tasks. "Check PRs every hour." Runs until you stop it or close the session. Covered in depth in Module 18.

/claude-api

Looks up Claude Code's own documentation. When you're not sure how to use a feature, ask Claude Code about itself.

Marketplace Plugins

The claude-plugins-official marketplace is auto-registered on startup. Browse with /plugin → Discover tab:

LSP Plugins — Real-Time Code Intelligence

The highest-impact plugins most people miss. Language Server Protocol plugins give Claude real-time diagnostics — type errors, missing imports, go-to-definition — as it edits. Install the binary first, then the plugin:

typescript-lsp — TS/JS (binary: npm i -g typescript-language-server typescript)

pyright-lsp — Python (binary: pip install pyright)

rust-analyzer-lsp — Rust (binary: rustup component add rust-analyzer)

gopls-lsp — Go (binary: go install golang.org/x/tools/gopls@latest)

clangd-lsp — C/C++ (binary: brew install llvm)

jdtls-lsp — Java (binary: brew install jdtls)

# Example: install TypeScript LSP
npm i -g typescript-language-server typescript
/plugin install typescript-lsp@claude-plugins-official

Why LSP matters: Without an LSP plugin, Claude edits code, then you discover type errors later. With an LSP plugin, Claude sees type errors in real time as it edits — it catches its own mistakes during the Act step before you ever review the diff. This is the single biggest quality improvement for typed languages.

Workflow Plugins

commit-commands

Staged diff → AI-written commit message → commit. One slash command replaces "git add, think of a message, git commit."

security-guidance

OWASP-aware checks on changed files. Catches injection risks, auth bypasses, and hardcoded secrets before they're committed.

code-review

Structured PR review. Produces a checklist with severity levels, not prose. Install, run on a PR, get actionable findings.

feature-dev

Spec → plan → implementation pipeline. Describe the feature, Claude writes the spec, plans the approach, and implements.

Installing and Managing

# Browse the marketplace
/plugin  # opens Discover / Installed / Marketplaces / Errors tabs

# Install from official marketplace
/plugin install commit-commands@anthropics-claude-code

# Add a third-party marketplace
/plugin marketplace add owner/repo

# Install from third-party
/plugin install tool-name@owner-repo

# Hot-reload after changes
/reload-plugins

Curation rule: Each active plugin adds to your context baseline. The skill description budget is 1% of context window. The best practitioners run 2–3 plugins max — not 20. If you can write the instructions in 20 minutes as a CLAUDE.md rule, do that instead of installing a plugin.

Cowork plugins: Department-specific: Productivity, Marketing, Sales, Finance, Legal, Data Analysis, HR, Engineering, Design, Operations, plus financial verticals (IB, Equity Research, PE, Wealth Management). Partner plugins from Slack by Salesforce, LSEG, S&P Global, Apollo, Common Room. Enterprise: private marketplaces with admin controls, per-user provisioning, and auto-install for new team members.

Commands Introduced

/debug · /simplify · /batch · /loop · /claude-api · /plugin · /plugin install · /plugin marketplace add · /reload-plugins

Checkpoint: You've used /debug or /simplify on real code, installed one marketplace plugin, and can explain why plugin curation matters for context budget.

Level 3 · Fluent · Module 14

● MCP & Connectors

⏱ 20 min + setup

MCP Servers & Connectors

How agents talk to external tools. The Model Context Protocol is the open standard behind both Code's MCP servers and Cowork's connectors.

Objective

Connect an MCP server (Obsidian, database, or API), run a cross-context task, and understand when MCP adds value vs. when it adds noise.

🔗 Externalization: Interaction Structure

MCP externalizes how the agent interacts with the outside world. Each MCP server exposes tools and resources that Claude can discover and use. This extends the agent's Perceive step (read data from external sources) and Act step (write to external tools) beyond the codebase.

Adding MCP Servers

# Add by command (simplest)
claude mcp add obsidian npx mcp-remote http://localhost:22360/sse

# Add via JSON (more control)
claude mcp add-json obsidian '{"command":"npx","args":["mcp-remote","http://localhost:22360/sse"]}'

# Manage
claude mcp list     # see all connected servers
claude mcp remove obsidian
/mcp               # in-session management

# Test before installing
claude mcp run obsidian

Common MCP Servers

Obsidian — read/write notes, search by tag, create ADRs

PostgreSQL / MySQL — query databases, inspect schemas

GitHub — manage issues, PRs, actions

Playwright — browser automation, E2E testing

Sentry — read error reports, correlate with code

Slack — read messages, post updates

Built-in: WebSearch & WebFetch

Claude Code has built-in web tools — no MCP server needed:

WebSearch

Search the web for current information during tasks. Use cases: looking up error messages, checking library documentation, finding latest API patterns, verifying if a dependency is still maintained. "Search for the correct way to handle CORS preflight in Fastify 4."

WebFetch

Fetch a specific URL and read its contents. Use cases: reading documentation pages, pulling API specs, checking release notes. "Fetch the Prisma 5 migration guide and apply the changes to our schema."

When to use web tools vs. MCP: Web tools are for ad-hoc lookups — searching docs, checking versions, reading error threads. MCP servers are for persistent integrations — your database, your project tracker, your note system. Web tools require no setup; MCP servers require configuration but give structured, authenticated access.

Example: Obsidian Integration

# .claude/mcp.json (project-level config)
{
  "mcpServers": {
    "obsidian": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:22360/sse"]
    }
  }
}

# Then ask Claude:
"Find all my notes tagged #architecture-decision and summarize the ones from this quarter into a new note called Q2-ADRs.md."

Warning: Every MCP server adds tools to context. Each tool costs context budget. Only enable servers you're actively using. A project that connects Obsidian, PostgreSQL, GitHub, Sentry, and Slack simultaneously burns significant context before you've typed a prompt.

Cowork connectors: 60+ integrations via the same MCP protocol: Slack, Google Workspace (Calendar, Drive, Gmail), Notion, Asana, Linear, Jira, HubSpot, Figma, Snowflake, BigQuery, Databricks, FactSet, MSCI, DocuSign, Apollo, and more. One-click OAuth setup. Connectors stay active across sessions — you authorize once. No limit on connected tools.

Commands Introduced

claude mcp add · claude mcp add-json · claude mcp list · claude mcp remove · claude mcp run · /mcp

No Obsidian account? Use the local filesystem MCP: add {"mcpServers":{"fs":{"command":"npx","args":["-y","@anthropic-ai/mcp-server-filesystem","."]}}} to .mcp.json. No external service required.

Checkpoint: You've connected an MCP server, run a task that uses data from the external tool, and can explain the context budget tradeoff.

Level 3 · Fluent · Module 15

● Hooks & Safety Defaults

⏱ 20 min + hands-on

Hooks & Safety Defaults

Deterministic shell commands that fire at lifecycle events. They always run — unlike LLM behavior, which is probabilistic. Hooks are your quality gates.

Objective

Add hooks using the /hooks interactive UI. Understand the event lifecycle. Build at least one formatting hook and one safety hook.

🎛 Agent Loop Connection

Hooks make the Observe step deterministic. Instead of hoping the agent checks formatting or runs tests, hooks guarantee it. They fire at specific points in the agent loop — before a tool runs (PreToolUse), after a tool runs (PostToolUse), when a session starts, when context compacts. This is the harness's enforcement layer.

The /hooks Interactive UI — Step by Step

Run /hooks for a menu interface. It writes .claude/settings.json for you — no manual JSON editing required. Here's exactly what the flow looks like for adding the two hooks you need for the checkpoint:

Hook 1: Auto-format on edit (PostToolUse)

Run /hooks → select PostToolUse
Matcher: type Edit|Write (matches either tool)
Hook type: command
Command: npx prettier --write $CLAUDE_FILE_PATH || true
Save → Claude writes the entry to .claude/settings.json

Verify: make any edit → you should see prettier run automatically in the output.

Hook 2: Block destructive Bash (PreToolUse)

Run /hooks → select PreToolUse
Matcher: Bash
Hook type: command
Command: echo "$CLAUDE_TOOL_INPUT" | grep -qE "rm -rf|DROP TABLE|sudo rm" && exit 2 || exit 0
Save

Exit code 2 = block + show error to Claude. Exit code 1 = block silently. Exit code 0 = allow. Verify: ask Claude to run rm -rf /tmp/test — it should be blocked.

The resulting .claude/settings.json looks like this — commit it so the whole team gets the same hooks:

Hook Schema

{
  "hooks": {
    "PostToolUse": [
      { "matcher": "Edit|Write", "hooks": [{ "type": "command", "command": ".claude/hooks/format-file.sh" }] }
    ],
    "PreToolUse": [
      { "matcher": "Bash", "hooks": [{ "type": "command", "command": ".claude/hooks/block-dangerous-bash.sh" }] }
    ]
  }
}

Common Hook Recipes

Auto-format on edit

// PostToolUse → matcher: "Edit|Write"
npx prettier --write $CLAUDE_FILE_PATH || true

Auto-run tests after write

// PostToolUse → matcher: "Write"
pnpm test --passWithNoTests 2>&1 | tail -5

Block destructive commands

// PreToolUse → matcher: "Bash"
// Exit non-zero to block the command
echo "$CLAUDE_TOOL_INPUT" | grep -qE "rm -rf|sudo|chmod 777" && exit 1 || exit 0

Load today's date on session start

// SessionStart
date "+Today is %A, %B %d, %Y"

Hook Events in the Agent Loop

Hooks fire at specific points in the agent lifecycle. ● Blocking hooks can prevent the action. ● Observe hooks run after the fact.

SessionStartSession begins — load date, set env vars

InstructionsLoadedCLAUDE.md and rules loaded

UserPromptSubmitYou send a prompt — can validate/transform

PreToolUseBefore any tool — ★ block destructive commands here

PostToolUseAfter tool succeeds — ★ auto-format, run tests here

FileChangedWatched file modified externally

TaskCompletedAgent task marked complete

StopClaude finishes responding

PreCompactBefore context compaction

Hook Events — Full Reference (20+)

Events that can block the action (exit non-zero to prevent):

PreToolUse — before any tool runs

UserPromptSubmit — when you send a prompt

WorktreeCreate — worktree being created

PermissionRequest — permission dialog shown

Elicitation — MCP requests input

ConfigChange — config file changes

Events that are observe-only (cannot block):

PostToolUse — after any tool runs

Stop — Claude finishes a response

SessionStart / InstructionsLoaded

PreCompact / PostCompact

FileChanged / CwdChanged

TaskCreated / TaskCompleted

TeammateIdle / StopFailure

PermissionDenied / WorktreeRemove

Environment variables available in hooks: CLAUDE_FILE_PATH, CLAUDE_TOOL_NAME, CLAUDE_TOOL_INPUT, CLAUDE_SESSION_ID, CLAUDE_PROJECT_DIR.

Performance tip: Keep hooks under 2 seconds. For heavier checks, use PostToolUse (async — doesn't block the agent) or defer to CI. A slow PreToolUse hook pauses the agent on every single tool call.

Five Hook Types

The course so far has shown command hooks. There are four more types:

command

Run a shell command. The most common type. Output is returned to Claude.

http

POST the event JSON to a URL. Use for external logging, Slack notifications, or webhook-based workflows.

mcp_tool

Call a tool on a connected MCP server. Use when you want a hook to trigger an action in an external system (create a Jira ticket, send a Slack message).

prompt

Evaluate the event with the LLM. Use for smart guardrails: "Is this bash command potentially destructive?" — the LLM classifies and the hook blocks based on the answer.

agent

Run an agentic verifier — a full agent loop that checks the proposed action. The most powerful type. Use for complex validation (e.g., "does this code change break the API contract?").

Permission Rule Syntax

Beyond hooks, you can configure granular allow/deny rules in .claude/settings.json using glob patterns:

# Permission rules in settings.json
{
  "permissions": {
    "allow": [
      "Read",                   # all file reads
      "Edit(src/**/*.ts)",      # edit TypeScript files only
      "Bash(git *)",            # any git subcommand
      "Bash(npm test *)",       # npm test with any args
      "mcp__obsidian"           # all tools on obsidian MCP
    ],
    "deny": [
      "Bash(rm -rf *)",        # block recursive deletes
      "Edit(*.env)",            # protect env files
      "mcp__prod-db__write_*"   # block writes to prod DB
    ]
  }
}

Use /permissions to manage rules interactively, or /less-permission-prompts to have Claude propose an allowlist based on your recent transcript — it analyzes what you've been approving and suggests rules to reduce future prompts.

Cowork safety defaults: Cowork doesn't have programmatic hooks, but the same principle applies via global instructions:

"Never delete files without my explicit confirmation. Never modify files outside the designated output folder. Show me your plan before executing any multi-step task. If you're unsure, ask rather than assume. Flag all assumptions explicitly."

These are your deterministic guardrails — they apply to every session, every task.

Commands Introduced

/hooks (interactive UI) · JSON config in .claude/settings.json

Security notes

Never pipe remote scripts into hooks (curl ... | bash). Use least-privilege connector scopes in MCP. In Lab C, only scan networks you own or have written permission to test.

Checkpoint: You've added at least two hooks using /hooks — one for formatting (PostToolUse → prettier) and one for safety (PreToolUse → block destructive commands). Both run automatically without you thinking about them.

Level 3 · Fluent · Module 16

● Debugging Patterns

⏱ 15 min

Debugging with Claude Code

The #1 daily use case. Start with the built-in /debug skill, then apply manual patterns for complex cases.

Objective

Use /debug on a real issue. Apply the five manual debugging patterns. Know when to delegate debugging to a subagent.

Start with /debug

The bundled /debug skill provides a structured debugging workflow out of the box. It reads the error, traces the execution path, analyzes root cause, proposes a fix, and optionally verifies with tests. Use this as your default — fall back to manual patterns when /debug misses the issue or the bug is architectural.

# Simplest approach — let /debug handle it
/debug "The API returns 500 on POST /users when email contains a plus sign"

Five Manual Debugging Patterns

1. Stack Trace Handoff

Paste the full error and stack trace. Ask Claude to trace the execution path and identify root cause. "Here's the stack trace from production. Read the relevant source files and explain what went wrong." This maps to Perceive → Reason — give Claude maximum signal before asking for analysis.

2. Reproduction First

Ask Claude to write a minimal reproduction before fixing. "Write a test that reproduces this bug, then fix it so the test passes." This maps to Act → Observe → Iterate — the test is the verification mechanism built into the fix itself.

3. Bisect with Explore

Delegate to the Explore subagent to narrow down where the bug lives before committing main context. "Use the Explore agent to find every place we modify the user session object." This is a nested Perceive loop — the subagent maps the territory, returns a summary, and your main session stays clean.

4. Explain-then-Fix

Ask Claude to explain what the code does before asking what it should do. Mismatches between the two are the bug. "Explain what processOrder() does step by step. Then tell me what it should do based on the JSDoc comment."

5. Screenshot Debugging

For UI bugs, paste a screenshot directly into the terminal. Claude can process images as Perceive input. "This button should be aligned right, but it's overflowing. Here's a screenshot. Fix the CSS."

When to Delegate Debugging to a Subagent

Delegate when...

The bug might be in one of many files (use Explore to narrow down), the investigation is exploratory (you don't know where to look), or the debugging would pollute your main context with irrelevant code.

Don't delegate when...

You already know which file has the bug, the fix is straightforward, or you need to maintain conversational context about the broader task the bug interrupted.

End-to-End Worked Example: Bug Report to Commit

This walkthrough ties together plan mode, subagents, hooks, skills, and diff review — everything from Levels 2–3 — into a single realistic workflow. Follow along in your own project.

The bug report

"POST /api/orders returns 500 when the cart contains a product with a price of $0.00. Stack trace points to calculateTotal() in src/orders/pricing.ts."

Step 1: Enter plan mode — Perceive + Reason

Press Shift+Tab to enter plan mode. Paste the bug report. Claude reads the stack trace, explores src/orders/pricing.ts and its callers, and proposes an approach — without editing anything yet.

"Here's a bug report: POST /api/orders returns 500 when the cart has a $0.00 product. Stack trace points to calculateTotal() in src/orders/pricing.ts. Research the issue and propose a fix plan."

Claude's plan: "The issue is a division-by-zero in the discount calculation when basePrice is 0. I'll add a guard clause, update the existing test, and add a new edge-case test."

Step 2: Delegate exploration — Subagent

Before approving, verify the scope. Use the Explore agent to check if the same pattern exists elsewhere.

"Use the Explore agent to find every place we divide by price or amount in src/orders/. Are there other division-by-zero risks?"

Explore returns: "Found 3 division operations in pricing.ts, 1 in discount.ts. The discount.ts one has the same vulnerability." You now know the fix needs to cover two files, not one. The plan mode approach caught this before any code was written.

Step 3: Approve and execute — Act

Switch back to default mode (Shift+Tab). Approve the plan with the expanded scope.

"Good plan, but also fix the same issue in discount.ts. Add guard clauses to both. Write a failing test first, then fix it so the test passes."

Claude writes the failing test (Reproduction First pattern), adds guard clauses to both files, and runs the test suite.

Step 4: Hooks fire automatically — Observe

Your PostToolUse hook auto-runs prettier on every edited file. Your test hook auto-runs pnpm test after writes. You don't think about formatting or testing — the hooks handle the Observe step deterministically.

Hook output: "✓ prettier: 2 files formatted. ✓ tests: 47 passed, 0 failed."

Step 5: Review the diff — Human oversight

Run /diff. Check: are the guard clauses correct? Did Claude remove any existing logic? Are the new tests meaningful (not just asserting the function exists)? Accept the changes you're happy with.

"Looks good. Commit with message: 'fix: guard against zero-price division in pricing and discount modules'"

Step 6: Commit — Done

Total time: ~4 minutes. Without Claude Code: reproduce manually, find both files, write fix, write tests, run formatter, run tests, write commit message — probably 30–45 minutes. The agent loop compressed 6 steps into a supervised flow where you made decisions and Claude did the work.

What this example demonstrates: Plan mode (visible Reason before Act), Explore subagent (nested Perceive to widen scope), Reproduction First debugging pattern (failing test before fix), hooks (deterministic Observe — formatting + tests), /diff (human oversight before commit), and outcome-based prompting (describe what done looks like). These are Levels 2–3 features working together as a system, not isolated commands.

Checkpoint: You've used /debug on a real issue and applied at least one manual pattern (reproduction first or bisect with Explore) on a case where /debug wasn't sufficient.

Level 4 · Native · Module 17

● Custom Agent Teams

⏱ 25 min + hands-on

Custom Subagents & Agent Teams

Design specialized agents with persistent memory, permission boundaries, and model selection. Chain them into multi-phase pipelines. Up to 10 in parallel.

Objective

Create a custom subagent with YAML frontmatter, build a multi-phase pipeline (Explore → Plan → Execute), and configure persistent agent memory.

Custom Agent Definition

Custom subagents are defined as Markdown files in .claude/agents/ (project scope) or ~/.claude/agents/ (personal scope). YAML frontmatter controls behavior; the body is the agent's system prompt.

# .claude/agents/security-reviewer.md
---
name: security-reviewer
description: OWASP top 10 review. Use PROACTIVELY after auth or data-handling changes.
model: opus            # hard reasoning → use Opus
tools: [Read, Grep, Glob, Bash]
permissionMode: plan   # shows plan, waits for approval
effort: high           # thorough analysis
memory: user           # remembers across sessions
maxTurns: 20           # iteration limit
---
You are a senior security engineer.
Analyze for OWASP Top 10 vulnerabilities, secrets,
hardcoded credentials, and SQL injection.

Before starting:
1. Check your MEMORY.md for past findings on this codebase.
2. Read the changed files and their dependencies.

For each finding:
- Rate severity (low / medium / high / critical)
- Show the exact vulnerable code
- Propose a specific remediation
- Reference the relevant OWASP category

After review:
- Update MEMORY.md with new patterns discovered
- Report findings with severity levels

Frontmatter Reference

model — opus (deep reasoning), sonnet (balanced), haiku (fast exploration)

tools — allowed tools list. Plugin subagents cannot use hooks or mcpServers

permissionMode — plan (review before act), auto (act freely), default (inherit parent)

effort — low (quick scan), medium (standard), high (thorough analysis)

memory — user (persistent in ~/.claude/agent-memory/) or project

maxTurns — cap iterations to prevent runaway agents

isolation — context boundaries for the subagent

The Explore → Plan → Execute Pipeline

The most powerful pattern: chain specialized agents where each handles one phase of the agent loop:

👁

Explore

Haiku · read-only

Maps territory
Finds files, traces calls

→

🧠

Plan

Sonnet/Opus · analytical

Designs approach
You review + approve

→

⚡

Execute

Sonnet · full tools

Implements changes
Runs tests, iterates

→

✓

You

review final diff

Accept, reject,
or iterate

Phase 1: Explore agent (Haiku, read-only)

Maps the territory. Reads files, traces call chains, identifies scope. Returns a codebase summary. Fast and cheap.

Phase 2: Plan agent (Sonnet/Opus, analytical)

Takes the Explore summary. Designs the approach, identifies risks, proposes a step-by-step plan. You review and approve.

Phase 3: Execute agent (Sonnet, full tools)

Takes the approved plan. Implements changes, runs tests, iterates on failures. You review the final diff.

Agent Memory

Agents with memory: user maintain a MEMORY.md in ~/.claude/agent-memory/<agent-name>/. This persists across sessions — the security-reviewer remembers past findings, the doc-writer remembers your style preferences. This is Layer 3 memory (per-agent), distinct from CLAUDE.md (Layer 1, you write) and .claude/memory/ (Layer 2, auto-managed).

Commands

/agents (Library + Running tabs) · agents discoverable via context: fork + agent: <name> in skills

Checkpoint: Your repo has at least one custom agent in .claude/agents/ with persistent memory. You've chained an Explore → Plan → Execute pipeline on a real task.

Foundation Lab C · Self-Study Guide Part 1 Module 3

📘 Local LLM Security & Deployment

⏱ MVP 2–3 hrs · Full 10–14 hrs

Local LLM Security & Deployment

Deploy agents outside the managed harness — with proper sandboxing, audit logging, command enforcement, and container isolation.

Time estimate: The MVP (implement is_allowed() and run_sandboxed(), verify with the three sample scripts) is 2–3 hours — this lab has the most scaffolding and the tightest scope. The three sample files in lab-c-samples/ give you immediate feedback: sandbox_test.py should exit 0, timeout_test.py should hit your timeout, and blocked_command_test.py documents what the allowlist must reject. Full completion with the audit logger stretch goal and Docker isolation is 10–14 hours.

Module 17 gives you agent teams inside Claude Code's managed harness. But what happens when you deploy agents outside that harness — in CI/CD, on local infrastructure, or in production pipelines? This lab teaches the security principles you need: sandboxing, allowed/denied command lists, prompt/response logging, secrets handling, and container isolation. These are the same principles Claude Code's hooks and Cowork's safety defaults implement — but now you understand them deeply enough to build your own.

Learning Objectives

By the end of this lab you can

Deploy a local LLM endpoint (Ollama, vLLM, or LM Studio)

Sandbox tool execution safely with Docker

Implement allowed/denied command lists that agents cannot bypass

Build a full prompt/response audit logging pipeline

Configure timeout controls and resource limits

Handle secrets without exposing them to the agent

Topics Covered

Local LLM deployment (Ollama, vLLM, LM Studio)

Local vs. API model tradeoffs (latency, cost, privacy)

Docker containerization for tool execution

Allowed/denied command enforcement

Prompt injection defense patterns

Prompt/response logging and audit trails

Secrets management (env vars, vaults, never in prompts)

Rate limiting and resource controls

Timeout management for long-running tasks

Monitoring and alerting for agent behavior

The Lab: Sandboxed Local Security Agent

Build a secure local agent runtime where a locally-hosted LLM can execute a controlled set of security reconnaissance commands inside a Docker container. The agent must refuse to run denied commands even when prompted, and all interactions must be logged for audit.

Security boundaries — read before running any commands

Only scan networks you own or have written permission to test. Running nmap, ping, or traceroute against infrastructure you don't control is illegal in most jurisdictions. The allowed command list in this lab is for localhost and private test ranges only.

Never pipe remote scripts. curl … | bash and wget -O- | sh patterns are in the deny list for a reason — do not add exceptions, even for setup convenience.

Use least-privilege connector scope. If you extend the lab to call external APIs or MCP servers, request only the minimum permissions needed. Broad scopes widen the blast radius if the agent is prompt-injected.

Run inside Docker. The container isolation in Phase 3 is not optional polish — it is the primary safety boundary. Do not skip it when testing allowed commands against real targets.

Phase 1: Local LLM Setup (~2 hrs)

Deploy a local model using Ollama (easiest) or vLLM (most flexible). Verify it runs, serves an API endpoint, and can handle tool-use-style prompts. Compare response quality and latency against the Claude API for a standard task. Document the tradeoffs — when would you choose local over API?

Phase 2: Command Enforcement (~3 hrs)

Build the core safety layer: an allowed/denied command system that sits between the LLM's tool calls and actual execution. The LLM proposes commands; your enforcement layer validates them before execution.

# Command enforcement skeleton
ALLOWED_CMDS = ["nmap -sn", "dig", "whois", "ping -c 4", "traceroute"]
DENIED_CMDS  = ["rm -rf", "curl | bash", "sudo", "chmod 777", "wget -O-|sh"]

def validate_command(cmd: str) -> tuple[bool, str]:
    # Check denied patterns first (blocklist)
    for denied in DENIED_CMDS:
        if denied in cmd:
            return False, f"[BLOCKED] '{denied}' is not permitted"
    # Check allowed patterns (allowlist)
    if not any(cmd.startswith(a) for a in ALLOWED_CMDS):
        return False, "[BLOCKED] Command not in allowed list"
    return True, "OK"

def execute_safe(cmd: str, timeout: int = 30) -> str:
    is_ok, reason = validate_command(cmd)
    if not is_ok:
        log_event("DENIED", cmd, reason)
        return reason
    # Run inside Docker with resource limits
    result = docker_exec(cmd, timeout=timeout, mem_limit="256m")
    log_event("EXECUTED", cmd, result)
    return result

Phase 3: Docker Sandboxing (~2 hrs)

All commands execute inside a Docker container with: read-only filesystem (except /tmp), no network access by default (enable per-command), CPU and memory limits, non-root user, and automatic cleanup after timeout. Test prompt injection: can the agent trick the system into running a denied command? It must fail.

Phase 4: Audit Logging (~2 hrs)

Build a complete audit trail: every prompt sent to the LLM, every response received, every command proposed, every validation decision (allowed/denied), every execution result, and timestamps for all. Store as structured JSON. Implement a simple log viewer that can filter by action type and time range.

# Audit log entry structure
{
  "timestamp": "2026-05-03T14:22:01Z",
  "session_id": "abc-123",
  "event": "COMMAND_DENIED",
  "prompt": "Run sudo rm -rf /var/log to clean up",
  "proposed_command": "sudo rm -rf /var/log",
  "denial_reason": "'sudo' is not permitted",
  "model": "llama3:8b"
}

Phase 5: Integration & Testing (~1 hr)

Wire everything together. Run the full agent loop: user provides a target (e.g., a domain name), the agent proposes a reconnaissance plan, the enforcement layer validates each command, Docker executes the safe ones, and results are logged and returned. Run adversarial tests: prompt injection attempts, command chaining, encoding tricks. Document what your system catches and what it misses.

Recommended Stack

Ollama with llama3 or mistral (local LLM)

Docker for container isolation

Python 3.11+ with subprocess / docker-py

JSON-lines for audit logging

Optional: LiteLLM as a unified API gateway

Optional: Grafana/Loki for log visualization

Deliverable

A sandboxed agent runtime demonstrating: allowed/denied command enforcement (agent proposes, system validates), Docker container isolation with resource limits, full prompt/response audit logging (JSON-lines), timeout controls, and at least 3 documented adversarial test cases (prompt injection attempts) showing the system correctly blocking them.

Assessment Criteria

Not passing: No command enforcement or agent can bypass allowed list, no logging

Beginner: Allowed/denied list works, basic logging

Intermediate: Docker isolation, timeout controls, structured audit

Advanced: Prompt injection testing, secrets handling, monitoring

Bridge to Claude Code + Cowork: Map what you built to the production harness:
• Your allowed/denied list → Claude Code's hooks.PreToolUse matchers + Cowork's safety defaults
• Your audit logging → OTEL tracing (both Code and Cowork GA)
• Your Docker isolation → claude -w git worktrees + Cowork's sandboxed VM
• Your timeout controls → agent maxTurns frontmatter
• Your rate limits → /usage + enterprise governance via private plugin marketplaces

The lab teaches you to build these controls from scratch. Claude Code and Cowork give them to you out of the box. Understanding both means you can extend the harness when the defaults aren't enough.

Security notes

Never pipe remote scripts into hooks (curl ... | bash). Use least-privilege connector scopes in MCP. In Lab C, only scan networks you own or have written permission to test.

Level 4 · Native · Module 18

● Background, Scheduling & Voice

⏱ 15 min

Background Agents, Scheduling & Voice

Schedule Claude to do work while you sleep. Talk to it hands-free. Both Code and Cowork now have scheduling — in different forms.

Objective

Set up a /loop scheduled task, use /voice for hands-free input, and understand Cowork's Scheduled Tasks, Routines, and Dispatch.

Claude Code: /loop and /schedule

/loop runs a prompt on a schedule inside your current session. /schedule creates persistent routines that survive session restarts:

# In-session recurring task
/loop every 1h "check the latest 5 open PRs, leave a review comment on any with missing tests"

# Persistent routine (survives restarts) ★ PREVIEW
/schedule "Run security scan on main branch every morning at 9am"

# Push-to-talk voice input (hold spacebar)
/voice

# Kill all background agents
Ctrl+F

When to Use Which

/loop — ephemeral, in-session

Lives only as long as your terminal is open. Best for: monitoring during active work sessions (watch CI while you code), periodic checks during a long refactor (re-run tests every 30 min), development-time automation (check for new issues while coding). Stops when you close the session.

/schedule — persistent, cross-session ★ PREVIEW

Survives restarts. Best for: daily standup preparation (generate a summary every morning), periodic security scans, weekly report generation. Runs at the scheduled time if Claude Code is available.

Neither — use CI instead

If the task should run whether or not you're at your computer, on every PR, or on every commit — that's a CI job, not a /loop or /schedule. Use claude -p in a GitHub Action (Module 20).

Real-World Scheduling Patterns

# Monitor CI while coding
/loop every 5m "check CI status for current branch, alert if failed"

# Watch for dependency vulnerabilities
/loop every 2h "run npm audit, report any high/critical findings"

# Auto-generate end-of-day summary
/schedule "At 5pm, summarize today's git commits into a standup note"

# Self-pacing: /loop adjusts intervals based on results
/loop "check for new GitHub issues assigned to me"  # Claude picks the interval

Voice Input

/voice — Push-to-talk

Hold spacebar to dictate. 20 languages supported. Great for complex requirements where typing is slow, pair-programming sessions where your hands are on the keyboard, and accessibility. Particularly powerful for dictating architectural descriptions or debugging narrations.

Managing Background Work

Ctrl+B — Send to background

Background a running task so you can keep typing. The agent continues working and reports back when done.

Ctrl+F — Kill all background agents

Emergency stop for all background and scheduled tasks. Use when a /loop is misbehaving or an agent is stuck.

Cowork scheduling is more advanced:
• Scheduled Tasks — run on a timer while your desktop is open. If your machine is off during a scheduled run, the task is skipped but shows in task history and auto-runs when you reopen the app.
• Routines ★ PREVIEW — cloud-hosted automation that runs even when your laptop is closed. The agent equivalent of cron — your machine doesn't need to be on.
• Dispatch — text a task from your phone → Cowork runs the full workflow on your desktop → you return to finished files. Requires desktop app running and computer awake. Your phone becomes a remote trigger.

Commands Introduced

/loop · /voice · Ctrl+B (background) · Ctrl+F (kill all)

Checkpoint: You have a /loop agent doing periodic work (PR review, monitoring, or security scans). You've used /voice to dictate a complex prompt.

Level 4 · Native · Module 19

● Worktrees & Batch

⏱ 12 min

Git Worktrees & /batch

Stop stashing. Start parallelizing. Each Claude session gets its own isolated directory with its own branch.

Objective

Run three Claude sessions in parallel on different branches. Use /batch for parallel operations across many files.

🎛 Externalization: Sandboxing

Worktrees are the harness's sandboxing layer. Each Claude session operates in an isolated directory with its own branch. No stashing, no conflicts, no context contamination between parallel work streams.

Parallel Worktree Sessions

# Terminal 1: auth feature
claude -w feature-auth
"Implement OAuth2 PKCE flow for the login endpoint"

# Terminal 2: billing feature
claude -w feature-billing
"Add Stripe subscription management to the billing module"

# Terminal 3: bugfix
claude -w bugfix-cors
"Fix the CORS headers for the API gateway"

All three sessions work simultaneously. Skills, agents, and CLAUDE.md are discovered correctly from worktrees. Each branch gets its own git history. Merge when ready.

When to Use Worktrees vs. Subagents vs. Sessions

Use worktrees when...

Tasks are independent features on separate branches. You need full git isolation. You want to merge each piece separately. The work is substantial (hours, not minutes). Example: two features + a bugfix in parallel.

Use subagents when...

Tasks share the same branch but can run in parallel. You need results synthesized into your main session. The work is exploratory or analytical (minutes, not hours). Example: analyze 5 modules for the same pattern.

Use separate sessions when...

Tasks are in different projects entirely. No shared codebase. Example: fix a bug in project A while starting a feature in project B.

Worktree Tips

# Use --tmux for side-by-side panes
claude -w feature-auth --tmux  # opens in a new tmux pane

# Name sessions for easy /resume later
claude -w feature-auth -n "auth-oauth2"

# List active worktree sessions
/tasks

/batch — Parallel File Operations

# Apply a change across many files in parallel
/batch "migrate all React class components to functional components with hooks"

# Batch uses worktrees under the hood for isolation

/batch is a bundled skill that parallelizes operations across your codebase. It identifies affected files, creates isolated worktrees, applies changes in parallel, and merges the results. Particularly effective in monorepos.

Cowork parallel: Cowork's Projects feature provides the same context isolation for non-code work. Run 5 different client projects simultaneously — each with its own files, connectors, and conversation history. No context bleeding between them.

Commands Introduced

claude -w <branch> · /batch

Checkpoint: You've run at least two Claude sessions in parallel on different branches using claude -w. You've used /batch for a multi-file operation.

Level 4 · Native · Module 20

● Headless Mode & Agent SDK

⏱ 20 min

Headless Mode & the Agent SDK

If Claude Code can fix a bug manually, it can fix a bug in CI. Run the agent loop without a human at the keyboard.

Objective

Run Claude Code in headless mode (claude -p), set up a GitHub Action for automated PR review, and understand the Agent SDK for programmatic access.

Headless Mode

claude -p "prompt" runs a one-shot task, returns output, and exits. No interactive session. Perfect for CI/CD, cron jobs, and scripting:

# One-shot headless tasks
claude -p "Review this PR diff for security issues and output findings as JSON."
claude -p "Generate release notes from the last 10 merged PRs."
claude -p "Run the security-reviewer agent on the changed files."

GitHub Action Example

# .github/workflows/claude-review.yml
name: Claude PR Review
on: [pull_request]
jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Claude Code
        run: curl -fsSL https://claude.ai/install.sh | bash
      - name: Run review
        run: |
          claude -p "Review the diff in this PR for security issues, \
          missing tests, and style violations. Post findings \
          as a PR comment via gh pr comment."
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Required secrets — add these in repo Settings → Secrets and variables → Actions

Secret	What to set	Used by
`ANTHROPIC_API_KEY`	Your key from console.anthropic.com	claude-review.yml (PR review)
`NOTIFY_EMAIL_USER`	Your Yahoo address	discussion-notify.yml (sender)
`NOTIFY_EMAIL_PASS`	Yahoo App Password (not account password)	discussion-notify.yml (auth)
`NOTIFY_EMAIL_TO`	Where discussion alerts go (can be same as USER)	discussion-notify.yml (recipient)

Missing any secret → that workflow fails silently. See .github/workflows/README.md in the starter repo for setup steps including how to create a Yahoo App Password.

The Agent SDK

The Agent SDK (Python and TypeScript) gives you programmatic access to Claude Code's full harness — hooks, tools, memory, and agent orchestration. Build Claude Code into your own applications, not just your terminal:

/install-github-app — Always-On PR Review Bot

Distinct from headless CI review — this installs a persistent GitHub App that automatically reviews every PR on your repo:

# Install the always-on reviewer
/install-github-app

# This adds a claude-code-review.yml config to your repo
# Customize the review prompt in this file

Why this matters

As AI tool usage increases, your PR volume increases. Humans nitpick variable names; Claude finds actual logic errors and security issues. The key is customizing the review prompt in claude-code-review.yml — out of the box it's too verbose and comments on everything. Tune it to your team's priorities: security-only, architecture-only, or whatever matters most.

/install-slack-app — Claude in Slack

Installs the Claude Slack app via OAuth. Once connected, team members can interact with Claude directly in Slack channels — ask questions about the codebase, trigger reviews, or run skills from chat.

Advanced Headless Skills

/ultrareview [PR#] — Parallel multi-agent PR review ★ PREVIEW

Spawns multiple review agents in parallel — security, performance, architecture, test coverage — each analyzing the PR from a different angle. Results are synthesized into a single comprehensive review. More thorough than /install-github-app but runs on-demand rather than automatically.

/security-review — Vulnerability scan

Scans the current branch diff for security vulnerabilities. More focused than the security-guidance plugin — specifically targets the changed code and maps findings to OWASP categories with severity ratings.

Key Headless Flags

# Cost cap — essential for CI (prevents runaway spend)
claude -p "Review PR" --max-budget-usd 5

# Turn limit — prevent infinite loops
claude -p "Fix this bug" --max-turns 10

# Bare mode — skip plugins/hooks for faster startup
claude -p "Check syntax" --bare

# JSON output — for programmatic consumption
claude -p "List all TODO comments" --output-format json

Headless claude -p vs. /install-github-app: Use headless for one-shot CI tasks (run on specific events, custom logic). Use the GitHub App for always-on, every-PR review with a consistent prompt. Both complement each other — many teams use the App for standard review and headless for specialized checks (security scan, dependency audit).

Common SDK Patterns

Automated PR reviewer that runs on every push

Bug triage bot that reads stack traces and assigns severity

Release note generator from merged PRs

Documentation updater triggered by code changes

Custom CI step that verifies architectural constraints

Observability: OTEL Tracing

Both headless mode and the Agent SDK support OpenTelemetry tracing. Interaction spans wrap full agent turns — every tool call, every file read, every command execution is traced. Pipe traces to your SIEM, Grafana, or Datadog for monitoring agent behavior in production.

Commands Introduced

claude -p "<prompt>" · Agent SDK (Python/TypeScript) · OTEL tracing

CI Preflight Checklist

Before wiring Claude into CI, verify these — each is a common silent failure point:

Before you push the workflow

ANTHROPIC_API_KEY secret set?	Repo Settings → Secrets and variables → Actions. Missing secret = workflow fails silently on every run.
GITHUB_TOKEN write permission?	Repo Settings → Actions → General → Workflow permissions → "Read and write". Required for `gh pr comment`. Default is read-only — you must change it.
Fork PRs blocked?	By design, fork PRs cannot access secrets (prevents secret exfiltration). The workflow will trigger but skip the Claude step. Expected behaviour — not a bug.
Budget cap set?	Always pass `--max-budget-usd 5` (or lower) in headless mode. Without it, a runaway task or prompt loop can drain your API credits before you notice.
Expected failure modes	`exit 1` = Claude hit an error or the task failed. `exit 2` = budget exceeded. Empty output = ANTHROPIC_API_KEY missing or invalid. Review step logs before debugging the prompt.
First run: watch it live	Open Actions tab in GitHub while the first run executes. Tail the logs in real time. Claude's reasoning and tool calls are visible — diagnose there before anything else.

Checkpoint: You've run Claude Code in CI via claude -p without a human at the keyboard. You have a working GitHub Action (or equivalent) that posts automated review comments.

Level 4 · Native · Module 21

● Build Your Plugin

⏱ 30 min + hands-on

Build & Publish Your Plugin

Your team has tribal knowledge locked in people's heads. A plugin turns it into a one-command install for every new hire — on both Code and Cowork.

Objective

Create a plugin with plugin.json, bundle skills + agents + hooks, publish to a marketplace (private or public), and install it on a teammate's machine.

Plugin Anatomy

my-team-plugin/
├── .claude-plugin/
│   └── plugin.json        # metadata — name, version, description (required)
├── commands/
│   └── review.md          # /review slash command
├── skills/
│   └── deploy/
│       └── SKILL.md       # /deploy with $ARGUMENTS
├── agents/
│   └── security-reviewer.md  # custom subagent
├── hooks/
│   └── post-edit-format.sh   # auto-prettier on edit
├── .mcp.json              # bundled MCP server configs
└── README.md              # usage documentation

plugin.json

{
  "name": "acme-team-toolkit",
  "version": "1.0.0",
  "description": "Code review, security, and deploy workflows for Acme API",
  "author": "Acme Engineering"
}

Distribution

Marketplaces

Marketplaces are GitHub repos (or GitLab, self-hosted) with a marketplace.json that lists available plugins. The official Anthropic marketplace is auto-registered. Add your team's private marketplace with /plugin marketplace add your-org/plugins.

Install Scopes

user — personal (your machine only). project — committed to repo (everyone who clones gets it). local — this repo, just for you. Enterprise admins can auto-install approved plugins for new team members.

Security Sandbox

Plugin subagents cannot use hooks, mcpServers, or permissionMode in their frontmatter — this is a security boundary. Plugins can bundle hooks separately, but agents inside plugins are sandboxed.

# On a teammate's machine:
/plugin marketplace add your-org/plugins
/plugin install acme-team-toolkit@your-org
/reload-plugins
# All commands, agents, and hooks are now live

Building a Cowork Plugin — Minimal Working Example

Cowork plugins use the same plugin.json format as Code plugins. If you've only used Claude Code, here's the full picture for the Cowork capstone requirement. A minimal Cowork plugin has four files:

my-cowork-plugin/
├── .claude-plugin/
│   └── plugin.json        # required: name, version, description, author
├── skills/
│   ├── weekly-update/
│   │   └── SKILL.md       # /weekly-update slash command
│   └── draft-brief/
│       └── SKILL.md       # /draft-brief slash command
└── context/
    └── brand-voice.md     # loaded into every session (optional but valuable)

plugin.json (Cowork)

{
  "name": "acme-gtm-toolkit",
  "version": "1.0.0",
  "description": "Weekly update and brief drafting workflows for GTM team",
  "author": "Acme GTM"
}

skills/weekly-update/SKILL.md

---
name: weekly-update
description: Draft a weekly stakeholder update from Slack and project files
user-invocable: true
---

Read the files in the current folder. Pull last week's Slack messages from
#product-updates if connected. Draft a weekly stakeholder update using the
brand voice in context/brand-voice.md.

Format: subject line, 3-bullet executive summary, detail section.
Save as: weekly-update-YYYY-MM-DD.docx in the current folder.
Flag any assumptions you made about what counts as "last week".

How a non-developer installs and uses it

Open Claude Desktop → Cowork tab → Customize → Plugins → Browse
Find your plugin (published to a GitHub repo) → Install
Type /weekly-update in any Cowork session — the skill appears in the slash menu
Claude reads the SKILL.md body as its instructions, accesses connected tools, and produces the output

Or use Plugin Create: in the Cowork tab, say "Help me create a plugin for weekly stakeholder updates with a /weekly-update slash command." Claude walks you through building it conversationally — no SKILL.md syntax needed.

Key difference from Code plugins: Cowork skills don't use context: fork, agent:, or tool lists in frontmatter — Cowork manages model selection and context isolation automatically. Your SKILL.md body is the instruction; the platform handles execution. Enterprise admins can create private marketplaces, control installation, and auto-install approved plugins for new team members.

Commands Introduced

plugin.json manifest · --scope project|user|local · /plugin marketplace add · /reload-plugins

Checkpoint: You've created a plugin with plugin.json, at least one skill, one agent, and one hook. You've published it to a marketplace (private GitHub repo) and a teammate installed it with one command.

Level 4 · Native · Module 22

● Team Operating System

⏱ 15 min

Building Your Team's Operating System

Fluency is personal. Native means the whole team moves differently — developers and non-developers alike.

Objective

Synthesize all layers into a repo where any new team member — developer (Code) or non-developer (Cowork) — is productive on day one.

The Five Layers

Distribution — Plugins & Marketplaces

Plugins bundle everything into one-command installs. Private marketplaces for enterprise. Auto-install for new hires. Same format for Code and Cowork.

🛠 SKILLS

Connection — MCP & Connectors

MCP servers for Code. Connectors (60+) for Cowork. Same Model Context Protocol. Jira, Slack, Obsidian, Drive, Snowflake.

🔗 PROTO

Guardrails — Hooks & Safety Defaults

Hooks (20+ events, 5 types) for Code. Safety defaults in global instructions for Cowork. Both enforce deterministic behavior.

🎛 HARNESS

Skills — Procedures & Agents

Custom skills with context:fork and $ARGUMENTS. Custom subagents with persistent memory. Cowork plugins for non-developers.

🛠 SKILLS

Memory — CLAUDE.md & Context Files

Three layers: semantic (CLAUDE.md), episodic (.claude/memory/), per-agent. Cowork: about-me.md, brand-voice.md, working-preferences.md.

🧠 MEM

🧠 Memory Layer

CLAUDE.md (project policy for developers) + Cowork context files (about-me.md, brand-voice.md, working-preferences.md for non-developers) + .claude/memory/ (auto-managed episodic). Everyone on the team starts every session with the same knowledge base.

🛠 Skills Layer

Custom skills with context:fork and $ARGUMENTS for developers. Cowork plugins with domain-specific slash commands for non-developers. Both turn tribal knowledge into executable procedures.

🎛 Guardrails Layer

Hooks (20+ events, configured via /hooks) for Code. Safety defaults in global instructions for Cowork. Both enforce deterministic behavior — formatting, testing, deletion protection, assumption flagging.

🔗 Connection Layer

MCP servers for Code (claude mcp add). Connectors (60+) for Cowork. Same Model Context Protocol under the hood. Connect once, use everywhere — Jira, Slack, Obsidian, Drive, Snowflake.

🛠 Distribution Layer

Plugins bundle everything into one-command installs. Private marketplaces for enterprise governance. Auto-install for new hires. Same plugin format works for both Code and Cowork.

The Onboarding Test

The ultimate test of a Native-level team: A new engineer clones the repo, runs claude, and ships a feature on day one — without reading a wiki. A new marketing hire opens Cowork, installs the team plugin, and produces a campaign brief matching the brand voice — without a style meeting. The team's expertise lives in the harness, not in people's heads.

Case Study: How Team Acme Set This Up

A 12-person SaaS team (8 engineers, 2 marketers, 1 PM, 1 designer) rolled out Claude Code + Cowork over 4 weeks. Here's what they built at each layer and what happened when a new hire joined.

Week 1: Memory Layer

The tech lead wrote a CLAUDE.md covering their TypeScript/Fastify/Prisma stack, test commands, naming conventions, and architectural constraints (no raw SQL, no direct DB access outside the repository layer). The marketing lead created about-me.md (company positioning), brand-voice.md (with 3 real blog post samples), and working-preferences.md ("always ask before sending emails, never make claims about pricing without checking the pricing page"). Both were iterated over the week as Claude's output revealed gaps.

Week 2: Skills + Agents Layer

Engineers built 4 skills: /pr-review (their code review checklist encoded), /deploy $ARGUMENTS (staging/prod with safety checks), /deep-research (context:fork exploration), and /onboard-module $ARGUMENTS (explains any module to a new hire). They added a security-reviewer agent with persistent memory. The PM created a Cowork plugin with /sprint-summary and /stakeholder-update slash commands.

Week 3: Guardrails + Connections

The team added 3 hooks: auto-prettier on edit, auto-test on write, and a PreToolUse block on rm -rf and direct production DB access. They connected GitHub (issues/PRs), Slack (team channel read), and their PostgreSQL staging database via MCP. Marketing connected Google Drive, Slack, and HubSpot via Cowork connectors.

Week 4: Distribution

Everything was bundled into two plugins: acme-eng-toolkit (skills + agents + hooks for engineers) and acme-gtm-toolkit (Cowork plugin for marketing/PM). Both published to a private GitHub marketplace. The admin configured auto-install for new team members.

The result: New engineer, day one

Sarah joined on Monday. She cloned the repo, ran claude, and the plugin auto-installed. CLAUDE.md loaded the conventions. She ran /onboard-module auth and got a narrated walkthrough of the auth module. She asked Claude to add input validation to a form, reviewed the diff, and submitted a PR. The /install-github-app bot reviewed her PR before lunch. By end of day, she'd merged her first feature — without a single "how do we do X here?" Slack message.

The result: New marketer, day one

Jake opened Cowork and installed acme-gtm-toolkit. His context files loaded the brand voice. He ran /sprint-summary and Cowork pulled the latest sprint data from Jira via connector, cross-referenced with Slack discussions, and produced a stakeholder update matching the team's communication style. He edited two sentences and sent it. His manager said it read like someone who'd been on the team for months.

What Your Repo Looks Like

your-project/
├── CLAUDE.md                 # project policy
├── .claudeignore             # secrets out of context
├── .claude/
│   ├── settings.json         # team config + hooks
│   ├── skills/
│   │   ├── pr-review/        # code review workflow
│   │   ├── deploy/           # deploy with $ARGUMENTS
│   │   └── deep-research/    # context:fork exploration
│   ├── agents/
│   │   ├── security-reviewer.md  # OWASP scanner with memory
│   │   └── doc-writer.md     # documentation agent
│   ├── mcp.json              # connected tools
│   └── memory/               # auto-managed (gitignored)
├── cowork/
│   ├── about-me.md           # team context for Cowork users
│   ├── brand-voice.md        # communication style
│   └── working-preferences.md # behavior rules
├── .github/
│   └── workflows/
│       └── claude-review.yml # headless PR review
└── plugin/
    ├── .claude-plugin/plugin.json  # team plugin manifest
    └── ...                   # bundled skills, agents, hooks

Cowork Reference: 30 Best Practices

Once your team has the five-layer harness in place, the next step is operating discipline. These 30 practices — compiled from Nav Toor's foundational guide, Anthropic's official documentation, and community workflows — map directly to the layers above and fill the gaps that setup alone doesn't address.

Implementation Checklist (from the guide)

Today (30 min)	Create a dedicated working folder. Create your three context files (about-me.md, brand-voice.md, working-style.md). Set your Global Instructions. This alone puts you ahead of 95% of Cowork users.
This week	Add a _MANIFEST.md to your most-used project folder. Install 2–3 role-matched plugins. Set up one scheduled task. Start a decisions-log.md. Use Chat to draft a complex Cowork prompt before running it.
This month	Build your first custom skill for your most repeated workflow. Experiment with subagents on a complex research task. Refine context files based on output quality. Add self-review steps to high-stakes skills.
Monthly (ongoing)	Run a system audit: update manifests, context files, and skills. Mark superseded decisions. Identify new workflows to automate. Review every output before it ships — always.

The gap most teams don't close: Practices 18–22 (continuity and QA) — decisions-log.md for session handoffs, self-review steps in skills, rollback protocols for file operations, incremental refinement instead of full regeneration, and monthly system audits. These are what separates teams that get consistent results from teams that get impressive demos. Full guide: 30-cowork-best-practices.pdf

Checkpoint: Your team repo has all five layers configured. A new contributor (developer or non-developer) has verified they can start their tool and be productive immediately. The team's knowledge is externalized into the harness.

Synthesis

★ Capstone

⏱ 4–8 hrs total

AI-Augmented Team Workflow

Build a complete, portable team setup demonstrating every level of the course — for both Code and Cowork surfaces. This is your portfolio piece.

Objective

Produce a working repo that another team could clone and be productive with immediately. This proves you've internalized externalization — your knowledge lives in the harness, not in your head.

Capstone Tracks

Three complete outcomes — each one is a genuine win. Pick the track that fits your context.

◆ Solo developer → Core Pass is your target. It proves you can externalize a full dev workflow.

◆ Individual contributor with CI access → Professional Pass. Your setup runs while you sleep.

◆ Tech lead or team builder → Native/Team Pass. Your knowledge lives in the harness, not your head.

Core Pass The complete solo developer setup

A self-contained AI-augmented development workflow. No external accounts, no team required. This is a complete outcome — not a stepping stone.

CLAUDE.md with @file references — under 200 lines, stack/commands/conventions/don'ts
Three custom skills: one static, one with $ARGUMENTS, one with context:fork
One custom subagent with memory: user (persistent memory)
Two hooks wired in settings.json — one formatting (PostToolUse), one safety (PreToolUse)
One MCP connection (local filesystem MCP is fine — no Obsidian account needed)

Professional Pass Your workflow runs in CI, not just locally

Everything in Core Pass, plus automated review that runs on every PR — your standards enforced even when you're not at the keyboard.

Plugin package with plugin.json — bundles all skills, agents, hooks, and MCP config
CI headless task in GitHub Actions — claude -p for automated PR review or security scan

Native/Team Pass Your expertise ships with the repo

Everything in Professional Pass, plus your institutional knowledge encoded for teammates and non-developers. A new hire is productive on day one without reading a wiki.

Cowork context files — about-me.md, brand-voice.md, working-preferences.md with real content
Custom Cowork plugin for a non-developer role (marketing, ops, legal, sales)
Teammate (or simulated) validation — someone else installs and uses your setup on day one

Scoring Rubric

Score each requirement: 0 = not present, 1 = present but basic, 2 = production-quality. Minimum to pass: 14/18 on Code + 2/4 on Cowork.

#	Requirement	0	1	2
1	CLAUDE.md — under 200 lines, @file refs, stack/commands/conventions/don'ts	☐	☐	☐
2	Skill: static — works from slash menu, useful for your project	☐	☐	☐
3	Skill: $ARGUMENTS — dynamic, accepts parameters	☐	☐	☐
4	Skill: context:fork — runs in subagent, returns summary	☐	☐	☐
5	Custom agent — memory: user, permissionMode, has real MEMORY.md content	☐	☐	☐
6	Hook: formatting — PostToolUse auto-format fires reliably	☐	☐	☐
7	Hook: safety — PreToolUse blocks destructive commands	☐	☐	☐
8	MCP connection — connected, used by a skill or agent	☐	☐	☐
9	Plugin package — plugin.json, installable, all components bundled	☐	☐	☐

Code score: ___ / 18 · Minimum pass: 14

#	Cowork Requirement	0	1	2
C1	Context files — about-me.md, brand-voice.md, working-preferences.md with real content	☐	☐	☐
C2	Custom Cowork plugin — for a non-developer role, with skills + connectors	☐	☐	☐

Cowork score: ___ / 4 · Minimum pass: 2

Scoring Guide

Score 0 — Not Present

The requirement doesn't exist in the repo. Nothing to evaluate.

Score 1 — Present but Basic

The component exists and technically works, but uses generic/placeholder content. A CLAUDE.md that says "TypeScript project" without specific conventions. A skill that works but does something trivial. A hook that fires but doesn't catch real issues.

Score 2 — Production Quality

The component is specific to your project, handles edge cases, and a real team member would find it useful on day one. A CLAUDE.md with your actual stack, commands, and architectural constraints. A skill that encodes your team's actual review process. A hook that catches your project's actual risk patterns.

The Onboarding Test

How to validate your track

Core: Run git clone . /tmp/test && cd /tmp/test && claude — does Claude know your project without explanation?

Professional: Open a PR — does the CI workflow run and produce a useful review?

Native/Team: Hand the repo to a colleague — can they ship something on day one?

The ultimate validation: hand your repo to someone who has never seen it. Give them 10 minutes with Claude Code (or Cowork). Can they produce useful work? If yes, you've built a team operating system. If not, your score will tell you which layer is weakest.

Self-assessment questions

Could you explain every component to a skeptical peer?

Could you defend your design decisions?

Could you extend the setup if requirements changed?

Could you teach a peer on the other surface (Code ↔ Cowork)?

Does the plugin install cleanly on a fresh machine?

🎉

Share your capstone in Show & Tell

Post your repo link · Get feedback · See what others built

→

Reference

⏱ reference

Full Command Reference

Every command, shortcut, and flag — organized by maturity level. The Cowork column shows the equivalent on the non-code surface.

Basic — Curious (Level 2)

Command	Purpose	Cowork
`curl ... \| bash`	Install (recommended)	Install Desktop app
`claude auth login`	Authenticate	App sign-in
`claude doctor`	Verify install	—
`/powerup`	Built-in tutorial	—
`/help`	List commands	—
`/clear`	Fresh context	New session
`/compact`	Compress context	—
`/cost` · `/context`	Spend + context visualization	Settings → Usage
`/resume` · `/recap`	Session recovery + summary	Context files
`/btw <q>`	Side question (zero context cost)	—
`/branch` · `/rewind`	Fork / revert conversation	—
`/effort`	low / med / high / xhigh / auto	—
`Shift+Tab`	Cycle modes	"Ask before starting"
`--model / opusplan`	Model selection	—
`--auto`	AI-classified auto-approve	—
`Cmd+Esc`	VS Code launch	—
`Esc`	Interrupt	—

Intermediate — Fluent (Level 3)

Command	Purpose	Cowork
`claude init`	Bootstrap CLAUDE.md	Manual context files
`@filename`	Inline file ref	"Read this folder"
`/agents` · `Ctrl+B`	Subagents + background	Auto sub-agents
`/hooks` · `/permissions`	Hooks + permission rules	Global instructions
`/diff`	Interactive diff viewer	—
`/mcp` · `claude mcp add`	MCP servers	Connectors (60+)
`/plugin` · `/reload-plugins`	Plugin management	Plugin library (dept-specific)
`typescript-lsp` etc.	LSP plugins (11 languages)	—
`/debug` · `/simplify`	Bundled skills	—
`/claude-api`	Self-documentation	—
`claude -p`	Headless one-shot	—
`/install-github-app`	Always-on PR review bot	—
`WebSearch / WebFetch`	Built-in web tools	Claude in Chrome
`Ctrl+O`	Verbose thinking	—

Advanced — Native (Level 4)

Command	Purpose	Cowork
`/loop` · `/schedule`	In-session + persistent scheduling	Scheduled Tasks / Routines
`/ultrareview`	Parallel multi-agent PR review	—
`/security-review`	Vulnerability scan on diff	—
`/voice`	Push-to-talk	—
`/batch`	Parallel execution	Auto-parallel sub-agents
`claude -w <branch>`	Git worktree	Projects
`Ctrl+F`	Kill bg agents	—
`--max-budget-usd N`	Cost cap per session	—
`--max-turns N`	Turn limit (CI safety)	—
`--bare`	Skip plugins/hooks (fast startup)	—
`/install-slack-app`	Claude in Slack	Slack connector
`context: fork` + `agent:`	Skill → auto-subagent	Plugin Create
`$ARGUMENTS`	Dynamic skill params	—
`plugin.json`	Plugin manifest	Same format
`--scope`	Install scope	Org marketplace
`Agent SDK`	Programmatic access	—
—	—	Dispatch (phone → desktop)
—	—	Computer Use (screen control)

Reference

⏱ reference

Version Matrix

Which course content requires which Claude Code version. Use this to identify what changed when you update.

How to use this page: Run claude --version to check your current version. If a feature row shows a higher version than you have, run claude update or check the install page for the native binary installer.

Feature Version Requirements

Feature	Module	Min Version	Plan Required	Status
Core CLI (`claude`, `/help`, `/clear`)	M05	Any	Any paid	Stable
CLAUDE.md + `claude init`	M10	Any	Any paid	Stable
Skills + `$ARGUMENTS`	M11	Any	Any paid	Stable
Built-in subagents (Explore, Plan)	M12	Any	Any paid	Stable
MCP servers (`.mcp.json`)	M14	Any	Any paid	Stable
Hooks (PostToolUse, PreToolUse)	M15	Any	Any paid	Stable
Custom subagents + `memory: user`	M17	Any	Any paid	Stable
Git worktrees (`claude -w`)	M19	v2.0+	Any paid	Stable
Headless mode (`claude -p`)	M20	Any	Any paid	Stable
Plugin marketplace (`/plugin install`)	M13	v2.0+	Any paid	Stable
1M token context window	M06	v2.1+	Max / Team / Enterprise	Stable
Private plugin marketplaces	M13	v2.1+	Enterprise	Stable
`opusplan` hybrid mode	M06	v2.1.84+	Max / Enterprise	Stable
Auto mode (`claude --auto`)	M06	v2.1.84+	Any paid	Stable
LSP plugin (type-aware editing)	M13	v2.1+	Any paid	Stable
Background scheduling (`/schedule`)	M18	v2.1.84+	Pro / Max	Stable
Voice control (`/voice`)	M18	v2.1.84+	Max	Preview
Cowork Routines (cloud-hosted)	M18	—	Pro / Max	Preview
Computer Use (screen control)	M03	—	Max / Enterprise	Preview

Course Version History

Course Version	Date	Vetted Against	Key Changes
`v1.0`	May 2026	Claude Code v2.1.84	Initial release. 24 modules, 3 labs.
`v1.1`	May 2026	Claude Code v2.1.84	21 product gaps fixed. Install method updated to native binary.
`v1.2`	May 2026	Claude Code v2.1.84	Cheat sheet cross-reference. 18 items added. Auto mode. WebSearch/WebFetch.
`v1.3`	May 2026	Claude Code v2.1.84	Full content expansion. 7 diagrams. All Levels 0–4 elaborated.
`v1.4`	May 2026	Claude Code v2.1.84	Time estimates. Scored capstone rubric. Glossary. FAQ. Try-it exercises.
`v1.5`	May 2026	Claude Code v2.1.84	CCI backlog (20 items): track split, skill templates, hook schema, offline fallbacks, export progress, print mode, high contrast, command filters, transcript, lab skeletons, reference capstone branch.

How to Stay Current

When you see a new Claude Code release

1. Run claude update to get the latest binary.

2. Check the Preview rows in the Feature Version Requirements table above — previews are the most likely to change.

3. Run claude doctor to verify your install is healthy after the update.

4. If a module stops matching your Claude Code behavior, check the course changelog in the Glossary page — a course update may already be available.

Version symbols used in this course

Stable — shipped and generally available, behavior unlikely to change
Preview — research preview or early access; behavior may change; plan requirements may shift

Synthesis

⏱ 10 min

Assessment Rubric

Calibrate your progress at the end of each level. Use this for self-assessment or peer evaluation.

Your Maturity Journey

○

Unaware

"What is this?"

→

◐

Curious

"I can use it"

→

◉

Fluent

"I rely on it daily"

→

●

Native

"My team runs on it"

Area

Curious (L2)

Fluent (L3)

Native (L4)

Agent Loop

Names 5 steps, maps to features

Uses loop to diagnose problems

Designs multi-loop pipelines

Code + Cowork

Uses one tool effectively

Maps concepts across both

Team OS for both surfaces

Prompting

Explore → ask → edit

Outcome-based, XML-structured

Headless + SDK pipelines

Memory

/clear and /compact

CLAUDE.md + .claude/memory/

Three-layer memory + context rot

Skills

Uses /debug, /simplify

Custom skills with $ARGS + fork

Publishes and distributes plugins

Safety

.claudeignore basics

/hooks UI + safety defaults

OTEL tracing + audit in production

Team Impact

Personal productivity

Shares skills with team

New hires productive day 1

Self-Assessment Prompts

After Level 2 (Curious)

"Can I accept a diff, manage context, and use plan mode? Can I explain why the agent loop matters?"

After Level 3 (Fluent)

"Do I have a CLAUDE.md, at least one skill, and a hook running automatically? Can I debug with /debug and delegate to subagents?"

After Level 4 (Native)

"Can I build a custom agent team, publish a plugin, and run Claude in CI? Can a new hire be productive using my setup?"

Cross-surface test

"Could I teach this to a peer who uses the other surface (Code ↔ Cowork)? Could I explain why the concepts transfer?" If yes, you've internalized the mental model, not just the tool.

The course is complete when your expertise is externalized — it lives in the harness, compounds with every session, transfers to every new team member, and works across both surfaces. You are the architect. The harness is the system.

Artifact Taxonomy

These terms are used precisely throughout the course. If they blur together, use this table:

Term

What it is

How invoked

Cowork equivalent

Skill

A SKILL.md file with YAML frontmatter + instruction body. A persistent prompt, not compiled code.

Slash command (/pr-review) or by Claude automatically when relevant

Skill file in a Cowork plugin folder

Slash command

The /name shortcut that invokes a skill. Slash command = the trigger. Skill = the instruction behind it.

Type / in any session to see the menu

Same — Cowork skills appear in the slash menu

Agent

A .claude/agents/name.md file. Defines a specialist Claude instance with its own model, tools, memory, and permissions.

Dispatched from a skill (context:fork), called by name, or auto-invoked when relevant

Sub-agent inside a Cowork plugin

Hook

A shell command (or http/mcp_tool/prompt/agent) wired to a lifecycle event in settings.json. Always runs — deterministic, not probabilistic.

Automatically at the wired event (e.g., every Edit → PostToolUse)

Global instruction safety defaults (approximate equivalent)

Plugin

A folder with .claude-plugin/plugin.json bundling skills + agents + hooks + MCP config. Installable in one command.

/plugin install or marketplace browse

Cowork plugin (same format, simpler frontmatter)

Capstone Evidence Table

For each capstone requirement, here is what to submit and what passes:

Requirement

Expected file / output

Passes if…

CLAUDE.md

CLAUDE.md at repo root

Has stack, commands, conventions, and at least one @file reference. Under 200 lines.

Static skill

.claude/skills/[name]/SKILL.md

Valid frontmatter. Body contains meaningful instructions Claude follows when invoked. Appears in / menu.

$ARGUMENTS skill

.claude/skills/[name]/SKILL.md with $ARGUMENTS in body

$ARGUMENTS is used meaningfully — not just appended. Invoking with and without args produces different behaviour.

context:fork skill

.claude/skills/[name]/SKILL.md with context: fork in frontmatter

Invocation triggers a subagent (visible in /agents Running tab) and returns a summary to the main session.

Custom agent

.claude/agents/[name].md with memory: user

Agent has a meaningful description, defined tools list, and model. A .claude/memory/ file exists after first invocation (gitignored — screenshot as evidence).

Formatting hook

.claude/settings.json with PostToolUse hook

Making a code edit triggers the formatter automatically (show hook output in terminal screenshot).

Safety hook

.claude/settings.json with PreToolUse hook

Asking Claude to run a blocked command produces a block message (show blocked output).

MCP connection

.mcp.json or .claude/mcp.json

At least one MCP server configured. /mcp shows it as connected. Local filesystem MCP is sufficient.

Plugin

plugin/.claude-plugin/plugin.json with bundled skills + agent + hook

npm test && npm run build pass. Plugin installs locally with /plugin install ./plugin.

CI workflow

.github/workflows/claude-review.yml

Workflow runs on PR. Claude posts a review comment. Screenshot of the comment on a real PR.

Completion Certificate

Enter your name below and click Generate Certificate to open a print-ready certificate.

Reference

⏱ reference

Glossary, FAQ & What's Next

Every term defined. Top questions answered. Where to go after this course.

Glossary

Agent Loop

Perceive → Reason → Act → Observe → Iterate. The fundamental cycle that all AI agents follow. Every Claude Code and Cowork feature maps to a step in this loop.

Externalization

The framework for this course. Externalizing memory, skills, and protocols into a harness so the agent (and your team) can access them reliably. From Norman's cognitive artifacts theory.

Harness

The coordination layer around the agent: plan mode, hooks, permission modes, observability, sandboxing. The harness controls how the agent acts — deterministically, not probabilistically.

Context Rot

Quality degradation as conversations grow. Older context gets compressed or dropped, causing Claude to drift from instructions. Prevented by /compact, /btw, subagent delegation, and CLAUDE.md.

CLAUDE.md

Project-level semantic memory. Loaded every session, survives compaction. Contains stack, commands, conventions, and don'ts. The most important file for Claude Code.

context:fork

Skill frontmatter that runs the skill in an isolated subagent. Your main context stays clean. Results are summarized back. Used for exploration and research tasks.

$ARGUMENTS

Variable in skill body that captures parameters from the slash command. /deploy staging passes "staging" as $ARGUMENTS.

MCP

Model Context Protocol. Open standard for connecting AI agents to external tools. Used by both Claude Code (MCP servers) and Cowork (connectors). Authentication via OAuth or API keys.

Hook

A shell command (or HTTP call, MCP tool, LLM prompt, or agent) that fires at a specific lifecycle event. Unlike LLM behavior, hooks are deterministic — they always run.

Subagent

A nested agent loop running in isolated context. Built-in: Explore (read-only, Haiku), Plan (analytical), General-purpose (full tools). Custom: defined in .claude/agents/.

Plugin

A package bundling skills, agents, hooks, and MCP configs into a single installable unit. Distributed via marketplaces (GitHub repos). Same format for Code and Cowork.

LSP Plugin

Language Server Protocol integration giving Claude real-time type errors and go-to-definition as it edits. Available for 11 languages. Highest-impact plugin category.

opusplan

Hybrid mode: Opus plans the approach (deep reasoning), Sonnet executes the edits (fast, cheap). Best cost-quality tradeoff for complex refactoring.

Auto Mode

claude --auto. AI classifier approves routine actions (reads, grep) automatically while prompting for high-risk actions (writes, shell commands). The smart middle ground.

Dispatch

Cowork feature. Text a task from your phone → Cowork runs the workflow on your desktop → you return to finished files. Requires desktop app running.

Routines

Cloud-hosted Cowork automation that runs even when your laptop is closed. Distinct from desktop-bound Scheduled Tasks. Research preview.

Computer Use

Cowork's screen control. Claude can open apps, navigate browsers, run tools. Uses connectors first, browser second, screen control as last resort.

/btw

Side question at zero context cost. Uses the parent prompt cache. Never enters conversation history. Primary defense against context rot for meta-questions.

Worktree

claude -w <branch>. Isolated git worktree with its own directory and branch. Parallel Claude sessions without stashing or conflicts.

Headless Mode

claude -p "prompt". One-shot execution, no interactive session. For CI/CD, cron jobs, scripting. Use with --max-budget-usd for cost safety.

Frequently Asked Questions

How much does Claude Code cost?

Claude Code is included with paid Claude plans (Pro $20/mo, Max $100–200/mo, Team $30/user/mo, Enterprise custom). API-key users pay per-token. Typical session: $0.10–$2.25 depending on model and length. Use /cost to check and --max-budget-usd to cap.

Does it work offline?

The default setup requires an internet connection to reach Anthropic's API. Your code stays local — only prompts, tool calls, and file contents referenced in conversation are sent. Enterprise users running Claude via AWS Bedrock, Google Vertex AI, or Microsoft Azure Foundry keep traffic inside their VPC. Local model inference (fully air-gapped) is not currently supported.

Can I use my company's API key?

Yes. Set ANTHROPIC_API_KEY as an environment variable. Also supports AWS Bedrock (CLAUDE_CODE_USE_BEDROCK), Google Vertex (CLAUDE_CODE_USE_VERTEX), and Microsoft Foundry (CLAUDE_CODE_USE_FOUNDRY).

Is my code sent to Anthropic?

Only the portions Claude needs to see — files it reads, commands it runs, diffs it reviews. Code is not used for training. Enterprise plans offer additional data governance controls. Use .claudeignore to exclude sensitive files.

Claude Code vs. Cursor vs. GitHub Copilot?

Copilot is autocomplete (Generation 1). Cursor is chat + edit with context (Generation 2.5). Claude Code is fully agentic (Generation 3) — it reads, plans, acts, verifies, and iterates autonomously. Different tools for different workflows; many developers use Claude Code for complex tasks and Cursor/Copilot for quick inline completions.

What's the difference between CLAUDE.md and .claude/rules/?

CLAUDE.md is a single project policy file at your repo root. .claude/rules/*.md are modular rule fragments that can be path-scoped. Use CLAUDE.md for rules under 200 lines; split into rules/ when the project is large or rules vary by directory.

Do I need Cowork if I have Claude Code?

They serve different audiences. Code is for developers working in codebases. Cowork is for knowledge workers (marketing, legal, finance, ops) working with documents, spreadsheets, and connected tools. Same architecture — learn one, understand both.

Can I use Claude Code for non-code tasks?

Yes. Claude Code reads and writes any text file — Markdown, JSON, YAML, CSV. Many people use it for documentation, data analysis, and content generation. However, Cowork is specifically designed for non-code work and provides a better UX for those use cases.

What's Next

You've completed the course. Here's where to go from here:

Community: GitHub Discussions

The course community lives at github.com/douglasmun/claude-code-course-starter/discussions. Ask questions, share what you built, get unstuck, and help other students. Organised into Q&A, Show & Tell, and Ideas categories.

Deepen: Claude Agent SDK

Build Claude Code into your own applications. The Agent SDK (Python/TypeScript) gives programmatic access to the full harness. Start with the official docs.

Expand: Custom MCP Servers

Build MCP servers for your internal tools — databases, APIs, proprietary systems. The MCP spec is open source. Your tools become first-class citizens in the agent loop.

Scale: Enterprise Deployment

Private plugin marketplaces, OTEL tracing, admin controls, SSO. If your team is larger than 10 people, the enterprise tier adds governance without sacrificing developer experience.

Contribute: Open Source Plugins

The plugin ecosystem is community-driven. Build a plugin for your domain. Publish to GitHub. Share with the awesome-claude-code directory.

Reference: Keep the Cheat Sheet Handy

This course is your learning path. The Claude Code Cheat Sheet v1.0 is your daily reference. The Starter Repo is your project template. Bookmark both.

Course Changelog

Version	Date	Changes
`v1.0`	May 2026	Initial release. 24 modules, 3 labs, capstone, full command reference. Vetted against Claude Code v2.1.84+ and Cowork GA (Apr 9, 2026).
`v1.1`	May 2026	Product vet: 21 gaps fixed (install method, model versions, Cowork GA, Dispatch, Routines, Computer Use, /hooks UI, context:fork, opusplan, message queuing, Ctrl+B, /powerup, 20+ hook events, .claude/memory/, claude auth, /debug + /simplify, plugin count, Projects, context rot, $ARGUMENTS, Cowork capstone).
`v1.2`	May 2026	Cheat sheet cross-reference: 18 items added (/btw, /effort, /context, --max-budget-usd, LSP plugins, permission rule syntax, /diff, token-saving tips, /schedule, /ultrareview, /security-review, /install-slack-app, /branch, /rewind, .claude/rules/*.md, hook types, ENABLE_PROMPT_CACHING_1H, /recap). Auto mode (claude --auto). WebSearch/WebFetch tools. /install-github-app. Image input as general capability. Troubleshooting & recovery. Cost management strategies.
`v1.3`	May 2026	Full course content expanded: all Levels 0–4, labs elaborated with phases/code/stacks/rubrics. 7 visual diagrams (journey map, model×effort matrix, memory layers stack, hook timeline, agent pipeline, architecture stack, maturity bars). Companion cheat sheet linked.
`v1.4`	May 2026	22 try-it exercises, prev/next navigation, time estimates, scored capstone rubric (0/1/2 scoring, pass thresholds). Glossary (20 terms), FAQ (8 questions), What's Next (5 paths). Module search. Code copy buttons. Print stylesheet. Version/plan tier notes. Research preview flags. Lab failure conditions. Author attribution. This changelog.