Three problems keep recurring when working with Claude Code daily: format checks after every edit, context loss in long sessions, and losing work-in-progress after context compaction. The common thread: all of them require manual intervention. Typing “format this file”, “remind me of this context”, “where were we?” over and over isn’t automation, it’s a new chore.
Claude Code’s hook system automates these repetitive tasks. But what matters isn’t the hooks themselves, it’s which problems they solve. In this post, I’ll share a 4-layer approach, the problem each layer solves, and the results.
| Quick Reference | |
|---|---|
| Scope | Workflow automation with Claude Code hooks |
| Approach | Problem-driven: each layer solves a problem |
| References | Meta-RL (ICLR 2026), GrapeRoot benchmark, SashiDo boring agent, Axe CLI |
| Related concepts | Context Management, LLM Behavioral Decay Modes |
The Hook System in 30 Seconds
Claude Code’s event lifecycle consists of five main events: PreToolUse (before a tool call), PostToolUse (after a tool call), PreCompact (before context compaction), SessionStart (session start), and Stop (session end). Each event can trigger a hook: a shell command, script, or HTTP call.
Hooks are defined in settings.json. The matcher field specifies which tools they apply to:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [{ "type": "command", "command": "ruff format $FILEPATH" }]
}
]
}
}
For a comprehensive reference covering all 23 hook events, check shanraisshan’s claude-code-hooks repo. In this post, I’ll focus on which problem each hook solves rather than listing references.
Why Hooks? Isn’t CLAUDE.md Enough?
The instruction attenuation and ceremonialization concepts I covered in LLM Behavioral Decay Modes are critical here.
When you write “run tests after every change” in CLAUDE.md, that’s a probabilistic rule. The model’s compliance varies with context length, session duration, and topic. It runs tests for the first few changes, then by the tenth change writes “ran tests, passed.” Maybe it did, maybe it didn’t. The shell of the rule remains, its substance is gone: ceremonialization.
Hooks are deterministic rules. They run the same way every time, no exceptions, no ceremonialization. Probabilistic rules and deterministic controls need to work together. CLAUDE.md defines intent, hooks guarantee execution.
Layer 1: Auto-Format (PostToolUse)
Problem
Claude Code sometimes produces poorly formatted code. Indent inconsistencies in Python, missing semicolons in JavaScript, jumbled import ordering. The code works but the linter complains. Saying “format this file” is repetitive and unnecessary work.
Solution
The PostToolUse hook triggers automatically after the Edit or Write tool runs:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "ruff format --quiet $FILEPATH"
},
{
"type": "command",
"command": "eslint --fix --quiet $FILEPATH 2>/dev/null || true"
}
]
}
]
}
}
ruff runs on Python files, eslint on JavaScript/TypeScript files. The --quiet flag suppresses unnecessary output. || true prevents eslint from erroring on non-JS files.
Result
Zero format errors. No “fix the formatting” comments in PR reviews. Every file Claude Code produces meets project standards.
Layer 2: Context Enrichment (PreToolUse)
Problem
When Claude Code reads a file, it doesn’t know that file’s context within the project. Which module it relates to, which API it uses, which conventions it should follow. Having to say “this file is part of module X, use convention Y” every time gets old fast.
Solution
The PreToolUse hook triggers before the Read or Grep tool runs. It injects relevant knowledge base chunks into context:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Read|Grep",
"hooks": [
{
"type": "command",
"command": "path/to/enrich-context.sh"
}
]
}
]
}
}
This approach aligns with the pre-injection strategy demonstrated by the GrapeRoot benchmark. Compared to MCP tool call loops, pre-injection is 31% cheaper and uses 24% fewer turns1. Injecting relevant information into context before a file is read is more efficient than making a separate tool call via MCP.
Result
The “what does this file do?” question disappears. Claude Code reads every file with project context included. Repetitive context explanations are eliminated.
Layer 3: WIP Persistence (PreCompact + SessionStart)
Problem
In long sessions, Claude Code performs context compaction: summarizing old messages to save tokens. During this process, active work state (which branch, which files changed, which tasks are open) can be lost. Starting a new session with “where were we?” means spending the last 30 minutes of the previous session all over again.
Solution
Two hooks work together:
PreCompact (save checkpoint before compaction):
#!/bin/bash
# checkpoint-wip.sh
INPUT=$(cat)
CWD=$(echo "$INPUT" | jq -r '.cwd // empty')
PROJECT_NAME=$(basename "$CWD")
BRANCH=$(cd "$CWD" && git branch --show-current 2>/dev/null)
MODIFIED=$(cd "$CWD" && git diff --name-only 2>/dev/null | head -20)
STATUS=$(cd "$CWD" && git diff --stat 2>/dev/null | tail -1)
LAST_COMMITS=$(cd "$CWD" && git log --oneline -3 2>/dev/null)
jq -n \
--arg branch "$BRANCH" \
--arg modified "$MODIFIED" \
--arg status "$STATUS" \
--arg reflection "$LAST_COMMITS" \
--arg time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'{branch: $branch, modified_files: ($modified | split("\n")),
git_status: $status, reflection: $reflection, timestamp: $time}' \
> "$HOME/.claude/state/wip-${PROJECT_NAME}.json"
SessionStart (restore in new session):
#!/bin/bash
# restore-wip.sh
INPUT=$(cat)
CWD=$(echo "$INPUT" | jq -r '.cwd // empty')
PROJECT_NAME=$(basename "$CWD")
CHECKPOINT="$HOME/.claude/state/wip-${PROJECT_NAME}.json"
[ ! -f "$CHECKPOINT" ] && exit 0
# 24-hour check
FILE_AGE=$(( $(date +%s) - $(stat -f %m "$CHECKPOINT") ))
[ "$FILE_AGE" -gt 86400 ] && rm -f "$CHECKPOINT" && exit 0
BRANCH=$(jq -r '.branch' "$CHECKPOINT")
MODIFIED=$(jq -r '.modified_files | join(", ")' "$CHECKPOINT")
REFLECTION=$(jq -r '.reflection' "$CHECKPOINT")
CONTEXT="[WIP RECOVERED] Branch: $BRANCH | Changed: $MODIFIED | Last activity: $REFLECTION"
jq -n --arg ctx "$CONTEXT" '{additionalContext: $ctx}'
rm -f "$CHECKPOINT"
The reflection field in the checkpoint puts a critical finding from Meta-RL research into practice. In 2026, three independent research groups (AI2, EPFL, Tsinghua) reached the same conclusion: giving an agent multiple attempts with reflection after each failure significantly improves performance. LaMer (EPFL, ICLR 2026) revealed a particularly interesting finding: keeping only the reflection text produces better results than full trajectory + reflection (80.5% vs 74.4%)2.
This finding directly impacts WIP checkpoint design: keeping the last 3 commit messages as “reflection” is sufficient rather than saving the entire conversation history. Fewer tokens, better context.
SashiDo’s “AI-Assisted Programming That Actually Ships” approach follows a similar philosophy: feature_list.json, progress file, init script, every session follows the same loop3. Our WIP persistence hook is a lightweight version of this approach.
Result
Near-zero loss after context compaction. New sessions automatically start with previous state. Checkpoints auto-clean after 24 hours.
Layer 4: Standalone Agent Bridge (Axe + Docker)
Problem
Claude Code’s hook system is powerful but tied to a single ecosystem. Sometimes you want to use a different LLM for a different task: code review, log analysis, commit message generation. These tasks need a separate tool, but one that integrates with hooks.
Solution
Axe, developed by J.R. Swab, is a Go-based CLI tool. Designed with Unix philosophy: each agent does one thing, composable via stdin/stdout, triggerable from cron/git hooks/pipes4.
Running Axe via Docker and triggering it from Claude Code hooks bridges two ecosystems:
# Axe agent config (~/.config/axe/agents/code-reviewer.toml)
name = "code-reviewer"
description = "Reviews git diffs for bugs and security issues"
model = "anthropic/claude-haiku-4-5-20251001"
system_prompt = "You are a concise code reviewer. Focus on bugs, security issues, and logic errors. Max 5 bullet points."
skill = "skills/code-review/SKILL.md"
[params]
temperature = 0.2
max_tokens = 1024
On-demand usage:
git diff | docker run --rm -i \
-v ~/.config/axe:/home/axe/.config/axe:ro \
axe run code-reviewer
Test Results
First test, on a diff with intentional security vulnerabilities:
def get_user(user_id):
conn = sqlite3.connect(os.environ["DB_PATH"])
return conn.execute(f"SELECT * FROM users WHERE id = {user_id}").fetchone()
Axe’s code-reviewer agent detected 3 issues:
- SQL injection:
user_idis directly interpolated into the query - Resource leak: database connection is never closed
- Missing error handling: crashes if
DB_PATHenv var is missing
Second test, on a real project diff (content-intelligence pipeline): 4 legitimate issues detected. Missing JSON parsing error handling, hash lookup validation, unbounded query, file I/O race condition.
Trade-off
This approach is tool-agnostic and composable. But there’s a Docker container + LLM call cost with every invocation. That’s why I keep it as an on-demand script rather than triggering it after every edit. For those who want per-edit triggering, throttling (run only if 5 minutes have passed since last review) can be added.
I discussed this integration with Axe maintainer J.R. Swab on Dev.to. Swab is working on triggering Axe agents in skill steps via OpenClaw. Triggering from editor hooks hadn’t been tried yet; this post is the first documented use case.
Result
Two ecosystems (Claude Code hooks + standalone Axe agents) working together. Hooks handle automatic tasks, Axe handles on-demand reviews.
Overall Results
| Problem | Hook | Result |
|---|---|---|
| Format errors | PostToolUse (ruff, eslint) | Zero manual format intervention |
| Context gaps | PreToolUse (knowledge enrichment) | Automatic context injection |
| Context compaction loss | PreCompact + SessionStart | Automatic WIP save/restore |
| Single ecosystem lock-in | Axe + Docker bridge | Tool-agnostic agent triggering |
Full Setup
settings.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "type": "command", "command": "ruff format --quiet $FILEPATH" },
{
"type": "command",
"command": "eslint --fix --quiet $FILEPATH 2>/dev/null || true"
}
]
}
],
"PreToolUse": [
{
"matcher": "Read|Grep",
"hooks": [
{
"type": "command",
"command": "path/to/enrich-context.sh"
}
]
}
],
"PreCompact": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "path/to/checkpoint-wip.sh"
}
]
}
],
"SessionStart": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "path/to/restore-wip.sh"
}
]
}
]
}
}
Test
To test each hook:
- Auto-format: Edit a Python file with the Edit tool, verify ruff runs automatically
- WIP persistence: Modify a few files, run
/compact, start a new session, verify the checkpoint is restored - Axe bridge:
git diff | docker run --rm -i -v ~/.config/axe:/home/axe/.config/axe:ro axe run code-reviewer
What’s Next?
Claude Code’s hook system is the infrastructure that separates AI coding assistants from AI coding partners. Each hook is simple on its own. But four layers working together create a workflow where the agent handles routine operations autonomously.
The important thing isn’t the hooks themselves. It’s identifying which problems are worth solving with hooks and which are better left to CLAUDE.md rules. Not every automation is useful. But the automations in this post share one common trait: each one eliminates a task I was doing manually every day.
Related Posts
- AI-Powered Codebase Audit: The role of hooks and workflow automation in 6-track audit cycles
Footnotes
- GrapeRoot benchmark: pre-injection vs MCP tool loop. 31% cost reduction, 24% fewer turns. Source: Reddit post, benchmark repo ↩
- LaMer (EPFL, ICLR 2026): “Language Agents Meet Reinforcement Learning.” Reflection-only vs full trajectory + reflection comparison. Source: arXiv ↩
- SashiDo “AI-Assisted Programming That Actually Ships”: feature_list.json + progress file + init script loop pattern. Source: Dev.to ↩
- Axe CLI: Go-based, Unix philosophy, single-purpose LLM agents. TOML config + SKILL.md, stdin/stdout composability. Source: GitHub, Dev.to discussion ↩
- 01 Hooks are the deterministic version of CLAUDE.md instructions: probabilistic rules ceremonialize over time, hooks run the same way every time
- 02 PostToolUse auto-format brings format errors to zero, ruff and eslint run automatically after every edit
- 03 PreCompact WIP persistence puts Meta-RL research's 'reflection-only > full trajectory' finding into practice
- 04 Axe CLI's stdin/stdout composability enables triggering LLM agents independent of hooks, running in isolation via Docker
- 05 The hook system is the key feature that sets Claude Code apart from other CLI agents (Codex, Gemini CLI)
+ What is a Claude Code hook?
Shell commands or scripts that Claude Code automatically runs at specific events (file editing, tool calls, context compaction, session start). They are defined in settings.json.
+ How are hooks different from CLAUDE.md rules?
CLAUDE.md rules are probabilistic: the model's compliance varies with context and session length. Hooks are deterministic: they run the same way every time, no exceptions.
+ Why is WIP persistence necessary?
In long sessions, context compaction deletes old information. The PreCompact hook saves work state before compaction, and the SessionStart hook restores it in the new session. It auto-cleans after 24 hours.
+ What is Axe CLI?
A Go-based CLI tool following Unix philosophy that runs single-purpose LLM agents. Agents defined with TOML config + SKILL.md can be composed via stdin/stdout.