Why Hooks Matter: 4-Layer Workflow Automation

TL;DR

Claude Code's hook system transforms your AI coding agent from a passive executor into an active workflow partner. A 4-layer approach: (1) PostToolUse for auto-format, bringing format errors to zero. (2) PreToolUse for context enrichment, automatically injecting relevant knowledge base chunks on every file read. (3) PreCompact/SessionStart for WIP persistence, saving work state before context compaction and restoring it in new sessions. (4) Axe CLI for standalone agent bridge, triggering LLM agents independent of hooks. Each layer solves a concrete problem, each solution is backed by academic or community references.

Three problems keep recurring when working with Claude Code daily: format checks after every edit, context loss in long sessions, and losing work-in-progress after context compaction. The common thread: all of them require manual intervention. Typing “format this file”, “remind me of this context”, “where were we?” over and over isn’t automation, it’s a new chore.

Claude Code’s hook system automates these repetitive tasks. But what matters isn’t the hooks themselves, it’s which problems they solve. In this post, I’ll share a 4-layer approach, the problem each layer solves, and the results.

Quick Reference
Scope	Workflow automation with Claude Code hooks
Approach	Problem-driven: each layer solves a problem
References	Meta-RL (ICLR 2026), GrapeRoot benchmark, SashiDo boring agent, Axe CLI
Related concepts	Context Management, LLM Behavioral Decay Modes

The Hook System in 30 Seconds

Claude Code’s event lifecycle consists of five main events: PreToolUse (before a tool call), PostToolUse (after a tool call), PreCompact (before context compaction), SessionStart (session start), and Stop (session end). Each event can trigger a hook: a shell command, script, or HTTP call.

Hooks are defined in settings.json. The matcher field specifies which tools they apply to:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [{ "type": "command", "command": "ruff format $FILEPATH" }]
      }
    ]
  }
}

For a comprehensive reference covering all 23 hook events, check shanraisshan’s claude-code-hooks repo. In this post, I’ll focus on which problem each hook solves rather than listing references.

Why Hooks? Isn’t CLAUDE.md Enough?

The instruction attenuation and ceremonialization concepts I covered in LLM Behavioral Decay Modes are critical here.

When you write “run tests after every change” in CLAUDE.md, that’s a probabilistic rule. The model’s compliance varies with context length, session duration, and topic. It runs tests for the first few changes, then by the tenth change writes “ran tests, passed.” Maybe it did, maybe it didn’t. The shell of the rule remains, its substance is gone: ceremonialization.

Hooks are deterministic rules. They run the same way every time, no exceptions, no ceremonialization. Probabilistic rules and deterministic controls need to work together. CLAUDE.md defines intent, hooks guarantee execution.

Layer 1: Auto-Format (PostToolUse)

Problem

Claude Code sometimes produces poorly formatted code. Indent inconsistencies in Python, missing semicolons in JavaScript, jumbled import ordering. The code works but the linter complains. Saying “format this file” is repetitive and unnecessary work.

Solution

The PostToolUse hook triggers automatically after the Edit or Write tool runs:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "ruff format --quiet $FILEPATH"
          },
          {
            "type": "command",
            "command": "eslint --fix --quiet $FILEPATH 2>/dev/null || true"
          }
        ]
      }
    ]
  }
}

ruff runs on Python files, eslint on JavaScript/TypeScript files. The --quiet flag suppresses unnecessary output. || true prevents eslint from erroring on non-JS files.

Result

Zero format errors. No “fix the formatting” comments in PR reviews. Every file Claude Code produces meets project standards.

Layer 2: Context Enrichment (PreToolUse)

Problem

When Claude Code reads a file, it doesn’t know that file’s context within the project. Which module it relates to, which API it uses, which conventions it should follow. Having to say “this file is part of module X, use convention Y” every time gets old fast.

Solution

The PreToolUse hook triggers before the Read or Grep tool runs. It injects relevant knowledge base chunks into context:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Read|Grep",
        "hooks": [
          {
            "type": "command",
            "command": "path/to/enrich-context.sh"
          }
        ]
      }
    ]
  }
}

This approach aligns with the pre-injection strategy demonstrated by the GrapeRoot benchmark. Compared to MCP tool call loops, pre-injection is 31% cheaper and uses 24% fewer turns¹. Injecting relevant information into context before a file is read is more efficient than making a separate tool call via MCP.

Result

The “what does this file do?” question disappears. Claude Code reads every file with project context included. Repetitive context explanations are eliminated.

Layer 3: WIP Persistence (PreCompact + SessionStart)

Problem

In long sessions, Claude Code performs context compaction: summarizing old messages to save tokens. During this process, active work state (which branch, which files changed, which tasks are open) can be lost. Starting a new session with “where were we?” means spending the last 30 minutes of the previous session all over again.

Solution

Two hooks work together:

PreCompact (save checkpoint before compaction):

#!/bin/bash
# checkpoint-wip.sh
INPUT=$(cat)
CWD=$(echo "$INPUT" | jq -r '.cwd // empty')
PROJECT_NAME=$(basename "$CWD")

BRANCH=$(cd "$CWD" && git branch --show-current 2>/dev/null)
MODIFIED=$(cd "$CWD" && git diff --name-only 2>/dev/null | head -20)
STATUS=$(cd "$CWD" && git diff --stat 2>/dev/null | tail -1)
LAST_COMMITS=$(cd "$CWD" && git log --oneline -3 2>/dev/null)

jq -n \
  --arg branch "$BRANCH" \
  --arg modified "$MODIFIED" \
  --arg status "$STATUS" \
  --arg reflection "$LAST_COMMITS" \
  --arg time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  '{branch: $branch, modified_files: ($modified | split("\n")),
    git_status: $status, reflection: $reflection, timestamp: $time}' \
  > "$HOME/.claude/state/wip-${PROJECT_NAME}.json"

SessionStart (restore in new session):

#!/bin/bash
# restore-wip.sh
INPUT=$(cat)
CWD=$(echo "$INPUT" | jq -r '.cwd // empty')
PROJECT_NAME=$(basename "$CWD")
CHECKPOINT="$HOME/.claude/state/wip-${PROJECT_NAME}.json"

[ ! -f "$CHECKPOINT" ] && exit 0

# 24-hour check
FILE_AGE=$(( $(date +%s) - $(stat -f %m "$CHECKPOINT") ))
[ "$FILE_AGE" -gt 86400 ] && rm -f "$CHECKPOINT" && exit 0

BRANCH=$(jq -r '.branch' "$CHECKPOINT")
MODIFIED=$(jq -r '.modified_files | join(", ")' "$CHECKPOINT")
REFLECTION=$(jq -r '.reflection' "$CHECKPOINT")

CONTEXT="[WIP RECOVERED] Branch: $BRANCH | Changed: $MODIFIED | Last activity: $REFLECTION"
jq -n --arg ctx "$CONTEXT" '{additionalContext: $ctx}'

rm -f "$CHECKPOINT"

The reflection field in the checkpoint puts a critical finding from Meta-RL research into practice. In 2026, three independent research groups (AI2, EPFL, Tsinghua) reached the same conclusion: giving an agent multiple attempts with reflection after each failure significantly improves performance. LaMer (EPFL, ICLR 2026) revealed a particularly interesting finding: keeping only the reflection text produces better results than full trajectory + reflection (80.5% vs 74.4%)².

This finding directly impacts WIP checkpoint design: keeping the last 3 commit messages as “reflection” is sufficient rather than saving the entire conversation history. Fewer tokens, better context.

SashiDo’s “AI-Assisted Programming That Actually Ships” approach follows a similar philosophy: feature_list.json, progress file, init script, every session follows the same loop³. Our WIP persistence hook is a lightweight version of this approach.

Result

Near-zero loss after context compaction. New sessions automatically start with previous state. Checkpoints auto-clean after 24 hours.

Layer 4: Standalone Agent Bridge (Axe + Docker)

Problem

Claude Code’s hook system is powerful but tied to a single ecosystem. Sometimes you want to use a different LLM for a different task: code review, log analysis, commit message generation. These tasks need a separate tool, but one that integrates with hooks.

Solution

Axe, developed by J.R. Swab, is a Go-based CLI tool. Designed with Unix philosophy: each agent does one thing, composable via stdin/stdout, triggerable from cron/git hooks/pipes⁴.

Running Axe via Docker and triggering it from Claude Code hooks bridges two ecosystems:

# Axe agent config (~/.config/axe/agents/code-reviewer.toml)
name = "code-reviewer"
description = "Reviews git diffs for bugs and security issues"
model = "anthropic/claude-haiku-4-5-20251001"
system_prompt = "You are a concise code reviewer. Focus on bugs, security issues, and logic errors. Max 5 bullet points."
skill = "skills/code-review/SKILL.md"

[params]
temperature = 0.2
max_tokens = 1024

On-demand usage:

git diff | docker run --rm -i \
  -v ~/.config/axe:/home/axe/.config/axe:ro \
  axe run code-reviewer

Test Results

First test, on a diff with intentional security vulnerabilities:

def get_user(user_id):
    conn = sqlite3.connect(os.environ["DB_PATH"])
    return conn.execute(f"SELECT * FROM users WHERE id = {user_id}").fetchone()

Axe’s code-reviewer agent detected 3 issues:

SQL injection: user_id is directly interpolated into the query
Resource leak: database connection is never closed
Missing error handling: crashes if DB_PATH env var is missing

Second test, on a real project diff (content-intelligence pipeline): 4 legitimate issues detected. Missing JSON parsing error handling, hash lookup validation, unbounded query, file I/O race condition.

Trade-off

This approach is tool-agnostic and composable. But there’s a Docker container + LLM call cost with every invocation. That’s why I keep it as an on-demand script rather than triggering it after every edit. For those who want per-edit triggering, throttling (run only if 5 minutes have passed since last review) can be added.

I discussed this integration with Axe maintainer J.R. Swab on Dev.to. Swab is working on triggering Axe agents in skill steps via OpenClaw. Triggering from editor hooks hadn’t been tried yet; this post is the first documented use case.

Result

Two ecosystems (Claude Code hooks + standalone Axe agents) working together. Hooks handle automatic tasks, Axe handles on-demand reviews.

Overall Results

Problem	Hook	Result
Format errors	PostToolUse (ruff, eslint)	Zero manual format intervention
Context gaps	PreToolUse (knowledge enrichment)	Automatic context injection
Context compaction loss	PreCompact + SessionStart	Automatic WIP save/restore
Single ecosystem lock-in	Axe + Docker bridge	Tool-agnostic agent triggering

Full Setup

settings.json

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          { "type": "command", "command": "ruff format --quiet $FILEPATH" },
          {
            "type": "command",
            "command": "eslint --fix --quiet $FILEPATH 2>/dev/null || true"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "Read|Grep",
        "hooks": [
          {
            "type": "command",
            "command": "path/to/enrich-context.sh"
          }
        ]
      }
    ],
    "PreCompact": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "path/to/checkpoint-wip.sh"
          }
        ]
      }
    ],
    "SessionStart": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "path/to/restore-wip.sh"
          }
        ]
      }
    ]
  }
}

Test

To test each hook:

Auto-format: Edit a Python file with the Edit tool, verify ruff runs automatically
WIP persistence: Modify a few files, run /compact, start a new session, verify the checkpoint is restored
Axe bridge: git diff | docker run --rm -i -v ~/.config/axe:/home/axe/.config/axe:ro axe run code-reviewer

What’s Next?

Claude Code’s hook system is the infrastructure that separates AI coding assistants from AI coding partners. Each hook is simple on its own. But four layers working together create a workflow where the agent handles routine operations autonomously.

The important thing isn’t the hooks themselves. It’s identifying which problems are worth solving with hooks and which are better left to CLAUDE.md rules. Not every automation is useful. But the automations in this post share one common trait: each one eliminates a task I was doing manually every day.

AI-Powered Codebase Audit: The role of hooks and workflow automation in 6-track audit cycles

Footnotes

GrapeRoot benchmark: pre-injection vs MCP tool loop. 31% cost reduction, 24% fewer turns. Source: Reddit post, benchmark repo ↩
LaMer (EPFL, ICLR 2026): “Language Agents Meet Reinforcement Learning.” Reflection-only vs full trajectory + reflection comparison. Source: arXiv ↩
SashiDo “AI-Assisted Programming That Actually Ships”: feature_list.json + progress file + init script loop pattern. Source: Dev.to ↩
Axe CLI: Go-based, Unix philosophy, single-purpose LLM agents. TOML config + SKILL.md, stdin/stdout composability. Source: GitHub, Dev.to discussion ↩

Key Takeaways

01 Hooks are the deterministic version of CLAUDE.md instructions: probabilistic rules ceremonialize over time, hooks run the same way every time
02 PostToolUse auto-format brings format errors to zero, ruff and eslint run automatically after every edit
03 PreCompact WIP persistence puts Meta-RL research's 'reflection-only > full trajectory' finding into practice
04 Axe CLI's stdin/stdout composability enables triggering LLM agents independent of hooks, running in isolation via Docker
05 The hook system is the key feature that sets Claude Code apart from other CLI agents (Codex, Gemini CLI)

Frequently Asked Questions (FAQ)

+ What is a Claude Code hook?

Shell commands or scripts that Claude Code automatically runs at specific events (file editing, tool calls, context compaction, session start). They are defined in settings.json.

+ How are hooks different from CLAUDE.md rules?

CLAUDE.md rules are probabilistic: the model's compliance varies with context and session length. Hooks are deterministic: they run the same way every time, no exceptions.

+ Why is WIP persistence necessary?

In long sessions, context compaction deletes old information. The PreCompact hook saves work state before compaction, and the SessionStart hook restores it in the new session. It auto-cleans after 24 hours.

+ What is Axe CLI?

A Go-based CLI tool following Unix philosophy that runs single-purpose LLM agents. Agents defined with TOML config + SKILL.md can be composed via stdin/stdout.

developer-tools

The Hook System in 30 Seconds

Why Hooks? Isn’t CLAUDE.md Enough?

Layer 1: Auto-Format (PostToolUse)

Problem

Solution

Result

Layer 2: Context Enrichment (PreToolUse)

Problem

Solution

Result

Layer 3: WIP Persistence (PreCompact + SessionStart)

Problem

Solution

Result

Layer 4: Standalone Agent Bridge (Axe + Docker)

Problem

Solution

Test Results

Trade-off

Result

Overall Results

Full Setup

settings.json

Test

What’s Next?

Related Posts

Footnotes

RELATED

Argus: Make Your AI Coding Assistant's Web Searches Visible

Compaction-Friendly Search Output: A Practical Playbook

Token Budget Arithmetic for Agent Search

Marketing automationthat kills the repetition

Marketing automation
that kills the repetition