Skip to content
ceaksan

Chief of Staff: A Local-First AI Assistant That Prepares Your Daily Workflow

A local-first AI assistant system for solo entrepreneurs that automates daily operational overhead. Built with Claude Code, Python and SQLite. Collects Gmail, calendar, RSS feeds and tasks overnight, classifies them, and delivers a ready-made briefing each morning.

Jan 3, 2026 10 min read
TL;DR

Instead of spending every morning gathering information from multiple sources, I built Chief of Staff so I can sit down and start making decisions. It uses claude -p to collect Gmail and calendar data via MCP, adds RSS feeds and tasks through Python scripts, writes everything to SQLite, classifies items with Claude Sonnet, and presents the result as an Obsidian Daily Note. Fully local, open source.

The Problem: Gathering Information, Not Making Decisions

Every morning, the same cycle: I sit down at my computer but spend the first half hour gathering information instead of making decisions. Which emails came in on Gmail, what’s on the calendar today, which project has issues, what’s the status of yesterday’s leftover tasks, what’s been published in the feeds I follow.

None of this is hard. But it’s scattered. Five different tools, five different tabs, five different context switches. And cognitive load accumulates before any real work has begun.

I built Chief of Staff to solve this problem: a local AI assistant that runs overnight, collects data, classifies it, and delivers a ready briefing each morning.

The Result: The Briefing That Greets Me Every Morning

Before getting into technical details, I want to show the output the system produces. Every morning when I open Obsidian, a Daily Note like this is waiting for me:

# 2026-03-09

## Calendar

- 10:00-11:00 Client X meeting (Calendly) **prep needed**
- 14:00-14:30 Deploy review
- Free: 07:00-10:00, 11:00-14:00, 14:30-18:00

## Project Status

- OK: leetty, ceaksan
- validough: 3 errors (Neon connection timeout)

## Feed Highlights

- [Interesting Article](https://example.com) (Tech Blog) ~5min
- [Another Post](https://example.com) (Dev Weekly) ~3min

## Classified Tasks

### DISPATCH (AI handles)

- [ ] Client A email reply, meeting confirmation #email
- [ ] Blog research note #content

### PREP (80% ready, you finish)

- [ ] Client Y hosting migration reply, draft ready #email
- [ ] validough Neon timeout, summary + fix direction #dev

### YOURS (your brain needed)

- [ ] Client X meeting prep
- [ ] leetty checkout flow fix #dev

### SKIP (not today)

- [ ] validough onboarding wizard, P3, far deadline

## Carried Over

- [ ] [P2] Blog post publish, pending 2 days

Calendar, project statuses, classified tasks, items carried over from previous days. All in one place, collected and sorted. I just approve or adjust, then start working.

An Overview of the Options

There are multiple ways to build this kind of automation. Each approach has its own strengths and weaknesses.

Fully Autonomous Agent Systems

Systems where you define a task and expect the agent to solve it end-to-end. Appealing in theory, problematic in practice. Agents often enter loops, costs spiral unpredictably, and results are hard to anticipate. In a solo entrepreneur context, loss of control is an unacceptable risk.

No-Code Automation Platforms

Platforms that connect services through drag-and-drop interfaces. They excel at cloud-to-cloud integrations. However, when it comes to local file system access (like my Obsidian vault), custom classification logic, or LLM integration, they either hit their limits or require complex workarounds.

AI Assistant Platforms

Configurable AI assistant services. Useful for tasks like creating email drafts and calendar management. But data passes through their servers. For my consulting clients’ data and my own projects, this is a privacy risk. Monthly subscription costs add up, and customization options remain limited.

Agent Frameworks

Developer-focused frameworks that enable defining agent workflows in code. Powerful and flexible tools. But several issues arise for my situation:

  • Security and privacy: Both my own and my clients’ data are involved
  • Cost: Infrastructure overhead of unused features
  • No generalizable routine: Between consulting and my own projects, there’s no workflow that reduces to a general routine. Assigning tasks via chat doesn’t fit how I work
  • Overengineering risk: A purpose-built flow is more optimized and controllable than a ton of unused features

Build Your Own

Building your own pipeline with cron jobs, Python scripts, and API calls. The most flexible approach, but also the one requiring the most development time. Instead of writing everything from scratch, an orchestration layer over existing tools can be more efficient.

Local / Open Source Model Alternatives

Running models like Llama, Mistral, or Qwen locally through tools like Ollama. No cloud dependency, data never leaves. However, these models’ tool calling and long context management capabilities are currently limited. They may suffice for simple classification tasks, but they’re not yet mature enough for complex orchestration. This balance may shift over time.

ApproachStrengthWeakness (for my constraints)
Fully autonomous agentMinimal interventionLoss of control, cost uncertainty
No-code automationQuick setupWeak local file access, limited LLM integration
AI assistant platformReady to usePrivacy risk, limited customization
Agent frameworkFlexible, powerfulOverengineering, unnecessary complexity
Your own scriptsFull controlDevelopment time, maintenance burden
Local modelsPrivacy, zero costTool calling and context management still limited

My Choice: Lightweight and Purpose-Built

After evaluating these options, I decided to write my own solution. But not entirely from scratch.

I already had mini-services built for different needs. My own knowledge base MCP server for information management 1, my own scraper for pulling data from the web, my own semantic code search MCP server for searching codebases 2. Each one works independently and is accessible as an MCP server or API. I have similar mini-services in other areas too: design tokens, geometric pattern generation, e-commerce operations.

I use this ecosystem both to develop my own projects and to spot potential issues between developer and user perspectives. Using my own products daily shortens the feedback loop.

What was missing was an orchestration layer. A flow that brings existing pieces together, runs overnight, and produces a ready briefing by morning. Chief of Staff is that layer.

The principles behind my choice:

  • Lightweight: I don’t want to carry the weight of features I don’t use. For specific situations, progressing case by case is more appropriate
  • Local-first: My data stays on my computer. Client data going to third-party servers is unacceptable
  • Portable: I want a solution I can carry with me, one that can run off USB power. I don’t want to be tied to a single computer
  • Reviewable: I want to continuously review, adjust, and control the flow. Not a set-and-forget system, but a tool I actively manage
  • Cost control: Each step’s cost is predetermined, no surprise bills

Why Claude Code

I evaluated these criteria when choosing a model:

CriterionClaude CodeGPT + APIGemini + APILocal Model (Ollama)
MCP supportNative, built-inNone (custom integration required)NoneNone (experimental)
Non-interactive modeclaude -pAPI call requiredAPI call requiredollama run
Budget control--max-budget-usd flagManual trackingManual trackingNo cost
Tool callingStrongStrongStrongLimited
Gmail/Calendar accessMCP connector (no OAuth)OAuth + API key requiredOAuth + API key requiredManual integration
Local executionYes (CLI)No (API)No (API)Yes

Claude Code’s most decisive advantage is MCP connectors. I access Gmail and Google Calendar data through Claude’s built-in MCP connections. No need to create a project in Google Cloud Console, manage OAuth credentials, or store API keys. I connect once via /mcp in Claude Code, and everything works from there.

With claude -p (non-interactive mode), I pass a prompt file and run it. The --max-budget-usd flag sets the maximum cost for each run. This makes it possible to run automatically overnight via cron job or launchd.

# Overnight pipeline
claude -p prompts/collect.md --max-budget-usd 2.00    # Data collection
claude -p prompts/classifier.md --max-budget-usd 1.50  # Classification

An important note: the system is designed to be model-agnostic. Prompt files are plain markdown, the data layer is SQLite, collectors are Python scripts. The dependency on Claude Code is limited to MCP connectors and claude -p calls. Running the prompt files through Ollama with Llama or Mistral is technically possible. Gmail API and Google Calendar API can be used directly instead of MCP connectors (additional OAuth setup required). As local models’ tool calling capabilities mature, this transition will become easier.

System Architecture

Chief of Staff consists of three independent layers. Each layer produces value on its own; they don’t depend on each other.

              Claude Code (claude -p)
              ┌───────────────────────────────-───┐
              │  MCP Connector'lar                │
              │  ┌──────────┐  ┌─────────────-─┐  │
              │  │  Gmail   │  │  Google       │  │
              │  │  MCP     │  │  Calendar MCP │  │
              │  └────┬─────┘  └──────┬────────┘  │
              └───────┼───────────────┼───────────┘
                      │               │
    ┌─────────────────┼───────────────┼─────────-─────────┐
    │                 ▼               ▼                   │
    │  ┌─────────────────────────────────────────────-─┐  │
    │  │              cos.db (SQLite)                  │  │
    │  │              Tek doğru kaynak                 │  │
    │  └─────────────────────┬───────────────-───────-─┘  │
    │                        │                            │
    │  ┌──────────┐  ┌───────▼────────┐  ┌──────────--─┐  │
    │  │  Feed    │  │   Renderer     │  │  Task       │  │
    │  │Collector │  │  SQLite → MD   │  │ Collector.  │  │
    │  │ (Python) │  │  (Python)      │  │ (Python)    │  │
    │  └──────────┘  └───────┬────────┘  └──────────--─┘  │
    └────────────────────────┼────────────────────────-───┘

                  ┌──────────▼──────────┐
                  │ Classifier (Sonnet) │
                  └──────────┬──────────┘

                  ┌──────────▼──────────┐
                  │  Morning Sweep      │
                  │  (Opus + subagent)  │
                  └──────────┬──────────┘

                  ┌──────────▼──────────┐
                  │  Day Block (Sonnet) │
                  └─────────────────────┘

Layer 1: Overnight Collection

Runs automatically at 06:00 via launchd. A single claude -p session collects Gmail and Calendar data via MCP and writes to SQLite. Then Python scripts run sequentially:

SourceMethodWhat It Collects
GmailMCP connectorActionable emails from the last 24 hours
Google CalendarMCP connectorToday’s and tomorrow’s events across all calendars
RSS feedsPython (Miniflux REST API)Unread feed entries
Obsidian tasksPython (file scanning)Incomplete tasks
Project healthPython (custom scripts)Project status, error count, last deploy

If a source fails, the others continue running. The Daily Note shows a warning for the failed source.

Layer 1.5: Overnight Classification

After collection completes, Claude Sonnet classifies pending items:

ClassMeaningExample
DISPATCHAI can handle entirelyMeeting confirmation reply, research note
PREPAI does 80%, I finishComplex email draft, error analysis
YOURSMy brain requiredStrategy decisions, pricing, live meetings
SKIPNot todayLow priority, far deadline

Classification happens overnight. When I sit down in the morning, the sorted plan is already waiting.

Layer 2: Morning Sweep

On-demand, I trigger it. Claude Opus shows the classified plan, I approve or adjust. Subagents run in parallel for approved DISPATCH and PREP tasks:

AgentScopeSafety
Email AgentCreates Gmail drafts (MCP)Never sends
Dev Prep AgentError summary + fix directionRead-only
Content AgentBlog draft, research noteWrites to specific folders only
Calendar AgentMeeting prep noteRead-only

Layer 3: Day Block

Triggered after the Morning Sweep. Places remaining YOURS and PREP tasks into free calendar blocks. Writes to a dedicated “AI Plan” Google Calendar, doesn’t mix with the main calendar.

Data Layer: SQLite

All data lives in SQLite. Obsidian is just the view layer, not the database.

-- Domain tables (source-specific)
CREATE TABLE emails (
    id TEXT PRIMARY KEY,
    thread_id TEXT,
    subject TEXT,
    sender TEXT,
    snippet TEXT,
    labels TEXT,           -- JSON array
    received_at TEXT NOT NULL,
    raw_payload JSON
);

-- Work queue (pipeline lifecycle)
CREATE TABLE work_queue (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    domain_type TEXT NOT NULL,  -- email, event, task, health, feed
    domain_id TEXT NOT NULL,
    priority TEXT,              -- P1, P2, P3, P4
    status TEXT NOT NULL DEFAULT 'pending',
    content_hash TEXT,
    collected_at TEXT NOT NULL DEFAULT (datetime('now')),
    UNIQUE(domain_type, domain_id)
);

-- Classifications (audit trail)
CREATE TABLE classifications (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    queue_id INTEGER NOT NULL,
    category TEXT NOT NULL,     -- dispatch, prep, yours, skip
    reason TEXT,
    model TEXT,
    FOREIGN KEY (queue_id) REFERENCES work_queue(id)
);

Domain tables (emails, events, tasks, health_checks, feeds) hold source-specific structured data. work_queue tracks each item’s status in the pipeline. classifications records the reasoning behind each classification decision and which model made it. content_hash prevents re-classifying unchanged content.

Safety Model

RuleImplementation
Never sends emailsEmail Agent only creates drafts (gmail_create_draft)
Budget cap--max-budget-usd flag on every claude -p call
Mutexshlock prevents concurrent runs
IdempotentINSERT OR IGNORE + unique index prevents duplicate inserts
Dry run--dry-run preview on Day Block
Failure isolationOne source failing doesn’t affect others
Human approvalMorning Sweep shows classifications, waits for approval
Scoped writesContent Agent writes to specific vault folders only

Is there a risk of an email being misclassified? Yes. But the system never acts autonomously. At Morning Sweep, all classifications come to me with their reasoning. Nothing runs without my approval. Critical decisions (pricing, strategy, contracts) are defined as force_yours in the config, preventing the AI from classifying them as DISPATCH or PREP.

Cost

ComponentCost
Overnight collection + classification (Sonnet)~$1-3/day
Morning sweep (Opus)~$1-3/day
Day block (Sonnet)~$0.25-1/day
Google APIsFree (via MCP)
Total~$2-7/day

These numbers stay under control thanks to budget caps. If a step exceeds its allocated budget, it stops gracefully rather than crashing.

Current Status

Working layers:

  • SQLite schema and data layer (9 tables, 5 views)
  • Calendar, Gmail, Feed, Task, Health, Radar collectors
  • Cloudflare (Workers + Pages) and Coolify (apps, services, databases) health monitoring
  • Renderer (SQLite to Obsidian Daily Note generation)
  • Classifier prompt and flow
  • Parallel sweep orchestrator with 4 domain agents
  • Pipeline runner (run.sh, cos-brief.sh) with healthchecks.io monitoring
  • Weekly stats digest and scheduling insights
  • Interactive setup wizard (setup_wizard.py)

Not yet complete:

  • Day Block (writing time blocks to calendar)
  • Retry logic (exponential backoff for failed agent runs)
  • Vercel and Neon health collectors

The system runs every night and prepares my briefing every morning. Since the incomplete layers are independent, they don’t break the existing flow.

Update: Parallel Agents and the Ecosystem (March 20, 2026)

Since publishing this post, three things changed.

Others built similar systems

Jim Prosser, a non-technical consultant from Marin County, built his version in 36 hours and wrote about it on Medium and LinkedIn. Anthropic published an official Claude Agent SDK cookbook using a Chief of Staff scenario. Someone on Reddit built a skill that separates planning from building.

I compared all of them with my implementation. The patterns are converging: everyone lands on some variant of dispatch/prep/yours/skip classification. The differences are in the data layer and safety model. Prosser stores everything in Todoist (third-party dependency), the Anthropic cookbook uses CSV files, the Reddit skill has no persistence at all. My system’s SQLite intermediate layer with content hash dedup, idempotent inserts, and audit trail classifications is the most robust of the four. What I was missing was parallel agent execution.

Subagents now run in parallel

I replaced the monolithic sweep prompt with an async Python orchestrator (collectors/orchestrator.py). Four domain-specific agents run concurrently with semaphore-based concurrency control:

AgentModelBudgetWhat it does
CalendarSonnet$0.50Meeting prep notes
HealthSonnet$0.50Error analysis, fix direction
TaskSonnet$0.50Task completion notes, research outlines
FeedSonnet$0.50Actionable feed summaries

Each agent gets its own prompt file, budget cap, timeout, and log file. If one agent fails, others still complete. Results are collected and imported to cos.db in a single transaction. The orchestrator supports --sequential mode for debugging and --dry-run for testing without running agents.

Email agent is classification-only

After testing, I decided not to auto-create Gmail drafts. Emails are still classified and shown in the Daily Note, but no agent touches Gmail. I handle email responses manually after reviewing the briefing. The email agent prompt exists in the codebase but is excluded from the orchestrator’s dispatch map. One line change to re-enable it.

The overnight pipeline now runs collect, classify, and render only. Sweep is triggered manually after reviewing the Daily Note. This matches how I actually work: I want to see the plan before anything fires.

Platform health monitoring across the stack

The health collector now monitors infrastructure beyond individual projects. Two platform-level scripts check Cloudflare and Coolify resources automatically:

PlatformWhat it monitorsMethod
Cloudflare WorkersError rates, invocation countsGraphQL Analytics API
Cloudflare PagesDeployment statusREST API
Coolify AppsContainer status (running/healthy)REST API via Cloudflare Tunnel
Coolify ServicesService healthREST API via Cloudflare Tunnel
Coolify DatabasesDatabase statusREST API via Cloudflare Tunnel

Each platform script outputs a JSON array. The health collector runs them alongside per-project health scripts and writes everything to cos.db. If a Worker’s error rate exceeds 10%, it shows as P1 in the Daily Note. If a Coolify container exits or becomes unhealthy, same treatment.

The architecture is extensible: adding a new platform (Vercel, Neon, Railway) means adding one script to PLATFORM_SCRIPTS and a config section. No changes to the collector or renderer needed.

Setup wizard

The project now includes an interactive setup wizard (setup_wizard.py). It reads config.example.toml as a template, walks through each section with sensible defaults, and generates config.toml. It also initializes the SQLite database and optionally installs the macOS launchd agent for overnight scheduling.

Every step is skippable. Optional integrations (Miniflux, Coolify, Cloudflare, healthchecks.io) are prompted separately. A --validate mode checks an existing config without interactive prompts. No external dependencies, stdlib only.

Closing

Chief of Staff isn’t the right solution for everyone. Agent frameworks or no-code platforms may be more suitable for many people. In my case, there’s no workflow between consulting and my own projects that reduces to a general routine. For routine situations, I already have different approaches that are sufficient. For specific situations, progressing case by case is more appropriate.

What this approach gives me is control. I can continuously review the flow, change it when needed, and avoid carrying the weight of unused features. A lightweight, purpose-built, portable system. When I sit down in the morning, I start making decisions instead of gathering information.

I’ll cover my other work in this area, the mini-services ecosystem, and the portable AI setup in future posts.

The project has been published as open source on GitHub 3.

Footnotes

  1. dnomia-knowledge: Project-based semantic knowledge management MCP server
  2. mcp-code-search: Local semantic code search MCP server
  3. Chief of Staff GitHub repository
Version Info
v0.2.0Repo →
Changelog
v0.2.02026-03-20Parallel sweep orchestrator, domain-specific subagents
v0.1.02026-01-03Initial release: Gmail, calendar, RSS and task collection pipeline
Key Takeaways
  • 01 Automating information gathering accelerates the decision-making process
  • 02 MCP connectors eliminate the need for OAuth setup or API key management
  • 03 The system is model-agnostic: prompt files and SQLite make it portable to any LLM
  • 04 Each layer works independently and delivers value on its own
Frequently Asked Questions (FAQ)
+ What is Chief of Staff?

A local-first AI assistant built for solo entrepreneurs. It collects Gmail, calendar, RSS feeds and Obsidian tasks overnight, classifies them, and presents a ready-made briefing each morning. Built on Claude Code, Python and SQLite, fully open source.

+ What data sources does it support?

Gmail and Google Calendar (via MCP connectors), Miniflux RSS feeds (REST API), Obsidian vault tasks (Python grep), and project health checks (customizable scripts).

+ Why not use OpenClaw or a similar framework?

Security, privacy, cost and portability constraints. Consulting and personal projects don't reduce to a generalizable routine. A lightweight, purpose-built solution is more optimized and controllable than a large framework with unused features.

+ Does it work with models other than Claude?

The system is designed to be model-agnostic. Prompt files and the SQLite layer can work with any LLM. Local models like Llama, Mistral or Qwen via Ollama are also viable options.

+ What does it cost per day?

Roughly $2-7 per day. Collection and classification run on Sonnet, the morning sweep runs on Opus. Budget caps keep each step's cost under control.