Build your own agentic OS (with Claude Code or any agent)

What an “agentic OS” actually is

It is not an operating system, and you do not need anyone’s paid framework to build one. The honest definition: a folder on your machine where a capable coding agent (Claude Code and others) can read your notes, run your tools, and execute saved routines that loop until the work is done. Three layers do all the work: a brain (your knowledge), a build (the agent plus the rules and routines that drive it), and a way to ship it (run it on demand, or on a schedule). It is model-agnostic on purpose, any sufficiently capable agent will do, so nothing here depends on a specific model or its context size.

The payoff is real: instead of re-explaining yourself every session, you teach the system once and then trigger it, “write my morning report,” “triage my inbox,” “plan today,” and it runs the whole job against your own data. The catch is also real, and we will get to it: this only pays off when the work repeats and you can check the output automatically.

Step 1: the brain (your knowledge base)

Start with a folder of linked markdown notes. Plain .md files, one idea per note, cross-referenced with links. Many people use Obsidian because it stores everything as local markdown and shows the links as a graph, but the agent never needs Obsidian itself; it just reads the files and follows the links between them as text. No database, no cloud, nothing uploaded. Your data stays yours, which is the whole point of running this locally.

Why notes instead of a database? Because a coding agent is already excellent at reading files, grepping for a term, and following references, the same way it reads a codebase. A well-linked notes folder becomes a knowledge base it can search and cross-reference on demand. This is the same idea as agent memory: the model is stateless, so durable knowledge has to live somewhere it can re-read.

Step 2: the build (rules, routines, and loops)

This is where the proprietary “frameworks” you see online are really just four built-in primitives. Learn these and you can build the same thing yourself:

CLAUDE.md, a markdown file at the root the agent reads every session. Your standing rules: how to write, what to never touch, your stack, your conventions. This is the “memory” that makes runs consistent. There is a ready starter in our free resources.
Custom slash commands, one markdown file per routine in .claude/commands/. A file named plan-today.md becomes the /plan-today command. The file is just the instructions to run, which is all a “named command” ever was. The builder below writes one for you.
Skills, reusable instruction bundles the agent loads when relevant, so you are not re-pasting the same how-to into every routine.
Subagents, a second agent that checks the first one’s work. That separation is what stops a model approving its own homework, the Judge and Verification patterns in practice.

Connect the agent to the outside world with tool calling and MCP servers (your files, a database, GitHub, the web). And every routine worth automating is a loop: discover, plan, execute, verify against a real check, and iterate until it passes or hits a stop limit. That verify gate is non-negotiable; without it you have an expensive way to generate unchecked output.

Security, before you automate anything. A coding agent can run real commands and change real files. By default it asks before risky actions, and you can relax that (auto-accept edits, or a read-only plan mode). There is also a full “bypass permissions” mode, and the hype tends to leave it on. Treat that like sudo: only in a throwaway or sandboxed directory you have looked at, never pointed at your whole machine or your real credentials. Convenience is not worth handing an autonomous loop unrestricted access.

Hands-on: write a real routine

Describe a job and we generate a working Claude Code slash command, a loop with a verify gate, ready to save and run. This is the honest version of those “framework commands.”

Routine name becomes the slash command

What it should do What it should read first your data Done when the verify gate

.claude/commands/morning-report.md

Save that file in your project, restart the agent, and run /morning-report.

Step 3: ship it (run on demand, or 24/7)

Your “command deck” is just the set of routines you built: a morning report, an inbox triage, a weekly review. Two ways to run them:

Local, on demand. You sit down, run /plan-today, watch it work. Simplest, safest, and where everyone should start.
Scheduled, hands-off. To have routines fire while you sleep, you need something that runs them on a timer. The real options, in rough order of effort: a cron job on your own machine, a GitHub Actions scheduled workflow (free, runs in the cloud), or a small always-on host (Render, Fly.io, Railway, a cheap VPS). Pick the lightest one that fits; you do not need a platform to schedule a script.

Whichever you choose, the move is the same as any loop system: get one routine reliable by hand first, then schedule it. Automating something that was never reliable just means it now fails on a timer.

Package it as one repo

Once it works, the whole thing is a single folder you can version, clone, and reuse:

~/operator/
  CLAUDE.md            the standing rules your agent reads every session
  notes/               your linked markdown knowledge base (the "brain")
  .claude/commands/    your routines, one markdown file per slash command
  .claude/skills/      reusable instruction bundles
  reports/             where routines write their output
  .env                 keys, gitignored, never committed

Commit it (with .env gitignored, see the key-hygiene note), and you can clone it onto a new machine, drop in your keys, and run. Want a variant for a different project or a teammate? Fork the folder and adjust the rules and routines. That is all the “one-click OS” products are: a repo of a CLAUDE.md, some commands, and some skills.

The honest part: when not to build one

An agentic OS is genuinely useful, and genuinely overkill for most one-off work. It earns its complexity only when the same conditions that justify any loop hold: the task repeats, there is an automatic way to check the output, and the agent can run it end to end. If you are doing something once, or you would have to eyeball every result anyway, a single good prompt beats a whole system. Build the smallest thing that works, then grow it one reliable routine at a time.