How to build an AI agent: the modern, honest guide

A chatbot answers. An agent acts.

A plain model can only talk: you ask, it replies, it waits. That is a chatbot. It becomes an agent the moment two things are true: it can call tools (search the web, read a file, hit an API, send an email), and it can decide for itself which tool to use, in what order, and when it is done. That decision-making is not a separate “brain module” you build, it is the model’s own tool calling. Give a capable model tools and a loop, and it starts acting.

Hands-on: chatbot, workflow, or agent?

Three different things get called “an AI agent.” Which is each of these really?

The five parts of an agent

Every agent, from a 30-line script to a production system, is built from the same five pieces:

The model, the reasoning core. Any capable one works: Claude, GPT, Gemini, or a strong open model. Do not lock your design to a single model or version, they change constantly, and the agent’s design should not care which one is behind it.
Tools, what it can actually do. Each tool is a function the model can call: search, read a document, query a database, send a message. The connection mechanism is tool calling, and increasingly the Model Context Protocol (MCP), the emerging standard for plugging tools and data into any agent without custom glue for each one.
The loop, the engine. The model calls a tool, reads the result, decides the next step, and repeats until the goal is met or a limit is hit. This is the part most tutorials skip, and it is the whole point, see AI loops.
Memory, so it does not start from zero. Short-term memory is the current conversation living in the context window; long-term memory is a store it can write to and search later, often a vector database for recall by meaning. See agent memory.
A goal and a stop condition, so it knows what “done” means and cannot run (or spend) forever.

The loop is the thing

The linear picture, prompt goes in, answer comes out, describes a chatbot. An agent is a cycle: think, act with a tool, observe the result, check against the goal, and go again. Watch one real loop:

GOAL: research a company and draft a short outreach email.

STEP 1  THINK   I need current facts about the company.
STEP 1  ACT     call web_search("Acme Corp recent news")
STEP 1  OBSERVE 3 results: a funding round, a new product, a hire.

STEP 2  THINK   The funding round is the strongest hook.
STEP 2  ACT     call read_page(funding_article_url)
STEP 2  OBSERVE key details: amount, date, stated plans.

STEP 3  THINK   I have enough to draft.
STEP 3  ACT     draft the email, referencing the raise.
STEP 3  CHECK   under 120 words? specific hook? clear ask? not yet, too long.

STEP 4  ACT     tighten the draft.
STEP 4  CHECK   all criteria met. STOP and return the email.

Notice what the model is doing: choosing tools, reading results, judging its own draft, and only stopping when the goal is met. No separate “decision layer,” the model decides inside the loop. That verify-and-iterate step is exactly the loop pattern, and a real check on the output is what keeps it honest.

Hands-on: spec your agent

Describe what you want and we assemble a starter agent brief, the loop, tools, memory, and guardrails, ready to hand to any model or framework.

Goal what it should accomplish Tools it can use Memory Done when the stop condition

Your agent brief

You do not need a whole app to start

The hype makes it sound like building an agent means building a Next.js frontend, an Express backend, and a database. It does not. The essence of an agent, model plus tools plus a loop, fits in a short script using a provider SDK or an agent framework (which handle the loop and tool-calling for you). Start there: prove the agent works from a terminal against real tools. A UI, a server, and a database are how you later ship it to other people, not what makes it an agent. Build the smallest thing that does the job, then wrap it.

Security is not optional

The tutorials that hand an agent web access, email sending, and code execution in ten slides skip the most important part. The moment an agent can read untrusted content (the web, an inbox), act (send, run, pay), and touch private data, you have assembled the lethal trifecta: a poisoned web page can quietly redirect your agent. You cannot prompt your way out of it. Contain it with architecture: least privilege (the narrowest tool access that does the job), a human in the loop for anything irreversible, and treating every tool result as data, never as instructions. Give an autonomous loop the least power it needs, not the most it could use.

When not to build one

An agent is the right tool only when the task genuinely needs to decide: pick different tools, adapt to what it finds, and loop. If the steps are always the same, a plain workflow is simpler and more reliable. If it is a one-off, a single good prompt wins. Most things people reach for an agent for are really workflows or prompts in disguise, and the honest move is to use the smallest tool that works. See agents vs workflows vs RPA. When you do need the full autonomous version wired into your own data and routines, that is an agentic OS.