Feeds for Machines: RSS as the Structured Layer for AI

Most of what an AI system reads about the world, it reads by scraping HTML, fighting through nav bars, cookie banners, ad slots, and markup that was written for a browser, not a model. It works, badly. There is a cleaner path that has been sitting in plain sight: the feed.

What a feed gives a model

An RSS or Atom feed is content with the chrome already removed:

  • Structure: title, author, timestamp, and body in labeled fields, not guessed at from <div> soup.
  • Freshness: feeds are ordered and timestamped, so “what changed since I last looked” is a trivial question instead of a re-crawl.
  • Intent: a publisher who offers a feed is explicitly saying “here is my content, meant to be consumed by software.” That’s a permission signal scraping never has.

The permission problem

The tension between AI crawlers and publishers is, at bottom, about consent and structure. Scraping takes without asking and guesses at meaning. A feed is the opposite: a publisher deciding what to syndicate, in a format that says what each part is. As the relationship between models and the open web gets renegotiated, feeds are a natural unit, granular, opt-in, and machine-clean.

If you were designing “structured, permissioned, time-ordered content for machines” from scratch, you’d land somewhere close to RSS.

What feeds don’t solve

Feeds aren’t a silver bullet, and it’s worth being honest about the gaps. Not every site offers one. Full-text feeds are rarer than they should be, plenty publish only a summary. And a feed still can’t capture everything a rich, interactive page does. But as the default unit for “give a model clean, recent, opt-in content,” a feed beats scraping on every axis that matters: structure, freshness, cost, and consent. It puts the publisher back in charge of what gets shared.

Where this goes

The practical near-term version is unglamorous and useful: point a model at a set of feeds instead of a set of URLs to scrape, and let RSS-to-JSON do the parsing so the content arrives clean. Retrieval pipelines, monitoring agents, and “summarize what’s new” tools all get simpler when the source already speaks a structured dialect.

It’s a nice irony. The format everyone wrote off a decade ago turns out to be well-suited to the most hyped technology of this one. KB Cafe is building its feed tools with that in mind, and an AI-tools corner alongside them.

Related

See what RSS is, why RSS didn’t die, and the Feeds section.