What is XML?
XML is the markup language behind feeds, documents, and config the world over. Here is the syntax, well-formed vs valid, namespaces, and where XML still beats JSON.
What XML is
XML (Extensible Markup Language) is a text format for representing structured data as a tree of elements. Unlike HTML, it has no fixed tag set, you invent element names that describe your data. For two decades it was the backbone of data interchange, document formats, and config, and it still runs huge swaths of the web: every RSS and Atom feed, office document formats, SVG, SOAP web services, and countless enterprise systems are XML.
The syntax: elements & attributes
An XML document has exactly one root element containing nested child elements. Each element is a tag pair; the opening tag may carry attributes as quoted name/value pairs.
<?xml version="1.0" encoding="UTF-8"?>
<book category="reference">
<title lang="en">Structure and Interpretation</title>
<authors>
<author>Abelson</author>
<author>Sussman</author>
</authors>
<year>1985</year>
</book>
Here book is the root, category is an attribute, and title,
authors, and year are child elements. The convention: data in elements,
metadata in attributes.
Well-formed vs valid
| Well-formed | Valid | |
|---|---|---|
| Means | Follows XML syntax rules | Well-formed and matches a schema |
| Checks | One root, all tags closed, proper nesting, quoted attributes | Allowed elements, order, and types (via XSD/DTD) |
| Needs a schema? | No | Yes (XSD or DTD) |
Every XML parser requires well-formed input. Validation is an extra step you
opt into when the exact shape matters. Five characters must be escaped in text: &,
<, >, ", '.
Namespaces
When one document combines vocabularies, namespaces (declared with xmlns) keep
element names from colliding, so a title defined by one standard never clashes with a
title from another. They are what make formats like Atom, SOAP, and SVG able to mix cleanly, and a
common source of confusion for newcomers, because the same tag name can mean different things under different
namespaces.
The XML behind RSS
Feeds are the most familiar XML most people meet. An RSS feed is just an XML document with an agreed vocabulary, that shared structure is exactly why any reader can parse any feed:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>KB Cafe</title>
<item>
<title>A new reference is live</title>
<link>https://kbcafe.com/what-is-xml</link>
</item>
</channel>
</rss> Check one against the spec with the RSS validator, or read the full guide to RSS, Atom & OPML.
XML vs JSON
| XML | JSON | |
|---|---|---|
| Shape | Document tree | Objects & arrays |
| Attributes | Yes | No |
| Namespaces | Yes | No |
| Comments | Yes | No |
| Verbosity | Heavier | Lighter |
| Best for | Documents, feeds, mixed content, validation | APIs, config, data interchange |
Same data, different jobs. See what JSON is for the other side, the rule of thumb is documents and feeds lean XML, program-to-program data leans JSON.
FAQ
What does XML stand for?
Extensible Markup Language. 'Extensible' is the point: unlike HTML, XML has no fixed set of tags, you define element names that fit your data, and the structure is a tree of those elements.
What is the difference between an element and an attribute?
An element is a tag pair with content between it, like <title>Hello</title>. An attribute is a name="value" pair inside the opening tag, like <title lang="en">. Rule of thumb: data goes in elements, metadata about that data goes in attributes.
What does 'well-formed' vs 'valid' mean?
Well-formed means the XML follows the basic syntax rules: one root element, every tag closed, proper nesting, quoted attributes. Valid is stronger: the document is well-formed AND conforms to a schema (XSD or DTD) that defines which elements and types are allowed. All valid XML is well-formed; not all well-formed XML is valid.
Is RSS just XML?
Yes. RSS and Atom feeds are XML documents with an agreed-upon set of elements (channel, item, title, link, and so on). That is why a feed reader can parse any feed, it is parsing XML against a known vocabulary.
XML vs JSON, when do I still use XML?
Use XML when you need its specific strengths: mixed content (markup embedded in text), attributes alongside elements, namespaces to combine vocabularies, comments, or mature schema validation via XSD. For lightweight data interchange between programs, JSON is usually the better default.
What is a namespace in XML?
A way to avoid name collisions when documents mix vocabularies. A namespace (declared with xmlns) prefixes element names so a <title> from one standard does not clash with a <title> from another, essential in formats like Atom, SOAP, and SVG.
Why does my XML fail to parse?
Almost always a well-formedness slip: an unclosed tag, more than one root element, an unescaped & or < in text (use & and <), mismatched nesting, or an unquoted attribute. A parser will point you at the exact line and column.
XML was the water KB Cafe swam in, the original site ran on RSS, Atom, and the XML feed tooling of the open web. This is the modern restoration: what XML is, how it stays well-formed, and why the feed layer it powers is still here in 2026.