XPath Tutorial — Query XML Documents with Path Expressions

XPath is a query language for selecting nodes from XML documents. It works like CSS selectors for HTML but for XML. Here's XPath syntax, axes, predicates, and functions with...

Mian Ali Khalid · 2026-05-11 · 7 min read

Use the tool

XML Formatter

Format, validate, and beautify XML documents.

Open XML Formatter →

XPath (XML Path Language) is a query language for selecting nodes from XML documents. It uses path-like syntax to navigate the element tree, similar to file system paths or CSS selectors — but with more expressive filtering via predicates and functions.

Use the XML Formatter to inspect and work with XML structures.

XPath fundamentals

Given this XML:

<bookstore>
  <book category="fiction">
    <title lang="en">The Great Gatsby</title>
    <author>F. Scott Fitzgerald</author>
    <price>12.99</price>
  </book>
  <book category="tech">
    <title lang="en">Clean Code</title>
    <author>Robert C. Martin</author>
    <price>34.99</price>
  </book>
  <book category="fiction">
    <title lang="fr">Le Petit Prince</title>
    <author>Antoine de Saint-Exupéry</author>
    <price>9.99</price>
  </book>
</bookstore>

Basic XPath expressions:

Expression	Selects
`/bookstore`	Root element `bookstore`
`/bookstore/book`	All `book` elements that are direct children of `bookstore`
`//book`	All `book` elements anywhere in the document
`//title`	All `title` elements
`//book/title`	All `title` elements that are children of `book`
`//@lang`	All `lang` attributes
`//book[1]`	First `book` element
`//book[last()]`	Last `book` element

Predicates (filtering)

Predicates in square brackets filter nodes:

//book[@category='fiction']

→ All book elements where category attribute equals “fiction”

//book[price > 20]

→ All books with price greater than 20

//book[author='Robert C. Martin']

→ Books by Robert C. Martin

//book[@category='tech' and price < 40]

→ Tech books under $40

//title[@lang='en']

→ Titles in English

//book[position() <= 2]

→ First two books

Axes

XPath axes select nodes relative to the context node:

Axis	Meaning
`child::`	Direct children (default)
`parent::`	Parent element
`ancestor::`	All ancestors (parent, grandparent, …)
`descendant::`	All descendants
`following-sibling::`	Siblings after context node
`preceding-sibling::`	Siblings before context node
`attribute::`	Attributes of context node
`self::`	Context node itself

//book/following-sibling::book

→ All book elements that come after each book element

//title/parent::book

→ book elements that contain title elements

//book/ancestor::bookstore

→ The bookstore element (ancestor of any book)

Abbreviated syntax:

. = self::node()
.. = parent::node()
@attr = attribute::attr
// = /descendant-or-self::node()/

Functions

String functions

//title[contains(text(), 'Clean')]

→ Titles containing “Clean”

//title[starts-with(text(), 'The')]

→ Titles starting with “The”

//book[string-length(title) > 10]

→ Books with titles longer than 10 characters

normalize-space(//title[1])

→ First title with leading/trailing whitespace removed

Numeric functions

count(//book)

→ Number of book elements

sum(//price)

→ Sum of all prices

//book[price = min(//price)]

→ Cheapest book (XPath 2.0)

Node functions

//book[name()='book']

→ Elements named “book” (redundant here but useful dynamically)

//book[not(@category)]

→ Books without a category attribute

//book[@category != 'fiction']

→ Books where category is not fiction (includes books without category)

XPath in JavaScript (browser)

const xml = `<bookstore>
  <book category="fiction"><title>Gatsby</title><price>12.99</price></book>
  <book category="tech"><title>Clean Code</title><price>34.99</price></book>
</bookstore>`;

const parser = new DOMParser();
const doc = parser.parseFromString(xml, 'application/xml');

// Evaluate XPath:
function xpathQuery(expression, doc) {
  const result = doc.evaluate(
    expression,
    doc,
    null,  // Namespace resolver
    XPathResult.ANY_TYPE,
    null
  );
  
  const nodes = [];
  let node = result.iterateNext();
  while (node) {
    nodes.push(node);
    node = result.iterateNext();
  }
  return nodes;
}

// Get all book titles:
const titles = xpathQuery('//title/text()', doc);
titles.forEach(t => console.log(t.textContent));
// "Gatsby"
// "Clean Code"

// Get fiction books:
const fiction = xpathQuery('//book[@category="fiction"]', doc);
console.log(fiction.length);  // 1

// Get price of tech books:
const techPrices = xpathQuery('//book[@category="tech"]/price/text()', doc);
console.log(techPrices[0].textContent);  // "34.99"

XPath in Python (lxml)

from lxml import etree

xml = b'''<bookstore>
  <book category="fiction"><title>Gatsby</title><price>12.99</price></book>
  <book category="tech"><title>Clean Code</title><price>34.99</price></book>
</bookstore>'''

root = etree.fromstring(xml)

# Simple selection:
titles = root.xpath('//title/text()')
print(titles)  # ['Gatsby', 'Clean Code']

# Filter by attribute:
fiction_books = root.xpath('//book[@category="fiction"]')
for book in fiction_books:
    print(book.find('title').text)  # "Gatsby"

# Numeric result:
total_price = root.xpath('sum(//price)')
print(total_price)  # 46.98

count = root.xpath('count(//book)')
print(int(count))  # 2

# With namespaces:
xml_ns = b'''<root xmlns:dc="http://purl.org/dc/elements/1.1/">
  <dc:title>My Doc</dc:title>
</root>'''
root_ns = etree.fromstring(xml_ns)
titles = root_ns.xpath('//dc:title/text()', 
                       namespaces={'dc': 'http://purl.org/dc/elements/1.1/'})
print(titles)  # ['My Doc']

XPath vs CSS selectors

Task	XPath	CSS Selector
All descendants	`//div`	`div`
Direct child	`/div/p`	`div > p`
By attribute	`//a[@href]`	`a[href]`
By attribute value	`//a[@class='nav']`	`a.nav` or `a[class='nav']`
First element	`(//li)[1]`	`li:first-child`
Last element	`(//li)[last()]`	`li:last-child`
Contains text	`//p[contains(text(),'hello')]`	No equivalent
Parent selection	`//li/..`	No equivalent
Sibling selection	`//h2/following-sibling::p[1]`	`h2 + p`
Select by text	`//button[text()='Submit']`	No equivalent

XPath is more powerful for data extraction (can select parent nodes, filter by text content). CSS selectors are more concise for styling and simpler DOM traversal.

XML Formatter — format and inspect XML documents
XML Validator Online — validate XML syntax
XML Namespace Guide — using XML namespaces

XML Still Matters in 2026 (Here's Where and Why) — JSON won the wire format war years ago, but XML is still everywhere it actually …
XML Formatter Online — Beautify and Validate XML Instantly — An XML formatter adds proper indentation to minified XML, making it human-readab…
XML vs JSON in API Design — When to Choose Each Format — JSON has largely replaced XML in REST APIs, but XML still dominates in SOAP, ent…
XML to JSON Converter — Transform XML Data to JSON — Converting XML to JSON maps elements, attributes, and text nodes to JSON objects…
XML Validator Online — Check XML Syntax and Structure — An XML validator checks that XML is well-formed (correct syntax) and optionally …

Related tool