XPath Tester

XPath Tester

Test XPath expressions on XML documents with live results. Debug XPath queries for web scraping. Free online XPath evaluator and validator

XPath has been the standard XML query language since 1999 and the de-facto selector dialect for HTML scraping since the libxml2 era. This tester evaluates your expression against the document you paste, shows the node count, and renders every match with its absolute path so you can confirm "//div[@class='price']" hits the eleven prices you expected and not the twelve that include a sidebar widget.

XPath 1.0 vs 2.0/3.1 — pick the right version

  • XPath 1.0 — what every browser ships (document.evaluate), what libxml2/lxml/Selenium use by default. No sequences, no regex, no upper-case-first(). String functions limited to translate(), contains(), starts-with(), substring-before/after.
  • XPath 2.0/3.1 — full sequence type system, regex via matches() and replace(), if/then/else, for/let expressions. Requires Saxon or BaseX as evaluator. Not available in browsers.
  • XQuery — superset of XPath 3.1 with FLWOR and constructors. Different language; not a drop-in upgrade.

If you write //title[matches(., "^Chapter \d")] expecting it to work in Selenium, you will get an XPathExceptionImpl. Selenium ships browser XPath 1.0, which has no matches() function. Rewrite as //title[starts-with(., "Chapter ")] or do the regex check in your test framework.

Working example: scraping prices from a product table

Input

HTML:
<table>
  <tr><td class="name">Widget</td><td class="price">$12.99</td></tr>
  <tr><td class="name">Gadget</td><td class="price">$34.50</td></tr>
  <tr><td class="name">Promo</td><td class="price old">$50.00</td></tr>
</table>

Expression: //td[@class="price"]/text()

Output

Matched 2 nodes:
  /table/tr[1]/td[2]/text() → "$12.99"
  /table/tr[2]/td[2]/text() → "$34.50"

The "Promo" row's td has class="price old" so the exact-match @class="price" skips it.

Use contains(@class, "price") and not(contains(@class, "old")) if you want both regular prices AND the promo. Beware: contains(@class, "price") also matches @class="price-old" which is usually a bug.

Axes, predicates, and the parts of XPath people skip

  • Axes — child:: (default), descendant::, ancestor::, following-sibling::, preceding-sibling::, parent::, self::, attribute:: (or @). //x is shorthand for /descendant-or-self::node()/x, which is why //x[1] is not the first match overall but the first child x of each parent.
  • Predicates — square brackets. Multiple predicates AND together. //book[author="Knuth"][year>1980] matches Knuth books from 1981 onward.
  • Numeric predicates — //book[2] is the second book of each parent, not the second book in the document. Use (//book)[2] for "second book overall".
  • Position vs last() — //li[position()=1] = //li[1]. //li[last()] is the last child of its parent. //li[last()-1] is the second-to-last.
  • Namespaces — XPath 1.0 has no default namespace. //html:div in an HTML document only works if you bind the html prefix to http://www.w3.org/1999/xhtml. Most "//div returns nothing" bugs in scrapers are namespace bugs.

When to reach for this tool

  • You are writing a Selenium or Playwright test and the locator //button[contains(., "Save")] picks the wrong button — paste the page HTML and tighten the predicate.
  • You inherited a Scrapy or Selectolax spider with thirty XPath rules and need to debug why one rule started returning empty after the source site changed markup.
  • You are migrating SOAP/XSLT code and need to verify each XPath still hits the expected nodes against sample responses.
  • You are building a sitemap parser and want to confirm //url/loc/text() with namespace handling actually returns URLs.

What this tool will not do

  • It will not execute against JavaScript-rendered DOM. Paste the rendered HTML (browser DevTools → Copy outerHTML), not the server response. Modern SPAs deliver an empty <body> until JS runs.
  • It will not evaluate XPath 2.0 functions in browser mode (matches, replace, upper-case, distinct-values). If your expression needs those, run it through Saxon-HE on the command line or rewrite for 1.0.
  • It will not handle malformed HTML the way browser engines do. The tester parses with an HTML5-aware DOM parser; pages that rely on quirks-mode tag soup may not match what Chrome sees.

XML and HTML you paste stay in your browser. Useful for testing scrapers against pages behind a login — copy DOM from DevTools and test locally without re-fetching.

Frequently asked questions

Why does //div return nothing on an XHTML document?

XHTML lives in the http://www.w3.org/1999/xhtml namespace. XPath 1.0 only matches elements in the empty namespace unless you bind a prefix. Use *[local-name()="div"] for namespace-agnostic matching, or bind a prefix in your evaluator (different per language).

What is the difference between // and /?

/ is the document root or strict parent-child step. // means "anywhere in the descendant axis". /html/body/div matches only direct children at each level; //div matches any div anywhere. // is convenient but slow on big trees — be specific when you can.

How do I select by attribute that contains a space-separated value (like CSS class)?

Use concat to normalize: //*[contains(concat(" ", normalize-space(@class), " "), " active ")]. This is the standard XSLT 1.0 idiom for "has class active" without false-matching active-secondary.

My XPath works in DevTools but fails in Selenium. Why?

DevTools $x() runs against the live DOM after JavaScript; Selenium runs against the DOM that exists when the page is fetched (or after explicit waits). Either WebDriverWait until the element exists, or fetch the rendered HTML separately.

How do I get the text of an element without its child elements?

text() returns only direct text children, not concatenated descendants. //h1/text() returns the text directly inside h1 but not text inside any nested <span>. Use string(//h1) or .//text() (then join in your code) if you want the flattened content.

Is there an XPath equivalent of CSS :nth-child(2n+1)?

XPath 1.0 supports //li[position() mod 2 = 1] for odd-numbered siblings. For arbitrary 2n+1 patterns use modular arithmetic; for nth-of-type use //li[position() mod n = 0]. CSS is simpler when both work.

Related tools

Published · Updated · E-Utils editorial team