Back

How to Find DOM Elements by Text

How to Find DOM Elements by Text

There’s no getElementByText() in the DOM API. If you need to locate an element based on what it says rather than what it is, you have to build that capability yourself. This comes up more often than you’d think — automation scripts, UI testing, dynamic content parsing — and the right approach depends on how much flexibility you need.

Key Takeaways

  • The DOM API has no built-in method for selecting elements by text content, but three native approaches fill the gap: filtering with querySelectorAll, traversing with TreeWalker, and querying with XPath.
  • TreeWalker is the most versatile native option for full DOM text searches across any element type without collecting a large NodeList upfront.
  • Prefer textContent over innerText for text matching — it’s faster, avoids triggering layout recalculation, and behaves consistently regardless of element visibility.
  • Watch for common pitfalls like extra whitespace, nested descendant text, and dynamically injected content that may not be present when your script runs.

Why querySelector Can’t Query DOM by Text Content

querySelector() and querySelectorAll() only accept CSS selectors. While CSS has a :has() pseudo-class and attribute selectors, there is no standard CSS selector for matching an element’s text content. Selectors like div:text("Submit") simply don’t exist in the spec.

That leaves you with three practical approaches: filter a candidate set, traverse the DOM with a native API, or use XPath.

Method 1: Filter a Candidate Set by Text

The simplest approach is to query elements by tag or class, then filter by text. This works well when you know the element type in advance.

function findByText(tag, text) {
  return [...document.querySelectorAll(tag)].filter(el =>
    el.textContent.trim() === text
  )
}

// Usage
const buttons = findByText('button', 'Submit')

This is readable and fast when the candidate set is small. The weakness: it only searches one element type at a time. Searching all elements with '*' works but is slower on large DOMs.

Method 2: Traverse the DOM with TreeWalker

TreeWalker is a built-in DOM API that lets you walk through nodes efficiently. It’s well-supported in all modern browsers and avoids the overhead of collecting a full NodeList upfront.

function findElementsByText(root, text) {
  const walker = document.createTreeWalker(
    root,
    NodeFilter.SHOW_ELEMENT,
    {
      acceptNode(node) {
        return node.textContent.trim() === text
          ? NodeFilter.FILTER_ACCEPT
          : NodeFilter.FILTER_SKIP
      }
    }
  )

  const results = []
  while (walker.nextNode()) results.push(walker.currentNode)
  return results
}

// Usage
const matches = findElementsByText(document.body, 'TV')

This searches any element type across the entire tree — a generic solution the tag-specific approach can’t provide. It also supports early termination if you only need the first match.

Note on FILTER_SKIP vs. FILTER_REJECT: Using FILTER_SKIP here means the walker still descends into the children of a non-matching node. If you used FILTER_REJECT instead, the walker would skip the node and its entire subtree. For text searching, FILTER_SKIP is almost always what you want, since a parent’s textContent might not match even though a deeper descendant’s does.

Method 3: XPath Text Search in the DOM

For more expressive matching, document.evaluate() supports XPath expressions, including text-based queries. This is the most powerful option for complex patterns.

function findByXPath(expression, context = document) {
  const result = document.evaluate(
    expression,
    context,
    null,
    XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
    null
  )

  return Array.from({ length: result.snapshotLength }, (_, i) =>
    result.snapshotItem(i)
  )
}

// Find any element containing the text "Submit"
const els = findByXPath('//*[contains(text(), "Submit")]')

XPath text search in the DOM handles partial matches and complex conditions cleanly. The tradeoff is readability — XPath syntax is unfamiliar to most frontend developers.

One thing to keep in mind: contains(text(), "Submit") only matches against the element’s direct text nodes. If “Submit” lives inside a child element, this expression won’t match the parent. To search across all descendant text, use contains(., "Submit") instead, where . refers to the string value of the entire element including its descendants.

textContent vs. innerText: Which to Use for Matching

Both properties return text, but they behave differently:

PropertyReturnsTriggers Layout?
textContentRaw DOM text, including hidden elementsNo
innerTextRendered text onlyYes (reflow)

Use textContent for text matching. It’s faster, doesn’t trigger layout recalculation, and works consistently across all elements regardless of visibility.

Common Pitfalls When You Find DOM Elements by Text

Whitespace: textContent includes whitespace from HTML formatting. Always .trim() before comparing.

Nested text: An element’s textContent includes all descendant text. <div><span>TV</span></div> — the div’s textContent is also "TV". Be specific about which element level you’re targeting.

Dynamic content: Text injected after page load won’t be present when your script runs. Use a MutationObserver or run your search after the content is confirmed to exist.

Choosing the Right Approach

  • Known element type, simple match → filter with querySelectorAll
  • Any element type, full DOM searchTreeWalker
  • Partial match or complex pattern → XPath via document.evaluate()

If you’re working in a testing context, tools like Testing Library provide getByText() built in — worth knowing, though the native techniques above remain essential for non-test scripts.

Conclusion

Text-based DOM lookup is a gap in the standard API, but these three approaches cover every practical case. Use querySelectorAll with a filter for quick, targeted searches when you know the element type. Reach for TreeWalker when you need a full DOM traversal without committing to a specific tag. Turn to XPath when the matching logic demands partial text or complex conditions. Whichever method you choose, prefer textContent over innerText, trim whitespace before comparing, and account for nested descendant text to avoid false matches.

FAQs

No. querySelector and querySelectorAll only accept CSS selectors, and CSS has no selector for matching text content. You need to use JavaScript-based approaches such as filtering a querySelectorAll result set, walking the DOM with TreeWalker, or running an XPath expression with document.evaluate to locate elements by what they contain.

FILTER_SKIP tells the TreeWalker to skip the current node but still visit its children. FILTER_REJECT skips the node and its entire subtree. For text-based searches, FILTER_SKIP is usually the right choice because a parent might not match while a deeper descendant does.

No. The expression contains(text(), value) only checks the element's direct text nodes. If the target string lives inside a nested child element, use contains(., value) instead. The dot operator refers to the full string value of the element, including all descendant text.

textContent is faster because it does not trigger a layout reflow. It returns all text in the DOM subtree regardless of CSS visibility, making it consistent and predictable. innerText returns only rendered text and forces the browser to recalculate layout, which adds unnecessary overhead for matching purposes.

Complete picture for complete understanding

Capture every clue your frontend is leaving so you can instantly get to the root cause of any issue with OpenReplay — the open-source session replay tool for developers. Self-host it in minutes, and have complete control over your customer data.

Check our GitHub repo and join the thousands of developers in our community.

OpenReplay