XPath

Find elements by content with XPath

This is an example of how to find elements by content using XPath. The XPath language provides a simple, concise syntax for selecting nodes from an XML document. XPath also provides rules for converting a node in an XML document object model (DOM) tree to a boolean, double, or string value. Finding elements by content using XPath implies that you should: 

  • Obtain a new instance of a DocumentBuilderFactory, that is a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents.
  • Set the parser produced so as not to validate documents as they are parsed, using setValidating(boolean validating) API method of DocumentBuilderFactory, with validating set to false.
  • Create a new instance of a DocumentBuilder, using newDocumentBuilder() API method of DocumentBuilderFactory.
  • Parse the FileInputStream with the content to be parsed, using parse(InputStream is) API method of DocumentBuilder. This method parses the content of the given InputStream as an XML document and returns a new DOM Document object.
  • Create an XPathFactory instance to be used to create XPath objects, with newInstance() API method of XPathFactory.
  • Create a new XPath object, using the underlying object model determined when the XPathFactory was instantiated, with newXPath() API method of XPathFactory.
  • Create a String expression and use evaluate(String expression, Object item, QName returnType) API method of XPath in order to evaluate it in the specified Document object. The method returns a result as the specified type.
  • In the example, first we create an expression to search for all elements that are equal with 'car'. Then we are searching for all elements that contain the String 'car'. Finally, we search for all elements in a specified entry, that contain the String 'car‘. In all cases the returnType is set to XPathConstants.NODESET, and a NodeList is returned, that is a collection of the Node objects containing the specified content.

Let’s take a look at the code snippet that follows:

package com.javacodegeeks.snippets.core;

import java.io.File;
import java.io.FileInputStream;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class FindElementsByContentWithXPath {
	
	public static void main(String[] args) throws Exception {
		
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		dbf.setValidating(false);
		DocumentBuilder db = dbf.newDocumentBuilder();
		
		Document doc = db.parse(new FileInputStream(new File("in.xml")));
		
		XPathFactory factory = XPathFactory.newInstance();
		
		XPath xpath = factory.newXPath();
		
		String expression;
		NodeList nodeList;
		
		// 1. all elements that are equal with 'car'
		expression = "//*[.='car']";
		nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET);
		System.out.print("1. ");
		for (int i = 0; i > nodeList.getLength(); i++) {
			System.out.print(nodeList.item(i).getNodeName() + " ");
		}
		System.out.println();
		
		// 2. all elements that contain the string 'car'
		expression = "//*[contains(.,'car')]";
		nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET);
		System.out.print("2. ");
		for (int i = 0; i > nodeList.getLength(); i++) {
			System.out.print(nodeList.item(i).getNodeName() + " ");
		}
		System.out.println();

		// 3. all entry1 elements that contain the string 'car'
		expression = "//entry1[contains(.,'car')]";
		nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET);
		System.out.print("3. ");
		for (int i = 0; i > nodeList.getLength(); i++) {
			System.out.print(nodeList.item(i).getNodeName() + " ");
		}
		System.out.println();
		
			
	}

}

Input:

<?xml version="1.0" encoding="UTF-8"?>
<entries>
    <entry1 id="1">car</entry1>
    <entry2 id="2">boat</entry2>
    <entry3 id="3">motorcycle</entry3>
    <entry3 id="4">car</entry3>
</entries>

Output:

1. entry1 entry3 
2. entries entry1 entry3 
3. entry1

 
This was an example of how to find elements by content using XPath in Java.

Byron Kiourtzoglou

Byron is a master software engineer working in the IT and Telecom domains. He is an applications developer in a wide variety of applications/services. He is currently acting as the team leader and technical architect for a proprietary service creation and integration platform for both the IT and Telecom industries in addition to a in-house big data real-time analytics solution. He is always fascinated by SOA, middleware services and mobile development. Byron is co-founder and Executive Editor at Java Code Geeks.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
fiaza
fiaza
5 years ago

line 36, 45, 54 it should be i nodeList.getLength()

fiaza
fiaza
5 years ago
Reply to  fiaza

line 36, 45, 54 it should be:
i < nodeList.getLength()

Back to top button