XPath
Find elements by content with XPath
This is an example of how to find elements by content using XPath. The XPath language provides a simple, concise syntax for selecting nodes from an XML document. XPath also provides rules for converting a node in an XML document object model (DOM) tree to a boolean, double, or string value. Finding elements by content using XPath implies that you should:
- Obtain a new instance of a DocumentBuilderFactory, that is a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents.
- Set the parser produced so as not to validate documents as they are parsed, using
setValidating(boolean validating)
API method of DocumentBuilderFactory, with validating set to false. - Create a new instance of a DocumentBuilder, using
newDocumentBuilder()
API method of DocumentBuilderFactory. - Parse the FileInputStream with the content to be parsed, using
parse(InputStream is)
API method of DocumentBuilder. This method parses the content of the given InputStream as an XML document and returns a new DOM Document object. - Create an XPathFactory instance to be used to create XPath objects, with
newInstance()
API method of XPathFactory. - Create a new XPath object, using the underlying object model determined when the XPathFactory was instantiated, with
newXPath()
API method of XPathFactory. - Create a String expression and use
evaluate(String expression, Object item, QName returnType)
API method of XPath in order to evaluate it in the specified Document object. The method returns a result as the specified type. - In the example, first we create an expression to search for all elements that are equal with
'car'
. Then we are searching for all elements that contain the String'car'
. Finally, we search for all elements in a specified entry, that contain the String'car
‘. In all cases thereturnTyp
e is set toXPathConstants.NODESET
, and a NodeList is returned, that is a collection of the Node objects containing the specified content.
Let’s take a look at the code snippet that follows:
package com.javacodegeeks.snippets.core; import java.io.File; import java.io.FileInputStream; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.NodeList; public class FindElementsByContentWithXPath { public static void main(String[] args) throws Exception { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setValidating(false); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new FileInputStream(new File("in.xml"))); XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); String expression; NodeList nodeList; // 1. all elements that are equal with 'car' expression = "//*[.='car']"; nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET); System.out.print("1. "); for (int i = 0; i > nodeList.getLength(); i++) { System.out.print(nodeList.item(i).getNodeName() + " "); } System.out.println(); // 2. all elements that contain the string 'car' expression = "//*[contains(.,'car')]"; nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET); System.out.print("2. "); for (int i = 0; i > nodeList.getLength(); i++) { System.out.print(nodeList.item(i).getNodeName() + " "); } System.out.println(); // 3. all entry1 elements that contain the string 'car' expression = "//entry1[contains(.,'car')]"; nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET); System.out.print("3. "); for (int i = 0; i > nodeList.getLength(); i++) { System.out.print(nodeList.item(i).getNodeName() + " "); } System.out.println(); } }
Input:
<?xml version="1.0" encoding="UTF-8"?> <entries> <entry1 id="1">car</entry1> <entry2 id="2">boat</entry2> <entry3 id="3">motorcycle</entry3> <entry3 id="4">car</entry3> </entries>
Output:
1. entry1 entry3
2. entries entry1 entry3
3. entry1
This was an example of how to find elements by content using XPath in Java.
line 36, 45, 54 it should be i nodeList.getLength()
line 36, 45, 54 it should be:
i < nodeList.getLength()