XPath

Java XPath Examples

1. Introduction

The previous article, Java XPath Best Practices Tutorial (https://examples.javacodegeeks.com/core-java/xpath-best-practices-tutorial/), explored how to set up a Java application to create a DOM (Document Object Model) document using a DOM parser to read an XML file; and an XPath object to evaluate XPath expressions as applied to the DOM.

This article dives into how to construct XPath expressions. Starting with the syntax used to build XPath expressions, and ending with some examples sum up the concepts explored.

The download for this article includes both the inventory.xml file used in the previous article and also includes the complete source code for a simple Java console application, called XPath Expression Explorer. More details about XPath Expression Explorer revealed throughout this article.

2. XPath Expression Explorer

This article builds and uses a Java application (XPath Expression Explorer) to reveal facts about XPath expressions and to help shorten the learning curve encountered when learning XPath expressions.

2.1 The Data

Below is the inventory.xml file from the previous article.

inventory.xml

<?xml version="1.0" encoding="UTF-8"?>
<inventory>
    <vendor name="Dell">
        <computer>
            <model>Win 10 Laptop</model>
            <os>Windows 10</os>
            <cpu>Intel i7</cpu>
            <ram>12GB</ram>
            <price>900.00</price>
        </computer>
        <computer>
            <model>Low Cost Windows Laptop</model>
            <os>Windows 10 Home</os>
            <cpu>Intel Pentium</cpu>
            <ram>4GB</ram>
            <price>313.00</price>
        </computer>
        <computer>
            <model>64 Bit Windows Desktop Computer</model>
            <os>Windows 10 Home 64 Bit</os>
            <cpu>AMD A8-Series</cpu>
            <ram>8GB</ram>
            <price>330.00</price>
        </computer>
    </vendor>
    <vendor name="Apple">
        <computer>
            <model>Apple Desktop Computer</model>
            <os>MAC OS X</os>
            <cpu>Intel Core i5</cpu>
            <ram>8GB</ram>
            <price>1300.00</price>
        </computer>
        <computer>
            <model>Apple Low Cost Desktop Computer</model>
            <os>OS X Yosemite</os>
            <cpu>4th Gen Intel Core i5</cpu>
            <ram>8GB</ram>
            <price>700.00</price>
        </computer>
    </vendor>
    <vendor name="HP">
        <computer>
            <model>HP Low Cost Windows 10 Laptop</model>
            <os>Windows 10 Home</os>
            <cpu>AMD A6-Series</cpu>
            <ram>4GB</ram>
            <price>230.00</price>
        </computer>
        <computer>
            <model>Windows 7 Desktop</model>
            <os>Windows 7</os>
            <cpu>6th Gen Intel Core i5</cpu>
            <ram>6GB</ram>
            <price>750.00</price>
        </computer>
        <computer>
            <model>HP High End, Low Cost 64 Bit Desktop</model>
            <os>Windows 10 Home 64 Bit</os>
            <cpu>6th Gen Intel Core i7</cpu>
            <ram>12GB</ram>
            <price>800.00</price>
        </computer>
    </vendor>
</inventory>

A.   There are 3 vendors; each vendor has a unique name
B.   There are 8 computers defined
C.   Each computer node has 5 children: 
     * model – Name of this configuration
     * os – Name of Operating System installed
     * cpu – Type of processor
     * ram – size of installed RAM
     * price – expressed as a decimal number

2.2 The Application

Below is the Java code that comprises the XPath Expression Explorer console application.

JavaXPathExpressionExplorer.java

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

import org.xml.sax.SAXException;

public class JavaXPathExpressionExplorer {

    public static final String DEFAULT_XML_FILENAME = "inventory.xml";

    public static void main(String... args) {

        // Setup an InputStreamReader to read from the keyboard
        InputStreamReader reader = new InputStreamReader(System.in);
        BufferedReader in = new BufferedReader(reader);

        // Instantiate the factory that supplies the DOM parser
        DocumentBuilderFactory builderFactory =
                DocumentBuilderFactory.newInstance();

        DocumentBuilder domParser = null;
        try {
            // Instantiate the DOM parser
            domParser = builderFactory.newDocumentBuilder();

            // Load the DOM Document from the XML data using the parser
            Document domDocument =
                    domParser.parse(getFileInputStreamName(in));

            // Instantiate an XPath object which compiles
            // and evaluates XPath expressions.
            XPath xPath = XPathFactory.newInstance().newXPath();

            while (true) {

                System.out.print("Enter expression (blank line to exit): ");
                String expr = in.readLine(); // Holds the XPath expression

                try {
                    if ((expr == null) || (expr.length() == 0)) {
                        System.exit(0); // User is done entering expressions
                    }
                    System.out.println(expr + " evaluates to:");

                    // See if expr evaluates to a String
                    String resString = (String) xPath.compile(expr).
                        evaluate(domDocument, XPathConstants.STRING);
                    if (resString != null) {
                        System.out.println("String: " + resString);
                    }

                    Number resNumber = (Number) xPath.compile(expr).
                        evaluate(domDocument, XPathConstants.NUMBER);
                    if (resNumber != null) {
                        System.out.println("Number: " + resNumber);
                    }

                    Boolean resBoolean = (Boolean) xPath.compile(expr).
                        evaluate(domDocument, XPathConstants.BOOLEAN);
                    if (resNumber != null) {
                        System.out.println("Boolean: " + resBoolean);
                    }

                    Node resNode = (Node) xPath.compile(expr).
                        evaluate(domDocument, XPathConstants.NODE);
                    if (resNode != null) {
                        System.out.println("Node: " + resNode);
                    }

                    NodeList resNodeList = (NodeList) xPath.compile(expr).
                        evaluate(domDocument, XPathConstants.NODESET);
                    if (resNodeList != null) {
                        int lenList = resNodeList.getLength();
                        System.out.println("Number of nodes in NodeList: " + lenList);
                        for (int i = 1; i <= lenList; i++) {
                            resNode = resNodeList.item(i-1);
                            String resNodeNameStr = resNode.getNodeName();
                            String resNodeTextStr = resNode.getTextContent();
                            System.out.println(i + ": " + resNode + "  (NodeName:'" +
                                resNodeNameStr + "'    NodeTextContent:'" + 
                                resNodeTextStr + "')");
                        }
                    }
                    outputSeparator();

                } catch (XPathExpressionException e) {
                    // Do nothing. This prevents output to console if 
                    // expression result type is not appropriate
                    // for the XPath expression being compiled and evaluated
                }

            } // end  while (true)

        } catch (SAXException e) {
            // Even though we are using a DOM parser a SAXException is thrown
            // if the DocumentBuilder cannot parse the XML file
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (ParserConfigurationException e){
                e.printStackTrace();
        }
    }

    // Helper method to load the XML file into the DOM Document
    public static String getFileInputStreamName(BufferedReader inputReader) {
        System.out.print("Enter XML file name (abc.xml) “+ 
            “or leave blank to use "+DEFAULT_XML_FILENAME+": ");
        String fileName = null;
        try {
            fileName = inputReader.readLine();
        } catch (IOException e) {
            e.printStackTrace();
        }
        if ((fileName == null) || (fileName.length() == 0)) {
            return DEFAULT_XML_FILENAME;
        }
        return fileName;
    }

    // Helper method to pretty up the output
    public static void outputSeparator() {
        System.out.println("=+=+=+=+=+=+=+=+");
    }

}

The application initially prompts the user for an XML filename. Respond to this prompt with a blank line to use the inventory.xml file found in the application’s classpath.

The application then takes an XPath expression entered from the keyboard, compiles, and evaluates the expression using different return types (as determined by XPathConstants) and displays the results to the user.

The main loop in this application repeatedly prompts for XPath expressions. Entering a blank line terminates the application.

Admittedly the application is crude, but it is effective for learning about XPath expressions.

3. XPath Expressions

3.1 XPathConstants Effect on XPath Expressions

The evaluate() method of an XPath object allows the user to specify an optional XPathConstant which determines the data type of the result returned, which changes the value of the result.

NOTE: If the optional XPathConstant is not passed to evaluate(), the data type of the result returned by evaluate() is String.

The table below shows the effects of the different XPathConstants when the XPath expression /inventory/vendor/computer/cpu[text() = “Intel Pentium”] is evaluated given a DOM built from the inventory.xml file (noted in section 2.1 The Data)

Table showing effects of different XPathConstants

XPath Constant          Java Data Type    Value Returned

XPathConstant.String    String            Intel Pentium
XPathConstant.Number    Number            NaN
XPathConstant.Boolean   Boolean           true
XPathConstant.Node      Node              [cpu: null]
XPathConstant.NodeList  NodeList          [cpu: null]

It is worth noting: Using the NodeList on line 7:

  • Executing the getNodeName() method returns the String “cpu”
  • Executing the getNodeValue() method returns the String “Intel Pentium”
    (namely, the same value as shown on line 1)

This is shown in the code below, which has been excerpted from the XPath Expression Explorer:

Excerpt from JavaXPathExpressionExplorer.java

 NodeList resNodeList = (NodeList) xPath.compile(expr).
evaluate(domDocument, XPathConstants.NODESET);
 if (resNodeList != null) {
     int lenList = resNodeList.getLength();
     System.out.println("Number of nodes in NodeList: " + lenList);
     for (int i = 1; i <= lenList; i++) {
         resNode = resNodeList.item(i-1);
         String resNodeNameStr = resNode.getNodeName();
         String resNodeTextStr = resNode.getTextContent();
         System.out.println(i + ": " + resNode + 
             "  (NodeName:'" + resNodeNameStr + 
             "'    NodeTextContent:'" + resNodeTextStr + "')");
     }
 }

Which renders the following output when executed:

Output from code excerpt, above

Number of nodes in NodeList: 1
1: [cpu: null]  (NodeName:'cpu'    NodeTextContent:'Intel Pentium')

3.2 XPath Expression Syntax

DOM documents represent XML data as a tree structure. XPath expressions are a series of steps, or paths through the tree where each step specifies a Node or a set of nodes (NodeList) from the tree.

Each step comes from one of the following categories:

Node specifications

*matches any element node

/specifies the root node, which is the first node in the tree
//specifies nodes in the tree that matches the selection regardless of location within the tree
.specifies the current node
..specifies the parent of the current node
nodenamespecifies all nodes in the tree with the name “nodename”
@specifies attributes within the node
@*matches any node with any attribute
node()matches any node of any kind

Predicates

Predicates are used to select specific nodes and are always surrounded by square brackets ‘[]’
Examples of some predicates are:

/vendor/computer[1]Selects the first computer node that is the child of a vendor node
/vendor/computer[last()]Selects the last computer node that is a child of a vendor node
/vendor/computer[last()-1]Selects the computer before the last computer which is a child of a vendor
/vendor/computer[position()350.00]Selects all the computer nodes of any vendor with a price value greater than 350.00

Axes

XPath axes specify set of Nodes relative to the current node.

ancestorspecifies all ancestors (such as parent, or grandparent) of the current node
ancestor-or-selfspecifies all ancestors of the current node and the current node itself
attributespecifies all attributes of the current node
childspecifies all children of the current node
descendantspecifies all descendants (such as children, or grandchildren) of the current node
descendant-or-selfspecifies all descendants of the current node and the current node itself
followingspecifies everything in the document after the closing tag of the current node
following-siblingspecifies all siblings after the current node
namespacespecifies all namespace nodes of the current node
parentspecifies the parent of the current node
precedingspecifies all nodes that appear before the current node in the document except ancestors, attribute nodes and namespace nodes
preceding-siblingspecifies all siblings before the current node
selfspecifies the current node

Operators

Node Set Operator
|Union of two node-sets (CAUTION: The Union operator ANDs two node sets.
In most computer languages ‘|’ is an OR operation
Arithmetic Operators
+Addition
Subtraction
*Multiplication
divInteger Division
modModulus (division remainder)
Logical Operators
andAnd
orOr
=Equal
!=Not equal
<Less than
>Greater than
>=Greater than or equal to

Functions

There is a vast array of XPath functions. In fact far too many to go into any detail here. If a function requires a text argument, as opposed to a Node orf NodeList, use the text() function to retrieve text associated with the current Node.

For information concerning XPath functions consult Section 4 of the XPath Specification:

3.3 XPath Expression Examples

Use the sample XPath expressions below, with the inventory.xml file and the XPath Expression Explorer. Then download for this article includes both the inventory.xml file and the source for the XPath Expression Explorer.

  • Get a list of all “AMD” processors
    /inventory/vendor/computer/cpu[contains(text(),”AMD”)]
  • Get list of the models of all computers with AMD processors
    /inventory/vendor/computer/cpu[contains(text(),”AMD”)]/preceding-sibling::model
  • Get all of the computers with cpu of “Intel Pentium”
    /inventory/vendor/computer/cpu[text() = “Intel Pentium”]
  • Select all computers with 4 GB ram
    /inventory/vendor/computer/ram[text()=”4GB”]
  • Get all the vendors with computers with AMD processors
    //computer[contains(cpu,’AMD’)]/parent::node()/@name

4. Download The Source Code

This was a Java Xpath example.

Download
You can download the full source code for this article here: JavaXPathExamples.zip

David Guinivere

David graduated from University of California, Santa Cruz with a Bachelor’s degree in Information Science. He has been actively working with Java since the days of J2SE 1.2 in 1998. His work has largely involved integrating Java and SQL. He has worked on a wide range of projects including e-commerce, CRM, Molecular Diagnostic, and Video applications. As a freelance developer he is actively studying Big Data, Cloud and Web Development.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button