Home » Core Java » xml » Java XML parser tutorial

About Sotirios-Efstathios Maneas

Sotirios-Efstathios Maneas
Sotirios-Efstathios (Stathis) Maneas is a postgraduate student at the Department of Informatics and Telecommunications of The National and Kapodistrian University of Athens. His main interests include distributed systems, web crawling, model checking, operating systems, programming languages and web applications.

Java XML parser tutorial

In this tutorial we will discuss about XML parsers in Java. XML is a markup language that defines a set of rules for encoding documents. Java offers a number of libraries in order to parse and process XML documents. An XML parser provides the required functionality to read and modify an XML file.

The XML language is used to provide a general way in order for different machines to communicate and exchange data. Like Java, XML is also platform independent. An XML document consists of elements. Each element has a start tag, its content and an end tag. Also, an XML document must have exactly one root element. Finally, an XML file has a strict syntax and form.

Example of an XML File

In the following file, we will declare the employees of a company. Each employee has a unique ID, first and last name, age and salary. The employees are separated by their IDs. We create a new file called Employees.xml as shown below:

Employees.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Employees>
     <Employee ID="1">
          <Firstname>Lebron</Firstname >
          <Lastname>James</Lastname>
          <Age>30</Age>
          <Salary>2500</Salary>
     </Employee>
     <Employee ID="2">
          <Firstname>Anthony</Firstname>
          <Lastname>Davis</Lastname>
          <Age>22</Age>
          <Salary>1500</Salary>
     </Employee>
     <Employee ID="3">
          <Firstname>Paul</Firstname>
          <Lastname>George</Lastname>
          <Age>24</Age>
          <Salary>2000</Salary>
     </Employee>
     <Employee ID="4">
          <Firstname>Blake</Firstname>
          <Lastname>Griffin</Lastname>
          <Age>25</Age>
          <Salary>2250</Salary>
     </Employee>
</Employees>

Also, in order to capture the notion of an employee, we create its respective Java class, called Employee.java as shown below:

Employee.java:

class Employee {

     private String ID;
     private String Firstname;
     private String Lastname;
     private int age;
     private double salary;

     public Employee(String ID, String Firstname, String Lastname, int age, double salary) {
          this.ID = ID;
          this.Firstname = Firstname;
          this.Lastname = Lastname;
          this.age = age;
          this.salary = salary;
     }

     @Override
     public String toString() {
          return "<" + ID + ", " + Firstname + ", " + Lastname + ", " + age + ", "
                                   + salary + ">";
     }
}

Parse an XML File using the DOM Parser

The DOM parser implementation is included in the release of JDK. The Document Object Model provides APIs that let you create, modify, delete, and rearrange nodes. The DOM parser parses the entire XML document and loads the XML content into a Tree structure. Using the Node and NodeList classes, we can retrieve and modify the content of an XML file.

A sample example that loads the content of an XML file and prints its contents is shown below:

DomParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomParserExample {

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if(args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();

          // Load the input XML document, parse it and return an instance of the
          // Document class.
          Document document = builder.parse(new File(args[0]));

          List<Employee> employees = new ArrayList<Employee>();
          NodeList nodeList = document.getDocumentElement().getChildNodes();
          for (int i = 0; i < nodeList.getLength(); i++) {
               Node node = nodeList.item(i);

               if (node.getNodeType() == Node.ELEMENT_NODE) {
                    Element elem = (Element) node;

                    // Get the value of the ID attribute.
                    String ID = node.getAttributes().getNamedItem("ID").getNodeValue();

                    // Get the value of all sub-elements.
                    String firstname = elem.getElementsByTagName("Firstname")
                                        .item(0).getChildNodes().item(0).getNodeValue();

                    String lastname = elem.getElementsByTagName("Lastname").item(0)
                                        .getChildNodes().item(0).getNodeValue();

                    Integer age = Integer.parseInt(elem.getElementsByTagName("Age")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    Double salary = Double.parseDouble(elem.getElementsByTagName("Salary")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    employees.add(new Employee(ID, firstname, lastname, age, salary));
               }
          }

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

Inside the main method, we create a DocumentBuilder from the DocumentBuilderFactory and then, parse and store the XML file in an instance of the Document class. Then, we parse that document and when we find a node of type Node.ELEMENT_NODE, we retrieve all its information and store them in an instance of the Employee class. Finally, we print the information of all stored employees.

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the SAX Parser

SAX is an event-based sequential access parser API and provides a mechanism for reading data from an XML document that is an alternative to that provided by a DOM parser. A SAX parser only needs to report each parsing event as it happens and the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file.

Our SAX parser extends the DefaultHandler class, in order to provide the following callbacks:

  • startElement: this event is triggered when a start tag is encountered.
  • endElement: – this event is triggered when an end tag is encountered.
  • characters: – this event is triggered when some text data is encountered.

A sample example of a SAX parser is shown below:

SaxParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SAXParserExample extends DefaultHandler {

     private static List<Employee> employees = new ArrayList<Employee>();
     private static Employee empl = null;
     private static String text = null;

     @Override
     // A start tag is encountered.
     public void startElement(String uri, String localName, String qName, Attributes attributes)
          throws SAXException {

          switch (qName) {
               // Create a new Employee.
               case "Employee": {
                    empl = new Employee();
                    empl.setID(attributes.getValue("ID"));
                    break;
               }
          }
     }

     @Override
     public void endElement(String uri, String localName, String qName) throws SAXException {
          switch (qName) {
               case "Employee": {
                    // The end tag of an employee was encountered, so add the employee to the list.
                    employees.add(empl);
                    break;
               }
               case "Firstname": {
                    empl.setFirstname(text);
                    break;
               }
               case "Lastname": {
                    empl.setLastname(text);
                    break;
               }
               case "Age": {
                    empl.setAge(Integer.parseInt(text));
                    break;
               }
               case "Salary": {
                    empl.setSalary(Double.parseDouble(text));
                    break;
               }
          }
     }

     @Override
     public void characters(char[] ch, int start, int length) throws SAXException {
          text = String.copyValueOf(ch, start, length).trim();
     }

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          SAXParserFactory parserFactor = SAXParserFactory.newInstance();
          SAXParser parser = parserFactor.newSAXParser();
          SAXParserExample handler = new SAXParserExample();

          parser.parse(new File(args[0]), handler);

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the StAX Parser

Streaming API for XML (StAX) is an application programming interface to read and write XML documents. The StAX parser is an XML parser that is able to process tree-like structured data as the data gets streamed-in. StAX was designed as a median between DOM and SAX parsers. In a StAX parser, the entry point is a cursor that represents a point within the XML document. The application moves the cursor forward, in order to pull the information from the parser. In contrast, a SAX parser pushes data to the application, instead of pulling.

A sample example of a StAX parser is shown below:

StaxParserExample.java:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class StaxParserExample {

     public static void main(String[] args) throws FileNotFoundException,
          XMLStreamException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          List<Employee> employees = null;
          Employee empl = null;
          String text = null;

          XMLInputFactory factory = XMLInputFactory.newInstance();
          XMLStreamReader reader = factory.createXMLStreamReader(new FileInputStream(
                                                  new File(args[0])));

          while (reader.hasNext()) {
               int Event = reader.next();

               switch (Event) {
                    case XMLStreamConstants.START_ELEMENT: {
                         if ("Employee".equals(reader.getLocalName())) {
                              empl = new Employee();
                              empl.setID(reader.getAttributeValue(0));
                         }
                         if ("Employees".equals(reader.getLocalName()))
                              employees = new ArrayList<>();

                         break;
                    }
                    case XMLStreamConstants.CHARACTERS: {
                         text = reader.getText().trim();
                         break;
                    }
                    case XMLStreamConstants.END_ELEMENT: {
                         switch (reader.getLocalName()) {
                              case "Employee": {
                                   employees.add(empl);
                                   break;
                              }
                              case "Firstname": {
                                   empl.setFirstname(text);
                                   break;
                              }
                              case "Lastname": {
                                   empl.setLastname(text);
                                   break;
                              }
                              case "Age": {
                                   empl.setAge(Integer.parseInt(text));
                                   break;
                              }
                              case "Salary": {
                                   empl.setSalary(Double.parseDouble(text));
                                   break;
                              }
                         }
                         break;
                    }
               }
          }

          // Print all employees.
          for (Employee employee : employees)
               System.out.println(employee.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Download the Eclipse Project

The Eclipse project of this example: XMLParsers.zip.

This was a tutorial about XML parsers in Java.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

and many more ....

 

Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.