Sotirios-Efstathios Maneas

About Sotirios-Efstathios Maneas

Sotirios-Efstathios (Stathis) Maneas is a postgraduate student at the Department of Informatics and Telecommunications of The National and Kapodistrian University of Athens. His main interests include distributed systems, web crawling, model checking, operating systems, programming languages and web applications.

Java XML parser tutorial

In this tutorial we will discuss about XML parsers in Java. XML is a markup language that defines a set of rules for encoding documents. Java offers a number of libraries in order to parse and process XML documents. An XML parser provides the required functionality to read and modify an XML file.

The XML language is used to provide a general way in order for different machines to communicate and exchange data. Like Java, XML is also platform independent. An XML document consists of elements. Each element has a start tag, its content and an end tag. Also, an XML document must have exactly one root element. Finally, an XML file has a strict syntax and form.

Example of an XML File

In the following file, we will declare the employees of a company. Each employee has a unique ID, first and last name, age and salary. The employees are separated by their IDs. We create a new file called Employees.xml as shown below:

Employees.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Employees>
     <Employee ID="1">
          <Firstname>Lebron</Firstname >
          <Lastname>James</Lastname>
          <Age>30</Age>
          <Salary>2500</Salary>
     </Employee>
     <Employee ID="2">
          <Firstname>Anthony</Firstname>
          <Lastname>Davis</Lastname>
          <Age>22</Age>
          <Salary>1500</Salary>
     </Employee>
     <Employee ID="3">
          <Firstname>Paul</Firstname>
          <Lastname>George</Lastname>
          <Age>24</Age>
          <Salary>2000</Salary>
     </Employee>
     <Employee ID="4">
          <Firstname>Blake</Firstname>
          <Lastname>Griffin</Lastname>
          <Age>25</Age>
          <Salary>2250</Salary>
     </Employee>
</Employees>

Also, in order to capture the notion of an employee, we create its respective Java class, called Employee.java as shown below:

Employee.java:

class Employee {

     private String ID;
     private String Firstname;
     private String Lastname;
     private int age;
     private double salary;

     public Employee(String ID, String Firstname, String Lastname, int age, double salary) {
          this.ID = ID;
          this.Firstname = Firstname;
          this.Lastname = Lastname;
          this.age = age;
          this.salary = salary;
     }

     @Override
     public String toString() {
          return "<" + ID + ", " + Firstname + ", " + Lastname + ", " + age + ", "
                                   + salary + ">";
     }
}

Parse an XML File using the DOM Parser

The DOM parser implementation is included in the release of JDK. The Document Object Model provides APIs that let you create, modify, delete, and rearrange nodes. The DOM parser parses the entire XML document and loads the XML content into a Tree structure. Using the Node and NodeList classes, we can retrieve and modify the content of an XML file.

A sample example that loads the content of an XML file and prints its contents is shown below:

DomParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomParserExample {

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if(args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();

          // Load the input XML document, parse it and return an instance of the
          // Document class.
          Document document = builder.parse(new File(args[0]));

          List<Employee> employees = new ArrayList<Employee>();
          NodeList nodeList = document.getDocumentElement().getChildNodes();
          for (int i = 0; i < nodeList.getLength(); i++) {
               Node node = nodeList.item(i);

               if (node.getNodeType() == Node.ELEMENT_NODE) {
                    Element elem = (Element) node;

                    // Get the value of the ID attribute.
                    String ID = node.getAttributes().getNamedItem("ID").getNodeValue();

                    // Get the value of all sub-elements.
                    String firstname = elem.getElementsByTagName("Firstname")
                                        .item(0).getChildNodes().item(0).getNodeValue();

                    String lastname = elem.getElementsByTagName("Lastname").item(0)
                                        .getChildNodes().item(0).getNodeValue();

                    Integer age = Integer.parseInt(elem.getElementsByTagName("Age")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    Double salary = Double.parseDouble(elem.getElementsByTagName("Salary")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    employees.add(new Employee(ID, firstname, lastname, age, salary));
               }
          }

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

Inside the main method, we create a DocumentBuilder from the DocumentBuilderFactory and then, parse and store the XML file in an instance of the Document class. Then, we parse that document and when we find a node of type Node.ELEMENT_NODE, we retrieve all its information and store them in an instance of the Employee class. Finally, we print the information of all stored employees.

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the SAX Parser

SAX is an event-based sequential access parser API and provides a mechanism for reading data from an XML document that is an alternative to that provided by a DOM parser. A SAX parser only needs to report each parsing event as it happens and the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file.

Our SAX parser extends the DefaultHandler class, in order to provide the following callbacks:

  • startElement: this event is triggered when a start tag is encountered.
  • endElement: – this event is triggered when an end tag is encountered.
  • characters: – this event is triggered when some text data is encountered.

A sample example of a SAX parser is shown below:

SaxParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SAXParserExample extends DefaultHandler {

     private static List<Employee> employees = new ArrayList<Employee>();
     private static Employee empl = null;
     private static String text = null;

     @Override
     // A start tag is encountered.
     public void startElement(String uri, String localName, String qName, Attributes attributes)
          throws SAXException {

          switch (qName) {
               // Create a new Employee.
               case "Employee": {
                    empl = new Employee();
                    empl.setID(attributes.getValue("ID"));
                    break;
               }
          }
     }

     @Override
     public void endElement(String uri, String localName, String qName) throws SAXException {
          switch (qName) {
               case "Employee": {
                    // The end tag of an employee was encountered, so add the employee to the list.
                    employees.add(empl);
                    break;
               }
               case "Firstname": {
                    empl.setFirstname(text);
                    break;
               }
               case "Lastname": {
                    empl.setLastname(text);
                    break;
               }
               case "Age": {
                    empl.setAge(Integer.parseInt(text));
                    break;
               }
               case "Salary": {
                    empl.setSalary(Double.parseDouble(text));
                    break;
               }
          }
     }

     @Override
     public void characters(char[] ch, int start, int length) throws SAXException {
          text = String.copyValueOf(ch, start, length).trim();
     }

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          SAXParserFactory parserFactor = SAXParserFactory.newInstance();
          SAXParser parser = parserFactor.newSAXParser();
          SAXParserExample handler = new SAXParserExample();

          parser.parse(new File(args[0]), handler);

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the StAX Parser

Streaming API for XML (StAX) is an application programming interface to read and write XML documents. The StAX parser is an XML parser that is able to process tree-like structured data as the data gets streamed-in. StAX was designed as a median between DOM and SAX parsers. In a StAX parser, the entry point is a cursor that represents a point within the XML document. The application moves the cursor forward, in order to pull the information from the parser. In contrast, a SAX parser pushes data to the application, instead of pulling.

A sample example of a StAX parser is shown below:

StaxParserExample.java:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class StaxParserExample {

     public static void main(String[] args) throws FileNotFoundException,
          XMLStreamException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          List<Employee> employees = null;
          Employee empl = null;
          String text = null;

          XMLInputFactory factory = XMLInputFactory.newInstance();
          XMLStreamReader reader = factory.createXMLStreamReader(new FileInputStream(
                                                  new File(args[0])));

          while (reader.hasNext()) {
               int Event = reader.next();

               switch (Event) {
                    case XMLStreamConstants.START_ELEMENT: {
                         if ("Employee".equals(reader.getLocalName())) {
                              empl = new Employee();
                              empl.setID(reader.getAttributeValue(0));
                         }
                         if ("Employees".equals(reader.getLocalName()))
                              employees = new ArrayList<>();

                         break;
                    }
                    case XMLStreamConstants.CHARACTERS: {
                         text = reader.getText().trim();
                         break;
                    }
                    case XMLStreamConstants.END_ELEMENT: {
                         switch (reader.getLocalName()) {
                              case "Employee": {
                                   employees.add(empl);
                                   break;
                              }
                              case "Firstname": {
                                   empl.setFirstname(text);
                                   break;
                              }
                              case "Lastname": {
                                   empl.setLastname(text);
                                   break;
                              }
                              case "Age": {
                                   empl.setAge(Integer.parseInt(text));
                                   break;
                              }
                              case "Salary": {
                                   empl.setSalary(Double.parseDouble(text));
                                   break;
                              }
                         }
                         break;
                    }
               }
          }

          // Print all employees.
          for (Employee employee : employees)
               System.out.println(employee.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Download the Eclipse Project

The Eclipse project of this example: XMLParsers.zip.

This was a tutorial about XML parsers in Java.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.
Examples Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Examples Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Examples Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close