Sotirios-Efstathios Maneas

About Sotirios-Efstathios Maneas

Sotirios-Efstathios (Stathis) Maneas is a postgraduate student at the Department of Informatics and Telecommunications of The National and Kapodistrian University of Athens. His main interests include distributed systems, web crawling, model checking, operating systems, programming languages and web applications.

Java XML parser tutorial

In this tutorial we will discuss about XML parsers in Java. XML is a markup language that defines a set of rules for encoding documents. Java offers a number of libraries in order to parse and process XML documents. An XML parser provides the required functionality to read and modify an XML file.

The XML language is used to provide a general way in order for different machines to communicate and exchange data. Like Java, XML is also platform independent. An XML document consists of elements. Each element has a start tag, its content and an end tag. Also, an XML document must have exactly one root element. Finally, an XML file has a strict syntax and form.

Example of an XML File

In the following file, we will declare the employees of a company. Each employee has a unique ID, first and last name, age and salary. The employees are separated by their IDs. We create a new file called Employees.xml as shown below:

Employees.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Employees>
     <Employee ID="1">
          <Firstname>Lebron</Firstname >
          <Lastname>James</Lastname>
          <Age>30</Age>
          <Salary>2500</Salary>
     </Employee>
     <Employee ID="2">
          <Firstname>Anthony</Firstname>
          <Lastname>Davis</Lastname>
          <Age>22</Age>
          <Salary>1500</Salary>
     </Employee>
     <Employee ID="3">
          <Firstname>Paul</Firstname>
          <Lastname>George</Lastname>
          <Age>24</Age>
          <Salary>2000</Salary>
     </Employee>
     <Employee ID="4">
          <Firstname>Blake</Firstname>
          <Lastname>Griffin</Lastname>
          <Age>25</Age>
          <Salary>2250</Salary>
     </Employee>
</Employees>

Also, in order to capture the notion of an employee, we create its respective Java class, called Employee.java as shown below:

Employee.java:

class Employee {

     private String ID;
     private String Firstname;
     private String Lastname;
     private int age;
     private double salary;

     public Employee(String ID, String Firstname, String Lastname, int age, double salary) {
          this.ID = ID;
          this.Firstname = Firstname;
          this.Lastname = Lastname;
          this.age = age;
          this.salary = salary;
     }

     @Override
     public String toString() {
          return "<" + ID + ", " + Firstname + ", " + Lastname + ", " + age + ", "
                                   + salary + ">";
     }
}

Parse an XML File using the DOM Parser

The DOM parser implementation is included in the release of JDK. The Document Object Model provides APIs that let you create, modify, delete, and rearrange nodes. The DOM parser parses the entire XML document and loads the XML content into a Tree structure. Using the Node and NodeList classes, we can retrieve and modify the content of an XML file.

A sample example that loads the content of an XML file and prints its contents is shown below:

DomParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomParserExample {

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if(args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();

          // Load the input XML document, parse it and return an instance of the
          // Document class.
          Document document = builder.parse(new File(args[0]));

          List<Employee> employees = new ArrayList<Employee>();
          NodeList nodeList = document.getDocumentElement().getChildNodes();
          for (int i = 0; i < nodeList.getLength(); i++) {
               Node node = nodeList.item(i);

               if (node.getNodeType() == Node.ELEMENT_NODE) {
                    Element elem = (Element) node;

                    // Get the value of the ID attribute.
                    String ID = node.getAttributes().getNamedItem("ID").getNodeValue();

                    // Get the value of all sub-elements.
                    String firstname = elem.getElementsByTagName("Firstname")
                                        .item(0).getChildNodes().item(0).getNodeValue();

                    String lastname = elem.getElementsByTagName("Lastname").item(0)
                                        .getChildNodes().item(0).getNodeValue();

                    Integer age = Integer.parseInt(elem.getElementsByTagName("Age")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    Double salary = Double.parseDouble(elem.getElementsByTagName("Salary")
                                        .item(0).getChildNodes().item(0).getNodeValue());

                    employees.add(new Employee(ID, firstname, lastname, age, salary));
               }
          }

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

Inside the main method, we create a DocumentBuilder from the DocumentBuilderFactory and then, parse and store the XML file in an instance of the Document class. Then, we parse that document and when we find a node of type Node.ELEMENT_NODE, we retrieve all its information and store them in an instance of the Employee class. Finally, we print the information of all stored employees.

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the SAX Parser

SAX is an event-based sequential access parser API and provides a mechanism for reading data from an XML document that is an alternative to that provided by a DOM parser. A SAX parser only needs to report each parsing event as it happens and the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file.

Our SAX parser extends the DefaultHandler class, in order to provide the following callbacks:

  • startElement: this event is triggered when a start tag is encountered.
  • endElement: – this event is triggered when an end tag is encountered.
  • characters: – this event is triggered when some text data is encountered.

A sample example of a SAX parser is shown below:

SaxParserExample.java:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SAXParserExample extends DefaultHandler {

     private static List<Employee> employees = new ArrayList<Employee>();
     private static Employee empl = null;
     private static String text = null;

     @Override
     // A start tag is encountered.
     public void startElement(String uri, String localName, String qName, Attributes attributes)
          throws SAXException {

          switch (qName) {
               // Create a new Employee.
               case "Employee": {
                    empl = new Employee();
                    empl.setID(attributes.getValue("ID"));
                    break;
               }
          }
     }

     @Override
     public void endElement(String uri, String localName, String qName) throws SAXException {
          switch (qName) {
               case "Employee": {
                    // The end tag of an employee was encountered, so add the employee to the list.
                    employees.add(empl);
                    break;
               }
               case "Firstname": {
                    empl.setFirstname(text);
                    break;
               }
               case "Lastname": {
                    empl.setLastname(text);
                    break;
               }
               case "Age": {
                    empl.setAge(Integer.parseInt(text));
                    break;
               }
               case "Salary": {
                    empl.setSalary(Double.parseDouble(text));
                    break;
               }
          }
     }

     @Override
     public void characters(char[] ch, int start, int length) throws SAXException {
          text = String.copyValueOf(ch, start, length).trim();
     }

     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          SAXParserFactory parserFactor = SAXParserFactory.newInstance();
          SAXParser parser = parserFactor.newSAXParser();
          SAXParserExample handler = new SAXParserExample();

          parser.parse(new File(args[0]), handler);

          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Parse an XML File using the StAX Parser

Streaming API for XML (StAX) is an application programming interface to read and write XML documents. The StAX parser is an XML parser that is able to process tree-like structured data as the data gets streamed-in. StAX was designed as a median between DOM and SAX parsers. In a StAX parser, the entry point is a cursor that represents a point within the XML document. The application moves the cursor forward, in order to pull the information from the parser. In contrast, a SAX parser pushes data to the application, instead of pulling.

A sample example of a StAX parser is shown below:

StaxParserExample.java:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class StaxParserExample {

     public static void main(String[] args) throws FileNotFoundException,
          XMLStreamException {

          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");

          List<Employee> employees = null;
          Employee empl = null;
          String text = null;

          XMLInputFactory factory = XMLInputFactory.newInstance();
          XMLStreamReader reader = factory.createXMLStreamReader(new FileInputStream(
                                                  new File(args[0])));

          while (reader.hasNext()) {
               int Event = reader.next();

               switch (Event) {
                    case XMLStreamConstants.START_ELEMENT: {
                         if ("Employee".equals(reader.getLocalName())) {
                              empl = new Employee();
                              empl.setID(reader.getAttributeValue(0));
                         }
                         if ("Employees".equals(reader.getLocalName()))
                              employees = new ArrayList<>();

                         break;
                    }
                    case XMLStreamConstants.CHARACTERS: {
                         text = reader.getText().trim();
                         break;
                    }
                    case XMLStreamConstants.END_ELEMENT: {
                         switch (reader.getLocalName()) {
                              case "Employee": {
                                   employees.add(empl);
                                   break;
                              }
                              case "Firstname": {
                                   empl.setFirstname(text);
                                   break;
                              }
                              case "Lastname": {
                                   empl.setLastname(text);
                                   break;
                              }
                              case "Age": {
                                   empl.setAge(Integer.parseInt(text));
                                   break;
                              }
                              case "Salary": {
                                   empl.setSalary(Double.parseDouble(text));
                                   break;
                              }
                         }
                         break;
                    }
               }
          }

          // Print all employees.
          for (Employee employee : employees)
               System.out.println(employee.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

Download the Eclipse Project

The Eclipse project of this example: XMLParsers.zip.

This was a tutorial about XML parsers in Java.

Related Whitepaper:

Java Essential Training

Author David Gassner explores Java SE (Standard Edition), the language used to build mobile apps for Android devices, enterprise server applications, and more!

The course demonstrates how to install both Java and the Eclipse IDE and dives into the particulars of programming. The course also explains the fundamentals of Java, from creating simple variables, assigning values, and declaring methods to working with strings, arrays, and subclasses; reading and writing to text files; and implementing object oriented programming concepts. Exercise files are included with the course.

Get it Now!  

Examples Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use
All trademarks and registered trademarks appearing on Examples Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Examples Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

15,153 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books
Get tutored by the Geeks! JCG Academy is a fact... Join Now
Hello. Add your message here.