Home » Core Java » xml » Java XML parser Tutorial

About Sotirios-Efstathios Maneas

Sotirios-Efstathios Maneas
Sotirios-Efstathios (Stathis) Maneas is a PhD student at the Department of Computer Science at the University of Toronto. His main interests include distributed systems, storage systems, file systems, and operating systems.

Java XML parser Tutorial

In this tutorial we will discuss about the Java XML parser. XML is a markup language that defines a set of rules for encoding documents. Java offers a number of libraries in order to parse and process XML documents. An XML parser provides the required functionality to read and modify an XML file.

The XML language is used to provide a general way in order for different machines to communicate and exchange data. Like Java, XML is also platform independent. An XML document consists of elements. Each element has a start tag, its content and an end tag. Also, an XML document must have exactly one root element. Finally, an XML file has a strict syntax and form.

1. Example of an XML File

In the following file, we will declare the employees of a company. Each employee has a unique ID, first and last name, age and salary. The employees are separated by their IDs. We create a new file called Employees.xml as shown below:

Employees.xml

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
     <Employee ID="1">
          <Firstname>Lebron</Firstname >
          <Lastname>James</Lastname>
          <Age>30</Age>
          <Salary>2500</Salary>
     </Employee>
     <Employee ID="2">
          <Firstname>Anthony</Firstname>
          <Lastname>Davis</Lastname>
          <Age>22</Age>
          <Salary>1500</Salary>
     </Employee>
     <Employee ID="3">
          <Firstname>Paul</Firstname>
          <Lastname>George</Lastname>
          <Age>24</Age>
          <Salary>2000</Salary>
     </Employee>
     <Employee ID="4">
          <Firstname>Blake</Firstname>
          <Lastname>Griffin</Lastname>
          <Age>25</Age>
          <Salary>2250</Salary>
     </Employee>
</Employees>

Also, in order to capture the notion of an employee, we create its respective Java class, called Employee.java as shown below:

Employee.java:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
class Employee {
 
     private String ID;
     private String Firstname;
     private String Lastname;
     private int age;
     private double salary;
 
     public Employee(String ID, String Firstname, String Lastname, int age, double salary) {
          this.ID = ID;
          this.Firstname = Firstname;
          this.Lastname = Lastname;
          this.age = age;
          this.salary = salary;
     }
 
     @Override
     public String toString() {
          return "<" + ID + ", " + Firstname + ", " + Lastname + ", " + age + ", "
                                   + salary + ">";
     }
}

2. Parse an XML File using the DOM Parser

The DOM parser implementation is included in the release of JDK. The Document Object Model provides APIs that let you create, modify, delete, and rearrange nodes. The DOM parser parses the entire XML document and loads the XML content into a Tree structure. Using the Node and NodeList classes, we can retrieve and modify the content of an XML file.

A sample example that loads the content of an XML file and prints its contents is shown below:

DomParserExample.java:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
 
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
 
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
 
public class DomParserExample {
 
     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {
 
          if(args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");
 
          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();
 
          // Load the input XML document, parse it and return an instance of the
          // Document class.
          Document document = builder.parse(new File(args[0]));
 
          List<Employee> employees = new ArrayList<Employee>();
          NodeList nodeList = document.getDocumentElement().getChildNodes();
          for (int i = 0; i < nodeList.getLength(); i++) {
               Node node = nodeList.item(i);
 
               if (node.getNodeType() == Node.ELEMENT_NODE) {
                    Element elem = (Element) node;
 
                    // Get the value of the ID attribute.
                    String ID = node.getAttributes().getNamedItem("ID").getNodeValue();
 
                    // Get the value of all sub-elements.
                    String firstname = elem.getElementsByTagName("Firstname")
                                        .item(0).getChildNodes().item(0).getNodeValue();
 
                    String lastname = elem.getElementsByTagName("Lastname").item(0)
                                        .getChildNodes().item(0).getNodeValue();
 
                    Integer age = Integer.parseInt(elem.getElementsByTagName("Age")
                                        .item(0).getChildNodes().item(0).getNodeValue());
 
                    Double salary = Double.parseDouble(elem.getElementsByTagName("Salary")
                                        .item(0).getChildNodes().item(0).getNodeValue());
 
                    employees.add(new Employee(ID, firstname, lastname, age, salary));
               }
          }
 
          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

Inside the main method, we create a DocumentBuilder from the DocumentBuilderFactory and then, parse and store the XML file in an instance of the Document class. Then, we parse that document and when we find a node of type Node.ELEMENT_NODE, we retrieve all its information and store them in an instance of the Employee class. Finally, we print the information of all stored employees.

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

3. Parse an XML File using the SAX Parser

SAX is an event-based sequential access parser API and provides a mechanism for reading data from an XML document that is an alternative to that provided by a DOM parser. A SAX parser only needs to report each parsing event as it happens and the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file.

Our SAX parser extends the DefaultHandler class, in order to provide the following callbacks:

  • startElement: this event is triggered when a start tag is encountered.
  • endElement: – this event is triggered when an end tag is encountered.
  • characters: – this event is triggered when some text data is encountered.

A sample example of a SAX parser is shown below:

SaxParserExample.java:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
 
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
 
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
 
public class SAXParserExample extends DefaultHandler {
 
     private static List<Employee> employees = new ArrayList<Employee>();
     private static Employee empl = null;
     private static String text = null;
 
     @Override
     // A start tag is encountered.
     public void startElement(String uri, String localName, String qName, Attributes attributes)
          throws SAXException {
 
          switch (qName) {
               // Create a new Employee.
               case "Employee": {
                    empl = new Employee();
                    empl.setID(attributes.getValue("ID"));
                    break;
               }
          }
     }
 
     @Override
     public void endElement(String uri, String localName, String qName) throws SAXException {
          switch (qName) {
               case "Employee": {
                    // The end tag of an employee was encountered, so add the employee to the list.
                    employees.add(empl);
                    break;
               }
               case "Firstname": {
                    empl.setFirstname(text);
                    break;
               }
               case "Lastname": {
                    empl.setLastname(text);
                    break;
               }
               case "Age": {
                    empl.setAge(Integer.parseInt(text));
                    break;
               }
               case "Salary": {
                    empl.setSalary(Double.parseDouble(text));
                    break;
               }
          }
     }
 
     @Override
     public void characters(char[] ch, int start, int length) throws SAXException {
          text = String.copyValueOf(ch, start, length).trim();
     }
 
     public static void main(String[] args) throws ParserConfigurationException,
          SAXException, IOException {
 
          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");
 
          SAXParserFactory parserFactor = SAXParserFactory.newInstance();
          SAXParser parser = parserFactor.newSAXParser();
          SAXParserExample handler = new SAXParserExample();
 
          parser.parse(new File(args[0]), handler);
 
          // Print all employees.
          for (Employee empl : employees)
               System.out.println(empl.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

4. Parse an XML File using the StAX Parser

Streaming API for XML (StAX) is an application programming interface to read and write XML documents. The StAX parser is an XML parser that is able to process tree-like structured data as the data gets streamed-in. StAX was designed as a median between DOM and SAX parsers. In a StAX parser, the entry point is a cursor that represents a point within the XML document. The application moves the cursor forward, in order to pull the information from the parser. In contrast, a SAX parser pushes data to the application, instead of pulling.

A sample example of a StAX parser is shown below:

StaxParserExample.java:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;
 
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
 
public class StaxParserExample {
 
     public static void main(String[] args) throws FileNotFoundException,
          XMLStreamException {
 
          if (args.length != 1)
               throw new RuntimeException("The name of the XML file is required!");
 
          List<Employee> employees = null;
          Employee empl = null;
          String text = null;
 
          XMLInputFactory factory = XMLInputFactory.newInstance();
          XMLStreamReader reader = factory.createXMLStreamReader(new FileInputStream(
                                                  new File(args[0])));
 
          while (reader.hasNext()) {
               int Event = reader.next();
 
               switch (Event) {
                    case XMLStreamConstants.START_ELEMENT: {
                         if ("Employee".equals(reader.getLocalName())) {
                              empl = new Employee();
                              empl.setID(reader.getAttributeValue(0));
                         }
                         if ("Employees".equals(reader.getLocalName()))
                              employees = new ArrayList<>();
 
                         break;
                    }
                    case XMLStreamConstants.CHARACTERS: {
                         text = reader.getText().trim();
                         break;
                    }
                    case XMLStreamConstants.END_ELEMENT: {
                         switch (reader.getLocalName()) {
                              case "Employee": {
                                   employees.add(empl);
                                   break;
                              }
                              case "Firstname": {
                                   empl.setFirstname(text);
                                   break;
                              }
                              case "Lastname": {
                                   empl.setLastname(text);
                                   break;
                              }
                              case "Age": {
                                   empl.setAge(Integer.parseInt(text));
                                   break;
                              }
                              case "Salary": {
                                   empl.setSalary(Double.parseDouble(text));
                                   break;
                              }
                         }
                         break;
                    }
               }
          }
 
          // Print all employees.
          for (Employee employee : employees)
               System.out.println(employee.toString());
     }
}

A sample execution is shown below:

<1, Lebron, James, 30, 2500.0>
<2, Anthony, Davis, 22, 1500.0>
<3, Paul, George, 24, 2000.0>
<4, Blake, Griffin, 25, 2250.0>

5. Parse an XML using JAXB

Java Architecture for XML Binding (JAXB) provides a fast and convenient way to bind XML schemas and Java representations, making it easy for Java developers to incorporate XML data and processing functions in Java applications. As part of this process, JAXB provides methods for unmarshalling (reading) XML instance documents into Java content trees, and then marshalling (writing) Java content trees back into XML instance documents. JAXB also provides a way to generate XML schema from Java objects.

JAXB annotations defined in javax.xml.bind.annotation package can be used to customize Java program elements to XML schema mapping. Let us now check the marshall and unmarshall features using an example.

The EmployeeData class contains all the attributes that will map to XML schema. Notice the annotations @XmlRootElement, @XmlAttribute and @XmlElementto indicate the root XML element, attribute, and elements.

EmployeeData.java
package main.java;

import javax.xml.bind.annotation.*;

/*
* Employee class to map the XML schema
*/
@XmlRootElement(name="employee")
public class EmployeeData {
	@XmlAttribute(name="id")
	private String ID;
	@XmlElement(name="firstName")
    private String Firstname;
	@XmlElement(name="lastName")
    private String Lastname;
	@XmlElement(name="age")
    private Integer age;
	@XmlElement(name="salary")
    private Double salary;
    
    public EmployeeData() {}

    public EmployeeData(String ID, String Firstname, String Lastname, Integer age, Double salary) {
    	this.ID = ID;
    	this.Firstname = Firstname;
    	this.Lastname = Lastname;
    	this.age = age;
    	this.salary = salary;
    }

    public void setID(String ID) {
		this.ID = ID;
	}

	public void setFirstname(String firstname) {
		this.Firstname = firstname;
	}

	public void setLastname(String lastname) {
		this.Lastname = lastname;
	}

	public void setAge(Integer age) {
		this.age = age;
	}

	public void setSalary(Double salary) {
		this.salary = salary;
	}

	@Override
    public String toString() {
            return "";
    }
}

The Employees class is created to hold list of all employees. Note that we defined the @XmlRootElement as employees and employeeData as @XmlElement.

Employees.java
package main.java;

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import java.util.ArrayList;
import java.util.List;
import main.java.EmployeeData;

/*
* Schema to hold multiple employee objects
*/
@XmlRootElement(name = "employees")
public class Employees {
    List employees;
    public List getEmployees() {
        return employees;
    }

    @XmlElement(name = "employeeData")
    public void setEmployees(List employees) {
        this.employees = employees;
    }

    public void add(EmployeeData employeeData) {
        if (this.employees == null) {
            this.employees = new ArrayList();
        }
        this.employees.add(employeeData);

    }
	
	@Override
    public String toString() {
		System.out.println("Our employee list after unmarshall is : ");
            StringBuffer str = new StringBuffer();
			for (EmployeeData emp : employees){
				str = str.append(emp.toString());
			}
			return str.toString();
    }

}

The JAXBExample class provides the marshall and unmarshall operations.

JAXBExample.java
package main.java;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import java.io.File;
import main.java.EmployeeData;
import main.java.Employees;


/*
* Class to check marshall and unmarshall
*/
public class JAXBExample{
	public static void main(String args[]){
		// Create the employee list
		Employees empList = new Employees();
		EmployeeData data1 = new EmployeeData("1","Charlie","Chaplin",35,2000.00);
		EmployeeData data2 = new EmployeeData("2","John","Rambo",36,2500.00);
		empList.add(data1);
		empList.add(data2);
		
		try{
			// Marshall
			JAXBContext jaxbContext = JAXBContext.newInstance(Employees.class);
			Marshaller marshaller = jaxbContext.createMarshaller();
			marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
			marshaller.marshal(empList, new File("employee.xml"));
			marshaller.marshal(empList, System.out);	

			// Unmarshall
			File file = new File("employee.xml");
			jaxbContext = JAXBContext.newInstance(Employees.class);
			Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
			empList = (Employees) unmarshaller.unmarshal(file);
			System.out.println(empList);		
			
		} catch (JAXBException jaxbe){
			jaxbe.printStackTrace();
		} catch (Exception fnfe) {
			fnfe.printStackTrace();
		}
	}
}

Executing the above JAXBExample class gives the below output.

Java XML parser - JAXB output
Fig 1. JAXB output

6. Download the Eclipse Project

This was a tutorial about XML parsers in Java.

Download
You can download the full source code of this example here: Java XML parser Tutorial

Last updated on Oct. 08, 2019

(+1 rating, 1 votes)
2 Comments Views Tweet it!

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

 

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

 

and many more ....

 

Receive Java & Developer job alerts in your Area

 

2
Leave a Reply

avatar
1 Comment threads
1 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
2 Comment authors
edwinFarhath Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Farhath
Guest
Farhath

where is the Example.xml file is called here..??

edwin
Guest

Here –> (args[0])