Nikos Maravitsas

About Nikos Maravitsas

Nikos has graduated from the Department of Informatics and Telecommunications of The National and Kapodistrian University of Athens. Currently, his main interests are system’s security, parallel systems, artificial intelligence, operating systems, system programming, telecommunications, web applications, human – machine interaction and mobile development.

Read UTF-8 XML File in Java using SAX parser example

In the previous SAX parser tutorial we saw how to parse and read a simple XML File. If your file had UTF-8 encoding, there is a chance that the client produced a MalformedByteSequenceException. In order to solve this you have to set the InputSource encoding to UTF-8.
 
 
 
 
 
 
 
 
You can do this with the following code :

InputStream inputStream= new FileInputStream(xmlFile);
InputStreamReader inputReader = new InputStreamReader(inputStream,"UTF-8");
InputSource inputSource = new InputSource(inputReader);
InputSource.setEncoding("UTF-8");

Here is the XML File we are going to use for our demo. We have the special UTF-8 character ©.

testFile.xml:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><company>

	<employee id="10">
		<firstname>Jeremy</firstname>
		<lastname>Harley</lastname>
        <email>james@example.org</email>
		<department>Human Resources</department>
		<salary>2000000</salary>
	    <address>34 Stanley St.©</address>

	</employee>

	<employee id="2">
		<firstname>John</firstname>
		<lastname>May</lastname>
		<email>john@example.org</email>
		<department>Logistics</department>
		<salary>400</salary>
	    <address>123 Stanley St.</address>
	</employee>

</company>

MyHandler.java:

package com.javacodegeeks.java.core;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class MyHandler extends DefaultHandler {

	boolean tagFname = false;
	boolean tagLname = false;
	boolean tagEmail = false;
	boolean tagDep = false;
	boolean tagSalary = false;
	boolean tagAddress = false;

	public void startElement(String uri, String localName, String qName,
			Attributes attributes) throws SAXException {

		if (attributes.getLength() > 0) {

			String tag = "<" + qName;
			for (int i = 0; i < attributes.getLength(); i++) {

				tag += " " + attributes.getLocalName(i) + "="
						+ attributes.getValue(i);
			}

			tag += ">";
			System.out.println(tag);

		} else {

			System.out.println("<" + qName + ">");
		}

		if (qName.equalsIgnoreCase("firstname")) {
			tagFname = true;
		}

		if (qName.equalsIgnoreCase("lastname")) {
			tagLname = true;
		}

		if (qName.equalsIgnoreCase("email")) {
			tagEmail = true;
		}

		if (qName.equalsIgnoreCase("department")) {
			tagDep = true;
		}

		if (qName.equalsIgnoreCase("salary")) {
			tagSalary = true;
		}

		if (qName.equalsIgnoreCase("address")) {
			tagAddress = true;
		}

	}

	public void characters(char ch[], int start, int length)
			throws SAXException {

		if (tagFname) {
			System.out.println(new String(ch, start, length));
			tagFname = false;
		}

		if (tagLname) {
			System.out.println(new String(ch, start, length));
			tagLname = false;
		}

		if (tagEmail) {
			System.out.println(new String(ch, start, length));
			tagEmail = false;
		}

		if (tagDep) {
			System.out.println(new String(ch, start, length));
			tagDep = false;
		}

		if (tagSalary) {
			System.out.println(new String(ch, start, length));
			tagSalary = false;
		}

		if (tagAddress) {
			System.out.println(new String(ch, start, length));
			tagAddress = false;
		}

	}

	public void endElement(String uri, String localName, String qName)
			throws SAXException {

		System.out.println("</" + qName + ">");

	}

}

ParseUTF8XMLFileWithSAX.java:

package com.javacodegeeks.java.core;

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.InputSource;

public class ParseUTF8XMLFileWithSAX {

	private static final String xmlFilePath = "C:\\Users\\nikos7\\Desktop\\filesForExamples\\testFile.xml";

	public static void main(String argv[]) {

		try {

			SAXParserFactory factory = SAXParserFactory.newInstance();
			SAXParser saxParser = factory.newSAXParser();

			File xmlFile = new File(xmlFilePath);

			InputStream inputStream= new FileInputStream(xmlFile);

			InputStreamReader inputReader = new InputStreamReader(inputStream,"UTF-8");

			InputSource inputSource = new InputSource(inputReader);
			inputSource.setEncoding("UTF-8");

			saxParser.parse(inputSource, new MyHandler());

		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

Output:

<company>
<employee id=10>
<firstname>
Jeremy
</firstname>
<lastname>
Harley
</lastname>
<email>
james@example.org
</email>
<department>
Human Resources
</department>
<salary>
2000000
</salary>
<address>
34 Stanley St.©
</address>
</employee>
<employee id=2>
<firstname>
John
</firstname>
<lastname>
May
</lastname>
<email>
john@example.org
</email>
<department>
Logistics
</department>
<salary>
400
</salary>
<address>
123 Stanley St.
</address>
</employee>
</company>

 
This was an example on how to read UTF-8 XML File in Java using SAX parser.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.
Examples Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Examples Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Examples Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close