DOM

Remove nodes from DOM document recursively

In this example we shall show you how to remove Nodes from a DOM Document recursively. We have implemented two methods, removeRecursively(Node node, short nodeType, String name), in order to remove recursively a Node from a DOM Document and void prettyPrint(Document xml), in order to convert a DOM into a formatted XML String. To remove Nodes from a DOM Document recursively one should perform the following steps:

  • Obtain a new instance of a DocumentBuilderFactory, that is a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents.
  • Set the parser produced so as not to validate documents as they are parsed, using setValidating(boolean validating) API method of DocumentBuilderFactory, with validating set to false.
  • Create a new instance of a DocumentBuilder, using newDocumentBuilder() API method of DocumentBuilderFactory.
  • Parse the FileInputStream with the content to be parsed, using parse(InputStream is) API method of DocumentBuilder. This method parses the content of the given InputStream as an XML document and returns a new DOM Document object.
  • Call removeRecursively(Node node, short nodeType, String name) method of the example. This method takes a Node, a short nodeType and a String name as parameters. It checks if the nodeType of the given node is equal to the specified nodetype and if the given name is equal to the name of the node, using getNodeType() and getNodeName() API methods of Node. If this statement is true, then it removes the node from the DOM DOcument, by taking the parent node of the node, with getParentNode() API method of Node, and then by removing the specified child, with removeChild(Node oldChild) API method of Node. Else, if the above statement is false, it gets the children of this Node, using getChildNodes() API method of Node and does the same steps for each one of them.
  • Use normalize() API method of Document, to normalize the DOM tree. The method puts all text nodes in the full depth of the sub-tree underneath this node.
  • Call void prettyPrint(Document xml) method of the example. The method gets the xml Document and converts it into a formatted xml String, after transforming it with specific parameters, such as encoding. The method uses a Transformer, that is created using newTransformer() API method of TransformerFactory. The Transformer is used to transform a source tree into a result tree. After setting specific output properties to the transformer, using setOutputProperty(String name, String value) API method of Transformer, the method uses it to make the transformation, with transform(Source xmlSource, Result outputTarget) API method of Transformer. The parameters are the DOMSource with the DOM node and the result that is a StreamResult created from a StringWriter,

as described in the code snippet below.  

package com.javacodegeeks.snippets.core;

import java.io.File;
import java.io.FileInputStream;
import java.io.StringWriter;
import java.io.Writer;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class RemoveNodesFromDOMDocumentRecursively {
	
	public static void main(String[] args) throws Exception {
		
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		dbf.setValidating(false);
		DocumentBuilder db = dbf.newDocumentBuilder();
		
		Document doc = db.parse(new FileInputStream(new File("in.xml")));
		
		// remove all elements named 'item'
		removeRecursively(doc, Node.ELEMENT_NODE, "item");

		// remove all comment nodes
		removeRecursively(doc, Node.COMMENT_NODE, null);
		
		// Normalize the DOM tree, puts all text nodes in the
		// full depth of the sub-tree underneath this node
		doc.normalize();
		
		prettyPrint(doc);
		
	}

	public static void removeRecursively(Node node, short nodeType, String name) {
		if (node.getNodeType()==nodeType && (name == null || node.getNodeName().equals(name))) {
			node.getParentNode().removeChild(node);
		}
		else {
			// check the children recursively
			NodeList list = node.getChildNodes();
			for (int i = 0; i < list.getLength(); i++) {
				removeRecursively(list.item(i), nodeType, name);
			}
		}
	}

	public static final void prettyPrint(Document xml) throws Exception {
		Transformer tf = TransformerFactory.newInstance().newTransformer();
		tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
		tf.setOutputProperty(OutputKeys.INDENT, "yes");
		Writer out = new StringWriter();
		tf.transform(new DOMSource(xml), new StreamResult(out));
		System.out.println(out.toString());
	}

}

Input:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
	<channel>
		<title>Java Tutorials and Examples</title>
		<item>
			<title><![CDATA[Java Tutorials]]></title>
			<link>http://www.javacodegeeks.com/</link>
		</item>
		<item>
			<title><![CDATA[Java Examples]]></title>
			<link>http://examples.javacodegeeks.com/</link>
		</item>
	</channel>
</rss>

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<rss version="2.0">
	<channel>
		<title>Java Tutorials and Examples</title>
		
		
	</channel>
</rss>

  
This was an example of how to remove Nodes from a DOM Document recursively in Java.

Ilias Tsagklis

Ilias is a software developer turned online entrepreneur. He is co-founder and Executive Editor at Java Code Geeks.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Irfan Ullah
Irfan Ullah
5 years ago

Dear Sir It is indeed a very good effort, which makes it easy to process XML files. I have two questions: 1. How can we modify/add the code that can process multiple XML files at once that have similar structure/XML tags. 2. The program currently gives a NullPointerException if a similar XML file is given but with one/few missing tags. While processing multiple files (as in Question 01), “Can we ignore such exception without affecting the normal flow of the program?” In other words, is it possible to process multiple similar XML files with varying/missing number of nodes? Thanks and… Read more »

Back to top button