Remove nodes from DOM document recursively

Ilias TsagklisNovember 11th, 2012Last Updated: July 6th, 2013

1 191 3 minutes read

In this example we shall show you how to remove Nodes from a DOM Document recursively. We have implemented two methods, removeRecursively(Node node, short nodeType, String name), in order to remove recursively a Node from a DOM Document and void prettyPrint(Document xml), in order to convert a DOM into a formatted XML String. To remove Nodes from a DOM Document recursively one should perform the following steps:

Obtain a new instance of a DocumentBuilderFactory, that is a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents.
Set the parser produced so as not to validate documents as they are parsed, using setValidating(boolean validating) API method of DocumentBuilderFactory, with validating set to false.
Create a new instance of a DocumentBuilder, using newDocumentBuilder() API method of DocumentBuilderFactory.
Parse the FileInputStream with the content to be parsed, using parse(InputStream is) API method of DocumentBuilder. This method parses the content of the given InputStream as an XML document and returns a new DOM Document object.
Call removeRecursively(Node node, short nodeType, String name) method of the example. This method takes a Node, a short nodeType and a String name as parameters. It checks if the nodeType of the given node is equal to the specified nodetype and if the given name is equal to the name of the node, using getNodeType() and getNodeName() API methods of Node. If this statement is true, then it removes the node from the DOM DOcument, by taking the parent node of the node, with getParentNode() API method of Node, and then by removing the specified child, with removeChild(Node oldChild) API method of Node. Else, if the above statement is false, it gets the children of this Node, using getChildNodes() API method of Node and does the same steps for each one of them.
Use normalize() API method of Document, to normalize the DOM tree. The method puts all text nodes in the full depth of the sub-tree underneath this node.
Call void prettyPrint(Document xml) method of the example. The method gets the xml Document and converts it into a formatted xml String, after transforming it with specific parameters, such as encoding. The method uses a Transformer, that is created using newTransformer() API method of TransformerFactory. The Transformer is used to transform a source tree into a result tree. After setting specific output properties to the transformer, using setOutputProperty(String name, String value) API method of Transformer, the method uses it to make the transformation, with transform(Source xmlSource, Result outputTarget) API method of Transformer. The parameters are the DOMSource with the DOM node and the result that is a StreamResult created from a StringWriter,

as described in the code snippet below.

package com.javacodegeeks.snippets.core;

import java.io.File;
import java.io.FileInputStream;
import java.io.StringWriter;
import java.io.Writer;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class RemoveNodesFromDOMDocumentRecursively {
	
	public static void main(String[] args) throws Exception {
		
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		dbf.setValidating(false);
		DocumentBuilder db = dbf.newDocumentBuilder();
		
		Document doc = db.parse(new FileInputStream(new File("in.xml")));
		
		// remove all elements named 'item'
		removeRecursively(doc, Node.ELEMENT_NODE, "item");

		// remove all comment nodes
		removeRecursively(doc, Node.COMMENT_NODE, null);
		
		// Normalize the DOM tree, puts all text nodes in the
		// full depth of the sub-tree underneath this node
		doc.normalize();
		
		prettyPrint(doc);
		
	}

	public static void removeRecursively(Node node, short nodeType, String name) {
		if (node.getNodeType()==nodeType && (name == null || node.getNodeName().equals(name))) {
			node.getParentNode().removeChild(node);
		}
		else {
			// check the children recursively
			NodeList list = node.getChildNodes();
			for (int i = 0; i < list.getLength(); i++) {
				removeRecursively(list.item(i), nodeType, name);
			}
		}
	}

	public static final void prettyPrint(Document xml) throws Exception {
		Transformer tf = TransformerFactory.newInstance().newTransformer();
		tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
		tf.setOutputProperty(OutputKeys.INDENT, "yes");
		Writer out = new StringWriter();
		tf.transform(new DOMSource(xml), new StreamResult(out));
		System.out.println(out.toString());
	}

}

Input:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
	<channel>
		<title>Java Tutorials and Examples</title>
		<item>
			<title><![CDATA[Java Tutorials]]></title>
			<link>http://www.javacodegeeks.com/</link>
		</item>
		<item>
			<title><![CDATA[Java Examples]]></title>
			<link>http://examples.javacodegeeks.com/</link>
		</item>
	</channel>
</rss>

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<rss version="2.0">
	<channel>
		<title>Java Tutorials and Examples</title>
		
		
	</channel>
</rss>

This was an example of how to remove Nodes from a DOM Document recursively in Java.

Remove nodes from DOM document recursively

Thank you!

Ilias Tsagklis

Thank you!

Thank you!

Related Articles

Thank you!