The XML DOM uses a tree-structure, also known as a node-tree, to view an XML document, i.e., each node can be accessed through the tree. We can also modify, delete, or create a new element through the tree. The set of nodes and their connections is what a node tree displays. Travelling across or looping through a node tree is called traversing.
Traversing the Node Tree:
We can need to loop an XML document in many situations, such as for extracting the value of each element, also known as “Traversing the node tree”.
Example: To loop through all the child nodes of <book> and to display their names and values:
In the above example, first, we are loading the XML string into xmlDoc to get the child nodes of the root element. Here, we will output the node name and the node value of the text node for each child node.
Browser Differences in DOM Parsing:
The W3C DOM specification is supported by all the modern browsers’ support but has some differences between browsers. One such difference is in the way each browser handles white-spaces and new lines.
DOM – White Spaces and New Lines:
A new line, or white space characters, are often present between nodes, especially at times when a document is being edited by a simple editor like Notepad.
ABC Unknown 2020 100.00
In the above example, a CR/LF or newline is present between each line and two spaces are present in front of each child node. The above document was edited by Notepad. Empty white-spaces, or newlines as text nodes, are not treated by Internet Explorer 9, however, other browsers do.
In the above example, the output is the number of child nodes the root element of note.xml has. For the same code, IE10 and later versions, and other browsers will output 9 child nodes, but IE9 and earlier versions will output 4 child nodes only.
PCDATA – Parsed Character Data:
The text data that will be parsed by the XML parser is also termed as Parsed Character Data (PCDATA). Usually, all the text in an XML document is parsed by the XML parsers. The text between the XML tags is also parsed if an XML element is parsed.
The reason for this is simple and that is because XML elements can contain other elements.
Here, two other elements i.e., first and last are present in the <name> element. The parser will thus break it up into sub-elements.
CDATA – (Unparsed) Character Data:
In the above example, the parser ignores everything inside the CDATA section.