InfinityQuest - Programming Code Tutorials and Examples with Python, C++, Java, PHP, C#, JavaScript, Swift and more

Menu
  • Home
  • Sitemap

Python Programming Language Best Tutorials and Code Examples

Learn Python Right Now!
Home
PHP
Parsing Complex XML Documents in PHP
PHP

Parsing Complex XML Documents in PHP

InfinityCoder December 6, 2016

You have a complex XML document, such as one where you need to introspect the document to determine its schema, or you need to use more esoteric XML features, such as processing instructions or comments.

Use the DOM extension. It provides a complete interface to all aspects of the XML specification:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// $node is the DOM parsed node <book cover="soft">PHP Cookbook</book>
$type = $node->nodeType;
 
switch($type) {
case XML_ELEMENT_NODE:
    // I'm a tag. I have a tagname property.
    print $node->tagName; // prints the tagname property: "book"
    break;
case XML_ATTRIBUTE_NODE:
    // I'm an attribute. I have a name and a value property.
    print $node->name; // prints the name property: "cover"
    print $node->value; // prints the value property: "soft"
    break;
case XML_TEXT_NODE:
    // I'm a piece of text inside an element.
    // I have a name and a content property.
    print $node->nodeName; // prints the name property: "#text"
    print $node->nodeValue; // prints the text content: "PHP Cookbook"
    break;
default:
    // another type
    break;
}
 
book

The W3C’s DOM provides a platform- and language-neutral method that specifies the structure and content of a document.

Using DOM, you can read an XML document into a tree of nodes and then maneuver through the tree to locate information about a particular element or elements that match your criteria.

This is called tree-based parsing.

One of the major advantages of DOM is that by following the W3C’s specification, many languages implement DOM functions in a similar manner.

Therefore, the work of translating logic and instructions from one application to another is considerably simplified.
DOM is large and complex.

For more information, read the specification or pick up a copy of XML in a Nutshell. DOM functions in PHP are object oriented.

To move from one node to another, access properties such as $node->childNodes, which contains an array of node objects, and $node->parentNode, which contains the parent node object.

Therefore, to process a node, check its type and call a corresponding method, as shown:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// $node is the DOM parsed node <book cover="soft">PHP Cookbook</book>
$type = $node->nodeType;
 
switch($type) {
case XML_ELEMENT_NODE:
    // I'm a tag. I have a tagname property.
    print $node->tagName; // prints the tagname property: "book"
    break;
case XML_ATTRIBUTE_NODE:
    // I'm an attribute. I have a name and a value property.
    print $node->name; // prints the name property: "cover"
    print $node->value; // prints the value property: "soft"
    break;
case XML_TEXT_NODE:
    // I'm a piece of text inside an element.
    // I have a name and a content property.
    print $node->nodeName; // prints the name property: "#text"
    print $node->nodeValue; // prints the text content: "PHP Cookbook"
    break;
default:
    // another type
    break;
}

To automatically search through a DOM tree for specific elements, use getElements ByTagname(). Here’s how to do so with multiple book records:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<books>
     <book>
         <title>PHP Cookbook</title>
         <author>Sklar</author>
         <author>Trachtenberg</author>
         <subject>PHP</subject>
     </book>
     <book>
         <title>Perl Cookbook</title>
         <author>Christiansen</author>
         <author>Torkington</author>
         <subject>Perl</subject>
     </book>
</books>

And to find all authors:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// find and print all authors
$authors = $dom->getElementsByTagname('author');
 
// loop through author elements
foreach ($authors as $author) {
    // childNodes holds the author values
    $text_nodes = $author->childNodes;
 
    foreach ($text_nodes as $text) {
        print $text->nodeValue . "\n";
    }
}
 
Sklar
Trachtenberg
Christiansen
Torkington

The getElementsByTagname() method returns an array of element node objects. By looping through each element’s children, you can get to the text node associated with that element.

From there, you can pull out the node values, which in this case are the names of the book authors, such as Sklar and Trachtenberg.

Share
Tweet
Email
Prev Article
Next Article

Related Articles

Processing All Files in a Directory Recursively in PHP
You want to do something to all the files in …

Processing All Files in a Directory Recursively in PHP

Skipping Selected Return Values in PHP
A function returns multiple values, but you only care about …

Skipping Selected Return Values in PHP

About The Author

InfinityCoder
InfinityCoder

Leave a Reply

Cancel reply

Recent Tutorials InfinityQuest

  • Adding New Features to bash Using Loadable Built-ins in bash
    Adding New Features to bash Using Loadable …
    June 27, 2017 0
  • Getting to the Bottom of Things in bash
    Getting to the Bottom of Things in …
    June 27, 2017 0

Recent Comments

  • fer on Turning a Dictionary into XML in Python
  • mahesh on Turning a Dictionary into XML in Python

Categories

  • Bash
  • PHP
  • Python
  • Uncategorized

InfinityQuest - Programming Code Tutorials and Examples with Python, C++, Java, PHP, C#, JavaScript, Swift and more

About Us

Start learning your desired programming language with InfinityQuest.com.

On our website you can access any tutorial that you want with video and code examples.

We are very happy and honored that InfinityQuest.com has been listed as a recommended learning website for students.

Popular Tags

binary data python CIDR convert string into datetime python create xml from dict python dictionary into xml python how to create xml with dict in Python how to write binary data in Python IP Address read binary data python tutorial string as date object python string to datetime python

Archives

  • June 2017
  • April 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
Copyright © 2021 InfinityQuest - Programming Code Tutorials and Examples with Python, C++, Java, PHP, C#, JavaScript, Swift and more
Programming Tutorials | Sitemap