<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Rob Rosenbaum's Development Blog &#187; XML</title>
	<atom:link href="http://robrosenbaum.com/tags/xml/feed/" rel="self" type="application/rss+xml" />
	<link>http://robrosenbaum.com</link>
	<description>PHP, Symfony, and Other Web Things</description>
	<lastBuildDate>Wed, 30 Jan 2008 02:38:55 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>DOMNodeList Gotchas</title>
		<link>http://robrosenbaum.com/php/domnodelist-gotchas/</link>
		<comments>http://robrosenbaum.com/php/domnodelist-gotchas/#comments</comments>
		<pubDate>Wed, 13 Jun 2007 18:05:16 +0000</pubDate>
		<dc:creator>Rob Rosenbaum</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://robrosenbaum.com/xml/domnodelist-gotchas/</guid>
		<description><![CDATA[Be wary - DomNodeLists behave strangely in loops the alter them.]]></description>
			<content:encoded><![CDATA[<h3>An Undocumented &#034;Feature&#034;</h3>
<p>Suppose we write the following code, whose simple purpose is to go through an XML document and replace every &#034;foo&#034; element with an empty &#034;bar&#034; element: 
<pre>
$dom = DOMDocument::loadXML('
  &lt;root&gt;
  &lt;foo&gt;This&lt;/foo&gt;
  &lt;foo /&gt;
  &lt;foo /&gt;
  &lt;/root&gt;'
);

$document = $dom-&gt;documentElement;
$foos = $document-&gt;getElementsByTagName('foo');

for ($i = 0; $i &lt; $foos-&gt;length; $i++) {
  $bar = $dom-&gt;createElement('bar');
  $document-&gt;replaceChild($bar, $foos-&gt;item($i));
}
</pre>
We are quite surprised when the script outputs:<pre class="xml">
&lt;root&gt;&lt;bar/&gt;&lt;foo/&gt;&lt;bar/&gt;&lt;/root&gt;
</pre>
Why did it skip the middle element? Because the DOMNodeList class has an undocumented &#034;feature&#034;: when the owner document of a DOMNodeList object is changed, the object is <strong>recreated</strong>. That means that, when we replace the first &#034;foo&#034; node, the second &#034;foo&#034; node <em>becomes</em> the new first node. Also, the length of the node list is now 2, not 3. But since $i has been incremented, the for loop misses the second node entirely, operates on the third, then exits normally.</p>
<p>The solution to this problem is to save a reference to each node in an array, then loop over the array:
<pre class="php">
for ($i = 0; $i &lt; $foos-&gt;length; $i++) {
  $nodes[$i] = $foos-&gt;item($i);
}

for ($i = 0; $i &lt; count($nodes); $i++) {
  $bar = $dom-&gt;createElement('bar');
  $document-&gt;replaceChild($bar, $nodes[$i]);
}
</pre>
This code outputs what we intuitively expected from the original code:
<pre class="xml">
&lt;root&gt;&lt;bar/&gt;&lt;bar/&gt;&lt;bar/&gt;&lt;/root&gt;
</pre></p>

<h3>Implementation: A DOMNodeIterator Class</h3>
<p>It&#039;s best to encapsulate this technique in a class. Here&#039;s a simple class that does the job:

<pre class="php">
class DOMNodeIterator implements Iterator
{
  protected $nodes;

  public function __construct(DOMNodeList $nodeList)
  {
    if ($nodeList-&gt;item(0)) {
      for ($i = 0; $i &lt; $nodeList-&gt;length; $i++) {
        $this-&gt;nodes[$i] = $nodeList-&gt;item($i);
      }
    }
  }

  public function current()
  {
    return current($this-&gt;nodes);
  }
    
  public function key()
  {
    return key($this-&gt;nodes);
  }

  public function next()
  {
    return  next($this-&gt;nodes);
  }

  public function rewind()
  {
    reset($this-&gt;nodes);
  }

  public function valid()
  {
    return $this-&gt;current() ? true : false;
  }
}
</pre></p>

<h3>On the Other Hand, Orphan Nodes</h3>
<p>Our iterator has one drawback: if we remove a node in the list via removeChild(), it will still exist in the iterator, but it will no longer be associated with our document. Unfortunately, the only way to check for this is to ascend the entire DOM tree each time we want to access a node, to make sure it is still a descendant of the root node. Rather than incur that overhead, we&#039;ll leave it to the devloper to use the iterator with care. We can safeguard the above code by putting the call to replaceChild() inside a try block:
  
<pre class="php">
try {
  $document-&gt;replaceChild($bar, $foo);
} catch (DOMException $e) {
  if ($e-&gt;getMessage() !== 'Not Found Error') {
    throw $e;
  }
}
</pre></p>

<h3>An Issue with PHP, or with DOM?</h3>
<p>Stay tuned for my next blog entitled &#034;Why the DOM Sucks.&#034; Till next time&#8230;</p> 
]]></content:encoded>
			<wfw:commentRss>http://robrosenbaum.com/php/domnodelist-gotchas/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
