So it seems SimpleXML doesn't support CDATA... I bashed together this little regex function to sort out the CDATA before trying to parse XML with the likes of simplexml_load_file / simplexml_load_string. Hope it might help somebody and would be very interested to hear of better solutions. (Other than *not* using SimpleXML of course! ;)
It looks for any <![CDATA [Text and HTML etc in here]]> elements, htmlspecialchar()'s the encapsulated data and then strips the "<![CDATA [" and "]]>" tags out.
<?php
function simplexml_unCDATAise($xml) {
$new_xml = NULL;
preg_match_all("/\<\!\[CDATA \[(.*)\]\]\>/U", $xml, $args);
if (is_array($args)) {
if (isset($args[0]) && isset($args[1])) {
$new_xml = $xml;
for ($i=0; $i<count($args[0]); $i++) {
$old_text = $args[0][$i];
$new_text = htmlspecialchars($args[1][$i]);
$new_xml = str_replace($old_text, $new_text, $new_xml);
}
}
}
return $new_xml;
}
$xml = 'Your XML with CDATA...';
$xml = simplexml_unCDATAise($xml);
$xml_object = simplexml_load_string($xml);
?>