|
 |
xml_parse (PHP 3 >= 3.0.6, PHP 4, PHP 5) xml_parse -- Start parsing an XML document Описаниеint xml_parse ( resource parser, string data [, bool is_final] )
xml_parse() parses an XML document. The handlers for
the configured events are called as many times as necessary.
Список параметров
- parser
A reference to the XML parser to use.
- data
Chunk of data to parse. A document may be parsed piece-wise by
calling xml_parse() several times with new data,
as long as the is_final parameter is set and
TRUE when the last data is parsed.
- is_final
If set and TRUE, data is the last piece of
data sent in this parse.
xml_parse
james @at@ mercstudio dot Com dot nospam
02-Apr-2006 12:10
hi,
i've modified bbellwfu at gmail dot com to as below:
features added:
- toXML (convert back array to xml string)
- changed name, according to macromedia flash xml concept : children -> childrens, tagdata -> nodevalue, name -> nodename,
- added pointer firstchild to childrens[0] (if exists)
some findings that i would like to share:
- <![cdata[my value here]]> (does not work on property value
- xml file must be htmlentity based (if not using cdata)
- xml line feed on node data seems to be double line feed on windows (still figuring why)
- xml line feed on attribute value seems to be ignored...
here's my code below :)
class u007xml
{
var $arrOutput = array();
var $resParser;
var $strXmlData;
function u007xml($tfile = "")
{
if(trim($tfile) != "") { $this->loadFile($tfile);}
}
function loadFile($tfile)
{
$this->thefile = $tfile;
$th = file($tfile);
$tdata = implode("\n", $th);
return $this->parse($tdata);
}
function parse($strInputXML)
{
$this->resParser = xml_parser_create ();
xml_set_object($this->resParser,$this);
xml_set_element_handler($this->resParser, "tagOpen", "tagClosed");
xml_set_character_data_handler($this->resParser, "tagData");
$this->strXmlData = xml_parse($this->resParser,$strInputXML );
if(!$this->strXmlData) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($this->resParser)),
xml_get_current_line_number($this->resParser)));
}
xml_parser_free($this->resParser);
return $this->arrOutput;
}
//called on each xml tree
function tagOpen($parser, $name, $attrs) {
$tag=array("nodename"=>$name,"attributes"=>$attrs);
array_push($this->arrOutput,$tag);
}
//called on data for xml
function tagData($parser, $tagData) {
if(trim($tagData)) {
if(isset($this->arrOutput[count($this->arrOutput)-1]['nodevalue'])) {
$this->arrOutput[count($this->arrOutput)-1]['nodevalue'] .= $this->parseXMLValue($tagData);
}
else {
$this->arrOutput[count($this->arrOutput)-1]['nodevalue'] = $this->parseXMLValue($tagData);
}
}
}
//called when finished parsing
function tagClosed($parser, $name) {
$this->arrOutput[count($this->arrOutput)-2]['childrens'][] = $this->arrOutput[count($this->arrOutput)-1];
if(count ($this->arrOutput[count($this->arrOutput)-2]['childrens'] ) == 1)
{
$this->arrOutput[count($this->arrOutput)-2]['firstchild'] =& $this->arrOutput[count($this->arrOutput)-2]['childrens'][0];
}
array_pop($this->arrOutput);
}
function toArray()
{
//not used, we can call loadString or loadFile instead...
}
function parseXMLValue($tvalue)
{
$tvalue = htmlentities($tvalue);
return $tvalue;
}
function toXML($tob = null)
{
//return back xml
$result = "";
if( $tob == null)
{
$tob = $this->arrOutput;
}
if(!isset($tob))
{
echo "XML Array empty...";
return null;
}
for($c = 0; $c < count($tob); $c++)
{
$result .="<" . $tob[$c]["nodename"];
while (list($key, $value) = each($tob[$c]["attributes"]))
{
$result .=" " . $key."=\"" . $this->parseXMLValue($value) . "\"";
}
$result .= ">";
//assign node value
if( isset($tob[$c]["nodevalue"]) )
{
$result .= $tob[$c]["nodevalue"];
}
if( count($tob[$c]["childrens"]) > 0 )
{
$result .= "\r\n" . $this->toXML(&$tob[$c]["childrens"]) . "";
}
$result .= "</" . $tob[$c]["nodename"] . ">\r\n";
}//end of each array...
return $result;
}
function displayXML()
{
print_r($this->arrOutput);
}
function getXML($tob = null)
{
return "<?xml version='1.0'?>\r\n" . $this->toXML($tob);
}
}//end of u007xml class
//examples below:
$xx = new u007xml();
$xx->loadFile("xml3.xml");
//$xx->displayXML();
print $xx->getXML();
ben at autonomic dot net
16-Mar-2006 05:12
bbellwfu's code does not handle 'text nodes' properly.
Consider the innards of a tag like <root>xxx<tag2/>yyy</root>
The 'tagData' for root will be "xxxyyy" and you have lost all information about where "tag2" was in that sequence.
Quick and dirty hack.
Replace tagData with this code :
function tagData($parser, $tagData) {
$last_element=count($this->arrOutput)-1;
$this->arrOutput[$last_element]['children'][] = array("textnode",$tagData);
}
What this does is adds 'textnodes' as children of its containing parent, *in the right sequence* (rather like the internet browsers do it). This then lets you do some more sensible secondary work like recursively looking up internal references within the document...
Kyle Bresin
19-Jan-2006 03:15
Just wanted to note a small bug in bbellwfu's class (which is really great btw).
It fails to capture any datums which are equal to numerical zero.
The problem lies in the function tagData, the first if statement should be:
if(trim($tagData) != '') {
tim at alloutinteraction dot com
01-Oct-2005 07:29
I wanted to create a really simple XML parser, but I found the array management in xml_parse a bit daunting. So I flattened my XML and parsed it using string matching. It wouldn't be difficult to add xml depth (of 2 plus levels) by modifying the parsedXML array.
<?
$xmlRaw="<order>Order data</order><label>Label data</label><control>123</control>";
$xmlFieldNames=array("order", "label", "control");
foreach ($xmlFieldNames as $xmlField) {
if(strpos($xmlRaw,$xmlField)!==false){
$parsedXML[$xmlField]=substr($xmlRaw,
strpos($xmlRaw,"<$xmlField>")+strlen("<$xmlField>"),
strpos($xmlRaw,"</$xmlField>")-strlen("<$xmlField>")
-strpos($xmlRaw,"<$xmlField>"));
}
}
print_r($parsedXML);
?>
Hope you find this useful (coded it while ill in bed with streaming cold, but felt much better afterwards!)
Tim (a lazy coder)
bbellwfu at gmail dot com
05-May-2005 02:51
Just improving a little bit on the code examples from tgrabietz and randlem below... everything in one pretty class, plus some checks in place so that the element data doesnt get split up (thanks to flobee on the xml_set_character_data_handler page)
<?php
class xml2Array {
var $arrOutput = array();
var $resParser;
var $strXmlData;
function parse($strInputXML) {
$this->resParser = xml_parser_create ();
xml_set_object($this->resParser,$this);
xml_set_element_handler($this->resParser, "tagOpen", "tagClosed");
xml_set_character_data_handler($this->resParser, "tagData");
$this->strXmlData = xml_parse($this->resParser,$strInputXML );
if(!$this->strXmlData) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($this->resParser)),
xml_get_current_line_number($this->resParser)));
}
xml_parser_free($this->resParser);
return $this->arrOutput;
}
function tagOpen($parser, $name, $attrs) {
$tag=array("name"=>$name,"attrs"=>$attrs);
array_push($this->arrOutput,$tag);
}
function tagData($parser, $tagData) {
if(trim($tagData)) {
if(isset($this->arrOutput[count($this->arrOutput)-1]['tagData'])) {
$this->arrOutput[count($this->arrOutput)-1]['tagData'] .= $tagData;
}
else {
$this->arrOutput[count($this->arrOutput)-1]['tagData'] = $tagData;
}
}
}
function tagClosed($parser, $name) {
$this->arrOutput[count($this->arrOutput)-2]['children'][] = $this->arrOutput[count($this->arrOutput)-1];
array_pop($this->arrOutput);
}
}
?>
Will output something like...
<snippet>
Array
(
[0] => Array
(
[name] => GETMESSAGESRESPONSE
[attrs] => Array
(
)
[children] => Array
(
[0] => Array
(
[name] => STATUS
[attrs] => Array
(
)
)
</snippet>
alex dot garcia at noos dot fr
14-Mar-2005 01:47
Here is the inverse function which takes parsed xml array in entry and outputs xml string
enjoy !
function getXmlFromArray($root){
if(count($root) > 0){
$curr_name = $root['name'];
$attribs = $root['attrs'];
$curr_childs = $root['children'];
$curr_data = $root['cdata'];
$xml .= '<'.$curr_name;
if(count($attribs) > 0){
$i = 1;
foreach($attribs as $key => $value){
$curr_attribs .= $key.'="'.$value.'"';
$i++;
if($i <= count($attribs)){
$curr_attribs .= ' ';
}
}
$xml .= ' '.$curr_attribs;
}
if($curr_data != ''){
$xml .= '><![CDATA['.$curr_data.']]></'.$curr_name.'>';
} else {
if(count($curr_childs) > 0){
$xml .= '>';
foreach($curr_childs as $child){
$xml .= getXmlFromArray($child);
}
$xml .= '</'.$curr_name.'>';
} else {
$xml .= '/>';
}
}
}
return $xml;
}
tgrabietz at bupnet dot de
22-Sep-2004 08:05
it's like randlem at gmail dot com's great code, without using a "class container" but parsing cdata. The script returns the tree-structure in a single array.
<?php
$file = 'simple.xml';
$stack = array();
function startTag($parser, $name, $attrs)
{
global $stack;
$tag=array("name"=>$name,"attrs"=>$attrs);
array_push($stack,$tag);
}
function cdata($parser, $cdata)
{
global $stack,$i;
if(trim($cdata))
{
$stack[count($stack)-1]['cdata']=$cdata;
}
}
function endTag($parser, $name)
{
global $stack;
$stack[count($stack)-2]['children'][] = $stack[count($stack)-1];
array_pop($stack);
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "cdata");
$data = xml_parse($xml_parser,file_get_contents($file));
if(!$data) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
xml_parser_free($xml_parser);
print("<pre>\n");
print_r($stack);
print("</pre>\n");
?>
ByK
16-Sep-2004 08:12
modified from yours code. I think it's work!!.
class CXml
{
var $xml_data;
var $obj_data;
var $pointer;
function CXml() { }
function Set_xml_data( &$xml_data )
{
$this->index = 0;
$this->pointer[] = &$this->obj_data;
//strip white space between tags
$this->xml_data = eregi_replace(">"."[[:space:]]+"."<","><",$xml_data);
$this->xml_parser = xml_parser_create( "UTF-8" );
xml_parser_set_option( $this->xml_parser, XML_OPTION_CASE_FOLDING, false );
xml_set_object( $this->xml_parser, &$this );
xml_set_element_handler( $this->xml_parser, "_startElement", "_endElement");
xml_set_character_data_handler( $this->xml_parser, "_cData" );
xml_parse( $this->xml_parser, $this->xml_data, true );
xml_parser_free( $this->xml_parser );
}
function _startElement( $parser, $tag, $attributeList )
{
foreach( $attributeList as $name => $value )
{
$value = $this->_cleanString( $value );
$object->$name = $value;
}
//replaces the special characters with the underscore (_) in tag name
$tag = preg_replace("/[:\-\. ]/", "_", $tag);
eval( "\$this->pointer[\$this->index]->" . $tag . "[] = \$object;" );
eval( "\$size = sizeof( \$this->pointer[\$this->index]->" . $tag . " );" );
eval( "\$this->pointer[] = &\$this->pointer[\$this->index]->" . $tag . "[\$size-1];" );
$this->index++;
}
function _endElement( $parser, $tag )
{
array_pop( $this->pointer );
$this->index--;
}
function _cData( $parser, $data )
{
if (empty($this->pointer[$this->index])) {
if (rtrim($data, "\n"))
$this->pointer[$this->index] = $data;
} else {
$this->pointer[$this->index] .= $data;
}
}
function _cleanString( $string )
{
return utf8_decode( trim( $string ) );
}
}
$m_xml = new CXml();
$xml_data = file_get_contents( $filename );
$m_xml->Set_XML_data( $xml_data );
$newsid = $m_xml->obj_data->root[0]->NewsID[0];
randlem at gmail dot com
15-Sep-2004 08:43
Here's a handy way to generate a tree that can be can be decended easily.
<?php
$file = 'xmltest.xml';
$tag_tree = array();
$stack = array();
class tag {
var $name;
var $attrs;
var $children;
function tag($name, $attrs, $children) {
$this->name = $name;
$this->attrs = $attrs;
$this->children = $children;
}
}
function startTag($parser, $name, $attrs) {
global $tag_tree, $stack;
$tag = new tag($name,$attrs,'');
array_push($stack,$tag);
}
function endTag($parser, $name) {
global $stack;
$stack[count($stack)-2]->children[] = $stack[count($stack)-1];
array_pop($stack);
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
$data = xml_parse($xml_parser,file_get_contents($file));
if(!$data) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
xml_parser_free($xml_parser);
print("\n");
print_r($stack);
print("\n");
?>
talraith at withouthonor dot com
08-Jul-2004 04:40
I have written a module that contains a class for use with XML documents. The module is dual-purpose in that it will parse XML code into a native object tree structure and will generate XML code from the object tree structure.
The output produced from generating XML code is designed to be used by other applications and is not in human-readable form.
I would like to point out that the code does not use any eval() statements to create the tree.
I am posting my code for two purposes:
1) I am looking to refine it and make it more efficient
2) It may benefit someone that is looking for a module like this.
The code is too big to post in here, so I have uploaded it to a web site: http://www.withouthonor.com/obj_xml.html
An example of parsing an XML document:
<?php
$XML = '<root><section>This is my sample XML code</section></root>';
$xml = new xml_doc($XML);
$xml->parse();
$my_tag = $xml->getTag(0,$name,$attributes,$cdata,$children);
?>
It is then possible to loop through the children of the tag and process the data with your program. The last variable above ($children) contains a list of tag reference ID's. The object tree is created by assigning each tag a unique ID starting with zero. The tree is created by using object references to relate parents and children.
If you wanted to create an XML document from scratch, the code would be similar to the following example:
<?php
$xml = new xml_doc();
$root_tag = $xml->createTag('root');
$xml->createTag('section',array(),'This is my sample XML code',$root_tag);
$my_output = $xml->generate();
print $my_output;
?>
The example above creates an XML document that is the same as the one used in my first example. Another option available would be to load the XML code as in the first example, change it through PHP, and then generate the code and output it.
michelek
26-Oct-2003 10:42
its maybe not better, but me thinks its more stright-forward
--INPUT:
<?xml version="1.0" encoding="UTF-8"?>
<world>
<country name="sweden">
<city name="stockholm">
<user>Adam</user>
<user>Eva</user>
</city>
<city name="gteborg">
<user>God</user>
</city>
</country>
<country name="usa">
<city name="new york">
<user>Clinton</user>
<user>Bush</user>
</city>
</country>
</world>
--CODE:
<?
$filename = "m.m.xml";
$xmlC = new XmlC();
$xml_data = file_get_contents( $filename );
$xmlC->Set_XML_data( $xml_data );
echo( "<pre>\n" );
print_r( $xmlC->obj_data );
echo( "</pre>\n" );
class XmlC
{
var $xml_data;
var $obj_data;
var $pointer;
function XmlC()
{
}
function Set_xml_data( &$xml_data )
{
$this->index = 0;
$this->pointer[] = &$this->obj_data;
$this->xml_data = $xml_data;
$this->xml_parser = xml_parser_create( "UTF-8" );
xml_parser_set_option( $this->xml_parser, XML_OPTION_CASE_FOLDING, false );
xml_set_object( $this->xml_parser, &$this );
xml_set_element_handler( $this->xml_parser, "_startElement", "_endElement");
xml_set_character_data_handler( $this->xml_parser, "_cData" );
xml_parse( $this->xml_parser, $this->xml_data, true );
xml_parser_free( $this->xml_parser );
}
function _startElement( $parser, $tag, $attributeList )
{
foreach( $attributeList as $name => $value )
{
$value = $this->_cleanString( $value );
$object->$name = $value;
}
eval( "\$this->pointer[\$this->index]->" . $tag . "[] = \$object;" );
eval( "\$size = sizeof( \$this->pointer[\$this->index]->" . $tag . " );" );
eval( "\$this->pointer[] = &\$this->pointer[\$this->index]->" . $tag . "[\$size-1];" );
$this->index++;
}
function _endElement( $parser, $tag )
{
array_pop( $this->pointer );
$this->index--;
}
function _cData( $parser, $data )
{
if( trim( $data ) )
{
$this->pointer[$this->index] = trim( $data );
}
}
function _cleanString( $string )
{
return utf8_decode( trim( $string ) );
}
}
?>
Adam Tylmad
11-Sep-2003 01:11
I've created a parser that returns an
object based on a xml document.
example:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<country name="sweden">
<city name="stockholm">
<user>Adam</user>
<user>Eve</user>
</city>
<city name="gteborg">
<user>God</user>
</city>
</country>
<country name="usa">
<city name="new york">
<user>Clinton</user>
<user>Bush</user>
</city>
</country>
generates the following object structure:
[country] => Array
(
[0] => stdClass Object
(
[name] => sweden
[city] => Array
(
[0] => stdClass Object
(
[name] => stockholm
[user] => Array
(
[0] => Adam
[1] => Eve
)
)
[1] => stdClass Object
(
[name] => gteborg
[user] => God
)
)
)
[1] => stdClass Object
(
[name] => usa
[city] => stdClass
(
[name] => new york
[user] => Array
(
[0] => Clinton
[1] => Bush
)
)
)
)
Here is the code:
class XMLParser {
var $path;
var $result;
function XMLParser($encoding, $data) {
$this->path = "\$this->result";
$this->index = 0;
$xml_parser = xml_parser_create($encoding);
xml_set_object($xml_parser, &$this);
xml_set_element_handler($xml_parser, 'startElement', 'endElement');
xml_set_character_data_handler($xml_parser, 'characterData');
xml_parse($xml_parser, $data, true);
xml_parser_free($xml_parser);
}
function startElement($parser, $tag, $attributeList) {
eval("\$vars = get_object_vars(".$this->path.");");
$this->path .= "->".$tag;
if ($vars and array_key_exists($tag, $vars)) {
eval("\$data = ".$this->path.";");
if (is_array($data)) {
$index = sizeof($data);
$this->path .= "[".$index."]";
} else if (is_object($data)) {
eval($this->path." = array(".$this->path.");");
$this->path .= "[1]";
}
}
eval($this->path." = null;");
foreach($attributeList as $name => $value)
eval($this->path."->".$name. " = '".XMLParser::cleanString($value)."';");
}
function endElement($parser, $tag) {
$this->path = substr($this->path, 0, strrpos($this->path, "->"));
}
function characterData($parser, $data) {
eval($this->path." = '".trim($data)."';");
}
}
enjoy! And please make it better if you can ;-)
jacek <dot> prucia <at> 7bulls <dot> com
27-Aug-2002 09:03
don't underestimate is_final argument. If you ignore it (since it is optional) you can get strange results with non well-formed XML's, like no output from xml_parse at all. Also if you use feof($fp) as is_final make sure you don't use fgets, because there's a caveat with how feof is evaluated there.
06-Dec-2001 12:14
if you're using magic quotes by default, remember to turn them off for the XML parsing.
| |