You may want to know the line and column number at which a token begins (or ends). Since this tokenizer interface doesn't provide that information, you have to track it manually, like below:
<?php
function update_line_and_column_positions($c, &$line, &$col)
{
$numNewLines = substr_count($c, "\n");
if (1 <= $numNewLines) {
$line += $numNewLines;
$col = 1;
$c = substr($c, strrpos($c, "\n") + 1);
if ($c === false) {
$c = '';
}
}
$col += strlen($c);
}
?>
Now use it, something like:
<?php
$line = 1;
$col = 1;
foreach ($tokens as $token) {
if (is_array($token)) {
list ($token, $text) = $token;
} else if (is_string($token)) {
$text = $token;
}
update_line_and_column_positions($text, $line, $col);
}
?>
Note this assumes that your desired coordinate system is 1-based (eg (1,1) is the upper left). Zero-based is left as an exercise for the reader.