|
 |
iconv (PHP 4 >= 4.0.5, PHP 5) iconv -- Преобразовывает символы строки в другую кодировку Описаниеstring iconv ( string in_charset, string out_charset, string str )
Производит преобразование кодировки символов строки
str из начальной кодировки in_charset в конечную out_charset.
Возвращает строку в новой кодировке, или FALSE в случае ошибки.
Если добавить //TRANSLIT к параметру out_charset будет включена транслитеризация.
Это означает, что вслучае, когда символа нет в конечной кодировке, он заменяется одним или несколькими аналогами.
Если добавить //IGNORE, то символы, которых нет в конечной кодировке, будут опущены.
Иначе, будет возвращена строка str, обрезанная до первого недопустимого символа.
Пример 1. Пример использования iconv():
<?php
echo iconv("KOI8-U", "UTF-8", "Пора переходить на юникод.");
?>
|
|
iconv
sire at acc dot umu dot se
14-Dec-2005 01:17
If you get this error message: "Notice: iconv(): Detected an illegal character in input string in file.php on line x", and your text or database is likely to contain text copied from Microsoft Word documents, it's very likely that the error is because of the evil 0x96 "long dash" character. MS Word as default converts all double hyphens into this illegal character. The solution is either to convert 0x96 (dash) into the regular 0x2d (hyphen/minus), or to append the //TRANSLIT or //IGNORE parameters (se above).
nilcolor at gmail dot coom
24-Nov-2005 03:29
Didn't know its a feature or not but its works for me (PHP 5.0.4)
iconv('', 'UTF-8', $str)
test it to convert from windows-1251 (stored in DB) to UTF-8 (which i use for web pages).
BTW i convert each array i fetch from DB with array_walk_recursive...
anyean at gmail dot com
30-May-2005 03:23
<?php
function unescape($str) {
$str = rawurldecode($str);
preg_match_all("/(?:%u.{4})|&#x.{4};|&#\d+;|.+/U",$str,$r);
$ar = $r[0];
print_r($ar);
foreach($ar as $k=>$v) {
if(substr($v,0,2) == "%u")
$ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,-4)));
elseif(substr($v,0,3) == "&#x")
$ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1)));
elseif(substr($v,0,2) == "&#") {
echo substr($v,2,-1)."<br>";
$ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1)));
}
}
return join("",$ar);
}
?>
zhawari at hotmail dot com
01-Feb-2005 03:27
Here is how to convert UTF-8 numbers to UCS-2 numbers in hex:
<?php
function utf8toucs2($str)
{
for ($i=0;$i<strlen($str);$i+=2)
{
$substring1 = $str[$i].$str[$i+1];
$substring2 = $str[$i+2].$str[$i+3];
if (hexdec($substring1) < 127)
$results = "00".$str[$i].$str[$i+1];
else
{
$results = dechex((hexdec($substring1)-192)*64 + (hexdec($substring2)-128));
if ($results < 1000) $results = "0".$results;
$i+=2;
}
$ucs2 .= $results;
}
return $ucs2;
}
echo strtoupper(utf8toucs2("D985D8B1D8AD"))."\n";
echo strtoupper(utf8toucs2("456725"))."\n";
?>
Input:
D985D8B1D8AD
Output:
06450631062D
Input:
456725
Output:
004500670025
PHANTOm <phantom at nix dot co dot il>
27-Jan-2005 12:49
convert windows-1255 to utf-8 with the following code
<?php
$heb = 'put hebrew text here';
$utf = preg_replace("/([\xE0-\xFA])/e","chr(215).chr(ord(\${1})-80)",$heb);
?>
zhawari at hotmail dot com
18-Jan-2005 03:02
Here is how to convert UCS-2 numbers to UTF-8 numbers in hex:
function ucs2toutf8($str)
{
for ($i=0;$i<strlen($str);$i+=4)
{
$substring1 = $str[$i].$str[$i+1];
$substring2 = $str[$i+2].$str[$i+3];
if ($substring1 == "00")
{
$byte1 = "";
$byte2 = $substring2;
}
else
{
$substring = $substring1.$substring2;
$byte1 = dechex(192+(hexdec($substring)/64));
$byte2 = dechex(128+(hexdec($substring)%64));
}
$utf8 .= $byte1.$byte2;
}
return $utf8;
}
echo strtoupper(ucs2toutf8("06450631062D0020"));
?>
Input:
06450631062D
Output:
D985D8B1D8AD
regards,
Ziyad
SiMM
10-Dec-2004 11:15
<? function CP1251toUTF8($string){
$out = '';
for ($i = 0; $i<strlen($string); ++$i){
$ch = ord($string{$i});
if ($ch < 0x80) $out .= chr($ch);
else
if ($ch >= 0xC0)
if ($ch < 0xF0)
$out .= "\xD0".chr(0x90 + $ch - 0xC0); else $out .= "\xD1".chr(0x80 + $ch - 0xF0); else
switch($ch){
case 0xA8: $out .= "\xD0\x81"; break; case 0xB8: $out .= "\xD1\x91"; break; case 0xA1: $out .= "\xD0\x8E"; break; case 0xA2: $out .= "\xD1\x9E"; break; case 0xAA: $out .= "\xD0\x84"; break; case 0xAF: $out .= "\xD0\x87"; break; case 0xB2: $out .= "\xD0\x86"; break; case 0xB3: $out .= "\xD1\x96"; break; case 0xBA: $out .= "\xD1\x94"; break; case 0xBF: $out .= "\xD1\x97"; break; case 0x8C: $out .= "\xD3\x90"; break; case 0x8D: $out .= "\xD3\x96"; break; case 0x8E: $out .= "\xD2\xAA"; break; case 0x8F: $out .= "\xD3\xB2"; break; case 0x9C: $out .= "\xD3\x91"; break; case 0x9D: $out .= "\xD3\x97"; break; case 0x9E: $out .= "\xD2\xAB"; break; case 0x9F: $out .= "\xD3\xB3"; break; }
}
return $out;
}
?>
aissam at yahoo dot com
29-Nov-2004 08:20
For those who have troubles in displaying UCS-2 data on browser, here's a simple function that convert ucs2 to html unicode entities :
<?php
function ucs2html($str) {
$str=trim($str); $len=strlen($str);
$html='';
for($i=0;$i<$len;$i+=2)
$html.='&#'.hexdec(dechex(ord($str[$i+1])).
sprintf("%02s",dechex(ord($str[$i])))).';';
return($html);
}
?>
nikolai-dot-zujev-at-gmail-dot-com
18-Nov-2004 01:14
Here is an example how to convert windows-1251 (windows) or cp1251(Linux/Unix) encoded string to UTF-8 encoding.
<?php
function cp1251_utf8( $sInput )
{
$sOutput = "";
for ( $i = 0; $i < strlen( $sInput ); $i++ )
{
$iAscii = ord( $sInput[$i] );
if ( $iAscii >= 192 && $iAscii <= 255 )
$sOutput .= "&#".( 1040 + ( $iAscii - 192 ) ).";";
else if ( $iAscii == 168 )
$sOutput .= "&#".( 1025 ).";";
else if ( $iAscii == 184 )
$sOutput .= "&#".( 1105 ).";";
else
$sOutput .= $sInput[$i];
}
return $sOutput;
}
?>
vitek at 4rome dot ru
15-Nov-2004 11:53
On some systems there may be no such function as iconv(); this is due to the following reason: a constant is defined named `iconv` with the value `libiconv`. So, the string PHP_FUNCTION(iconv) transforms to PHP_FUNCTION(libiconv), and you have to call libiconv() function instead of iconv().
I had seen this on FreeBSD, but I am sure that was a rather special build.
If you'd want not to be dependent on this behaviour, add the following to your script:
<?php
if (!function_exists('iconv') && function_exists('libiconv')) {
function iconv($input_encoding, $output_encoding, $string) {
return libiconv($input_encoding, $output_encoding, $string);
}
}
?>
Thanks to tony2001 at phpclub.net for explaining this behaviour.
ng4rrjanbiah at rediffmail dot com
22-Jun-2004 08:10
Here is a code to convert ISO 8859-1 to UTF-8 and vice versa without using iconv.
<?php
$str_iso8859_1 = 'foo in ISO 8859-1';
$str_utf8 = preg_replace("/([\x80-\xFF])/e",
"chr(0xC0|ord('\\1')>>6).chr(0x80|ord('\\1')&0x3F)",
$str_iso8859_1);
$str_iso8859_1 = preg_replace("/([\xC2\xC3])([\x80-\xBF])/e",
"chr(ord('\\1')<<6&0xC0|ord('\\2')&0x3F)",
$str_utf8);
?>
HTH,
R. Rajesh Jeba Anbiah
Igu4n4 at example dot com
05-Jul-2003 09:03
Maybe I was a fool in placing the charset definition string as ISO8859-1 instead of ISO-8859-1 (note the - after ISO) but it worked in PHP 4.3. When I ported the system back to 4.2.2 iconv gave back an empty string without error messages. So beware in PHP 4.2.2 use allways the ISO-88.... charset definition.
| |