只有utf-8允許可以沒有bom,utf-16, utf-32都是一定要有bom的,因為牽涉到little endian跟big endian的問題
function utf8_to_utf16($utf8_filename,$utf16_filename){
$file = fopen($utf8_filename,"r");
$write = fopen($utf16_filename,"w+");
if($file){
$buffer=chr(255).chr(254); //加入BOM
fwrite($write,$buffer);
while(($buffer = fgets($file)) != false){
$buffer=iconv('UTF-8','UTF-16LE',$buffer);
fwrite($write,$buffer);
}
}
}
function utf16_to_utf8($utf16_filename,$utf8_filename){
$write = fopen($utf8_filename,"w+");
$buffer=mb_convert_encoding( file_get_contents( $utf16_filename ), 'UTF-8', 'UTF-16LE' );
fwrite($write,$buffer);
}
為什麼2種轉法不一樣呢?
因為utf-8轉utf-16時會佔用比較多的記憶體,一次轉的話容易造成記憶體不足,所以分行轉,utf-16就沒這個問題,此外utf-8轉utf-16花的時間也比utf-16轉utf-8多,我測試轉一個12m的檔案,8轉16要花23秒,16轉8只要1秒...
另外,用這個方式轉出來的utf-8是有bom的,要去除的話請自己來 XD
沒有留言:
張貼留言