最近看到一个很古老的代码,当时正是gb2312横行的时代,所以,遍地都是gbk编码了。现如今打开,看起来一堆乱码,很不舒服,写了一个凑合脚本。
结合shell 跟 php
shell 实际上用的只是find
if [ $# -lt 1 ]
then
depth=1
else
depth=$1
fi
if [ $# -lt 2 ]
then
name="*"
else
name="$2"
fi
echo "Begin running:"
find . -maxdepth $depth -name "$name" -exec php /Users/qixingyue/sina/bin/cutf_8
.php {} \;
核心php代码:
<?php
if($argc != 2) exit();
$file_name = $argv[1];
if(is_dir($file_name)) return;
$file_name = getcwd() . trim($file_name,".");
if(is_link($file_name)){
echo "Link file: $file_name \n";
return;
}
if(!file_exists($file_name)){
echo "None file: $file_name \n";
return;
}
if(!is_writeable($file_name)){
echo "Can not write $file_name \n";
return;
}
$str = file_get_contents($file_name);
$from_types = array("ASCII", "GBK", "GB2312", "Unicode", "UTF-8");
$this_type = mb_detect_encoding($str, $from_types);
if($this_type == "UTF-8") {
echo "Already UTF-8 $file_name \n";
return;
}
if($this_type){
echo "$file_name $this_type ========> UTF-8 \n";
$str = mb_convert_encoding($str,"UTF-8",$this_type);
file_put_contents($file_name,$str);
} |