最近碰到调用别人的接口,json_decode无法解出的情况,

是因为里面存在了不可见字符,

在stackoverflow找到一个讨论页

http://stackoverflow.com/questions/1176904/php-how-to-remove-all-non-printable-characters-in-a-string

 

大部分答案只考虑了英文系的场景,去掉了所有非asc字符,中文也被过滤了

下面有个答案则考虑了所有语音兼容,记录下函数

function clean_string($string) {
  $s = trim($string);
  $s = iconv("UTF-8", "UTF-8//IGNORE", $s); // drop all non utf-8 characters

  // this is some bad utf-8 byte sequence that makes mysql complain - control and formatting i think
  $s = preg_replace('/(?>[\x00-\x1F]|\xC2[\x80-\x9F]|\xE2[\x80-\x8F]{2}|\xE2\x80[\xA4-\xA8]|\xE2\x81[\x9F-\xAF])/', ' ', $s);

  $s = preg_replace('/\s+/', ' ', $s); // reduce all multiple whitespace to a single space

  return $s;
}

 

声明:此文系舞林cuznwww.wulinlw.org)原创稿件,转载请保留版权

et_highlighter