3

I have a JSON string with many UNICODE codes in it, I am looking for a way to convert them to UTF8 using PHP. The JSON string has values like:

{
"capital":"Bras\u00edlia",
"symbol":"\u20a1"
}

and then other values like:

{
"native": "اليَمَن",
"symbol_native": "ر.ي.‏"
}

The JSON string is contained inside a PHP variable that looks like this:

$countries ='{  
   "AR":{  
      "name":"Argentina",
      "native":"Argentina",
      "phone":"54",
      "continent":"SA",
      "capital":"Buenos Aires",
      "currency":{  
         "symbol":"AR$",
         "name":"Argentine Peso",
         "symbol_native":"$",
         "decimal_digits":2,
         "rounding":0,
         "code":"ARS",
         "name_plural":"Argentine pesos",
         "vat":"21",
         "vat_name":"IVA"
      },
      "tin":"CUIT",
      "languages":"es,gn",
      "iso":"ARG"
   }';

Already tried most solutions around the web and SO but none of them worked, tried unsuccessfully with:

utf8_encode()
mb_convert_encoding()
iconv()
header('charset=utf-8');

The only way I found to successfully transform UNICODE codes to UTF8 was using str_replace() creating an array of the UNICODE codes and another array with their equivalent UTF8 values, but the array I have wont cover all posible combinations, so I was wondering if there is an easier way to do it.

This one works with the characters in the array:

function unicodeToutf8($str){
    $repl = ['\u00e1','\u00e9','\u00ed','\u00f3','\u00fa','\u00f1','\u00c1','\u00c9','\u00cd','\u00d3','\u00da','\u00d1'];
    $with = ['á','é','í','ó','ú','ñ','Á','É','Í','Ó','Ú','Ñ'];
    return str_replace($repl,$with,$str);
}

Thank you!

5
  • UTF-8 is also Unicode. What you're looking at is a unicode escape sequence. When you json_decode something with PHP, this un-escaping should alraedy happen... are you using json_decode? Commented Apr 28, 2020 at 19:09
  • What is your actual question? Both of your examples are perfectly valid JSON containing UTF-8. One is escaped and the other isn't. Additionally, the code you've quoted is not something a reasonable person should ever use. Commented Apr 28, 2020 at 19:28
  • @Evert thanks for the tip, but the JSON string is contained inside a variable (updated my question) I tried with json_decode then json_encode again but didn't work Commented Apr 28, 2020 at 19:29
  • @Sammitch yes the idea is to escape those characters from \u00e1 to á Commented Apr 28, 2020 at 19:35
  • You don't have to. When you decode the JSON on the other side they will be unescaped as part of the process. This is to protect you against encoding problems in transit, and turning it off has virtually no benefit. Commented Apr 28, 2020 at 20:14

1 Answer 1

5

If you're just trying to re-encode the JSON without the unicode escape sequences, this is how it's done:

json_encode(json_decode($input), JSON_UNESCAPED_UNICODE);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.