I have a string such like that : "Xin chào tất cả mọi người". There are some Unicode characters in the string. All that I want is writing a function (in JS) to check if there is at least 1 Unicode character exists.
-
3JavaScript strings do not "contain UTF-8" characters. They contain Unicode code-points (encoded as one code-point/character for Unicode in the BMP - whatever UTF-16/UCS-2 internal coding is an entirely different can of worms). So, now what is a "UTF-8 character"? Do you mean Unicode character not in the ASCII plane?user2864740– user28647402014-02-12 04:37:30 +00:00Commented Feb 12, 2014 at 4:37
Add a comment
|
2 Answers
A string is a series of characters, each which have a character code. ASCII defines characters from 0 to 127, so if a character in the string has a code greater than that, then it is a Unicode character. This function checks for that. See String#charCodeAt.
function hasUnicode (str) {
for (var i = 0; i < str.length; i++) {
if (str.charCodeAt(i) > 127) return true;
}
return false;
}
Then use it like, hasUnicode("Xin chào tất cả mọi người")
Comments
Here's a different approach using regular expressions
function hasUnicode(s) {
return /[^\u0000-\u007f]/.test(s);
}
1 Comment
StefansArya
The performance result will be different.. and your code doesn't detect the
à .. your regex should be /[^\u0000-\u007f]/ :)