10

I have a multilingual website (Chinese and English).

I like to validate a text field (name field) in javascript. I have the following code so far.

var chkName = /^[characters]{1,20}$/;

if( chkName.test("[name value goes here]") ){
  alert("validated");
}

the problem is, /^[characters]{1,20}$/ only matches English characters. Is it possible to match ANY (including unicode) characters? I used to use the following regex, but I don't want to allow spaces between each characeters.

/^(.+){1,20}$/
4
  • 2
    What do you intend to do if a Korean, Japanese, Vietnamese, or Klingon name is provided? Commented Jun 16, 2011 at 19:27
  • What rules do you have? 1-20 characters, no spaces. Anything else? Commented Jun 16, 2011 at 19:29
  • @Russell Borogove // That is my concern as well. I want to validate all the unicodes and english. Commented Jun 16, 2011 at 19:31
  • @roberkules // for now, I want to allow only characters without spaces. Commented Jun 16, 2011 at 19:32

5 Answers 5

29

You might check out Javascript + Unicode regexes and do some research to find exactly which ranges of characters you want to allow:

See What's the complete range for Chinese characters in Unicode?

After reading those two and a little extra research you should be able to find appropriate values to complete something like: /^[-'a-z\u4e00-\u9eff]{1,20}$/i

Sign up to request clarification or add additional context in comments.

1 Comment

in case ie.: german äüöß, french é..., spanish ñ... should be supported, the regex would need to be extended
3

Take a look at Regex Unicode blocks.

You can use this to take care of CJK names.

Comments

2

As of 2018, there is new syntax in JavaScript to match Chinese or any other non-ASCII scripts:

const REGEX = /(\p{Script=Hani})+/gu; // note the 'u'
'你好'.match(REGEX);
// ["你好"]

The trick is to use \p and use the right script name, Hani stands for Han script (Chinese). The full list of scripts is here: http://unicode.org/Public/UNIDATA/PropertyValueAliases.txt

To match both Chinese and English you just expand it a bit, for example:

const REGEX = /([A-Za-z]|\p{Script=Hani})+/gu;
// does not match accented letters though

3 Comments

It is Han, not Hani
Looks like both work, tried in Chrome. Sorry about saying it's wrong without verifying. The only difference is, "Hani" is the "code name", "Han" is the real name of the language. Like "Grek" vs "Greek". I'm Chinese, apparently my brain told me it should be "Han" not "Hani", they forced all the code names into 4 chars. Shrug.
It works! By far, the simplest solution I found, thanks a lot!
0

I have done some work on validating Chinese names using XRegExp. The core code is XRegExp("^((?![\\p{InKangxi_Radicals}\\p{InCJK_Radicals_Supplement}\\p{InCJK_Symbols_and_Punctuation}])\\p{Han}){2,4}$","u")

See jsfiddle.net/coas/4djhso1y

Comments

-1
var chkName = /\s/;

function check(name) {

    document.write("<br />" + name + " is ");

    if (!chkName.test(name)) {
        document.write("okay");
    } else {
        document.write("invalid");
    }

}

check("namevaluegoeshere");

check("name value goes here");

This way you just check if there's any white space in the name.

demo @ http://jsfiddle.net/roberkules/U3q5W/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.