regular expression containing unicode words

I'd like to match all strings containing a certain word. like:

String regex = (?:\P{L}|\W|^)(ベスパ)(?:\b|$)

however, the Pattern class doesn't compile it:

java.util.regex.PatternSyntaxException: Unmatched closing ')' near index 39
(?:\P{L}|\W|^)((?:ベス|ベス|ヘズ)(?:パ)|パ)|ハ)゚)(?:\b|$)

I already set unicode_case to compile param, not sure what's going wrong here

final Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE| Pattern.CANON_EQ);

Thanks for help! :)

7
задан tchrist 4 April 2015 в 19:04
поделиться