IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
CJK Width Token Filteredit
The cjk_width
token filter normalizes CJK width differences:
- Folds fullwidth ASCII variants into the equivalent basic Latin
- Folds halfwidth Katakana variants into the equivalent Kana
This token filter can be viewed as a subset of NFKC/NFKD
Unicode normalization. See the analysis-icu
plugin
for full normalization support.