CJK Width Token Filteredit

The cjk_width token filter normalizes CJK width differences:

  • Folds fullwidth ASCII variants into the equivalent basic Latin
  • Folds halfwidth Katakana variants into the equivalent Kana

This token filter can be viewed as a subset of NFKC/NFKD Unicode normalization. See the analysis-icu plugin for full normalization support.