Smart Chinese Analysis Pluginedit

The Smart Chinese Analysis plugin integrates Lucene’s Smart Chinese analysis module into elasticsearch.

It provides an analyzer for Chinese or mixed Chinese-English text. This analyzer uses probabilistic knowledge to find the optimal word segmentation for Simplified Chinese text. The text is first broken into sentences, then each sentence is segmented into words.

Installationedit

Version 7.0.0-alpha1 of the Elastic Stack has not yet been released.

Removaledit

The plugin can be removed with the following command:

sudo bin/elasticsearch-plugin remove analysis-smartcn

The node must be stopped before removing the plugin.

smartcn tokenizer and token filteredit

The plugin provides the smartcn analyzer and smartcn_tokenizer tokenizer, which are not configurable.

Note

The smartcn_word token filter and smartcn_sentence have been deprecated.