Enum KuromojiTokenizationMode
The tokenization mode determines how the tokenizer handles compound and unknown words.
Part of the analysis-kuromoji
plugin:
Namespace: OpenSearch.Client
Assembly: OpenSearch.Client.dll
Syntax
public enum KuromojiTokenizationMode
Fields
Name | Description |
---|---|
Extended | Extended mode outputs unigrams for unknown words. |
Normal | Normal segmentation, no decomposition for compounds |
Search | Segmentation geared towards search. This includes a decompounding process for long nouns, also including the full compound token as a synonym. |