Interface IKuromojiTokenizer
A tokenizer of type pattern that can flexibly separate text into terms via a regular expression.
Part of the analysis-kuromoji
plugin:
Namespace: OpenSearch.Client
Assembly: OpenSearch.Client.dll
Syntax
public interface IKuromojiTokenizer : ITokenizer
Properties
| Edit this page View SourceDiscardCompoundToken
Whether original compound tokens should be discarded from the output with
SearchMode. Defaults to false
.
Declaration
[DataMember(Name = "discard_compound_token")]
bool? DiscardCompoundToken { get; set; }
Property Value
Type | Description |
---|---|
bool? |
DiscardPunctuation
Whether punctuation should be discarded from the output. Defaults to true.
Declaration
[DataMember(Name = "discard_punctuation")]
bool? DiscardPunctuation { get; set; }
Property Value
Type | Description |
---|---|
bool? |
Mode
The tokenization mode determines how the tokenizer handles compound and unknown words.
Declaration
[DataMember(Name = "mode")]
KuromojiTokenizationMode? Mode { get; set; }
Property Value
Type | Description |
---|---|
KuromojiTokenizationMode? |
NBestCost
The nbest_cost parameter specifies an additional Viterbi cost. The KuromojiTokenizer will include all tokens in Viterbi paths that are within the nbest_cost value of the best path.
Declaration
[DataMember(Name = "nbest_cost")]
int? NBestCost { get; set; }
Property Value
Type | Description |
---|---|
int? |
NBestExamples
The nbest_examples can be used to find a nbest_cost value based on examples. For example, a value of /箱根山-箱根/成田空港-成田/ indicates that in the texts, 箱根山 (Mt. Hakone) and 成田空港 (Narita Airport) we’d like a cost that gives is us 箱根 (Hakone) and 成田 (Narita).
Declaration
[DataMember(Name = "nbest_examples")]
string NBestExamples { get; set; }
Property Value
Type | Description |
---|---|
string |
UserDictionary
The Kuromoji tokenizer uses the MeCab-IPADIC dictionary by default. A user_dictionary may be appended to the default dictionary.
Declaration
[DataMember(Name = "user_dictionary")]
string UserDictionary { get; set; }
Property Value
Type | Description |
---|---|
string |
UserDictionaryRules
Inline rule version of UserDictionary
Declaration
[DataMember(Name = "user_dictionary_rules")]
IEnumerable<string> UserDictionaryRules { get; set; }
Property Value
Type | Description |
---|---|
IEnumerable<string> |