Search Results for

    Show / Hide Table of Contents

    Interface ICharGroupTokenizer

    A tokenizer that breaks text into terms whenever it encounters a character which is in a defined set. It is mostly useful for cases where a simple custom tokenization is desired, and the overhead of use of PatternTokenizer is not acceptable.

    Inherited Members
    ITokenizer.Type
    ITokenizer.Version
    Namespace: OpenSearch.Client
    Assembly: OpenSearch.Client.dll
    Syntax
    public interface ICharGroupTokenizer : ITokenizer

    Properties

    | Edit this page View Source

    MaxTokenLength

    The maximum token length. If a token is seen that exceeds this length then it is split at MaxTokenLength intervals. Defaults to 255.

    Declaration
    [DataMember(Name = "max_token_length")]
    int? MaxTokenLength { get; set; }
    Property Value
    Type Description
    int?
    | Edit this page View Source

    TokenizeOnCharacters

    A list containing a list of characters to tokenize the string on. Whenever a character from this list is encountered, a new token is started. This accepts either single characters like eg. -, or character groups: whitespace, letter, digit, punctuation, symbol.

    Declaration
    [DataMember(Name = "tokenize_on_chars")]
    IEnumerable<string> TokenizeOnCharacters { get; set; }
    Property Value
    Type Description
    IEnumerable<string>

    Extension Methods

    SuffixExtensions.Suffix(object, string)
    • Edit this page
    • View Source
    In this article
    • Properties
      • MaxTokenLength
      • TokenizeOnCharacters
    • Extension Methods
    Back to top Generated by DocFX