Search Results for

    Show / Hide Table of Contents

    Class CharGroupTokenizer

    A tokenizer that breaks text into terms whenever it encounters a character which is in a defined set. It is mostly useful for cases where a simple custom tokenization is desired, and the overhead of use of PatternTokenizer is not acceptable.

    Inheritance
    object
    TokenizerBase
    CharGroupTokenizer
    Implements
    ICharGroupTokenizer
    ITokenizer
    Inherited Members
    TokenizerBase.Type
    TokenizerBase.Version
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: OpenSearch.Client
    Assembly: OpenSearch.Client.dll
    Syntax
    public class CharGroupTokenizer : TokenizerBase, ICharGroupTokenizer, ITokenizer

    Constructors

    | Edit this page View Source

    CharGroupTokenizer()

    Declaration
    public CharGroupTokenizer()

    Properties

    | Edit this page View Source

    MaxTokenLength

    The maximum token length. If a token is seen that exceeds this length then it is split at MaxTokenLength intervals. Defaults to 255.

    Declaration
    public int? MaxTokenLength { get; set; }
    Property Value
    Type Description
    int?
    | Edit this page View Source

    TokenizeOnCharacters

    A list containing a list of characters to tokenize the string on. Whenever a character from this list is encountered, a new token is started. This accepts either single characters like eg. -, or character groups: whitespace, letter, digit, punctuation, symbol.

    Declaration
    public IEnumerable<string> TokenizeOnCharacters { get; set; }
    Property Value
    Type Description
    IEnumerable<string>

    Implements

    ICharGroupTokenizer
    ITokenizer

    Extension Methods

    SuffixExtensions.Suffix(object, string)
    • Edit this page
    • View Source
    In this article
    • Constructors
      • CharGroupTokenizer()
    • Properties
      • MaxTokenLength
      • TokenizeOnCharacters
    • Implements
    • Extension Methods
    Back to top Generated by DocFX