OpenSearch Client

class opensearchpy.OpenSearch(hosts=None, transport_class=<class 'opensearchpy.transport.Transport'>, **kwargs)[source]

Bases: Client

OpenSearch client. Provides a straightforward mapping from Python to OpenSearch REST endpoints.

The instance has attributes cat, cluster, indices, ingest, nodes, snapshot and tasks that provide access to instances of CatClient, ClusterClient, IndicesClient, IngestClient, NodesClient, SnapshotClient and TasksClient respectively. This is the preferred (and only supported) way to get access to those classes and their methods.

You can specify your own connection class which should be used by providing the connection_class parameter:

# create connection to localhost using the ThriftConnection
client = OpenSearch(connection_class=ThriftConnection)

If you want to turn on sniffing you have several options (described in Transport):

# create connection that will automatically inspect the cluster to get
# the list of active nodes. Start with nodes running on
# 'opensearchnode1' and 'opensearchnode2'
client = OpenSearch(
    ['opensearchnode1', 'opensearchnode2'],
    # sniff before doing anything
    sniff_on_start=True,
    # refresh nodes after a node fails to respond
    sniff_on_connection_fail=True,
    # and also every 60 seconds
    sniffer_timeout=60
)

Different hosts can have different parameters, use a dictionary per node to specify those:

# connect to localhost directly and another node using SSL on port 443
# and an url_prefix. Note that ``port`` needs to be an int.
client = OpenSearch([
    {'host': 'localhost'},
    {'host': 'othernode', 'port': 443, 'url_prefix': 'opensearch', 'use_ssl': True},
])

If using SSL, there are several parameters that control how we deal with certificates (see AIOHttpConnection for detailed description of the options):

client = OpenSearch(
    ['localhost:443', 'other_host:443'],
    # turn on SSL
    use_ssl=True,
    # make sure we verify SSL certificates
    verify_certs=True,
    # provide a path to CA certs on disk
    ca_certs='/path/to/CA_certs'
)

If using SSL, but don’t verify the certs, a warning message is showed optionally (see AIOHttpConnection for detailed description of the options):

client = OpenSearch(
    ['localhost:443', 'other_host:443'],
    # turn on SSL
    use_ssl=True,
    # no verify SSL certificates
    verify_certs=False,
    # don't verify the hostname in the certificate
    ssl_assert_hostname=False,
    # don't show warnings about ssl certs verification
    ssl_show_warn=False
)

SSL client authentication is supported (see AIOHttpConnection for detailed description of the options):

client = OpenSearch(
    ['localhost:443', 'other_host:443'],
    # turn on SSL
    use_ssl=True,
    # make sure we verify SSL certificates
    verify_certs=True,
    # provide a path to CA certs on disk
    ca_certs='/path/to/CA_certs',
    # PEM formatted SSL client certificate
    client_cert='/path/to/clientcert.pem',
    # PEM formatted SSL client key
    client_key='/path/to/clientkey.pem'
)

Alternatively you can use RFC-1738 formatted URLs, as long as they are not in conflict with other options:

client = OpenSearch(
    [
        'http://user:secret@localhost:9200/',
        'https://user:secret@other_host:443/production'
    ],
    verify_certs=True
)

By default, JSONSerializer is used to encode all outgoing requests. However, you can implement your own custom serializer:

from opensearchpy.serializer import JSONSerializer

class SetEncoder(JSONSerializer):
    def default(self, obj):
        if isinstance(obj, set):
            return list(obj)
        if isinstance(obj, Something):
            return 'CustomSomethingRepresentation'
        return JSONSerializer.default(self, obj)

client = OpenSearch(serializer=SetEncoder())

Parameters:

hosts (Any) – list of nodes, or a single node, we should connect to. Node should be a dictionary ({“host”: “localhost”, “port”: 9200}), the entire dictionary will be passed to the Connection class as kwargs, or a string in the format of host[:port] which will be translated to a dictionary automatically. If no value is given the Connection class defaults will be used.
transport_class (Type[Transport]) – Transport subclass to use.
kwargs (Any) – any additional arguments will be passed on to the Transport class and, subsequently, to the Connection instances.
hosts –
transport_class –
kwargs –

__repr__()[source]

Return repr(self).

Return type:: Any

bulk(body, index=None, params=None, headers=None)[source]

Allows to perform multiple index/update/delete operations in a single request.

Parameters:

body (Any) – The operation definition and data (action-data pairs), separated by newlines
index (Any) – Name of the data stream, index, or index alias to perform bulk actions on.
_source – true or false to return the _source field or not, or a list of fields to return.
_source_excludes – A comma-separated list of source fields to exclude from the response.
_source_includes – A comma-separated list of source fields to include in the response.
error_trace – Whether to include the stack trace of returned errors. Default is false.
filter_path – Used to reduce the response. This parameter takes a comma-separated list of filters. It supports using wildcards to match any field or part of a field’s name. You can also exclude fields with “-“.
human – Whether to return human readable values for statistics. Default is True.
pipeline – ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to _none disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter.
pretty – Whether to pretty format the returned JSON response. Default is false.
refresh – If true, OpenSearch refreshes the affected shards to make this operation visible to search, if wait_for then wait for a refresh to make this operation visible to search, if false do nothing with refreshes. Valid values: true, false, wait_for.
require_alias – If true, the request’s actions must target an index alias. Default is false.
routing – Custom value used to route operations to a specific shard.
source – The URL-encoded request definition. Useful for libraries that do not accept a request body for non-POST requests.
timeout – Period each action waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.
wait_for_active_shards – The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). Valid choices are all, index-setting.
body –
index –
params (Any) –
headers (Any) –

Return type: