{ "cells": [ { "cell_type": "markdown", "id": "fba4c02d", "metadata": {}, "source": [ "# Demo Notebook to trace Sentence Transformers model\n", "\n", "#### [Download notebook](https://github.com/opensearch-project/opensearch-py-ml/blob/main/docs/source/examples/demo_tracing_model_torchscript_onnx.ipynb)\n", "\n", "This notebook provides a walkthrough guidance for users to trace models from Sentence Transformers in torchScript and onnx format. After tracing the model, customers can register the model to opensearch and generate embeddings.\n", "\n", "Remember, tracing model in torchScript or Onnx format at just two different options. We don't need to trace model in both ways. Here in our notebook we just want to show both ways. \n", "\n", "Step 0: Import packages and set up client\n", "\n", "Step 1: Save model in torchScript format\n", "\n", "Step 2: Register the saved torchScript model in Opensearch\n", "\n", "[The following steps are optional, just showing registering model in both ways and comparing the both embedding output]\n", "\n", "Step 3: Save model in Onnx format \n", "\n", "Step 4: Register the saved Onnx model in Opensearch\n", "\n", "Step 5: Generate Sentence Embedding with registered models\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "7011727e", "metadata": {}, "source": [ "## Step 0: Import packages and set up client\n", "Install required packages for opensearch_py_ml.sentence_transformer_model\n", "Install `opensearchpy` and `opensearch-py-ml` through pypi\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "17a3e085", "metadata": { "scrolled": true }, "outputs": [], "source": [ "#!pip install opensearch-py opensearch-py-ml\n", "\n", "# import os\n", "# import sys\n", "# sys.path.append(os.path.abspath(os.path.join('../../..')))" ] }, { "cell_type": "code", "execution_count": 2, "id": "d0c711bf", "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/linuxbrew/.linuxbrew/opt/python@3.8/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "import warnings\n", "warnings.filterwarnings('ignore', category=DeprecationWarning)\n", "warnings.filterwarnings('ignore', category=FutureWarning)\n", "warnings.filterwarnings(\"ignore\", message=\"Unverified HTTPS request\")\n", "warnings.filterwarnings(\"ignore\", message=\"TracerWarning: torch.tensor\")\n", "warnings.filterwarnings(\"ignore\", message=\"using SSL with verify_certs=False is insecure.\")\n", "\n", "import opensearch_py_ml as oml\n", "from opensearchpy import OpenSearch\n", "from opensearch_py_ml.ml_models import SentenceTransformerModel\n", "# import mlcommon to later register the model to OpenSearch Cluster\n", "from opensearch_py_ml.ml_commons import MLCommonClient" ] }, { "cell_type": "code", "execution_count": 3, "id": "5c85ae17", "metadata": {}, "outputs": [], "source": [ "CLUSTER_URL = 'https://localhost:9200'" ] }, { "cell_type": "code", "execution_count": 4, "id": "77442abf", "metadata": {}, "outputs": [], "source": [ "def get_os_client(cluster_url = CLUSTER_URL,\n", " username='admin',\n", " password='< admin password >'):\n", " '''\n", " Get OpenSearch client\n", " :param cluster_url: cluster URL like https://ml-te-netwo-1s12ba42br23v-ff1736fa7db98ff2.elb.us-west-2.amazonaws.com:443\n", " :return: OpenSearch client\n", " '''\n", " client = OpenSearch(\n", " hosts=[cluster_url],\n", " http_auth=(username, password),\n", " verify_certs=False\n", " )\n", " return client " ] }, { "cell_type": "code", "execution_count": 5, "id": "89e1cb2a", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/linuxbrew/.linuxbrew/opt/python@3.8/lib/python3.8/site-packages/opensearchpy/connection/http_urllib3.py:199: UserWarning: Connecting to https://localhost:9200 using SSL with verify_certs=False is insecure.\n", " warnings.warn(\n" ] } ], "source": [ "client = get_os_client()\n", "\n", "# Connect to ml_common client with OpenSearch client\n", "ml_client = MLCommonClient(client)" ] }, { "cell_type": "markdown", "id": "4da9e0de", "metadata": {}, "source": [ "## Step 1: Save model in torchScript format\n", "\n", "`Opensearch-py-ml` plugin provides method `save_as_pt` which will trace a model in torchScript format and save the model in a zip file in your filesystem. \n", "\n", "Detailed documentation: https://opensearch-project.github.io/opensearch-py-ml/reference/api/sentence_transformer.save_as_pt.html#opensearch_py_ml.ml_models.SentenceTransformerModel.save_as_pt\n", "\n", "\n", "Users need to provide a model id from sentence transformers (an example: `sentence-transformers/msmarco-distilbert-base-tas-b`). This model id is a huggingface model id. Example: https://huggingface.co/sentence-transformers/msmarco-distilbert-base-tas-b\n", "\n", "`save_as_pt` will download the model in filesystem and then trace the model with the given input strings.\n", "\n", "To get more direction about dummy input string please check this url: https://huggingface.co/docs/transformers/torchscript#dummy-inputs-and-standard-lengths\n", "\n", "after tracing the model (a .pt file will be generated), `save_as_pt` method zips `tokenizers.json` and torchScript (`.pt`) file and saves in the file system. \n", "\n", "User can register that model to opensearch to generate embedding." ] }, { "cell_type": "code", "execution_count": 6, "id": "b6405232", "metadata": {}, "outputs": [], "source": [ "model_id = \"sentence-transformers/msmarco-distilbert-base-tas-b\"\n", "folder_path = \"sentence-transformer-torchscript/msmarco-distilbert-base-tas-b\"" ] }, { "cell_type": "code", "execution_count": 7, "id": "c7b0ff7e", "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/linuxbrew/.linuxbrew/opt/python@3.8/lib/python3.8/site-packages/transformers/models/distilbert/modeling_distilbert.py:223: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.\n", " mask, torch.tensor(torch.finfo(scores.dtype).min)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "model file is saved to sentence-transformer-torchscript/msmarco-distilbert-base-tas-b/msmarco-distilbert-base-tas-b.pt\n", "zip file is saved to sentence-transformer-torchscript/msmarco-distilbert-base-tas-b/msmarco-distilbert-base-tas-b.zip \n", "\n" ] } ], "source": [ "pre_trained_model = SentenceTransformerModel(model_id=model_id, folder_path=folder_path, overwrite=True)\n", "model_path = pre_trained_model.save_as_pt(model_id=model_id, sentences=[\"for example providing a small sentence\", \"we can add multiple sentences\"])" ] }, { "cell_type": "markdown", "id": "9c3b7cbd", "metadata": {}, "source": [ "## Step 2: Register the saved torchScript model in Opensearch\n", "\n", "In the last step we saved a sentence transformer model in torchScript format. Now we will register that model in opensearch cluster. To do that we can take help of `register_model` method in `opensearch-py-ml` plugin.\n", "\n", "To register model, we need the zip file we just saved in the last step and a model config file. You can use `make_model_config_json` method to automatically generate the model config file and save it at `ml-commons_model_config.json` in model folder, or you can create a json file by yourself.\n", "\n", "Example of Model config file content can be:\n", "\n", "{\n", " \"name\": \"sentence-transformers/msmarco-distilbert-base-tas-b\",\n", " \"version\": \"1.0.0\",\n", " \"description\": \"This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.\",\n", " \"model_format\": \"TORCH_SCRIPT\",\n", " \"model_config\": {\n", " \"model_type\": \"distilbert\",\n", " \"embedding_dimension\": 768,\n", " \"framework_type\": \"sentence_transformers\"\n", " }\n", "}\n", "\n", "In either approach, you have to set `model_format` to be `TORCH_SCRIPT` so that internal system will look for the corresponding `.pt` file from the zip folder. \n", "\n", "Please refer to this doc: https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_serving_framework/text_embedding_model_examples.md\n", "\n", "\n", "Documentation for the method: https://opensearch-project.github.io/opensearch-py-ml/reference/api/ml_commons_register_api.html#opensearch_py_ml.ml_commons.MLCommonClient.register_model\n", "\n", "Related demo notebook about ml-commons plugin integration: https://opensearch-project.github.io/opensearch-py-ml/examples/demo_ml_commons_integration.html" ] }, { "cell_type": "code", "execution_count": 8, "id": "31d25f02", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ml-commons_model_config.json file is saved at : sentence-transformer-torchscript/msmarco-distilbert-base-tas-b/ml-commons_model_config.json\n" ] } ], "source": [ "model_config_path_torch = pre_trained_model.make_model_config_json(model_format='TORCH_SCRIPT')" ] }, { "cell_type": "code", "execution_count": 9, "id": "28e9310c", "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of chunks 27\n", "Sha1 value of the model file: b397ae99ef3c27ba2ea080428ba695ba732da90a9367e77383b55ec0b191903e\n", "Model meta data was created successfully. Model Id: 4djw4okB2Ly7dmqcT7Xp\n", "uploading chunk 1 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 2 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 3 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 4 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 5 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 6 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 7 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 8 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 9 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 10 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 11 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 12 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 13 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 14 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 15 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 16 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 17 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 18 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 19 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 20 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 21 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 22 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 23 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 24 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 25 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 26 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 27 of 27\n", "Model id: {'status': 'Uploaded'}\n", "Model registered successfully\n", "Task ID: 4tjw4okB2Ly7dmqcerVn\n", "Model deployed successfully\n" ] }, { "data": { "text/plain": [ "'4djw4okB2Ly7dmqcT7Xp'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ml_client.register_model(model_path, model_config_path_torch, isVerbose=True)" ] }, { "cell_type": "markdown", "id": "34ee235b", "metadata": {}, "source": [ "## Step 3: Save model in Onnx format\n", "\n", "`Opensearch-py-ml` plugin provides method `save_as_onnx` which will trace a model in ONNX format and save the model in a zip file in your filesystem. \n", "\n", "Detailed documentation: https://opensearch-project.github.io/opensearch-py-ml/reference/api/sentence_transformer.save_as_onnx.html#opensearch_py_ml.ml_models.SentenceTransformerModel.save_as_onnx\n", "\n", "\n", "Users need to provide a model id from sentence transformers (an example: `sentence-transformers/msmarco-distilbert-base-tas-b`). `save_as_onnx` will download the model in filesystem and then trace the model.\n", "\n", "after tracing the model (a .onnx file will be generated), `save_as_onnx` method zips `tokenizers.json` and torchScript (`.onnx`) file and saves in the file system. \n", "\n", "User can register that model to opensearch to generate embedding.\n" ] }, { "cell_type": "code", "execution_count": 11, "id": "7fff842a", "metadata": {}, "outputs": [], "source": [ "model_id = \"sentence-transformers/msmarco-distilbert-base-tas-b\"\n", "folder_path = \"sentence-transformer-onnx/msmarco-distilbert-base-tas-b\"" ] }, { "cell_type": "code", "execution_count": 12, "id": "44d6b1d2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ONNX opset version set to: 15\n", "Loading pipeline (model: sentence-transformers/msmarco-distilbert-base-tas-b, tokenizer: sentence-transformers/msmarco-distilbert-base-tas-b)\n", "Creating folder sentence-transformer-onnx/msmarco-distilbert-base-tas-b/onnx\n", "Using framework PyTorch: 1.13.1+cu117\n", "Found input input_ids with shape: {0: 'batch', 1: 'sequence'}\n", "Found input attention_mask with shape: {0: 'batch', 1: 'sequence'}\n", "Found output output_0 with shape: {0: 'batch', 1: 'sequence'}\n", "Ensuring inputs are in correct order\n", "head_mask is not present in the generated input list.\n", "Generated inputs order: ['input_ids', 'attention_mask']\n", "zip file is saved to sentence-transformer-onnx/msmarco-distilbert-base-tas-b/msmarco-distilbert-base-tas-b.zip \n", "\n" ] } ], "source": [ "pre_trained_model = SentenceTransformerModel(model_id=model_id, folder_path=folder_path, overwrite=True)\n", "model_path_onnx = pre_trained_model.save_as_onnx(model_id=model_id)" ] }, { "cell_type": "markdown", "id": "97ed5665", "metadata": {}, "source": [ "## Step 4: Register the saved Onnx model in Opensearch\n", "\n", "In the last step we saved a sentence transformer model in ONNX format. Now we will register that model in opensearch cluster. To do that we can take help of `register_model` method in `opensearch-py-ml` plugin.\n", "\n", "To register model, we need the zip file we just saved in the last step and a model config file. You can use `make_model_config_json` method to automatically generate the model config file and save it at `ml-commons_model_config.json` in model folder, or you can create a json file by yourself.\n", "\n", "{\n", " \"name\": \"sentence-transformers/msmarco-distilbert-base-tas-b\",\n", " \"version\": \"1.0.0\",\n", " \"description\": \"This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.\",\n", " \"model_format\": \"ONNX\",\n", " \"model_config\": {\n", " \"model_type\": \"distilbert\",\n", " \"embedding_dimension\": 768,\n", " \"framework_type\": \"sentence_transformers\",\n", " \"pooling_mode\":\"cls\",\n", " \"normalize_result\":\"false\"\n", " }\n", "}\n", "\n", "In either approach, you have to set `model_format` to be `ONNX` so that internal system will look for the corresponding `.onnx` file from the zip folder.\n", "\n", "Please refer to this doc: https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_serving_framework/text_embedding_model_examples.md\n", "\n", "\n", "Documentation for the method: https://opensearch-project.github.io/opensearch-py-ml/reference/api/ml_commons_register_api.html#opensearch_py_ml.ml_commons.MLCommonClient.register_model\n", "\n", "Related demo notebook about ml-commons plugin integration: https://opensearch-project.github.io/opensearch-py-ml/examples/demo_ml_commons_integration.html" ] }, { "cell_type": "code", "execution_count": 15, "id": "e475f04a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ml-commons_model_config.json file is saved at : sentence-transformer-onnx/msmarco-distilbert-base-tas-b/ml-commons_model_config.json\n" ] } ], "source": [ "model_config_path_onnx = pre_trained_model.make_model_config_json(model_format='ONNX')" ] }, { "cell_type": "code", "execution_count": 16, "id": "661c3f46", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of chunks 27\n", "Sha1 value of the model file: 81c950d07eaa21705dd94cec0f127efec42844cd1995502452764777460517d4\n", "Model meta data was created successfully. Model Id: 49jz4okB2Ly7dmqcNrWD\n", "uploading chunk 1 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 2 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 3 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 4 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 5 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 6 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 7 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 8 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 9 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 10 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 11 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 12 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 13 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 14 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 15 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 16 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 17 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 18 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 19 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 20 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 21 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 22 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 23 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 24 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 25 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 26 of 27\n", "Model id: {'status': 'Uploaded'}\n", "uploading chunk 27 of 27\n", "Model id: {'status': 'Uploaded'}\n", "Model registered successfully\n", "Task ID: 5Njz4okB2Ly7dmqcW7XA\n", "Model deployed successfully\n" ] }, { "data": { "text/plain": [ "'49jz4okB2Ly7dmqcNrWD'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ml_client.register_model(model_path_onnx, model_config_path_onnx, isVerbose=True)" ] }, { "cell_type": "markdown", "id": "3b22c708", "metadata": {}, "source": [ "## Step 5: Generate Sentence Embedding\n", "\n", "Now after loading these models in memory, we can generate embedding for sentences. We can provide a list of sentences to get a list of embedding for the sentences. " ] }, { "cell_type": "code", "execution_count": 17, "id": "8cc5a796", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "# Now using this model we can generate sentence embedding.\n", "\n", "import numpy as np\n", "\n", "input_sentences = [\"first sentence\", \"second sentence\"]\n", "\n", "# Generated embedding from torchScript\n", "\n", "embedding_output_torch = ml_client.generate_embedding(\"4djw4okB2Ly7dmqcT7Xp\", input_sentences)\n", "\n", "#just taking embedding for the first sentence\n", "data_torch = embedding_output_torch[\"inference_results\"][0][\"output\"][0][\"data\"]\n", "\n", "# Generated embedding from onnx\n", "\n", "embedding_output_onnx = ml_client.generate_embedding(\"49jz4okB2Ly7dmqcNrWD\", input_sentences)\n", "\n", "# Just taking embedding for the first sentence\n", "data_onnx = embedding_output_onnx[\"inference_results\"][0][\"output\"][0][\"data\"]\n", "\n", "# Now we can check if there's any significant difference between two outputs\n", "\n", "print(np.testing.assert_allclose(data_torch, data_onnx, rtol=1e-03, atol=1e-05))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.17" } }, "nbformat": 4, "nbformat_minor": 5 }