This API expects a TFLite model with TFLite Model Metadata that contains the
mandatory (described below) input tensors and output tensors. Metadata should
contain the input process unit for the model's Tokenizer as well as input /
output tensor metadata.
Input tensors
(kTfLiteInt32)
3 input tensors of size [batch_size x bert_max_seq_len] with names
"ids", "mask", and "segment_ids" representing the input ids, mask ids, and
segment ids respectively.
or 1 input tensor of size [batch_size x max_seq_len] representing the
input ids.
At least one output tensor with:
(kTfLiteFloat32)
N components corresponding to the N dimensions of the returned
feature vector for this output layer.
Either 2 or 4 dimensions, i.e. [1 x N] or [1 x 1 x 1 x N].
Utility function to compute cosine similarity between two embedding entries.
May return an InvalidArgumentError if e.g. the feature vectors are
of different types (quantized vs. float), have different sizes, or have a
an L2-norm of 0.
Args
u
An embedding entry.
v
An embedding entry.
Returns
The cosine similarity for the two embeddings.
Raises
ValueError
May return an error if e.g. the feature vectors are of
different types (quantized vs. float), have different sizes, or have
an L2-norm of 0.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-05-01 UTC."],[],[],null,["# mp.tasks.text.TextEmbedder\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/text_embedder.py#L75-L188) |\n\nClass that performs embedding extraction on text. \n\n mp.tasks.text.TextEmbedder(\n graph_config: ../../../mp/calculators/core/constant_side_packet_calculator_pb2/mediapipe_dot_framework_dot_calculator__pb2/CalculatorGraphConfig\n ) -\u003e None\n\nThis API expects a TFLite model with TFLite Model Metadata that contains the\nmandatory (described below) input tensors and output tensors. Metadata should\ncontain the input process unit for the model's Tokenizer as well as input /\noutput tensor metadata.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Input tensors ------------- ||\n|---|---|\n| (kTfLiteInt32) \u003cbr /\u003e - 3 input tensors of size `[batch_size x bert_max_seq_len]` with names \"ids\", \"mask\", and \"segment_ids\" representing the input ids, mask ids, and segment ids respectively. - or 1 input tensor of size `[batch_size x max_seq_len]` representing the input ids. ||\n\n\u003cbr /\u003e\n\nAt least one output tensor with:\n(kTfLiteFloat32)\n\n- `N` components corresponding to the `N` dimensions of the returned feature vector for this output layer.\n- Either 2 or 4 dimensions, i.e. `[1 x N]` or `[1 x 1 x 1 x N]`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|----------------|---------------------------------------------|\n| `graph_config` | The mediapipe text task graph config proto. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `close`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/core/base_text_task_api.py#L36-L42) \n\n close() -\u003e None\n\nShuts down the mediapipe text task instance.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|---------------------------------------------|\n| `RuntimeError` | If the mediapipe text task failed to close. |\n\n\u003cbr /\u003e\n\n### `cosine_similarity`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/text_embedder.py#L167-L188) \n\n @classmethod\n cosine_similarity(\n u: ../../../mp/tasks/components/containers/Embedding,\n v: ../../../mp/tasks/components/containers/Embedding\n ) -\u003e float\n\nUtility function to compute cosine similarity between two embedding entries.\n\nMay return an InvalidArgumentError if e.g. the feature vectors are\nof different types (quantized vs. float), have different sizes, or have a\nan L2-norm of 0.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|-----|---------------------|\n| `u` | An embedding entry. |\n| `v` | An embedding entry. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| The cosine similarity for the two embeddings. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|--------------|----------------------------------------------------------------------------------------------------------------------------------------------|\n| `ValueError` | May return an error if e.g. the feature vectors are of different types (quantized vs. float), have different sizes, or have an L2-norm of 0. |\n\n\u003cbr /\u003e\n\n### `create_from_model_path`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/text_embedder.py#L98-L116) \n\n @classmethod\n create_from_model_path(\n model_path: str\n ) -\u003e 'TextEmbedder'\n\nCreates an `TextEmbedder` object from a TensorFlow Lite model and the default `TextEmbedderOptions`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|--------------|--------------------|\n| `model_path` | Path to the model. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| `TextEmbedder` object that's created from the model file and the default `TextEmbedderOptions`. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|---------------------------------------------------------------------------------------------|\n| `ValueError` | If failed to create `TextEmbedder` object from the provided file such as invalid file path. |\n| `RuntimeError` | If other types of error occurred. |\n\n\u003cbr /\u003e\n\n### `create_from_options`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/text_embedder.py#L118-L140) \n\n @classmethod\n create_from_options(\n options: ../../../mp/tasks/text/TextEmbedderOptions\n ) -\u003e 'TextEmbedder'\n\nCreates the `TextEmbedder` object from text embedder options.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|-----------|-------------------------------------|\n| `options` | Options for the text embedder task. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| `TextEmbedder` object that's created from `options`. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|-------------------------------------------------------------------------------------------------|\n| `ValueError` | If failed to create `TextEmbedder` object from `TextEmbedderOptions` such as missing the model. |\n| `RuntimeError` | If other types of error occurred. |\n\n\u003cbr /\u003e\n\n### `embed`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/text_embedder.py#L142-L165) \n\n embed(\n text: str\n ) -\u003e ../../../mp/tasks/audio/AudioEmbedderResult\n\nPerforms text embedding extraction on the provided text.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|----------------------------------------------------------------|-----------------|\n| [`text`](https://www.tensorflow.org/text/api_docs/python/text) | The input text. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| An embedding result object that contains a list of embeddings. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|-------------------------------------------|\n| `ValueError` | If any of the input arguments is invalid. |\n| `RuntimeError` | If text embedder failed to run. |\n\n\u003cbr /\u003e\n\n### `__enter__`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/core/base_text_task_api.py#L44-L46) \n\n __enter__()\n\nReturns `self` upon entering the runtime context.\n\n### `__exit__`\n\n[View source](https://github.com/google/mediapipe/blob/master/mediapipe/tasks/python/text/core/base_text_task_api.py#L48-L54) \n\n __exit__(\n unused_exc_type, unused_exc_value, unused_traceback\n )\n\nShuts down the mediapipe text task instance on exit of the context manager.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|---------------------------------------------|\n| `RuntimeError` | If the mediapipe text task failed to close. |\n\n\u003cbr /\u003e"]]