Skip to content

Embedding Drivers

Overview

Embeddings in Griptape are multidimensional representations of text data. Embeddings carry semantic information, which makes them useful for extracting relevant chunks from large bodies of text for search and querying.

Griptape provides a way to build Embedding Drivers that are reused in downstream framework components. Every Embedding Driver has two basic methods that can be used to generate embeddings:

You can optionally provide a Tokenizer via the tokenizer field to have the Driver automatically chunk the input text to fit into the token limit.

Embedding Drivers

OpenAI Embeddings

The OpenAiEmbeddingDriver uses the OpenAI Embeddings API.

from griptape.drivers import OpenAiEmbeddingDriver

embeddings = OpenAiEmbeddingDriver().embed_string("Hello Griptape!")

# display the first 3 embeddings
print(embeddings[:3])
[0.0017853748286142945, 0.006118456833064556, -0.005811543669551611]

Azure OpenAI Embeddings

The AzureOpenAiEmbeddingDriver uses the same parameters as OpenAiEmbeddingDriver with updated defaults.

Bedrock Titan Embeddings

Info

This driver requires the drivers-embedding-amazon-bedrock extra.

The AmazonBedrockTitanEmbeddingDriver uses the Amazon Bedrock Embeddings API.

from griptape.drivers import AmazonBedrockTitanEmbeddingDriver

embeddings = AmazonBedrockTitanEmbeddingDriver().embed_string("Hello world!")

# display the first 3 embeddings
print(embeddings[:3])
[-0.234375, -0.024902344, -0.14941406]

Hugging Face Hub Embeddings

Info

This driver requires the drivers-embedding-huggingface extra.

The HuggingFaceHubEmbeddingDriver connects to the Hugging Face Hub API. It supports models with the following tasks:

  • feature-extraction
import os
from griptape.drivers import HuggingFaceHubEmbeddingDriver
from griptape.tokenizers import HuggingFaceTokenizer
from transformers import AutoTokenizer

driver = HuggingFaceHubEmbeddingDriver(
    api_token=os.environ["HUGGINGFACE_HUB_ACCESS_TOKEN"],
    model="sentence-transformers/all-MiniLM-L6-v2",
    tokenizer=HuggingFaceTokenizer(
        tokenizer=AutoTokenizer.from_pretrained(
            "sentence-transformers/all-MiniLM-L6-v2"
        )
    ),
)

embeddings = driver.embed_string("Hello world!")

# display the first 3 embeddings
print(embeddings[:3])

Multi Model Embedding Drivers

Certain embeddings providers such as Amazon SageMaker support many types of models, each with their own slight differences in parameters and response formats. To support this variation across models, these Embedding Drivers takes a Embedding Model Driver through the embedding_model_driver parameter. Embedding Model Drivers allows for model-specific customization for Embedding Drivers.

SageMaker Embeddings

The AmazonSageMakerEmbeddingDriver uses the Amazon SageMaker Endpoints to generate embeddings on AWS.

Info

This driver requires the drivers-embedding-amazon-sagemaker extra.

TensorFlow Hub Models
import os
from griptape.drivers import AmazonSageMakerEmbeddingDriver, SageMakerTensorFlowHubEmbeddingModelDriver

driver = AmazonSageMakerEmbeddingDriver(
    model=os.environ["SAGEMAKER_TENSORFLOW_HUB_MODEL"],
    embedding_model_driver=SageMakerTensorFlowHubEmbeddingModelDriver(),
)

embeddings = driver.embed_string("Hello world!")

# display the first 3 embeddings
print(embeddings[:3])
HuggingFace Models
import os
from griptape.drivers import AmazonSageMakerEmbeddingDriver, SageMakerHuggingFaceEmbeddingModelDriver

driver = AmazonSageMakerEmbeddingDriver(
    model=os.environ["SAGEMAKER_HUGGINGFACE_MODEL"],
    embedding_model_driver=SageMakerHuggingFaceEmbeddingModelDriver(),
)

embeddings = driver.embed_string("Hello world!")

# display the first 3 embeddings
print(embeddings[:3])

Override Default Structure Embedding Driver

Here is how you can override the Embedding Driver that is used by default in agents.

```python from griptape.structures import Agent from griptape.tools import WebScraper, TaskMemoryClient from griptape.drivers import LocalVectorStoreDriver, OpenAiEmbeddingDriver

agent = Agent( tools=[WebScraper(), TaskMemoryClient(off_prompt=False)], embedding_driver=OpenAiEmbeddingDriver() )

agent.run("based on https://www.griptape.ai/, tell me what Griptape is")