Text to Speech Drivers
Overview
Text to Speech Drivers are used by Text To Speech Engines to build and execute API calls to audio generation models.
Provide a Driver when building an Engine, then pass it to a Tool for use by an Agent:
Text to Speech Drivers
Eleven Labs
The Eleven Labs Text to Speech Driver provides support for text-to-speech models hosted by Eleven Labs. This Driver supports configurations specific to Eleven Labs, like voice selection and output format.
Info
This driver requires the drivers-text-to-speech-elevenlabs
extra.
import os
from griptape.drivers import ElevenLabsTextToSpeechDriver
from griptape.engines import TextToSpeechEngine
from griptape.structures import Agent
from griptape.tools.text_to_speech.tool import TextToSpeechTool
driver = ElevenLabsTextToSpeechDriver(
api_key=os.environ["ELEVEN_LABS_API_KEY"],
model="eleven_multilingual_v2",
voice="Matilda",
)
tool = TextToSpeechTool(
engine=TextToSpeechEngine(
text_to_speech_driver=driver,
),
)
Agent(tools=[tool]).run("Generate audio from this text: 'Hello, world!'")
OpenAI
The OpenAI Text to Speech Driver provides support for text-to-speech models hosted by OpenAI. This Driver supports configurations specific to OpenAI, like voice selection and output format.
from griptape.drivers import OpenAiTextToSpeechDriver
from griptape.engines import TextToSpeechEngine
from griptape.structures import Agent
from griptape.tools.text_to_speech.tool import TextToSpeechTool
driver = OpenAiTextToSpeechDriver()
tool = TextToSpeechTool(
engine=TextToSpeechEngine(
text_to_speech_driver=driver,
),
)
Agent(tools=[tool]).run("Generate audio from this text: 'Hello, world!'")
Azure OpenAI
The Azure OpenAI Text to Speech Driver provides support for text-to-speech models hosted in your Azure OpenAI instance. This Driver supports configurations specific to OpenAI, like voice selection and output format.
import os
from griptape.drivers import AzureOpenAiTextToSpeechDriver
from griptape.engines import TextToSpeechEngine
from griptape.structures import Agent
from griptape.tools.text_to_speech.tool import TextToSpeechTool
driver = AzureOpenAiTextToSpeechDriver(
api_key=os.environ["AZURE_OPENAI_API_KEY_4"],
model="tts",
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT_4"],
)
tool = TextToSpeechTool(
engine=TextToSpeechEngine(
text_to_speech_driver=driver,
),
)
Agent(tools=[tool]).run("Generate audio from this text: 'Hello, world!'")