Audio Engines
Overview
Audio Generation Engines facilitate audio generation. Audio Generation Engines provides a run
method that accepts the necessary inputs for its particular mode and provides the request to the configured Driver.
Text to Speech
This Engine facilitates synthesizing speech from text inputs.
import os
from griptape.drivers import ElevenLabsTextToSpeechDriver
from griptape.engines import TextToSpeechEngine
driver = ElevenLabsTextToSpeechDriver(
api_key=os.environ["ELEVEN_LABS_API_KEY"],
model="eleven_multilingual_v2",
voice="Laura",
)
engine = TextToSpeechEngine(
text_to_speech_driver=driver,
)
engine.run(
prompts=["Hello, world!"],
)
Audio Transcription
The Audio Transcription Engine facilitates transcribing speech from audio inputs.
from griptape.drivers import OpenAiAudioTranscriptionDriver
from griptape.engines import AudioTranscriptionEngine
from griptape.loaders import AudioLoader
driver = OpenAiAudioTranscriptionDriver(model="whisper-1")
engine = AudioTranscriptionEngine(
audio_transcription_driver=driver,
)
audio_artifact = AudioLoader().load("tests/resources/sentences.wav")
engine.run(audio_artifact)