Skip to content

Variation Image Generation Engine Tool

This Tool allows LLMs to generate variations of an input image from a text prompt. The input image can be provided either by its file path or by its Task Memory reference.

Referencing an Image by File Path

from griptape.drivers import AmazonBedrockImageGenerationDriver, BedrockStableDiffusionImageGenerationModelDriver
from griptape.engines import VariationImageGenerationEngine
from griptape.structures import Agent
from griptape.tools import VariationImageGenerationTool

# Create a driver configured to use Stable Diffusion via Bedrock.
driver = AmazonBedrockImageGenerationDriver(
    image_generation_model_driver=BedrockStableDiffusionImageGenerationModelDriver(
        style_preset="pixel-art",
    ),
    model="stability.stable-diffusion-xl-v0",
)

# Create an engine configured to use the driver.
engine = VariationImageGenerationEngine(
    image_generation_driver=driver,
)

# Create a tool configured to use the engine.
tool = VariationImageGenerationTool(
    engine=engine,
)

# Create an agent and provide the tool to it.
Agent(tools=[tool]).run(
    "Generate a variation of the image located at tests/resources/mountain.png " "depicting a mountain on a winter day"
)

Referencing an Image in Task Memory