Skip to content

Image Query Engines

Image Query Engines

The Image Query Engine allows you to perform natural language queries on the contents of images. You can specify the provider and model used to query the image by providing the Engine with a particular Image Query Driver.

All Image Query Drivers default to a max_tokens of 256. You can tune this value based on your use case and the Image Query Driver you are providing.

from griptape.drivers import OpenAiImageQueryDriver
from griptape.engines import ImageQueryEngine
from griptape.loaders import ImageLoader

driver = OpenAiImageQueryDriver(model="gpt-4o", max_tokens=256)

engine = ImageQueryEngine(image_query_driver=driver)

image_artifact = ImageLoader().load("tests/resources/mountain.png")

engine.run("Describe the weather in the image", [image_artifact])