Image Query Engines
Image Query Engines
The Image Query Engine allows you to perform natural language queries on the contents of images. You can specify the provider and model used to query the image by providing the Engine with a particular Image Query Driver.
All Image Query Drivers default to a max_tokens
of 256. You can tune this value based on your use case and the Image Query Driver you are providing.
from griptape.drivers import OpenAiImageQueryDriver
from griptape.engines import ImageQueryEngine
from griptape.loaders import ImageLoader
driver = OpenAiImageQueryDriver(model="gpt-4o", max_tokens=256)
engine = ImageQueryEngine(image_query_driver=driver)
image_artifact = ImageLoader().load("tests/resources/mountain.png")
engine.run("Describe the weather in the image", [image_artifact])