Skip to content

Loaders

Overview

Loaders are used to load textual data from different sources and chunk it into TextArtifacts. Each loader can be used to load a single "document" with load() or multiple documents with load_collection().

Pdf Loader

Inherits from the TextLoader and can be used to load PDFs from a path or from an IO stream:

from griptape.loaders import PdfLoader
import urllib.request

urllib.request.urlretrieve("https://arxiv.org/pdf/1706.03762.pdf", "attention.pdf")

PdfLoader().load("attention.pdf")

urllib.request.urlretrieve("https://arxiv.org/pdf/1706.03762.pdf", "CoT.pdf")

PdfLoader().load_collection(["attention.pdf", "CoT.pdf"])

Sql Loader

Can be used to load data from a SQL database into CsvRowArtifacts:

from griptape.loaders import SqlLoader
from griptape.drivers import SqlDriver

SqlLoader(
    sql_driver = SqlDriver(
        engine_url="sqlite:///:memory:"
    )
).load("SELECT 'foo', 'bar'")

SqlLoader(
    sql_driver = SqlDriver(
        engine_url="sqlite:///:memory:"
    )
).load_collection(["SELECT 'foo', 'bar';", "SELECT 'fizz', 'buzz';"])

Csv Loader

Can be used to load CSV files into CsvRowArtifacts:

import urllib
from griptape.loaders import CsvLoader

urllib.request.urlretrieve("https://people.sc.fsu.edu/~jburkardt/data/csv/cities.csv", "cities.csv")

CsvLoader().load(
    "cities.csv"
)

urllib.request.urlretrieve("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv", "addresses.csv")

CsvLoader().load_collection(
    ["cities.csv", "addresses.csv"]
)

Text Loader

Used to load arbitrary text and text files:

from pathlib import Path
import urllib
from griptape.loaders import TextLoader

TextLoader().load(
    "my text"
)

urllib.request.urlretrieve("https://example-files.online-convert.com/document/txt/example.txt", "example.txt")

TextLoader().load(
    Path("example.txt")
)

TextLoader().load_collection(
    ["my text", "my other text", Path("example.txt")]
)

You can set a custom tokenizer, max_tokens parameter, and chunker.

Web Loader

Inherits from the TextLoader and can be used to load web pages:

from griptape.loaders import WebLoader

WebLoader().load(
    "https://www.griptape.ai"
)

WebLoader().load_collection(
    ["https://www.griptape.ai", "https://docs.griptape.ai"]
)

Image Loader

The Image Loader is used to load an image from the filesystem, returning an ImageArtifact.

from griptape.loaders import ImageLoader


image_artifact = ImageLoader().load("tests/assets/mountain.png")
image_artifacts = ImageLoader().load_collection(paths=["tests/assets/mountain.png", "tests/assets/mountain.jpg"])

By default, the Image Loader will ensure all images are in png format. If an image in another format (for example, jpg) is loaded, it will be reformatted to png. Other formats are supported through the format field.

from griptape.loaders import ImageLoader


# Image data in artifact will be in JPG format.
image_artifact_jpg = ImageLoader(format="JPEG").load("tests/assets/mountain.png")