Loaders
Overview
Loaders are used to load textual data from different sources and chunk it into TextArtifacts. Each loader can be used to load a single "document" with load() or multiple documents with load_collection().
Pdf Loader
Inherits from the TextLoader and can be used to load PDFs from a path or from an IO stream:
from griptape.loaders import PdfLoader
import urllib.request
urllib.request.urlretrieve("https://arxiv.org/pdf/1706.03762.pdf", "attention.pdf")
PdfLoader().load("attention.pdf")
urllib.request.urlretrieve("https://arxiv.org/pdf/1706.03762.pdf", "CoT.pdf")
PdfLoader().load_collection(["attention.pdf", "CoT.pdf"])
Sql Loader
Can be used to load data from a SQL database into CsvRowArtifacts:
from griptape.loaders import SqlLoader
from griptape.drivers import SqlDriver
SqlLoader(
sql_driver = SqlDriver(
engine_url="sqlite:///:memory:"
)
).load("SELECT 'foo', 'bar'")
SqlLoader(
sql_driver = SqlDriver(
engine_url="sqlite:///:memory:"
)
).load_collection(["SELECT 'foo', 'bar';", "SELECT 'fizz', 'buzz';"])
Csv Loader
Can be used to load CSV files into CsvRowArtifacts:
import urllib
from griptape.loaders import CsvLoader
urllib.request.urlretrieve("https://people.sc.fsu.edu/~jburkardt/data/csv/cities.csv", "cities.csv")
CsvLoader().load(
"cities.csv"
)
urllib.request.urlretrieve("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv", "addresses.csv")
CsvLoader().load_collection(
["cities.csv", "addresses.csv"]
)
Text Loader
Used to load arbitrary text and text files:
from pathlib import Path
import urllib
from griptape.loaders import TextLoader
TextLoader().load(
"my text"
)
urllib.request.urlretrieve("https://example-files.online-convert.com/document/txt/example.txt", "example.txt")
TextLoader().load(
Path("example.txt")
)
TextLoader().load_collection(
["my text", "my other text", Path("example.txt")]
)
You can set a custom tokenizer, max_tokens parameter, and chunker.
Web Loader
Inherits from the TextLoader and can be used to load web pages: