RAG Engines
RAG Engines
Note
This section is a work in progress.
Rag Engine is an abstraction for implementing modular retrieval augmented generation (RAG) pipelines.
RAG Stages
RagEngine
s consist of three stages: QueryRagStage
, RetrievalRagStage
, and ResponseRagStage
. These stages are always executed sequentially. Each stage comprises multiple modules, which are executed in a customized manner. Due to this unique structure, RagEngines
are not intended to replace Workflows or Pipelines.
QueryRagStage
is used for modifying user queries.RetrievalRagStage
is used for retrieving and re-ranking text chunks.ResponseRagStage
is used for generating responses.
RAG Modules
RAG modules are used to implement concrete actions in the RAG pipeline. RagEngine
enables developers to easily add new modules to experiment with novel RAG strategies.
Query Modules
TranslateQueryRagModule
is for translating the query into another language.
Retrieval/Rerank Modules
TextChunksRerankRagModule
is for re-ranking retrieved results.TextLoaderRetrievalRagModule
is for retrieving data with text loaders in real time.VectorStoreRetrievalRagModule
is for retrieving text chunks from a vector store.
Response Modules
PromptResponseRagModule
is for generating responses based on retrieved text chunks.TextChunksResponseRagModule
is for responding with retrieved text chunks.FootnotePromptResponseRagModule
is for responding with automatic footnotes from text chunk references.
RAG Context
RagContext
is a container object for passing around queries, text chunks, module configs, and other metadata. RagContext
is modified by modules when appropriate. Some modules support runtime config overrides through RagContext.module_configs
.
Example
The following example shows a simple RAG pipeline that translates incoming queries into English, retrieves data from a local vector store, and generates a response:
from griptape.drivers import LocalVectorStoreDriver, OpenAiChatPromptDriver, OpenAiEmbeddingDriver
from griptape.engines.rag import RagContext, RagEngine
from griptape.engines.rag.modules import PromptResponseRagModule, TranslateQueryRagModule, VectorStoreRetrievalRagModule
from griptape.engines.rag.stages import QueryRagStage, ResponseRagStage, RetrievalRagStage
from griptape.loaders import WebLoader
from griptape.rules import Rule, Ruleset
prompt_driver = OpenAiChatPromptDriver(model="gpt-4o", temperature=0)
vector_store = LocalVectorStoreDriver(embedding_driver=OpenAiEmbeddingDriver())
artifacts = WebLoader(max_tokens=500).load("https://www.griptape.ai")
vector_store.upsert_text_artifacts(
{
"griptape": artifacts,
}
)
rag_engine = RagEngine(
query_stage=QueryRagStage(query_modules=[TranslateQueryRagModule(prompt_driver=prompt_driver, language="english")]),
retrieval_stage=RetrievalRagStage(
max_chunks=5,
retrieval_modules=[
VectorStoreRetrievalRagModule(
name="MyAwesomeRetriever", vector_store_driver=vector_store, query_params={"top_n": 20}
)
],
),
response_stage=ResponseRagStage(
response_modules=[
PromptResponseRagModule(
prompt_driver=prompt_driver, rulesets=[Ruleset(name="persona", rules=[Rule("Talk like a pirate")])]
)
]
),
)
rag_context = RagContext(
query="¿Qué ofrecen los servicios en la nube de Griptape?",
module_configs={"MyAwesomeRetriever": {"query_params": {"namespace": "griptape"}}},
)
print(rag_engine.process(rag_context).outputs[0].to_text())