Skip to content

RAG Engines

RAG Engines

Note

This section is a work in progress.

Rag Engine is an abstraction for implementing modular retrieval augmented generation (RAG) pipelines.

RAG Stages

RagEngines consist of three stages: QueryRagStage, RetrievalRagStage, and ResponseRagStage. These stages are always executed sequentially. Each stage comprises multiple modules, which are executed in a customized manner. Due to this unique structure, RagEngines are not intended to replace Workflows or Pipelines.

  • QueryRagStage is used for modifying user queries.
  • RetrievalRagStage is used for retrieving and re-ranking text chunks.
  • ResponseRagStage is used for generating responses.

RAG Modules

RAG modules are used to implement concrete actions in the RAG pipeline. RagEngine enables developers to easily add new modules to experiment with novel RAG strategies.

Query Modules

  • TranslateQueryRagModule is for translating the query into another language.

Retrieval/Rerank Modules

  • TextChunksRerankRagModule is for re-ranking retrieved results.
  • TextLoaderRetrievalRagModule is for retrieving data with text loaders in real time.
  • VectorStoreRetrievalRagModule is for retrieving text chunks from a vector store.

Response Modules

  • PromptResponseRagModule is for generating responses based on retrieved text chunks.
  • TextChunksResponseRagModule is for responding with retrieved text chunks.
  • FootnotePromptResponseRagModule is for responding with automatic footnotes from text chunk references.

RAG Context

RagContext is a container object for passing around queries, text chunks, module configs, and other metadata. RagContext is modified by modules when appropriate. Some modules support runtime config overrides through RagContext.module_configs.

Example

The following example shows a simple RAG pipeline that translates incoming queries into English, retrieves data from a local vector store, and generates a response:

from griptape.drivers import LocalVectorStoreDriver, OpenAiChatPromptDriver, OpenAiEmbeddingDriver
from griptape.engines.rag import RagContext, RagEngine
from griptape.engines.rag.modules import PromptResponseRagModule, TranslateQueryRagModule, VectorStoreRetrievalRagModule
from griptape.engines.rag.stages import QueryRagStage, ResponseRagStage, RetrievalRagStage
from griptape.loaders import WebLoader
from griptape.rules import Rule, Ruleset

prompt_driver = OpenAiChatPromptDriver(model="gpt-4o", temperature=0)

vector_store = LocalVectorStoreDriver(embedding_driver=OpenAiEmbeddingDriver())
artifacts = WebLoader(max_tokens=500).load("https://www.griptape.ai")


vector_store.upsert_text_artifacts(
    {
        "griptape": artifacts,
    }
)

rag_engine = RagEngine(
    query_stage=QueryRagStage(query_modules=[TranslateQueryRagModule(prompt_driver=prompt_driver, language="english")]),
    retrieval_stage=RetrievalRagStage(
        max_chunks=5,
        retrieval_modules=[
            VectorStoreRetrievalRagModule(
                name="MyAwesomeRetriever", vector_store_driver=vector_store, query_params={"top_n": 20}
            )
        ],
    ),
    response_stage=ResponseRagStage(
        response_modules=[
            PromptResponseRagModule(
                prompt_driver=prompt_driver, rulesets=[Ruleset(name="persona", rules=[Rule("Talk like a pirate")])]
            )
        ]
    ),
)

rag_context = RagContext(
    query="¿Qué ofrecen los servicios en la nube de Griptape?",
    module_configs={"MyAwesomeRetriever": {"query_params": {"namespace": "griptape"}}},
)

print(rag_engine.process(rag_context).outputs[0].to_text())