構建 Agentic RAG 系統

你可以跟隨這個 notebook 中的程式碼，並使用 Google Colab 執行它。

檢索增強生成（Retrieval Augmented Generation, RAG）系統結合了資料檢索和生成模型的能力，以提供具有上下文感知能力的響應。例如，使用者的查詢被傳遞給搜尋引擎，檢索到的結果與查詢一起提供給模型。然後，模型根據查詢和檢索到的資訊生成響應。

Agentic RAG（Agentic Retrieval-Augmented Generation）透過 將自主 Agent 與動態知識檢索相結合，擴充套件了傳統的 RAG 系統。

傳統 RAG 系統使用 LLM 根據檢索到的資料回答查詢，而 Agentic RAG 則 能夠智慧地控制檢索和生成過程，從而提高效率和準確性。

傳統 RAG 系統面臨一些關鍵限制，例如 依賴於單次檢索步驟，並專注於與使用者查詢的直接語義相似性，這可能會忽略相關資訊。

Agentic RAG 透過允許 Agent 自主制定搜尋查詢、評估檢索結果並執行多個檢索步驟來解決這些問題，從而獲得更具針對性和更全面的輸出。

使用 DuckDuckGo 進行基本檢索

讓我們構建一個能使用 DuckDuckGo 搜尋網路的簡單 Agent。這個 Agent 將檢索資訊並綜合響應以回答查詢。藉助 Agentic RAG，Alfred 的 Agent 可以：

搜尋最新的超級英雄派對趨勢
篩選結果以包含奢華元素
將資訊整合成一個完整的計劃

以下是 Alfred 的 Agent 實現這一目標的方式：

from smolagents import CodeAgent, DuckDuckGoSearchTool, InferenceClientModel

# Initialize the search tool
search_tool = DuckDuckGoSearchTool()

# Initialize the model
model = InferenceClientModel()

agent = CodeAgent(
    model=model,
    tools=[search_tool],
)

# Example usage
response = agent.run(
    "Search for luxury superhero-themed party ideas, including decorations, entertainment, and catering."
)
print(response)

Agent 遵循以下流程：

分析請求： Alfred 的 Agent 識別出查詢的關鍵要素——奢華的超級英雄主題派對策劃，重點關注裝飾、娛樂和餐飲。
執行檢索： Agent 利用 DuckDuckGo 搜尋最相關和最新的資訊，確保其符合 Alfred 對奢華活動的精確偏好。
綜合資訊： 收集結果後，Agent 將它們處理成一個連貫、可操作的計劃，供 Alfred 使用，涵蓋派對的各個方面。
儲存以備將來參考： Agent 儲存檢索到的資訊，以便在規劃未來活動時輕鬆訪問，從而最佳化後續任務的效率。

自定義知識庫工具

對於專業任務，自定義知識庫可能非常寶貴。讓我們建立一個工具，用於查詢技術文件或專業知識的向量資料庫。透過語義搜尋，Agent 可以找到最符合 Alfred 需求的資訊。

向量資料庫儲存由機器學習模型建立的文字或其他資料的數值表示（嵌入）。它透過識別高維空間中的相似含義來實現語義搜尋。

這種方法將預定義知識與語義搜尋相結合，為活動策劃提供具有上下文感知的解決方案。透過訪問專業知識，Alfred 可以完善派對的每一個細節。

在此示例中，我們將建立一個從自定義知識庫中檢索派對策劃創意的工具。我們將使用 BM25 檢索器來搜尋知識庫並返回最佳結果，並使用 RecursiveCharacterTextSplitter 將文件分割成更小的塊，以實現更高效的搜尋。

from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from smolagents import Tool
from langchain_community.retrievers import BM25Retriever
from smolagents import CodeAgent, InferenceClientModel

class PartyPlanningRetrieverTool(Tool):
    name = "party_planning_retriever"
    description = "Uses semantic search to retrieve relevant party planning ideas for Alfred’s superhero-themed party at Wayne Manor."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be a query related to party planning or superhero themes.",
        }
    }
    output_type = "string"

    def __init__(self, docs, **kwargs):
        super().__init__(**kwargs)
        self.retriever = BM25Retriever.from_documents(
            docs, k=5  # Retrieve the top 5 documents
        )

    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Your search query must be a string"

        docs = self.retriever.invoke(
            query,
        )
        return "\nRetrieved ideas:\n" + "".join(
            [
                f"\n\n===== Idea {str(i)} =====\n" + doc.page_content
                for i, doc in enumerate(docs)
            ]
        )

# Simulate a knowledge base about party planning
party_ideas = [
    {"text": "A superhero-themed masquerade ball with luxury decor, including gold accents and velvet curtains.", "source": "Party Ideas 1"},
    {"text": "Hire a professional DJ who can play themed music for superheroes like Batman and Wonder Woman.", "source": "Entertainment Ideas"},
    {"text": "For catering, serve dishes named after superheroes, like 'The Hulk's Green Smoothie' and 'Iron Man's Power Steak.'", "source": "Catering Ideas"},
    {"text": "Decorate with iconic superhero logos and projections of Gotham and other superhero cities around the venue.", "source": "Decoration Ideas"},
    {"text": "Interactive experiences with VR where guests can engage in superhero simulations or compete in themed games.", "source": "Entertainment Ideas"}
]

source_docs = [
    Document(page_content=doc["text"], metadata={"source": doc["source"]})
    for doc in party_ideas
]

# Split the documents into smaller chunks for more efficient search
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    add_start_index=True,
    strip_whitespace=True,
    separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)

# Create the retriever tool
party_planning_retriever = PartyPlanningRetrieverTool(docs_processed)

# Initialize the agent
agent = CodeAgent(tools=[party_planning_retriever], model=InferenceClientModel())

# Example usage
response = agent.run(
    "Find ideas for a luxury superhero-themed party, including entertainment, catering, and decoration options."
)

print(response)

這個增強後的 Agent 可以：

首先檢查文件以獲取相關資訊
結合知識庫中的見解
在記憶體中保持對話上下文

增強的檢索能力

在構建 Agentic RAG 系統時，Agent 可以採用複雜的策略，例如：

查詢重構 (Query Reformulation)： Agent 不使用原始使用者查詢，而是可以構建最佳化的搜尋詞，以更好地匹配目標文件。
查詢分解 (Query Decomposition)： 如果使用者查詢包含多個需要查詢的資訊點，可以將其分解為多個查詢，而不是直接使用。
查詢擴充套件 (Query Expansion)： 與查詢重構有些類似，但會多次進行，將查詢用多種措辭表達，然後全部進行查詢。
重排序 (Reranking)： 使用交叉編碼器 (Cross-Encoders) 在檢索到的文件和搜尋查詢之間分配更全面和語義化的相關性分數。
多步檢索 (Multi-Step Retrieval)： Agent 可以執行多次搜尋，利用初始結果為後續查詢提供資訊。
來源整合 (Source Integration)： 資訊可以從多個來源（如網路搜尋和本地文件）進行整合。
結果驗證 (Result Validation)： 在將檢索到的內容包含到響應中之前，可以分析其相關性和準確性。

高效的 Agentic RAG 系統需要仔細考慮幾個關鍵方面。Agent 應根據查詢型別和上下文在可用工具之間進行選擇。記憶系統有助於維護對話歷史並避免重複檢索。備用策略確保即使主要檢索方法失敗，系統仍能提供價值。此外，實施驗證步驟有助於確保檢索資訊的準確性和相關性。

資源

Agentic RAG：透過查詢重構和自查詢為你的 RAG 加速！🚀 - 使用 smolagents 開發 Agentic RAG 系統的指南。

< > 在 GitHub 上更新

Agents 課程

構建 Agentic RAG 系統

使用 DuckDuckGo 進行基本檢索

自定義知識庫工具

增強的檢索能力

資源