如何使用 Ollama 和 LangChain 建立本地 RAG 代理

首頁 > 程式設計 > 如何使用 Ollama 和 LangChain 建立本地 RAG 代理

如何使用 Ollama 和 LangChain 建立本地 RAG 代理

發佈於2024-08-14

How to Create a Local RAG Agent with Ollama and LangChain

什麼是 RAG？

RAG 代表 Retrieval-Augmented Generation，這是一種強大的技術，旨在透過以文件形式為大型語言模型 (LLM) 提供特定的相關上下文來增強其效能。與純粹根據預先訓練的知識生成響應的傳統法學碩士不同，RAG 允許您透過檢索和利用即時數據或特定領域的信息，使模型的輸出與您期望的結果更緊密地保持一致。

RAG 與微調

雖然 RAG 和微調的目的都是提高 LLM 的性能，但 RAG 通常是一種更有效率且資源友善的方法。微調涉及在專門的資料集上重新訓練模型，這需要大量的運算資源、時間和專業知識。另一方面，RAG 動態檢索相關資訊並將其合併到生成過程中，從而可以更靈活且更具成本效益地適應新任務，而無需進行大量的再培訓。

建構 RAG 代理

安裝要求

安裝奧拉馬

Ollama 提供本地運作 LLaMA 所需的後端基礎設施。首先，請訪問 Ollama 的網站並下載該應用程式。按照說明在本機上進行設定。

安裝 LangChain 要求

LangChain 是一個 Python 框架，旨在與各種 LLM 和向量資料庫配合使用，使其成為建立 RAG 代理的理想選擇。透過執行以下命令安裝 LangChain 及其相依性：

pip install langchain

對 RAG 代理進行編碼

建立 API 函數

首先，您需要一個函數來與本地 LLaMA 實例互動。設定方法如下：

from requests import post as rpost

def call_llama(prompt):
    headers = {"Content-Type": "application/json"}
    payload = {
        "model": "llama3.1",
        "prompt": prompt,
        "stream": False,
    }

    response = rpost(
        "http://localhost:11434/api/generate",
        headers=headers,
        json=payload
    )
    return response.json()["response"]

創建LangChain LLM

接下來，將此功能整合到LangChain內的自訂LLM類別中：

from langchain_core.language_models.llms import LLM

class LLaMa(LLM):
    def _call(self, prompt, **kwargs):
        return call_llama(prompt)

    @property
    def _llm_type(self):
        return "llama-3.1-8b"

集成 RAG 代理

設定檢索器

檢索商負責根據使用者的查詢取得相關文件。以下是如何使用 FAISS 進行向量儲存和 HuggingFace 的預訓練嵌入進行設定：

from langchain.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

documents = [
    {"content": "What is your return policy? ..."},
    {"content": "How long does shipping take? ..."},
    # Add more documents as needed
]

texts = [doc["content"] for doc in documents]

retriever = FAISS.from_texts(
    texts,
    HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
).as_retriever(k=5)

建立提示模板

定義 RAG 代理將用於根據檢索到的文件產生回應的提示範本：

from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

faq_template = """
You are a chat agent for my E-Commerce Company. As a chat agent, it is your duty to help the human with their inquiry and make them a happy customer.

Help them, using the following context:

{context}

"""

faq_prompt = ChatPromptTemplate.from_messages([
    ("system", faq_template),
    MessagesPlaceholder("messages")
])

建立文件和檢索器鏈

將文件檢索和 LLaMA 生成結合成一個內聚鏈：

from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(LLaMa(), faq_prompt)

def parse_retriever_input(params):
    return params["messages"][-1].content

retrieval_chain = RunnablePassthrough.assign(
    context=parse_retriever_input | retriever
).assign(answer=document_chain)

啟動您的 Ollama 伺服器

在運行 RAG 代理之前，請確保 Ollama 伺服器已啟動並正在運行。使用以下命令啟動伺服器：

ollama serve

提示您的 RAG 代理

現在，您可以透過發送查詢來測試您的 RAG 代理程式：

from langchain.schema import HumanMessage

response = retrieval_chain.invoke({
    "messages": [
        HumanMessage("I received a damaged item. I want my money back.")
    ]
})

print(response)

回覆:
「得知您收到損壞的物品，我感到非常遺憾。根據我們的政策，如果您收到損壞的物品，請立即聯繫我們的客戶服務團隊並附上損壞的照片。我們將為您安排更換或退款。

透過執行這些步驟，您可以建立一個功能齊全的本機 RAG 代理，能夠透過即時情境增強 LLM 的效能。此設定可以適應各種領域和任務，使其成為上下文感知生成至關重要的任何應用程式的通用解決方案。

版本聲明本文轉載於：https://dev.to/dmuraco3/how-to-create-a-local-rag-agent-with-ollama-and-langchain-1m9a?1如有侵犯，請聯絡[email protected]刪除