🤗 在 Argilla 中使用 HuggingFace 推理端點進行 LLM 建議

社群文章釋出於 2023 年 9 月 20 日

我們很高興能使用 Hugging Face 推理端點在 Argilla 中展示建議功能！從 Argilla v1.13.0 開始，只需幾行程式碼，任何人都可以為 Feedback Dataset 記錄新增建議。這透過將標註任務轉化為快速驗證和糾正過程，從而縮短了生成高質量資料集的時間。

Hugging Face 的推理端點讓 Hub 上的任何 ML 模型部署都變得前所未有的簡單。您只需選擇要部署的模型、您偏好的雲提供商和區域，以及要使用的例項型別。幾分鐘之內，您就可以擁有一個正常執行的推理端點。

得益於 Argilla 與 Hugging Face Spaces 可用模板的整合（之前釋出在🚀 在 Hugging Face Spaces 上啟動 Argilla），您只需點選幾下即可啟動 Argilla 例項。這使您可以將整個工作流程保留在 Hugging Face 的生態系統中。

在這篇文章中，我們將演示如何在 Hugging Face Spaces 中設定 Argilla 例項，部署 Hugging Face 推理端點以提供 Llama 2 7B Chat 服務，並將其整合到 Argilla 中，以便為 Argilla 資料集新增建議。

只需不到 10 行程式碼，您就可以使用 Hugging Face 推理端點自動將 LLM 驅動的建議新增到 Argilla 資料集中的記錄中！

🚀 在 Spaces 中部署 Argilla

您可以使用多種部署選項自託管 Argilla，註冊 Argilla Cloud，或透過此一鍵部署按鈕在 Hugging Face Spaces 上啟動 Argilla 例項

🍱 將資料集推送到 Argilla

我們將使用 Alpaca 的一個子集，這是一個由 OpenAI 的 text-davinci-003 引擎使用 Self-Instruct 框架生成的資料集，包含 52,000 條指令和演示，並進行了一些修改，具體描述在 Alpaca 的資料集卡片中。

我們將使用的 Alpaca 子集由 Hugging Face H4 團隊收集，每個拆分（訓練和測試）包含 100 行，其中包含提示和完成。

根據我們想要標註的資料，我們定義要推送到 Argilla 的 Feedback Dataset，這意味著需要定義每條記錄的欄位、使用者需要回答的問題，最後是標註指南。更多資訊請參閱 Argilla 文件 - 建立 Feedback Dataset。

最後一步是遍歷 Alpaca 子集中的行，並將它們新增到 Feedback Dataset 中，以便推送到 Argilla 以開始標註過程。

import argilla as rg
from datasets import load_dataset

rg.init(api_url="<ARGILLA_API_URL>", api_key="<ARGILLA_API_KEY>")

dataset = rg.FeedbackDataset(
    fields=[
      rg.TextField(name="prompt"),
      rg.TextField(name="completion"),
    ],
    questions=[
        rg.LabelQuestion(name="prompt-quality", title="Is the prompt clear?", labels=["yes", "no"]),
        rg.LabelQuestion(name="completion-quality", title="Is the completion correct?", labels=["yes", "no"]),
        rg.TextQuestion(
          name="completion-edit",
          title="If you feel like the completion could be improved, provide a new one",
          required=False,
        ),
    ],
    guidelines=(
      "You are asked to evaluate the following prompt-completion pairs quality,"
      " and provide a new completion if applicable."
    ),
)

alpaca_dataset = load_dataset("HuggingFaceH4/testing_alpaca_small", split="train")
dataset.add_records([rg.FeedbackRecord(fields=row) for row in in alpaca_dataset])

dataset.push_to_argilla(name="alpaca-small", workspace="admin")

如果我們現在導航到我們的 Argilla 例項，我們將看到以下 UI

🚀 部署 Llama 2 推理端點

現在，我們可以設定 Hugging Face 推理端點。這使我們能夠輕鬆地在專用、完全託管的基礎設施上提供任何模型服務，同時透過其安全、合規且靈活的生產解決方案降低成本。

如前所述，我們將使用 Hugging Face 格式的 Llama 2 7B 引數變體，並針對聊天完成進行了微調。您可以在 meta-llama/llama-2-7b-chat-hf 找到此模型。其他變體也可在 Hugging Face Hub 的 https://huggingface.co/meta-llama 上獲得。

注意： 在撰寫本文時，要使用 Llama 2，使用者需要訪問 Meta 網站並接受其許可條款和可接受使用策略，然後才能透過 Hugging Face Hub 在 Meta 的 Llama 2 組織請求訪問 Llama 2 模型。

首先，我們需要確保推理端點已啟動並正在執行。一旦獲取到 URL，我們就可以開始向其傳送請求。

✨ 為 Argilla 生成建議

在向推理端點發送請求之前，我們應該提前瞭解需要使用的系統提示以及如何格式化我們的提示。在這種情況下，由於我們使用的是 meta-llama/llama-2-7b-chat-hf，我們需要查詢用於微調它的提示，並在傳送推理請求時複製相同的格式。有關 Llama 2 的更多資訊，請參閱 Hugging Face 部落格 - Llama 2 已釋出 - 在 Hugging Face 上獲取。

system_prompt = (
  "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible,"
  " while being safe. Your answers should not include any harmful, unethical, racist, sexist,"
  " toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased"
  " and positive in nature.\nIf a question does not make any sense, or is not factually coherent,"
  " explain why instead of answering something not correct. If you don't know the answer to a"
  " question, please don't share false information."
)
base_prompt = "<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]"

定義好提示後，我們就可以例項化 huggingface_hub 中的 InferenceClient，以便稍後透過 text_generation 方法向已部署的推理端點發送請求。

以下程式碼片段展示瞭如何從我們的 Argilla 例項中檢索現有的 Feedback Dataset，以及如何使用 huggingface_hub 中的 InferenceClient 向已部署的推理端點發送請求，以為資料集中的記錄新增建議。

import argilla as rg
from huggingface_hub import InferenceClient

rg.init(api_url="<ARGILLA_SPACE_URL>", api_key="<ARGILLA_OWNER_API_KEY")
dataset = rg.FeedbackDataset.from_argilla("<ARGILLA_DATASET>", workspace="<ARGILLA_WORKSPACE>")

client = InferenceClient("<HF_INFERENCE_ENDPOINT_URL>", token="<HF_TOKEN>")

system_prompt = (
  "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible,"
  " while being safe. Your answers should not include any harmful, unethical, racist, sexist,"
  " toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased"
  " and positive in nature.\nIf a question does not make any sense, or is not factually coherent,"
  " explain why instead of answering something not correct. If you don't know the answer to a"
  " question, please don't share false information."
)
base_prompt = "<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]"

def generate_response(prompt: str) -> str:
  prompt = base_prompt.format(system_prompt=system_prompt, prompt=prompt)
  response = client.text_generation(
    prompt, details=True, max_new_tokens=512, top_k=30, top_p=0.9,
    temperature=0.2, repetition_penalty=1.02, stop_sequences=["</s>"],
  )
  return response.generated_text

for record in dataset.records:
  record.update(
    suggestions=[
      {
        "question_name": "response",
        "value": generate_response(prompt=record.fields["prompt"]),
        "type": "model",
        "agent": "llama-2-7b-hf-chat",
      },
    ],
  )

注意：預定義的系統提示可能不適用於某些用例，因此我們可以應用提示工程技術來使其適應我們的特定用例。

如果我們在使用推理端點生成建議後回到 Argilla 例項，我們將在 UI 中看到以下內容

最後，是時候讓標註人員審查 Argilla 資料集中的記錄，回答問題，並根據需要提交、編輯或丟棄建議了。

➡️ 下一步

使用 Hugging Face 推理端點將機器學習生成的建議注入 Argilla 既快速又簡單。現在，您可以隨意嘗試您喜歡的機器學習框架，並生成適合您特定用例的建議！

可以為任何問題生成建議，您只需找到最適合您的用例和 Argilla 中 Feedback Dataset 中定義的問題的模型即可。

建議有很多用例，我們對機器反饋在 LLM 用例中的作用感到非常興奮，我們很樂意聽取您的想法！我們強烈建議加入我們精彩的 Slack 社群，分享您對這篇文章或您想討論的任何其他內容的看法！

社群

透過拖放到文字輸入框、貼上或點選此處上傳圖片、音訊和影片。

點選或貼上此處以上傳圖片

· 註冊或登入以評論

贊