提示工程

提示工程，或稱提示，透過使用自然語言來提高大型語言模型（LLM）在各種任務上的效能。提示可以引導模型生成所需的輸出。在許多情況下，您甚至不需要為某個任務進行微調模型。您只需要一個好的提示。

嘗試提示LLM對某些文字進行分類。當您建立提示時，重要的是要提供關於任務和預期結果的非常具體的說明。

from transformers import pipeline
import torch

pipeline = pipeline(task="text-generation", model="mistralai/Mistal-7B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
prompt = """Classify the text into neutral, negative or positive.
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
"""

outputs = pipeline(prompt, max_new_tokens=10)
for output in outputs:
    print(f"Result: {output['generated_text']}")
Result: Classify the text into neutral, negative or positive. 
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
Positive

挑戰在於設計能產生您預期結果的提示，因為語言是如此微妙和富有表現力。

本指南涵蓋了提示工程的最佳實踐、技術以及如何解決語言和推理任務的示例。

最佳實踐

嘗試選擇最新的模型以獲得最佳效能。請記住，LLM 可以有兩種變體：基礎模型和指令調優模型（或聊天模型）。

基礎模型在給定初始提示的情況下，非常擅長補全文字，但在遵循指令方面表現不佳。指令調優模型是經過指令或對話資料專門訓練的基礎模型的版本。這使得指令調優模型更適合提示。

現代LLM通常是僅解碼器模型，但也有一些編碼器-解碼器LLM，例如Flan-T5或BART，它們可能用於提示。對於編碼器-解碼器模型，請確保將管道任務識別符號設定為text2text-generation而不是text-generation。
從簡短的提示開始，並對其進行迭代以獲得更好的結果。
將指令放在提示的開頭或結尾。對於較長的提示，模型可能會應用最佳化以防止注意力呈二次方縮放，這會更強調提示的開頭和結尾。
清晰地將指令與目標文字分開。
對任務和所需輸出進行具體而詳細的描述，包括其格式、長度、風格和語言等。避免模糊的描述和指令。
指令應側重於“做什麼”而不是“不做什麼”。
透過寫第一個詞甚至第一句話來引導模型生成正確的輸出。
嘗試其他技術，如少樣本和思維鏈，以改善結果。
用不同的模型測試您的提示，以評估其穩健性。
版本化和跟蹤您的提示效能。

技術

單獨構建一個好的提示，也稱為零樣本提示，可能不足以獲得您想要的結果。您可能需要嘗試一些提示技術才能獲得最佳效能。

本節介紹了一些提示技術。

少樣本提示

少樣本提示透過包含模型在給定輸入時應生成的具體示例來提高準確性和效能。明確的示例使模型更好地理解任務和您正在尋找的輸出格式。嘗試使用不同數量的示例（2、4、8 等）來檢視它如何影響效能。下面的示例為模型提供了一個輸出格式（MM/DD/YYYY 格式的日期）的示例（1 樣本）。

from transformers import pipeline
import torch

pipeline = pipeline(model="mistralai/Mistral-7B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
prompt = """Text: The first human went into space and orbited the Earth on April 12, 1961.
Date: 04/12/1961
Text: The first-ever televised presidential debate in the United States took place on September 28, 1960, between presidential candidates John F. Kennedy and Richard Nixon.
Date:"""

outputs = pipeline(prompt, max_new_tokens=12, do_sample=True, top_k=10)
for output in outputs:
    print(f"Result: {output['generated_text']}")
# Result: Text: The first human went into space and orbited the Earth on April 12, 1961.
# Date: 04/12/1961
# Text: The first-ever televised presidential debate in the United States took place on September 28, 1960, between presidential candidates John F. Kennedy and Richard Nixon.
# Date: 09/28/1960

少樣本提示的缺點是您需要建立更長的提示，這會增加計算和延遲。提示長度也有限制。最後，模型可能會從您的示例中學習到意想不到的模式，並且它可能在複雜的推理任務上表現不佳。

為了改進現代指令微調LLM的少樣本提示，請使用模型的特定聊天模板。這些模型在“使用者”和“助手”之間輪流對話的資料集上進行訓練。將您的提示結構與此對齊可以提高效能。

將您的提示結構化為基於回合的對話，並使用 `apply_chat_template` 方法進行分詞和格式化。

from transformers import pipeline
import torch

pipeline = pipeline(model="mistralai/Mistral-7B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")

messages = [
    {"role": "user", "content": "Text: The first human went into space and orbited the Earth on April 12, 1961."},
    {"role": "assistant", "content": "Date: 04/12/1961"},
    {"role": "user", "content": "Text: The first-ever televised presidential debate in the United States took place on September 28, 1960, between presidential candidates John F. Kennedy and Richard Nixon."}
]

prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

outputs = pipeline(prompt, max_new_tokens=12, do_sample=True, top_k=10)

for output in outputs:
    print(f"Result: {output['generated_text']}")

雖然基本的少樣本提示方法將示例嵌入到單個文字字串中，但聊天模板格式提供了以下優點。

模型可能會有潛在的改進理解，因為它可以更好地識別模式以及使用者輸入和助手輸出的預期角色。
模型可能更一致地輸出所需的輸出格式，因為它在訓練期間的結構與輸入相同。

請務必查閱特定指令微調模型的文件，以瞭解其聊天模板的格式，以便您可以相應地構建少樣本提示。

思維鏈

思維鏈（CoT）透過提供一系列提示，幫助模型更深入地思考一個主題，從而有效地生成更連貫和推理充分的輸出。

下面的示例為模型提供了幾個提示，以逐步推導中間推理步驟。

from transformers import pipeline
import torch

pipeline = pipeline(model="mistralai/Mistral-7B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
prompt = """Let's go through this step-by-step:
1. You start with 15 muffins.
2. You eat 2 muffins, leaving you with 13 muffins.
3. You give 5 muffins to your neighbor, leaving you with 8 muffins.
4. Your partner buys 6 more muffins, bringing the total number of muffins to 14.
5. Your partner eats 2 muffins, leaving you with 12 muffins.
If you eat 6 muffins, how many are left?"""

outputs = pipeline(prompt, max_new_tokens=20, do_sample=True, top_k=10)
for output in outputs:
    print(f"Result: {output['generated_text']}")
Result: Let's go through this step-by-step:
1. You start with 15 muffins.
2. You eat 2 muffins, leaving you with 13 muffins.
3. You give 5 muffins to your neighbor, leaving you with 8 muffins.
4. Your partner buys 6 more muffins, bringing the total number of muffins to 14.
5. Your partner eats 2 muffins, leaving you with 12 muffins.
If you eat 6 muffins, how many are left?
Answer: 6

與少樣本提示類似，思維鏈的缺點是需要更多的精力來設計一系列提示，以幫助模型推理複雜的任務，並且提示長度會增加延遲。

微調

雖然提示是使用LLM的強大方式，但在某些情況下，微調模型甚至微調模型會更好。

以下是一些微調模型有意義的示例場景。

您的領域與LLM預訓練的領域截然不同，並且大量的提示未能產生您想要的結果。
您的模型需要在一個低資源語言中良好執行。
您的模型需要針對具有嚴格監管要求的敏感資料進行訓練。
由於成本、隱私、基礎設施或其他限制，您正在使用小型模型。

在所有這些場景中，請確保您有足夠大的特定領域資料集來訓練模型，有足夠的時間和資源，並且微調的成本是值得的。否則，您最好嘗試最佳化提示。

示例

下面的示例演示瞭如何提示LLM執行不同的任務。

命名實體識別

翻譯

摘要

問答

< > 在 GitHub 上更新

Transformers

提示工程

最佳實踐

技術

少樣本提示

思維鏈

微調

示例