Lighteval 文件
為多語言評估做貢獻
加入 Hugging Face 社群
並獲得增強的文件體驗
開始使用
為多語言評估做貢獻
貢獻少量翻譯
我們定義了 19 個 literals
,它們是在自動建立評估提示時使用的基本關鍵字或標點符號,例如 yes
、no
、because
等。
我們歡迎您提供您所用語言的翻譯!
要做出貢獻,您需要:
- 開啟 translation_literals 檔案
- 編輯該檔案,為您感興趣的語言新增或擴充套件字面量。
Language.ENGLISH: TranslationLiterals(
language=Language.ENGLISH,
question_word="question", # Usage: "Question: How are you?"
answer="answer", # Usage: "Answer: I am fine"
confirmation_word="right", # Usage: "He is smart, right?"
yes="yes", # Usage: "Yes, he is"
no="no", # Usage: "No, he is not"
also="also", # Usage: "Also, she is smart."
cause_word="because", # Usage: "She is smart, because she is tall"
effect_word="therefore", # Usage: "He is tall therefore he is smart"
or_word="or", # Usage: "He is tall or small"
true="true", # Usage: "He is smart, true, false or neither?"
false="false", # Usage: "He is smart, true, false or neither?"
neither="neither", # Usage: "He is smart, true, false or neither?"
# Punctuation and spacing: only adjust if your language uses something different than in English
full_stop=".",
comma=",",
question_mark="?",
exclamation_mark="!",
word_space=" ",
sentence_space=" ",
colon=":",
# The first characters of your alphabet used in enumerations, if different from English
indices=["A", "B", "C", ...]
)
- 提交包含您修改的 PR!就是這樣!
貢獻新的多語言任務
您應該首先閱讀我們關於新增自定義任務的指南,以便更好地理解我們使用的不同引數。
然後,您應該檢視當前的多語言任務檔案,以瞭解它們是如何定義的。對於多語言評估,prompt_function
應透過適應語言的模板來實現。該模板將負責正確的格式化、正確且一致地使用適應語言的提示錨點(例如,Question/Answer)和標點符號。
在此處瀏覽所有模板列表,檢視哪些最適合您自己的任務。
然後,準備好後,要定義您自己的任務,您應該:
- 按照上述指南建立一個 Python 檔案
- 為您的任務型別匯入相關模板(XNLI、Copa、多項選擇、問答等)
- 使用我們可引數化的 LightevalTaskConfig 類,為每種相關語言和評估表述(用於多項選擇)定義一個或一組任務
your_tasks = [
LightevalTaskConfig(
# Name of your evaluation
name=f"evalname_{language.value}_{formulation.name.lower()}",
# The evaluation is community contributed
suite=["community"],
# This will automatically get the correct metrics for your chosen formulation
metric=get_metrics_for_formulation(
formulation,
[
loglikelihood_acc_metric(normalization=None),
loglikelihood_acc_metric(normalization=LogProbTokenNorm()),
loglikelihood_acc_metric(normalization=LogProbCharNorm()),
],
),
# In this function, you choose which template to follow and for which language and formulation
prompt_function=get_template_prompt_function(
language=language,
# then use the adapter to define the mapping between the
# keys of the template (left), and the keys of your dataset
# (right)
# To know which template keys are required and available,
# consult the appropriate adapter type and doc-string.
adapter=lambda line: {
"key": line["relevant_key"],
...
},
formulation=formulation,
),
# You can also add specific filters to remove irrelevant samples
hf_filter=lambda line: line["label"] in <condition>,
# You then select your huggingface dataset as well as
# the splits available for evaluation
hf_repo=<dataset>,
hf_subset=<subset>,
evaluation_splits=["train"],
hf_avail_splits=["train"],
)
for language in [
Language.YOUR_LANGUAGE, ...
]
for formulation in [MCFFormulation(), CFFormulation(), HybridFormulation()]
]
- 然後,您可以返回指南,測試您的任務是否已正確實現!
所有 LightevalTaskConfig 引數都是強型別的,包括模板函式的輸入。請確保利用您 IDE 的功能,以便更容易地正確填寫這些引數。
一切就緒後,提交一個 PR,我們很樂意對其進行審查!
< > 在 GitHub 上更新