Lighteval 文件
在伺服器或容器上評估模型
加入 Hugging Face 社群
並獲得增強的文件體驗
開始使用
在伺服器或容器上評估模型
除了在本地啟動評估,另一種方法是將模型部署在與 TGI 相容的伺服器/容器上,然後透過向伺服器傳送請求來執行評估。命令與之前相同,但需要指定一個 yaml 配置檔案(詳見下文)的路徑。
lighteval endpoint {tgi,inference-endpoint} \
"/path/to/config/file"\
<task parameters>
有兩種型別的配置檔案可用於在伺服器上執行
Hugging Face 推理端點
要使用 HuggingFace 的推理端點啟動模型,您需要提供以下檔案:endpoint_model.yaml
。Lighteval 將自動部署端點,執行評估,並最終刪除端點(除非您指定了已啟動的端點,在這種情況下,端點之後不會被刪除)。
配置檔案示例
model_parameters:
reuse_existing: false # if true, ignore all params in instance, and don't delete the endpoint after evaluation
# endpoint_name: "llama-2-7B-lighteval" # needs to be lower case without special characters
model_name: "meta-llama/Llama-2-7b-hf"
revision: "main" # defaults to "main"
dtype: "float16" # can be any of "awq", "eetq", "gptq", "4bit' or "8bit" (will use bitsandbytes), "bfloat16" or "float16"
accelerator: "gpu"
region: "eu-west-1"
vendor: "aws"
instance_type: "nvidia-a10g"
instance_size: "x1"
framework: "pytorch"
endpoint_type: "protected"
namespace: null # The namespace under which to launch the endpoint. Defaults to the current user's namespace
image_url: null # Optionally specify the docker image to use when launching the endpoint model. E.g., launching models with later releases of the TGI container with support for newer models.
env_vars:
null # Optional environment variables to include when launching the endpoint. e.g., `MAX_INPUT_LENGTH: 2048`
文字生成推理 (TGI)
使用已部署在 TGI 伺服器上的模型,例如 HuggingFace 的無伺服器推理。
配置檔案示例
model_parameters:
inference_server_address: ""
inference_server_auth: null
model_id: null # Optional, only required if the TGI container was launched with model_id pointing to a local directory