執行推理

本節介紹如何在英特爾 Gaudi 加速器上執行僅推理工作負載。

一個有效的快速入門是檢視 Optimum for Intel Gaudi [此處] 提供的推理示例。

您還可以瀏覽 Optimum for Intel Gaudi 儲存庫中的示例。雖然示例資料夾包含訓練和推理，但特定於推理的內容為在英特爾 Gaudi 加速器上最佳化和執行工作負載提供了寶貴的指導。

有關如何加快推理速度的更高階資訊，請檢視本指南。

使用 GaudiTrainer

您可以在下面找到一個使用 GaudiTrainer 例項執行推理的模板，我們希望在給定資料集上計算準確性

import evaluate

metric = evaluate.load("accuracy")

# You can define your custom compute_metrics function. It takes an `EvalPrediction` object (a namedtuple with a
# predictions and label_ids field) and has to return a dictionary string to float.
def my_compute_metrics(p):
    return metric.compute(predictions=np.argmax(p.predictions, axis=1), references=p.label_ids)

# Trainer initialization
trainer = GaudiTrainer(
        model=my_model,
        gaudi_config=my_gaudi_config,
        args=my_args,
        train_dataset=None,
        eval_dataset=eval_dataset,
        compute_metrics=my_compute_metrics,
        tokenizer=my_tokenizer,
        data_collator=my_data_collator,
    )

# Run inference
metrics = trainer.evaluate()

變數 my_args 應包含一些特定於推理的引數，您可以檢視此處，瞭解哪些引數在推理中設定會很有趣。

在我們的示例中

我們所有的示例都包含在給定資料集上使用給定模型執行推理的說明。所有示例的推理方式相同：使用 --do_eval 和 --per_device_eval_batch_size 執行示例指令碼，而不使用 --do_train。一個簡單的模板如下：

PT_HPU_LAZY_MODE=1 python path_to_the_example_script \
  --model_name_or_path my_model_name \
  --gaudi_config_name my_gaudi_config_name \
  --dataset_name my_dataset_name \
  --do_eval \
  --per_device_eval_batch_size my_batch_size \
  --output_dir path_to_my_output_dir \
  --use_habana \
  --use_lazy_mode \
  --use_hpu_graphs_for_inference

< > 在 GitHub 上更新