Optimum 文件

使用 ONNX Runtime 進行最佳推理

您正在檢視的是需要從原始碼安裝. 如果您希望常規的 pip 安裝,請檢視最新的穩定版本 (v1.27.0)。
Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

使用 ONNX Runtime 進行最佳推理

Optimum 是一個實用程式包,用於使用 ONNX Runtime 等加速執行時構建和執行推理。Optimum 可用於從 Hugging Face Hub 載入最佳化模型,並建立管道以執行加速推理,而無需重寫 API。

載入

Transformers 模型

一旦您的模型匯出為 ONNX 格式,您可以透過將 `AutoModelForXxx` 替換為相應的 `ORTModelForXxx` 類來載入它。

  from transformers import AutoTokenizer, pipeline
- from transformers import AutoModelForCausalLM
+ from optimum.onnxruntime import ORTModelForCausalLM

- model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B") # PyTorch checkpoint
+ model = ORTModelForCausalLM.from_pretrained("onnx-community/Llama-3.2-1B", subfolder="onnx") # ONNX checkpoint
  tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")

  pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
  result = pipe("He never went out without a book under his arm")

有關所有支援的 `ORTModelForXxx` 的更多資訊,請參閱我們的文件

Diffusers 模型

一旦您的模型匯出為 ONNX 格式,您可以透過將 `DiffusionPipeline` 替換為相應的 `ORTDiffusionPipeline` 類來載入它。

- from diffusers import DiffusionPipeline
+ from optimum.onnxruntime import ORTDiffusionPipeline

  model_id = "runwayml/stable-diffusion-v1-5"
- pipeline = DiffusionPipeline.from_pretrained(model_id)
+ pipeline = ORTDiffusionPipeline.from_pretrained(model_id, revision="onnx")
  prompt = "sailing ship in storm by Leonardo da Vinci"
  image = pipeline(prompt).images[0]

Sentence Transformers 模型

一旦您的模型匯出為 ONNX 格式,您可以透過將 `AutoModel` 替換為相應的 `ORTModelForFeatureExtraction` 類來載入它。

  from transformers import AutoTokenizer
- from transformers import AutoModel
+ from optimum.onnxruntime import ORTModelForFeatureExtraction

  tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
- model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
+ model = ORTModelForFeatureExtraction.from_pretrained("optimum/all-MiniLM-L6-v2")
  inputs = tokenizer("This is an example sentence", return_tensors="pt")
  outputs = model(**inputs)

您還可以直接使用 `sentence_transformers.SentenceTransformer` 類載入您的 ONNX 模型,只需確保已安裝 `sentence-transformers>=3.2`。如果模型尚未轉換為 ONNX,它將自動即時轉換。

  from sentence_transformers import SentenceTransformer

  model_id = "sentence-transformers/all-MiniLM-L6-v2"
- model = SentenceTransformer(model_id)
+ model = SentenceTransformer(model_id, backend="onnx")

  sentences = ["This is an example sentence", "Each sentence is converted"]
  embeddings = model.encode(sentences)

Timm 模型

一旦您的模型匯出為 ONNX 格式,您可以透過將 `create_model` 替換為相應的 `ORTModelForImageClassification` 類來載入它。

  import requests
  from PIL import Image
- from timm import create_model
  from timm.data import resolve_data_config, create_transform
+ from optimum.onnxruntime import ORTModelForImageClassification

- model = create_model("timm/mobilenetv3_large_100.ra_in1k", pretrained=True)
+ model = ORTModelForImageClassification.from_pretrained("optimum/mobilenetv3_large_100.ra_in1k")
  transform = create_transform(**resolve_data_config(model.config.pretrained_cfg, model=model))
  url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"
  image = Image.open(requests.get(url, stream=True).raw)
  inputs = transform(image).unsqueeze(0)
  outputs = model(inputs)

即時將模型轉換為 ONNX

如果您的模型尚未轉換為 ONNXORTModel 包含一個即時將模型轉換為 ONNX 的方法。只需將 `export=True` 傳遞給 from_pretrained() 方法,您的模型將即時載入並轉換為 ONNX

>>> from optimum.onnxruntime import ORTModelForSequenceClassification

>>> # Load the model from the hub and export it to the ONNX format
>>> model_id = "distilbert-base-uncased-finetuned-sst-2-english"
>>> model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

將模型推送到 Hub

您還可以直接在模型上呼叫 `push_to_hub` 將其上傳到 Hub

>>> from optimum.onnxruntime import ORTModelForSequenceClassification

>>> # Load the model from the hub and export it to the ONNX format
>>> model_id = "distilbert-base-uncased-finetuned-sst-2-english"
>>> model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

>>> # Save the converted model locally
>>> output_dir = "a_local_path_for_convert_onnx_model"
>>> model.save_pretrained(output_dir)

# Push the onnx model to HF Hub
>>> model.push_to_hub(output_dir, repository_id="my-onnx-repo")
< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.