Optimum 文件

支援的模型

您正在檢視的是需要從原始碼安裝。如果您希望透過常規 pip 進行安裝,請檢視最新的穩定版本 (v1.27.0)。
Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

Brevitas 是一個用於神經網路量化的 AMD 庫。🤗 Optimum-AMD 與 Brevitas 整合,以便更輕鬆地透過 Brevitas 量化 Transformers 模型。

這種整合還允許將透過 Brevitas 量化的模型匯出到 ONNX。

要回顧量化知識,請參閱此文件

有關所有可用選項,請參閱 ~BrevitasQuantizer~BrevitasQuantizationConfig

支援的模型

目前,僅測試並支援以下架構

  • Llama
  • OPT

動態量化

from optimum.amd import BrevitasQuantizationConfig, BrevitasQuantizer
from transformers import AutoTokenizer

# Prepare the quantizer, specifying its configuration and loading the model.
qconfig = BrevitasQuantizationConfig(
    is_static=False,
    apply_gptq=False,
    apply_weight_equalization=False,
    activations_equalization=False,
    weights_symmetric=True,
    activations_symmetric=False,
)

quantizer = BrevitasQuantizer.from_pretrained("facebook/opt-125m")

model = quantizer.quantize(qconfig)

靜態量化

from optimum.amd import BrevitasQuantizationConfig, BrevitasQuantizer
from transformers import AutoTokenizer

# Prepare the quantizer, specifying its configuration and loading the model.
qconfig = BrevitasQuantizationConfig(
    is_static=True,
    apply_gptq=False,
    apply_weight_equalization=True,
    activations_equalization=False,
    weights_symmetric=True,
    activations_symmetric=False,
)

quantizer = BrevitasQuantizer.from_pretrained("facebook/opt-125m")

tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m")

# Load the data for calibration and evaluation.
calibration_dataset = get_dataset_for_model(
    "facebook/opt-125m",
    qconfig=qconfig,
    dataset_name="wikitext2",
    tokenizer=tokenizer,
    nsamples=128,
    seqlen=512,
    split="train",
)

model = quantizer.quantize(qconfig, calibration_dataset)

將 Brevitas 模型匯出到 ONNX

Brevitas 模型可以使用 Optimum 匯出為 ONNX 格式。

import torch
from optimum.amd.brevitas.export import onnx_export_from_quantized_model

# Export to ONNX through optimum.exporters.
onnx_export_from_quantized_model(model, "llm_quantized_onnx")

完整示例

完整示例請參見 https://github.com/huggingface/optimum-amd/tree/main/examples/quantization/brevitas

< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.