Optimum 文件
量化
加入 Hugging Face 社群
並獲得增強的文件體驗
開始使用
量化
🤗 Optimum 提供了一個 optimum.furiosa
包,可讓您使用 Furiosa 量化工具對 Hugging Face Hub 上託管的許多模型應用量化。
量化過程透過 FuriosaAIConfig
和 FuriosaAIQuantizer
類進行抽象。前者允許您指定量化應如何完成,而後者則有效地處理量化。
靜態量化示例
FuriosaAIQuantizer
類可用於靜態量化您的 ONNX 模型。下面您將找到一個關於如何靜態量化 eugenecamus/resnet-50-base-beans-demo 的簡單端到端示例。
>>> from functools import partial
>>> from pathlib import Path
>>> from transformers import AutoFeatureExtractor
>>> from optimum.furiosa import FuriosaAIQuantizer, FuriosaAIModelForImageClassification
>>> from optimum.furiosa.configuration import AutoCalibrationConfig
>>> from optimum.furiosa.utils import export_model_to_onnx
>>> model_id = "eugenecamus/resnet-50-base-beans-demo"
# Convert PyTorch model convert to ONNX and create Quantizer and setup config
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
>>> batch_size = 1
>>> image_size = feature_extractor.size["shortest_edge"]
>>> num_labels = 3
>>> onnx_model_name = "model.onnx"
>>> output_dir = "output"
>>> onnx_model_path = Path(output_dir) / onnx_model_name
>>> export_model_to_onnx(
... model_id,
... save_dir=output_dir,
... input_shape_dict={"pixel_values": [batch_size, 3, image_size, image_size]},
... output_shape_dict={"logits": [batch_size, num_labels]},
... file_name=onnx_model_name,
)
>>> quantizer = FuriosaAIQuantizer.from_pretrained(output_dir, file_name=onnx_model_name)
>>> qconfig = QuantizationConfig()
# Create the calibration dataset
>>> def preprocess_fn(ex, feature_extractor):
... return feature_extractor(ex["image"])
>>> calibration_dataset = quantizer.get_calibration_dataset(
... "beans",
... preprocess_function=partial(preprocess_fn, feature_extractor=feature_extractor),
... num_samples=50,
... dataset_split="train",
... )
# Create the calibration configuration containing the parameters related to calibration.
>>> calibration_config = AutoCalibrationConfig.mse_asym(calibration_dataset)
# Perform the calibration step: computes the activations quantization ranges
>>> ranges = quantizer.fit(
... dataset=calibration_dataset,
... calibration_config=calibration_config,
... )
# Apply static quantization on the model
>>> model_quantized_path = quantizer.quantize(
... save_dir=output,
... calibration_tensors_range=ranges,
... quantization_config=qconfig,
... )