Accelerate 文件

FP8

Accelerate

加入 Hugging Face 社群

並獲得增強的文件體驗

在模型、資料集和 Spaces 上進行協作

透過加速推理獲得更快的示例

切換文件主題

開始使用

FP8

以下是與底層 FP8 實現相關的函式和類

FP8RecipeKwargs

class accelerate.utils.FP8RecipeKwargs

( opt_level: typing.Literal['O1', 'O2'] = None use_autocast_during_eval: bool = None margin: int = None interval: int = None fp8_format: typing.Literal['HYBRID', 'E4M3', 'E5M2'] = None amax_history_len: int = None amax_compute_algo: typing.Literal['max', 'most_recent'] = None override_linear_precision: tuple = None backend: typing.Literal['MSAMP', 'TE'] = None )

已棄用。請改用適當的 FP8 配方 kwargs 類，例如 `TERecipeKwargs` 或 `MSAMPRecipeKwargs`。

convert_model

accelerate.utils.convert_model

( model to_transformer_engine = True _convert_linear = True _convert_ln = True )

遞迴地將模型的線性和層歸一化層轉換為它們對應的 transformers_engine 版本。

has_transformer_engine_layers

accelerate.utils.has_transformer_engine_layers

( model )

返回給定模型是否包含 transformer_engine 層。

contextual_fp8_autocast

accelerate.utils.contextual_fp8_autocast

( model_forward fp8_recipe use_during_eval = False )

一個用於模型前向方法的包裝器，以應用 FP8 自動轉換。它具有上下文感知能力，意味著預設情況下它會在評估模式下停用 FP8 自動轉換，這通常有助於獲得更準確的指標。

apply_fp8_autowrap

accelerate.utils.apply_fp8_autowrap

( model fp8_recipe_handler )

將 FP8 上下文管理器應用於模型的前向方法

< > 在 GitHub 上更新

←Kwargs 處理器實用函式和類→

© . This site is unofficial and not affiliated with Hugging Face, Inc.