Transformers

( config: PretrainedConfig *inputs **kwargs )

所有模型的基類。

PreTrainedModel 負責儲存模型的配置，並處理載入、下載和儲存模型的方法，以及一些所有模型通用的方法，用於

調整輸入嵌入的大小，
剪枝自注意力頭中的頭部。

類屬性（由派生類覆蓋）

config_class (PretrainedConfig) — 一個 PretrainedConfig 的子類，用作此模型架構的配置類。
load_tf_weights (Callable) — 一個用於在 PyTorch 模型中載入 TensorFlow 檢查點的 Python **方法**，其引數為
- model (PreTrainedModel) — 要載入 TensorFlow 檢查點的模型例項。
- config (PreTrainedConfig) — 與模型關聯的配置例項。
- path (str) — TensorFlow 檢查點的路徑。
base_model_prefix (str) — 一個字串，指示在新增模組到基礎模型之上的相同架構的派生類中，與基礎模型關聯的屬性。
is_parallelizable (bool) — 指示此模型是否支援模型並行化的標誌。
main_input_name (str) — 模型的主要輸入名稱（NLP 模型通常為 input_ids，視覺模型為 pixel_values，語音模型為 input_values）。

push_to_hub

( repo_id: str use_temp_dir: typing.Optional[bool] = None commit_message: typing.Optional[str] = None private: typing.Optional[bool] = None token: typing.Union[bool, str, NoneType] = None max_shard_size: typing.Union[str, int, NoneType] = '5GB' create_pr: bool = False safe_serialization: bool = True revision: typing.Optional[str] = None commit_description: typing.Optional[str] = None tags: typing.Optional[list[str]] = None **deprecated_kwargs )

引數

repo_id (str) — 您要將模型推送到儲存庫的名稱。在推送到給定組織時，它應包含您的組織名稱。
use_temp_dir (bool, 可選) — 是否使用臨時目錄來儲存在推送到 Hub 之前儲存的檔案。如果沒有名為 repo_id 的目錄，則預設為 True，否則為 False。
commit_message (str, 可選) — 推送時提交的訊息。預設為 "Upload model"。
private (bool, 可選) — 是否將儲存庫設為私有。如果為 None（預設），則儲存庫將是公開的，除非組織的預設設定為私有。如果儲存庫已存在，此值將被忽略。
token (bool 或 str, 可選) — 用於遠端檔案的 HTTP 持有者授權令牌。如果為 True 或未指定，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。如果未指定 repo_url，則預設為 True。
max_shard_size (int 或 str, 可選, 預設為 "5GB") — 僅適用於模型。分片前檢查點的最大大小。檢查點分片的大小將小於此大小。如果表示為字串，則需要是數字後跟單位（如 "5MB"）。我們將其預設為 "5GB"，以便使用者可以在免費層 Google Colab 例項上輕鬆載入模型，而不會出現任何 CPU OOM 問題。
create_pr (bool, 可選, 預設為 False) — 是否使用上傳檔案建立拉取請求或直接提交。
safe_serialization (bool, 可選, 預設為 True) — 是否將模型權重轉換為 safetensors 格式以進行更安全的序列化。
revision (str, 可選) — 要將上傳檔案推送到的分支。
commit_description (str, 可選) — 將要建立的提交的描述
tags (list[str], 可選) — 要推送到 Hub 的標籤列表。

將模型檔案上傳到 🤗 模型中心。

示例

from transformers import AutoModel

model = AutoModel.from_pretrained("google-bert/bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("huggingface/my-finetuned-bert")

add_model_tags

( tags: typing.Union[list[str], str] )

引數

tags (Union[list[str], str]) — 要注入到模型中的所需標籤

將自定義標籤新增到推送到 Hugging Face Hub 的模型中。不會覆蓋模型中現有的標籤。

示例

from transformers import AutoModel

model = AutoModel.from_pretrained("google-bert/bert-base-cased")

model.add_model_tags(["custom", "custom-bert"])

# Push the model to your namespace with the name "my-custom-bert".
model.push_to_hub("my-custom-bert")

can_generate

( ) → bool

布林值

此模型是否可以使用 .generate() 生成序列。

返回此模型是否可以使用 GenerationMixin 中的 .generate() 生成序列。

在底層，對於返回 True 的類，會觸發一些特定於生成的更改：例如，模型例項將具有填充的 generation_config 屬性。

dequantize

( )

如果模型已透過支援反量化的量化方法進行量化，則可能對模型進行反量化。

disable_input_require_grads

( )

移除 _require_grads_hook。

enable_input_require_grads

( )

啟用輸入嵌入的梯度。這對於在保持模型權重不變的情況下微調適配器權重很有用。

from_pretrained

( pretrained_model_name_or_path: typing.Union[str, os.PathLike, NoneType] *model_args config: typing.Union[transformers.configuration_utils.PretrainedConfig, str, os.PathLike, NoneType] = None cache_dir: typing.Union[str, os.PathLike, NoneType] = None ignore_mismatched_sizes: bool = False force_download: bool = False local_files_only: bool = False token: typing.Union[str, bool, NoneType] = None revision: str = 'main' use_safetensors: typing.Optional[bool] = None weights_only: bool = True **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike, 可選) — 可以是：
- 一個字串，託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個*目錄*的路徑，包含使用 save_pretrained() 儲存的模型權重，例如 ./my_model_directory/。
- 一個 *tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應該設定為 True，並且應該提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後載入 PyTorch 模型要慢。
- 一個包含 *flax 檢查點檔案*（.msgpack 格式）的模型資料夾的路徑或 URL（例如，包含 flax_model.msgpack 的 ./flax_model/）。在這種情況下，from_flax 應該設定為 True。
- 如果您同時提供了配置和狀態字典（分別透過關鍵字引數 config 和 state_dict），則為 None。
model_args (位置引數序列，可選) — 所有剩餘的位置引數將傳遞給底層模型的 __init__ 方法。
config (Union[PretrainedConfig, str, os.PathLike], 可選) — 可以是：
- PretrainedConfig 派生類的一個例項，
- 一個字串或有效的路徑，作為 from_pretrained() 的輸入。
用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但是，在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選項。
cache_dir (Union[str, os.PathLike], 可選) — 如果不使用標準快取，則下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
from_flax (bool, 可選, 預設為 False) — 從 Flax 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
ignore_mismatched_sizes (bool, 可選, 預設為 False) — 如果檢查點中的某些權重與模型權重大小不匹配（例如，您正在例項化一個具有 10 個標籤的模型，而檢查點具有 3 個標籤），是否引發錯誤。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，如果它們存在於快取中，則覆蓋快取版本。
resume_download — 已棄用並忽略。所有下載現在預設情況下都可恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否還返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（即，不嘗試下載模型）。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP 持有者授權令牌。如果為 True 或未指定，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 Git 的系統在 huggingface.co 上儲存模型和其他工件，因此 revision 可以是 Git 允許的任何識別符號。

要在 Hub 上測試您建立的拉取請求，您可以傳遞 revision="refs/pr/<pr_number>"。
attn_implementation (str, 可選) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）、"flash_attention_2"（使用 Dao-AILab/flash-attention）或 "flash_attention_3"（使用 Dao-AILab/flash-attention/hopper）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則預設是手動 "eager" 實現。

大型模型推理引數

torch_dtype (str 或 torch.dtype, 可選) — 覆蓋預設的 torch.dtype，並以特定的 dtype 載入模型。不同的選項是：
1. torch.float16 或 torch.bfloat16 或 torch.float：以指定的 dtype 載入，如果模型存在 config.torch_dtype 則忽略。如果未指定，則模型將以 torch.float (fp32) 載入。
2. "auto" - 將嘗試使用模型 config.json 檔案中的 torch_dtype 條目。如果未找到此條目，則檢查檢查點中第一個浮點型別權重的 dtype 並將其用作 dtype。這將使用模型訓練結束時儲存的 dtype 載入模型。它不能用作模型訓練方式的指標。因為它可能以半精度 dtype 訓練，但以 fp32 儲存。
3. 一個有效的 torch.dtype 字串。例如，“float32”以 torch.float32 載入模型，“float16”以 torch.float16 載入模型等。
對於某些模型，其訓練所用的 dtype 是未知的 - 您可以嘗試檢視模型的論文或聯絡作者，要求他們在模型的卡片中新增此資訊並在 Hub 上的 config.json 中插入 torch_dtype 條目。
device_map (str 或 dict[str, Union[int, str, torch.device]] 或 int 或 torch.device, 可選) — 指定每個子模組應放置在何處的對映。它不需要細化到每個引數/緩衝區名稱，一旦給定的模組名稱在其中，它的每個子模組都將被髮送到同一裝置。如果只傳遞分配模型的裝置（*例如*，"cpu"、"cuda:1"、"mps"，或像 1 這樣的 GPU 順序），則裝置對映將整個模型對映到此裝置。傳遞 device_map = 0 表示將整個模型放置在 GPU 0 上。

要讓 Accelerate 自動計算最最佳化的 device_map，請設定 device_map="auto"。有關每個選項的更多資訊，請參閱設計裝置對映。
max_memory (Dict, 可選) — 如果使用 device_map，則是一個字典，將裝置識別符號對映到最大記憶體。如果未設定，將預設為每個 GPU 的最大可用記憶體和可用 CPU RAM。
tp_plan (str, 可選) — Torch 張量並行計劃，請參閱此處。目前，它僅接受 tp_plan="auto" 以使用基於模型的預定義計劃。請注意，如果您使用它，則應相應地使用 torchrun [args] script.py 啟動指令碼。這將比使用 device_map 快得多，但有侷限性。
tp_size (str, 可選) — Torch 張量並行度。如果未提供，則預設為世界大小。
device_mesh (torch.distributed.DeviceMesh, 可選) — Torch 裝置網格。如果未提供，則預設為世界大小。目前僅用於張量並行。
offload_folder (str 或 os.PathLike, 可選) — 如果 device_map 包含任何值 "disk"，則為我們將解除安裝權重的資料夾。
offload_state_dict (bool, 可選) — 如果為 True，將暫時將 CPU 狀態字典解除安裝到硬碟驅動器，以避免在 CPU 狀態字典的權重 + 檢查點的最大分片不適合時超出 CPU RAM。當有磁碟解除安裝時，預設為 True。
offload_buffers (bool, 可選) — 是否與模型引數一起解除安裝緩衝區。
quantization_config (Union[QuantizationConfigMixin,Dict], 可選) — 用於量化的配置引數字典或 QuantizationConfigMixin 物件（例如 bitsandbytes、gptq）。可能還有其他與量化相關的 kwargs，包括 load_in_4bit 和 load_in_8bit，這些由 QuantizationConfigParser 解析。僅支援 bitsandbytes 量化，不推薦使用。考慮將所有此類引數插入 quantization_config。
subfolder (str, 可選, 預設為 "") — 如果相關檔案位於 huggingface.co 上模型 repo 的子資料夾中，您可以在此處指定資料夾名稱。
variant (str, 可選) — 如果指定，則從 variant 檔名載入權重，例如 pytorch_model..bin。在使用 from_tf 或 from_flax 時，variant 被忽略。
use_safetensors (bool, 可選, 預設為 None) — 是否使用 safetensors 檢查點。預設為 None。如果未指定且未安裝 safetensors，則將其設定為 False。
weights_only (bool, 可選, 預設為 True) — 指示解封裝器是否應僅限於載入張量、基本型別、字典和透過 torch.serialization.add_safe_globals() 新增的任何型別。設定為 False 時，我們可以載入包裝器張量子類權重。
key_mapping (dict[str, str], 可選) — 如果使用與 Transformers 架構相容但未相應轉換的 Hub 上的模型，則為權重名稱的潛在對映。
kwargs (剩餘的關鍵字引數字典，可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果提供了 config，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）
- 如果未提供配置，kwargs 將首先傳遞給配置類初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型配置例項化預訓練 pytorch 模型。

預設情況下，模型使用 model.eval() 設定為評估模式（Dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

警告Weights from XXX not initialized from pretrained model表示 XXX 的權重未隨模型的其餘部分預訓練。您需要使用下游微調任務來訓練這些權重。

警告Weights from XXX not used in YYY表示層 XXX 未被 YYY 使用，因此這些權重被丟棄。

啟用特殊的“離線模式”以在防火牆環境中使用此方法。

示例

>>> from transformers import BertConfig, BertModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = BertModel.from_pretrained("google-bert/bert-base-uncased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = BertModel.from_pretrained("./test/saved_model/")
>>> # Update configuration during loading.
>>> model = BertModel.from_pretrained("google-bert/bert-base-uncased", output_attentions=True)
>>> assert model.config.output_attentions == True
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./tf_model/my_tf_model_config.json")
>>> model = BertModel.from_pretrained("./tf_model/my_tf_checkpoint.ckpt.index", from_tf=True, config=config)
>>> # Loading from a Flax checkpoint file instead of a PyTorch model (slower)
>>> model = BertModel.from_pretrained("google-bert/bert-base-uncased", from_flax=True)

get_compiled_call

( compile_config: typing.Optional[transformers.generation.configuration_utils.CompileConfig] )

返回 self.__call__ 的 torch.compile 版本。這對於在推理期間動態選擇非編譯/編譯的 forward 非常有用，尤其是在預填充（我們不想使用編譯版本以避免使用新形狀重新計算圖）和迭代解碼（我們希望透過靜態形狀獲得編譯版本的加速）之間切換時。

get_input_embeddings

( ) → nn.Module

nn.Module

將詞彙對映到隱藏狀態的 torch 模組。

返回模型的輸入嵌入。

get_memory_footprint

( return_buffers = True )

引數

return_buffers (bool, 可選, 預設為 True) — 在計算記憶體佔用時是否返回緩衝區張量的大小。緩衝區是不需要梯度且未註冊為引數的張量。例如，批次歸一化層中的均值和標準差。請參閱：https://discuss.pytorch.org/t/what-pytorch-means-by-buffers/120266/2

獲取模型的記憶體佔用。這將返回當前模型以位元組為單位的記憶體佔用。這對於衡量當前模型的記憶體佔用和設計一些測試很有用。解決方案靈感來自 PyTorch 討論：https://discuss.pytorch.org/t/gpu-memory-that-model-uses/56822/2

get_output_embeddings

( ) → nn.Module

nn.Module

將隱藏狀態對映到詞彙的 torch 模組。

返回模型的輸出嵌入。

get_parameter_or_buffer

( target: str )

如果存在，則返回由 target 給定的引數或緩衝區，否則丟擲錯誤。這在一個便捷的函式中結合了 get_parameter() 和 get_buffer()。如果目標是 _extra_state 屬性，它將返回模組提供的額外狀態。請注意，它僅在 target 是模型的葉節點時才有效。

gradient_checkpointing_disable

( )

為當前模型停用梯度檢查點。

請注意，在其他框架中，此功能可能被稱為“啟用檢查點”或“檢查點啟用”。

gradient_checkpointing_enable

( gradient_checkpointing_kwargs = None )

引數

gradient_checkpointing_kwargs (字典, 可選) — 傳遞給 torch.utils.checkpoint.checkpoint 函式的附加關鍵字引數。

為當前模型啟用梯度檢查點。

請注意，在其他框架中，此功能可能被稱為“啟用檢查點”或“檢查點啟用”。

我們傳遞模組的 __call__ 方法而不是 forward，因為 __call__ 附加了模組的所有鉤子。https://discuss.pytorch.org/t/any-different-between-model-input-and-model-forward-input/3690/2

init_weights

( )

如果需要，修剪並可能初始化權重。如果使用自定義 PreTrainedModel，您需要在 _init_weights 中實現任何初始化邏輯。

initialize_weights

( )

這相當於呼叫 self.apply(self._initialize_weights)，但正確處理了複合模型。此函式動態地將正確的 init_weights 函式分派給模組，因為我們在模組圖沿遞迴進行。它可以處理任意數量的子模型。如果沒有它，每個複合模型都必須在最外層的 _init_weights 中顯式地對所有子模型進行第二次遞迴，這極易出錯且效率低下。

另請注意，torch.no_grad() 裝飾器也非常重要，因為我們的大多數 _init_weights 不使用 torch.nn.init 函式（預設情況下都是 no_grad），而只是執行就地操作，例如 `module.weight.data.zero_()`。

post_init

( )

在每個 Transformer 模型初始化結束時執行的方法，用於執行需要模型模組正確初始化的程式碼（例如權重初始化）。

prune_heads

( heads_to_prune: dict )

引數

heads_to_prune (dict[int, list[int]]) — 字典，其中鍵為選定的層索引 (int)，關聯值為在該層中要修剪的注意力頭列表 (list of int)。例如，{1: [0, 2], 2: [2, 3]} 將修剪第 1 層的注意力頭 0 和 2，以及第 2 層的注意力頭 2 和 3。

修剪基本模型的注意力頭。

register_for_auto_class

( auto_class = 'AutoModel' )

引數

auto_class (str 或 type, 可選, 預設為 "AutoModel") — 要註冊此新模型的自動類。

將此類註冊到給定的自動類。這僅應用於自定義模型，因為庫中的模型已經對映到自動類。

resize_token_embeddings

( new_num_tokens: typing.Optional[int] = None pad_to_multiple_of: typing.Optional[int] = None mean_resizing: bool = True ) → torch.nn.Embedding

引數

new_num_tokens (int, 可選) — 嵌入矩陣中的新 token 數量。增加大小將在末尾新增新初始化的向量。減小大小將從末尾移除向量。如果未提供或為 None，則只返回模型輸入 token torch.nn.Embedding 模組的指標，不進行任何操作。
pad_to_multiple_of (int, 可選) — 如果設定，將把嵌入矩陣填充到提供值的倍數。如果 new_num_tokens 設定為 None，則只會將嵌入填充到 pad_to_multiple_of 的倍數。

這對於在計算能力 >= 7.5 (Volta) 的 NVIDIA 硬體上，或在序列長度為 128 倍數的 TPU 上，啟用 Tensor Cores 的使用特別有用。有關此內容的更多詳細資訊，或有關選擇正確大小調整值的幫助，請參閱此指南：https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
mean_resizing (bool) — 是否從具有舊嵌入均值和協方差的多元正態分佈初始化新增嵌入，或者是否使用均值為零且標準差等於 config.initializer_range 的正態分佈初始化它們。

當增加因果語言模型的嵌入大小時，將 mean_resizing 設定為 True 非常有用，因為新增嵌入不會影響生成的 token 的機率，因為用舊嵌入的均值初始化新嵌入將減少新增嵌入前後下一個 token 機率之間的 kl-散度。有關更多資訊，請參閱此文章：https://nlp.stanford.edu/~johnhew/vocab-expansion.html

torch.nn.Embedding

指向模型輸入 token 嵌入模組的指標。

如果 new_num_tokens != config.vocab_size，則調整模型的輸入 token 嵌入矩陣。

如果模型類具有 tie_weights() 方法，則之後處理繫結權重嵌入。

reverse_bettertransformer

( ) → PreTrainedModel

PreTrainedModel

轉換回原始建模的模型。

恢復 to_bettertransformer() 的轉換，以便使用原始建模，例如為了儲存模型。

save_pretrained

( save_directory: typing.Union[str, os.PathLike] is_main_process: bool = True state_dict: typing.Optional[dict] = None save_function: typing.Callable = <function save at 0x7f0107b99fc0> push_to_hub: bool = False max_shard_size: typing.Union[int, str] = '5GB' safe_serialization: bool = True variant: typing.Optional[str] = None token: typing.Union[str, bool, NoneType] = None save_peft_format: bool = True **kwargs )

引數

save_directory (str 或 os.PathLike) — 儲存到的目錄。如果不存在，將建立。
is_main_process (bool, 可選, 預設為 True) — 呼叫此過程的程序是否為主程序。在分散式訓練（如 TPUs）中非常有用，需要在所有程序上呼叫此函式。在這種情況下，僅在主程序上設定 is_main_process=True 以避免競爭條件。
state_dict (torch.Tensor 的巢狀字典) — 要儲存的模型的狀態字典。預設為 self.state_dict()，但可用於僅儲存模型的一部分，或者在恢復模型的狀態字典時需要特殊預防措施（例如在使用模型並行時）。
save_function (Callable) — 用於儲存狀態字典的函式。在分散式訓練（如 TPUs）中非常有用，當需要用另一種方法替換 torch.save 時。
push_to_hub (bool, 可選, 預設為 False) — 是否在儲存模型後將其推送到 Hugging Face 模型 hub。您可以使用 repo_id 指定要推送到的倉庫（預設為您名稱空間中 save_directory 的名稱）。
max_shard_size (int 或 str, 可選, 預設為 "5GB") — 檢查點在分片前的最大大小。檢查點分片的大小將低於此大小。如果表示為字串，則需要是數字後跟單位（如 "5MB"）。我們預設將其設定為 5GB，以便模型能夠在免費版 Google Colab 例項上輕鬆執行而不會出現 CPU OOM 問題。

如果模型中的單個權重大於 max_shard_size，它將位於其自己的檢查點分片中，該分片將大於 max_shard_size。
safe_serialization (bool, 可選, 預設為 True) — 是否使用 safetensors 或傳統的 PyTorch 方式（使用 pickle）儲存模型。
variant (str, 可選) — 如果指定，權重將以 pytorch_model..bin 格式儲存。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP bearer 授權的 token。如果為 True 或未指定，將使用執行 huggingface-cli login 時生成的 token（儲存在 ~/.huggingface 中）。
save_peft_format (bool, 可選, 預設為 True) — 為了與 PEFT 庫向後相容，如果介面卡權重附加到模型，則介面卡狀態字典的所有鍵都需要以 base_model.model 為字首。高階使用者可以透過將 save_peft_format 設定為 False 來停用此行為。
kwargs (dict[str, Any], optional) — 傳遞給 push_to_hub() 方法的附加關鍵字引數。

將模型及其配置檔案儲存到目錄中，以便可以使用 from_pretrained() 類方法重新載入。

set_input_embeddings

( value: Module )

引數

value (nn.Module) — 將詞彙對映到隱藏狀態的模組。

設定模型的輸入嵌入。

tie_weights

( )

繫結輸入嵌入和輸出嵌入之間的權重。

如果在配置中設定了 `torchscript` 標誌，則無法處理引數共享，因此我們將克隆權重而不是共享引數。

to_bettertransformer

( ) → PreTrainedModel

PreTrainedModel

轉換為 BetterTransformer 的模型。

透過 Optimum 庫將模型轉換為使用 PyTorch 的原生注意力實現。僅支援所有 Transformers 模型中的一部分。

PyTorch 的注意力快速路徑透過核心融合和巢狀張量的使用，可以加速推理。詳細的基準測試可以在這篇部落格文章中找到。

warn_if_padding_and_no_attention_mask

( input_ids attention_mask )

如果 `input_ids` 似乎包含填充且未提供注意力掩碼，則顯示一次性警告。

自定義模型還應包含 `_supports_assign_param_buffer`，它決定了超快速初始化是否適用於特定模型。如果 `test_save_and_load_from_pretrained` 失敗，則說明您的模型需要此功能。如果是這樣，請將其設定為 `False`。

ModuleUtilsMixin

class transformers.modeling_utils.ModuleUtilsMixin

( )

一些用於 `torch.nn.Modules` 的實用工具，可用作混合。

add_memory_hooks

( )

在每個子模組的前向傳播之前和之後新增記憶體鉤子，以記錄記憶體消耗的增加。

記憶體消耗的增加儲存在每個模組的 `mem_rss_diff` 屬性中，可以透過 `model.reset_memory_hooks_state()` 重置為零。

estimate_tokens

( input_dict: dict ) → int

引數

inputs (dict) — 模型輸入。

int

總的 token 數量。

一個輔助函式，用於從模型輸入中估算總的 token 數量。

floating_point_ops

( input_dict: dict exclude_embeddings: bool = True ) → int

引數

batch_size (int) — 前向傳播的批次大小。
sequence_length (int) — 批次中每行的 token 數量。
exclude_embeddings (bool, optional, defaults to True) — 是否計算嵌入和 softmax 操作。

int

浮點運算次數。

獲取此 transformer 模型的正向和反向傳播的（可選地，非嵌入）浮點運算次數。預設近似忽略了對 token 數量的二次依賴（在 `12 * d_model << sequence_length` 有效），如這篇論文 2.1 節所述。對於引數重用的 transformer（例如 Albert 或 Universal Transformers），或者使用非常高序列長度進行長距離建模時，應重寫此方法。

get_extended_attention_mask

( attention_mask: Tensor input_shape: tuple device: device = None dtype: torch.float32 = None )

引數

attention_mask (torch.Tensor) — 注意力掩碼，1 表示要關注的 token，0 表示要忽略的 token。
input_shape (tuple[int]) — 模型輸入的形狀。

製作可廣播的注意力掩碼和因果掩碼，以便忽略未來和被掩碼的 token。

get_head_mask

( head_mask: typing.Optional[torch.Tensor] num_hidden_layers: int is_attention_chunked: bool = False )

引數

head_mask (torch.Tensor, 形狀為 [num_heads] 或 [num_hidden_layers x num_heads], 可選) — 指示是否保留頭的掩碼（1.0 表示保留，0.0 表示丟棄）。
num_hidden_layers (int) — 模型中的隱藏層數量。
is_attention_chunked (bool, 可選, 預設為 False) — 注意力得分是否按塊計算。

如果需要，準備頭掩碼。

invert_attention_mask

( encoder_attention_mask: Tensor ) → torch.Tensor

引數

encoder_attention_mask (torch.Tensor) — 注意力掩碼。

torch.Tensor

反轉的注意力掩碼。

反轉註意力掩碼（例如，將 0. 切換為 1.）。

num_parameters

( only_trainable: bool = False exclude_embeddings: bool = False ) → int

引數

only_trainable (bool, 可選, 預設為 False) — 是否只返回可訓練引數的數量。
exclude_embeddings (bool, 可選, 預設為 False) — 是否只返回非嵌入引數的數量。

int

引數的數量。

獲取模組中（可選地，可訓練或非嵌入）引數的數量。

reset_memory_hooks_state

( )

重置每個模組的 `mem_rss_diff` 屬性（參見 add_memory_hooks()）。

TFPreTrainedModel

class transformers.TFPreTrainedModel

( config *inputs **kwargs )

所有 TF 模型的基礎類。

TFPreTrainedModel 負責儲存模型的配置並處理模型的載入、下載和儲存方法，以及一些所有模型通用的方法。

調整輸入嵌入的大小，
剪枝自注意力頭中的頭部。

類屬性（由派生類覆蓋）

config_class (PretrainedConfig) — 一個 PretrainedConfig 的子類，用作此模型架構的配置類。
base_model_prefix (str) — 一個字串，指示在新增模組到基礎模型之上的相同架構的派生類中，與基礎模型關聯的屬性。
main_input_name (str) — 模型的主要輸入名稱（NLP 模型通常為 input_ids，視覺模型為 pixel_values，語音模型為 input_values）。

push_to_hub

( repo_id: str use_temp_dir: Optional[bool] = None commit_message: Optional[str] = None private: Optional[bool] = None max_shard_size: Optional[Union[int, str]] = '10GB' token: Optional[Union[bool, str]] = None use_auth_token: Optional[Union[bool, str]] = None create_pr: bool = False **base_model_card_args )

引數

repo_id (str) — 您要將模型推送到的倉庫名稱。當推送到某個組織時，應包含您的組織名稱。
use_temp_dir (bool, 可選) — 是否使用臨時目錄來儲存在推送到 Hub 之前儲存的檔案。如果不存在名為 `repo_id` 的目錄，則預設為 `True`，否則為 `False`。
commit_message (str, 可選) — 推送時的提交訊息。預設為 `"Upload model"`。
private (bool, 可選) — 是否將倉庫設為私有。如果為 `None`（預設），倉庫將是公共的，除非組織的預設設定為私有。如果倉庫已存在，此值將被忽略。
token (bool 或 str, 可選) — 用作遠端檔案 HTTP 承載授權的令牌。如果為 `True`，將使用執行 `huggingface-cli login` 時生成的令牌（儲存在 `~/.huggingface` 中）。如果未指定 `repo_url`，則預設為 `True`。
max_shard_size (int 或 str, 可選, 預設為 "10GB") — 僅適用於模型。分片前檢查點的最大大小。檢查點分片的大小將小於此大小。如果表示為字串，則需要是數字後跟單位（如 "5MB"）。
create_pr (bool, 可選, 預設為 False) — 是否建立包含上傳檔案的 PR 或直接提交。

將模型檔案上傳到 🤗 模型中心，同時同步本地倉庫克隆到 `repo_path_or_name`。

示例

from transformers import TFAutoModel

model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("huggingface/my-finetuned-bert")

can_generate

( ) → bool

布林值

此模型是否可以使用 .generate() 生成序列。

返回此模型是否可以使用 ` .generate()` 生成序列。

compile

( optimizer = 'rmsprop' loss = 'auto_with_warning' metrics = None loss_weights = None weighted_metrics = None run_eagerly = None steps_per_execution = None **kwargs )

這是一個簡單的包裝器，如果使用者未指定損失函式，則將模型的損失輸出頭設定為損失函式。

建立模型卡片

( output_dir model_name: str language: Optional[str] = None license: Optional[str] = None tags: Optional[str] = None finetuned_from: Optional[str] = None tasks: Optional[str] = None dataset_tags: Optional[Union[str, list[str]]] = None dataset: Optional[Union[str, list[str]]] = None dataset_args: Optional[Union[str, list[str]]] = None )

引數

output_dir (str 或 os.PathLike) — 建立模型卡片的資料夾。
model_name (str, 可選) — 模型名稱。
language (str, 可選) — 模型的語言（如果適用）。
license (str, 可選) — 模型的許可。如果提供給 `Trainer` 的原始模型來自 Hub 上的倉庫，則預設為該預訓練模型的許可。
tags (str 或 list[str], 可選) — 要包含在模型卡片元資料中的標籤。
finetuned_from (str, 可選) — 用於微調此模型的模型名稱（如果適用）。預設為提供給 `Trainer` 的原始模型的倉庫名稱（如果它來自 Hub）。
tasks (str 或 list[str], 可選) — 一個或多個任務識別符號，包含在模型卡片元資料中。
dataset_tags (str 或 list[str], 可選) — 一個或多個數據集標籤，包含在模型卡片元資料中。
dataset (str 或 list[str], 可選) — 一個或多個數據集識別符號，包含在模型卡片元資料中。
dataset_args (str 或 list[str], 可選) — 一個或多個數據集引數，包含在模型卡片元資料中。

使用 Trainer 可用的資訊建立模型卡片的草稿。

from_pretrained

( pretrained_model_name_or_path: Optional[Union[str, os.PathLike]] *model_args config: Optional[Union[PretrainedConfig, str, os.PathLike]] = None cache_dir: Optional[Union[str, os.PathLike]] = None ignore_mismatched_sizes: bool = False force_download: bool = False local_files_only: bool = False token: Optional[Union[str, bool]] = None revision: str = 'main' use_safetensors: Optional[bool] = None **kwargs )

引數

pretrained_model_name_or_path (str, optional) — 可以是以下任意一種：
- 一個字串，是託管在huggingface.co模型庫中的預訓練模型的模型ID。
- 一個目錄的路徑，該目錄包含使用save_pretrained()儲存的模型權重，例如，./my_model_directory/。
- 一個PyTorch state_dict儲存檔案的路徑或URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt應該設定為True，並且應該提供一個配置物件作為config引數。這種載入路徑比使用提供的轉換指令碼將PyTorch模型轉換為TensorFlow模型再載入TensorFlow模型要慢。
- 如果您同時提供了配置和狀態字典（分別透過關鍵字引數config和state_dict），則為None。
model_args (位置引數序列，可選) — 所有剩餘的位置引數將傳遞給底層模型的__init__方法。
config (Union[PretrainedConfig, str], 可選) — 可以是以下任意一種：
- PretrainedConfig的派生類例項，
- 作為from_pretrained()輸入有效的字串。
用於模型配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的模型ID字串載入）。
- 模型使用save_pretrained()儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為pretrained_model_name_or_path載入模型，並且在目錄中找到名為config.json的配置JSON檔案。
from_pt (bool, 可選, 預設為False) — 從PyTorch state_dict儲存檔案載入模型權重（參見pretrained_model_name_or_path引數的文件字串）。
ignore_mismatched_sizes (bool, 可選, 預設為False) — 如果檢查點中的某些權重與模型權重大小不匹配，是否引發錯誤（例如，如果從具有3個標籤的檢查點例項化具有10個標籤的模型）。
cache_dir (str, 可選) — 快取下載的預訓練模型配置的目錄路徑，如果不需要使用標準快取。
force_download (bool, 可選, 預設為False) — 是否強制（重新）下載模型權重和配置檔案，如果已存在則覆蓋快取版本。
resume_download — 已棄用並忽略。現在預設情況下儘可能恢復所有下載。將在Transformers v5中刪除。
proxies — (dict[str, str], 可選): 一個字典，包含按協議或端點使用的代理伺服器，例如，{‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}。每個請求都會使用代理。output_loading_info(bool, *可選*, 預設為False`): 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
token (str 或 bool, 可選) — 用於遠端檔案的HTTP bearer授權令牌。如果為True或未指定，將使用執行huggingface-cli login時生成的令牌（儲存在~/.huggingface中）。
revision (str, 可選, 預設為"main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交ID，因為我們在huggingface.co上使用基於git的系統儲存模型和其他工件，所以revision可以是git允許的任何識別符號。

從預訓練的模型配置例項化預訓練的TF 2.0模型。

警告Weights from XXX not initialized from pretrained model表示 XXX 的權重未隨模型的其餘部分預訓練。您需要使用下游微調任務來訓練這些權重。

警告Weights from XXX not used in YYY表示層 XXX 未被 YYY 使用，因此這些權重被丟棄。

示例

>>> from transformers import BertConfig, TFBertModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFBertModel.from_pretrained("google-bert/bert-base-uncased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = TFBertModel.from_pretrained("./test/saved_model/")
>>> # Update configuration during loading.
>>> model = TFBertModel.from_pretrained("google-bert/bert-base-uncased", output_attentions=True)
>>> assert model.config.output_attentions == True
>>> # Loading from a Pytorch model file instead of a TensorFlow checkpoint (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./pt_model/my_pt_model_config.json")
>>> model = TFBertModel.from_pretrained("./pt_model/my_pytorch_model.bin", from_pt=True, config=config)

get_bias

( ) → tf.Variable

tf.Variable

表示偏置的權重，如果不是LM模型則為None。

附加到LM頭的偏置字典。鍵表示偏置屬性的名稱。

get_head_mask

( head_mask: tf.Tensor | None num_hidden_layers: int )

引數

head_mask (tf.Tensor，形狀為[num_heads]或[num_hidden_layers x num_heads]，可選) — 指示我們是否應保留頭部的掩碼（1.0表示保留，0.0表示丟棄）。
num_hidden_layers (int) — 模型中的隱藏層數量。

如果需要，準備頭掩碼。

get_input_embeddings

( ) → tf.Variable

tf.Variable

將詞彙對映到隱藏狀態的嵌入層。

返回模型的輸入嵌入層。

get_lm_head

( ) → keras.layers.Layer

keras.layers.Layer

如果模型有LM頭層，則返回該層；否則返回None。

LM頭層。此方法必須由所有具有LM頭的模型重寫。

get_output_embeddings

( ) → tf.Variable

tf.Variable

新的權重將詞彙對映到隱藏狀態。

返回模型的輸出嵌入。

get_output_layer_with_bias

( ) → keras.layers.Layer

keras.layers.Layer

處理偏置的層，如果不是LM模型則為None。

獲取在模型具有與嵌入繫結的LM頭的情況下處理偏置屬性的層

get_prefix_bias_name

( ) → str

字串

偏置的_字首名稱。

獲取偏置從模型名稱到父層的連線_字首名稱

prepare_tf_dataset

( dataset: datasets.Dataset batch_size: int = 8 shuffle: bool = True tokenizer: Optional[PreTrainedTokenizerBase] = None collate_fn: Optional[Callable] = None collate_fn_args: Optional[dict[str, Any]] = None drop_remainder: Optional[bool] = None prefetch: bool = True ) → Dataset

引數

dataset (Any) — 要封裝為tf.data.Dataset的[~datasets.Dataset]。
batch_size (int, 可選, 預設為8) — 要返回的批次大小。
shuffle (bool, 預設為True) — 是否以隨機順序返回資料集中的樣本。訓練資料集通常為True，驗證/測試資料集通常為False。
tokenizer (PreTrainedTokenizerBase, 可選) — 用於填充樣本以建立批次的PreTrainedTokenizer。如果未提供特定collate_fn，則無影響。
collate_fn (Callable, 可選) — 將資料集中的樣本整理成單個批次的功能。如果未提供tokenizer，則預設為DefaultDataCollator；如果提供了tokenizer，則預設為DataCollatorWithPadding。
collate_fn_args (dict[str, Any], 可選) — 一個字典，包含要傳遞給collate_fn的引數以及樣本列表。
drop_remainder (bool, 可選) — 如果批次大小不能均勻地劃分資料集長度，是否丟棄最後一個批次。預設為與shuffle相同的設定。
prefetch (bool, 預設為True) — 是否在tf.data管道的末尾新增預取。這幾乎總是對效能有益的，但在特殊情況下可以停用。

資料集

一個準備好傳遞給Keras API的tf.data.Dataset。

將HuggingFace Dataset封裝為具有整理和批處理功能的tf.data.Dataset。此方法旨在建立一個“即用型”資料集，可以直接傳遞給Keras方法（如fit()）而無需進一步修改。如果資料集中的列與模型的輸入名稱不匹配，則該方法將丟棄這些列。如果您想指定要返回的列名，而不是使用與此模型匹配的名稱，我們建議改用Dataset.to_tf_dataset()。

prune_heads

( heads_to_prune )

引數

heads_to_prune (dict[int, list[int]]) — 字典，其中鍵為選定的層索引（int），關聯值為在該層中要修剪的頭列表（int列表）。例如，{1: [0, 2], 2: [2, 3]} 將修剪第1層的0和2號頭，以及第2層的2和3號頭。

修剪基本模型的注意力頭。

register_for_auto_class

( auto_class = 'TFAutoModel' )

引數

auto_class (str 或 type, 可選, 預設為"TFAutoModel") — 用於註冊此新模型的自動類。

將此類註冊到給定的自動類。這僅應用於自定義模型，因為庫中的模型已經對映到自動類。

resize_token_embeddings

( new_num_tokens: Optional[int] = None ) → tf.Variable or keras.layers.Embedding

引數

new_num_tokens (int, 可選) — 嵌入矩陣中的新令牌數量。增加大小將在末尾新增新初始化的向量。減小大小將從末尾刪除向量。如果未提供或為None，則只返回指向輸入令牌的指標，而不執行任何操作。

tf.Variable 或 keras.layers.Embedding

指向模型輸入令牌的指標。

如果 new_num_tokens != config.vocab_size，則調整模型的輸入 token 嵌入矩陣。

如果模型類具有 tie_weights() 方法，則之後處理繫結權重嵌入。

save_pretrained

( save_directory saved_model = False version = 1 push_to_hub = False signatures = None max_shard_size: Union[int, str] = '5GB' create_pr: bool = False safe_serialization: bool = False token: Optional[Union[str, bool]] = None **kwargs )

引數

save_directory (str) — 儲存的目錄。如果不存在，將建立該目錄。
saved_model (bool, 可選, 預設為False) — 模型是否也要以儲存的模型格式儲存。
version (int, 可選, 預設為1) — 儲存模型的版本。為了能被TensorFlow Serving正確載入，儲存的模型需要版本化，詳情請參見官方文件https://www.tensorflow.org/tfx/serving/serving_basic
push_to_hub (bool, 可選, 預設為False) — 儲存模型後是否將其推送到Hugging Face模型中心。您可以使用repo_id指定要推送到的儲存庫（將預設為您名稱空間中save_directory的名稱）。
signatures (dict 或 tf.function, 可選) — 用於服務模型的模型簽名。這將傳遞給model.save()的signatures引數。
max_shard_size (int 或 str, 可選, 預設為"10GB") — 檢查點在分片前的最大大小。分片後的檢查點大小將小於此大小。如果表示為字串，則需要是數字後跟單位（例如"5MB"）。

如果模型的單個權重大於max_shard_size，它將位於其自己的檢查點分片中，該分片將大於max_shard_size。
create_pr (bool, 可選, 預設為False) — 是否使用上傳的檔案建立PR或直接提交。
safe_serialization (bool, 可選, 預設為False) — 是否使用safetensors或傳統的TensorFlow方式（使用h5）儲存模型。
token (str 或 bool, 可選) — 用於遠端檔案的HTTP bearer授權令牌。如果為True或未指定，將使用執行huggingface-cli login時生成的令牌（儲存在~/.huggingface中）。
kwargs (dict[str, Any], 可選) — 傳遞給push_to_hub()方法的其他關鍵字引數。

將模型及其配置檔案儲存到目錄，以便可以使用from_pretrained()類方法重新載入。

serving

( inputs )

引數

方法用於服務模型。沒有特定的簽名，但將專門化為具體 —
函式當使用 save_pretrained 儲存時。— 輸入 (dict[str, tf.Tensor]): 儲存模型輸入，以張量字典形式。

serving_output

( 輸出 )

準備儲存的模型輸出。如果需要進行特定的服務修改，可以覆蓋此方法。

set_bias

( 值 )

引數

值 (dict[tf.Variable]) — 所有附加到 LM head 的新偏置。

設定 LM head 中的所有偏置。

set_input_embeddings

( 值 )

引數

值 (tf.Variable) — 將隱藏狀態對映到詞彙表的新權重。

設定模型的輸入嵌入。

set_output_embeddings

( 值 )

引數

值 (tf.Variable) — 將隱藏狀態對映到詞彙表的新權重。

設定模型的輸出嵌入。

test_step

( 資料 )

Keras 預設 train_step 的修改版，它正確處理模型輸出與標籤的匹配，並支援直接在損失輸出頭上進行訓練。此外，它確保在適當時將輸入鍵複製到標籤中。當使用虛擬損失時，它還會將標籤鍵複製到輸入字典中，以確保模型在前向傳播期間可以使用它們。

train_step

( 資料 )

TFModelUtilsMixin

類 transformers.modeling_tf_utils.TFModelUtilsMixin

( )

keras.Model 的一些實用工具，用作混合。

num_parameters

( 僅可訓練: bool = False ) → 整數

引數

僅可訓練 (bool, 可選, 預設為 False) — 是否僅返回可訓練引數的數量。

int

引數的數量。

獲取模型中（可選地，可訓練）引數的數量。

FlaxPreTrainedModel

類 transformers.FlaxPreTrainedModel

( 配置: PretrainedConfig 模組: Module 輸入形狀: tuple = (1, 1) 種子: int = 0 資料型別: dtype = <class 'jax.numpy.float32'> _do_init: bool = True )

所有模型的基類。

FlaxPreTrainedModel 負責儲存模型配置並處理模型載入、下載和儲存的方法。

類屬性（由派生類覆蓋）

config_class (PretrainedConfig) — 一個 PretrainedConfig 的子類，用作此模型架構的配置類。
base_model_prefix (str) — 一個字串，指示在新增模組到基礎模型之上的相同架構的派生類中，與基礎模型關聯的屬性。
main_input_name (str) — 模型的主要輸入名稱（NLP 模型通常為 input_ids，視覺模型為 pixel_values，語音模型為 input_values）。

push_to_hub

引數

repo_id (str) — 您想將模型推送到的倉庫名稱。當推送到某個組織時，它應該包含您的組織名稱。
use_temp_dir (bool, 可選) — 是否使用臨時目錄儲存儲存後要推送到 Hub 的檔案。如果不存在名為 repo_id 的目錄，則預設為 True，否則為 False。
commit_message (str, 可選) — 推送時的提交訊息。預設為 "Upload model"。
private (bool, 可選) — 是否將倉庫設為私有。如果為 None（預設），倉庫將是公開的，除非組織的預設設定為私有。如果倉庫已存在，此值將被忽略。
token (bool 或 str, 可選) — 用於遠端檔案的 HTTP 承載授權令牌。如果為 True，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。如果未指定 repo_url，則預設為 True。
max_shard_size (int 或 str, 可選, 預設為 "5GB") — 僅適用於模型。分片前檢查點的最大大小。檢查點分片的大小將低於此大小。如果表示為字串，則需要後跟單位的數字（如 "5MB"）。我們將其預設設定為 "5GB"，以便使用者可以在免費層 Google Colab 例項上輕鬆載入模型，而不會出現 CPU OOM 問題。
create_pr (bool, 可選, 預設為 False) — 是否建立包含上傳檔案的 PR，或直接提交。
safe_serialization (bool, 可選, 預設為 True) — 是否將模型權重轉換為安全張量格式以實現更安全的序列化。
revision (str, 可選) — 要推送到上傳檔案的分支。
commit_description (str, 可選) — 將建立的提交的描述。
tags (list[str], 可選) — 要推送到 Hub 的標籤列表。

將模型檢查點上傳到 🤗 模型中心。

示例

from transformers import FlaxAutoModel

model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("huggingface/my-finetuned-bert")

can_generate

( )

返回此模型是否可以使用 .generate() 生成序列。返回：bool：此模型是否可以使用 .generate() 生成序列。

from_pretrained

( 預訓練模型名稱或路徑: typing.Union[str, os.PathLike] 資料型別: dtype = <class 'jax.numpy.float32'> *模型引數配置: typing.Union[transformers.configuration_utils.PretrainedConfig, str, os.PathLike, NoneType] = None 快取目錄: typing.Union[str, os.PathLike, NoneType] = None 忽略不匹配大小: bool = False 強制下載: bool = False 僅本地檔案: bool = False 令牌: typing.Union[bool, str, NoneType] = None 修訂版: str = 'main' **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個pt 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_pt 應該設定為 True。
dtype (jax.numpy.dtype, 可選, 預設為 jax.numpy.float32) — 計算的資料型別。可以是 jax.numpy.float32、jax.numpy.float16（在 GPU 上）和 jax.numpy.bfloat16（在 TPU 上）之一。

這可用於在 GPU 或 TPU 上啟用混合精度訓練或半精度推理。如果指定，所有計算都將使用給定的 dtype 執行。

請注意，這僅指定計算的資料型別，不影響模型引數的資料型別。

如果您希望更改模型引數的資料型別，請參閱 to_fp16() 和 to_bf16()。
model_args (位置引數序列，可選) — 所有剩餘的位置引數將傳遞給底層模型的 __init__ 方法。
config (Union[PretrainedConfig, str, os.PathLike], 可選) — 可以是以下之一：
- 一個派生自 PretrainedConfig 的類例項，
- 一個字串或作為 from_pretrained() 輸入的有效路徑。
模型使用的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的模型 ID 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到名為 config.json 的配置 JSON 檔案。
cache_dir (Union[str, os.PathLike], 可選) — 下載的預訓練模型配置的快取目錄路徑，如果不需要使用標準快取。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
ignore_mismatched_sizes (bool, 可選, 預設為 False) — 如果檢查點中的某些權重與模型權重大小不一致（例如，您正在例項化一個具有 10 個標籤的模型，而檢查點具有 3 個標籤），是否引發錯誤。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋現有快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（即，不嘗試下載模型）。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP 承載授權令牌。如果為 True 或未指定，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 Git 的系統在 huggingface.co 上儲存模型和其他工件，因此 revision 可以是 Git 允許的任何識別符號。

從預訓練模型配置例項化一個預訓練的 flax 模型。

警告Weights from XXX not initialized from pretrained model表示 XXX 的權重未隨模型的其餘部分預訓練。您需要使用下游微調任務來訓練這些權重。

警告Weights from XXX not used in YYY表示層 XXX 未被 YYY 使用，因此這些權重被丟棄。

示例

>>> from transformers import BertConfig, FlaxBertModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = FlaxBertModel.from_pretrained("./test/saved_model/")
>>> # Loading from a PyTorch checkpoint file instead of a PyTorch model (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./pt_model/config.json")
>>> model = FlaxBertModel.from_pretrained("./pt_model/pytorch_model.bin", from_pt=True, config=config)

load_flax_sharded_weights

( 分片檔案 ) → 字典

引數

shard_files (list[str] — 要載入的分片檔案列表。

字典

模型引數的巢狀字典，格式符合 flax 模型預期：{'model': {'params': {'...'}}}。

這與 flax.serialization.from_bytes (https:lax.readthedocs.io/en/latest/_modules/flax/serialization.html#from_bytes) 相同，但適用於分片檢查點。

此載入操作高效執行：每個檢查點分片逐一載入到 RAM 中，並在載入到模型後刪除。

register_for_auto_class

( 自動類 = 'FlaxAutoModel' )

引數

auto_class (str 或 type, 可選, 預設為 "FlaxAutoModel") — 用於註冊此新模型的自動類。

將此類註冊到給定的自動類。這僅應用於自定義模型，因為庫中的模型已經對映到自動類。

save_pretrained

( 儲存目錄: typing.Union[str, os.PathLike] 引數 = None 推送到集線器 = False 最大分片大小 = '10GB' 令牌: typing.Union[bool, str, NoneType] = None 安全序列化: bool = False **kwargs )

引數

save_directory (str 或 os.PathLike) — 儲存的目錄。如果不存在，將建立。
push_to_hub (bool, 可選, 預設為 False) — 是否在儲存模型後將其推送到 Hugging Face 模型中心。您可以使用 repo_id 指定要推送到的倉庫（預設為您名稱空間中 save_directory 的名稱）。
max_shard_size (int 或 str, 可選, 預設為 "10GB") — 分片前檢查點的最大大小。檢查點分片的大小將低於此大小。如果表示為字串，則需要後跟單位的數字（如 "5MB"）。

如果模型的單個權重大於 max_shard_size，它將位於其自己的檢查點分片中，該分片將大於 max_shard_size。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP 承載授權令牌。如果為 True 或未指定，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
kwargs (dict[str, Any], 可選) — 傳遞給 push_to_hub() 方法的其他關鍵字引數。
safe_serialization (bool, 可選, 預設為 False) — 是否使用 safetensors 或 msgpack 儲存模型。

將模型及其配置檔案儲存到目錄中，以便可以使用 [from_pretrained()](/docs/transformers/v4.53.3/en/main_classes/model#transformers.FlaxPreTrainedModel.from_pretrained) 類方法重新載入它

to_bf16

( params: typing.Union[dict, flax.core.frozen_dict.FrozenDict] mask: typing.Any = None )

引數

params (Union[Dict, FrozenDict]) — 模型引數的 PyTree。
mask (Union[Dict, FrozenDict]) — 與 params 樹結構相同的 PyTree。葉子應該是布林值，您想要進行型別轉換的引數為 True，想要跳過的引數為 False。

將浮點 params 轉換為 jax.numpy.bfloat16。這將返回一個新的 params 樹，並且不會原地轉換 params。

此方法可用於 TPU 上，將模型引數顯式轉換為 bfloat16 精度，以進行全半精度訓練，或以 bfloat16 格式儲存權重以進行推理，從而節省記憶體並提高速度。

示例

>>> from transformers import FlaxBertModel

>>> # load model
>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision
>>> model.params = model.to_bf16(model.params)
>>> # If you want don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows
>>> from flax import traverse_util

>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> flat_params = traverse_util.flatten_dict(model.params)
>>> mask = {
...     path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...     for path in flat_params
... }
>>> mask = traverse_util.unflatten_dict(mask)
>>> model.params = model.to_bf16(model.params, mask)

to_fp16

( params: typing.Union[dict, flax.core.frozen_dict.FrozenDict] mask: typing.Any = None )

引數

params (Union[Dict, FrozenDict]) — 模型引數的 PyTree。
mask (Union[Dict, FrozenDict]) — 與 params 樹結構相同的 PyTree。葉子應該是布林值，您想要進行型別轉換的引數為 True，想要跳過的引數為 False。

將浮點 params 轉換為 jax.numpy.float16。這將返回一個新的 params 樹，並且不會原地轉換 params。

此方法可用於 GPU 上，將模型引數顯式轉換為 float16 精度，以進行全半精度訓練，或以 float16 格式儲存權重以進行推理，從而節省記憶體並提高速度。

示例

>>> from transformers import FlaxBertModel

>>> # load model
>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> # By default, the model params will be in fp32, to cast these to float16
>>> model.params = model.to_fp16(model.params)
>>> # If you want don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows
>>> from flax import traverse_util

>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> flat_params = traverse_util.flatten_dict(model.params)
>>> mask = {
...     path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...     for path in flat_params
... }
>>> mask = traverse_util.unflatten_dict(mask)
>>> model.params = model.to_fp16(model.params, mask)

to_fp32

( params: typing.Union[dict, flax.core.frozen_dict.FrozenDict] mask: typing.Any = None )

引數

params (Union[Dict, FrozenDict]) — 模型引數的 PyTree。
mask (Union[Dict, FrozenDict]) — 與 params 樹結構相同的 PyTree。葉子應該是布林值，您想要進行型別轉換的引數為 True，想要跳過的引數為 False。

將浮點 params 轉換為 jax.numpy.float32。此方法可用於顯式將模型引數轉換為 fp32 精度。這將返回一個新的 params 樹，並且不會原地轉換 params。

示例

>>> from transformers import FlaxBertModel

>>> # Download model and configuration from huggingface.co
>>> model = FlaxBertModel.from_pretrained("google-bert/bert-base-cased")
>>> # By default, the model params will be in fp32, to illustrate the use of this method,
>>> # we'll first cast to fp16 and back to fp32
>>> model.params = model.to_f16(model.params)
>>> # now cast back to fp32
>>> model.params = model.to_fp32(model.params)

推送到 Hub

class transformers.utils.PushToHubMixin

( )

一個 Mixin，包含將模型或分詞器推送到 Hub 的功能。

push_to_hub

引數

repo_id (str) — 您要將 {object} 推送到的儲存庫的名稱。當推送到給定組織時，它應包含您的組織名稱。
use_temp_dir (bool, 可選) — 是否使用臨時目錄儲存在推送到 Hub 之前儲存的檔案。如果沒有名為 repo_id 的目錄，則預設為 True，否則為 False。
commit_message (str, 可選) — 推送時的提交訊息。預設為 "Upload {object}"。
private (bool, 可選) — 是否將倉庫設為私有。如果為 None（預設），除非組織預設為私有，否則倉庫將是公共的。如果倉庫已存在，此值將被忽略。
token (bool 或 str, 可選) — 用作遠端檔案 HTTP 持票人授權的令牌。如果為 True，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。如果未指定 repo_url，則預設為 True。
max_shard_size (int 或 str, 可選, 預設為 "5GB") — 僅適用於模型。分片前檢查點的最大大小。分片後的檢查點大小將低於此大小。如果表示為字串，則需要是數字後跟單位（如 "5MB"）。我們將其預設為 "5GB"，以便使用者可以輕鬆在免費層 Google Colab 例項上載入模型，而不會出現任何 CPU OOM 問題。
create_pr (bool, 可選, 預設為 False) — 是否建立包含已上傳檔案的 PR，或直接提交。
safe_serialization (bool, 可選, 預設為 True) — 是否將模型權重轉換為 safetensors 格式以進行更安全的序列化。
revision (str, 可選) — 推送已上傳檔案的分支。
commit_description (str, 可選) — 將要建立的提交的描述。
tags (list[str], 可選) — 要推送到 Hub 的標籤列表。

將 {object_files} 上傳到 🤗 模型 Hub。

示例

from transformers import {object_class}

{object} = {object_class}.from_pretrained("google-bert/bert-base-cased")

# Push the {object} to your namespace with the name "my-finetuned-bert".
{object}.push_to_hub("my-finetuned-bert")

# Push the {object} to an organization with the name "my-finetuned-bert".
{object}.push_to_hub("huggingface/my-finetuned-bert")

分片檢查點

transformers.modeling_utils.load_sharded_checkpoint