Transformers

( )

這是一個通用的分詞器類，當使用 AutoTokenizer.from_pretrained() 類方法建立時，它將被例項化為庫中的一個分詞器類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中預定義分詞器的 模型 ID。
- 一個包含分詞器所需詞彙檔案的目錄路徑，例如使用 save_pretrained() 方法儲存的目錄，例如 ./my_model_directory/。
- 當且僅當分詞器只需要單個詞彙檔案時（如 Bert 或 XLNet），可以是一個指向單個已儲存詞彙檔案的路徑或 URL，例如：./my_model_directory/vocab.txt。（不適用於所有派生類）
inputs (其他位置引數, 可選) — 將傳遞給分詞器的 `__init__()` 方法。
config (PretrainedConfig, 可選) — 用於確定要例項化的分詞器類的配置物件。
cache_dir (str or os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型配置應快取到的目錄路徑。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，並覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
subfolder (str, 可選) — 如果相關檔案位於 huggingface.co 上的模型倉庫的子資料夾中（例如，對於 facebook/rag-token-base），請在此處指定。
use_fast (bool, 可選, 預設為 True) — 如果給定模型支援，則使用基於 Rust 的快速分詞器。如果給定模型沒有可用的快速分詞器，則返回普通的基於 Python 的分詞器。
tokenizer_type (str, 可選) — 要載入的分詞器型別。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上使用其自定義建模檔案定義的模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 `True`，因為它將在您的本地計算機上執行 Hub 上的程式碼。
kwargs (其他關鍵字引數, 可選) — 將傳遞給分詞器的 `__init__()` 方法。可用於設定特殊標記，如 `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`, `additional_special_tokens`。更多詳情請參閱 `__init__()` 中的引數。

從預訓練模型的詞彙表中例項化庫中的一個分詞器類。

要例項化的分詞器類是根據配置物件（作為引數傳遞或儘可能從 `pretrained_model_name_or_path` 載入）的 `model_type` 屬性來選擇的，或者當該屬性缺失時，則透過對 `pretrained_model_name_or_path` 進行模式匹配來回退選擇。

albert — `AlbertTokenizer` 或 AlbertTokenizerFast (ALBERT 模型)
align — BertTokenizer 或 BertTokenizerFast (ALIGN 模型)
arcee — LlamaTokenizer 或 LlamaTokenizerFast (Arcee 模型)
aria — LlamaTokenizer 或 LlamaTokenizerFast (Aria 模型)
aya_vision — CohereTokenizerFast (AyaVision 模型)
bark — BertTokenizer 或 BertTokenizerFast (Bark 模型)
bart — BartTokenizer 或 BartTokenizerFast (BART 模型)
barthez — BarthezTokenizer 或 BarthezTokenizerFast (BARThez 模型)
bartpho — BartphoTokenizer (BARTpho 模型)
bert — BertTokenizer 或 BertTokenizerFast (BERT 模型)
bert-generation — BertGenerationTokenizer (Bert Generation 模型)
bert-japanese — BertJapaneseTokenizer (BertJapanese 模型)
bertweet — BertweetTokenizer (BERTweet 模型)
big_bird — BigBirdTokenizer 或 BigBirdTokenizerFast (BigBird 模型)
bigbird_pegasus — PegasusTokenizer 或 PegasusTokenizerFast (BigBird-Pegasus 模型)
biogpt — BioGptTokenizer (BioGpt 模型)
bitnet — PreTrainedTokenizerFast (BitNet 模型)
blenderbot — BlenderbotTokenizer 或 BlenderbotTokenizerFast (Blenderbot 模型)
blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmall 模型)
blip — BertTokenizer 或 BertTokenizerFast (BLIP 模型)
blip-2 — GPT2Tokenizer 或 GPT2TokenizerFast (BLIP-2 模型)
bloom — BloomTokenizerFast (BLOOM 模型)
bridgetower — RobertaTokenizer 或 RobertaTokenizerFast (BridgeTower 模型)
bros — BertTokenizer 或 BertTokenizerFast (BROS 模型)
byt5 — ByT5Tokenizer (ByT5 模型)
camembert — CamembertTokenizer 或 CamembertTokenizerFast (CamemBERT 模型)
canine — CanineTokenizer (CANINE 模型)
chameleon — LlamaTokenizer 或 LlamaTokenizerFast (Chameleon 模型)
chinese_clip — BertTokenizer 或 BertTokenizerFast (Chinese-CLIP 模型)
clap — RobertaTokenizer 或 RobertaTokenizerFast (CLAP 模型)
clip — CLIPTokenizer 或 CLIPTokenizerFast (CLIP 模型)
clipseg — CLIPTokenizer 或 CLIPTokenizerFast (CLIPSeg 模型)
clvp — ClvpTokenizer (CLVP 模型)
code_llama — CodeLlamaTokenizer 或 CodeLlamaTokenizerFast (CodeLlama 模型)
codegen — CodeGenTokenizer 或 CodeGenTokenizerFast (CodeGen 模型)
cohere — CohereTokenizerFast (Cohere 模型)
cohere2 — CohereTokenizerFast (Cohere2 模型)
colpali — LlamaTokenizer 或 LlamaTokenizerFast (ColPali 模型)
colqwen2 — Qwen2Tokenizer 或 Qwen2TokenizerFast (ColQwen2 模型)
convbert — ConvBertTokenizer 或 ConvBertTokenizerFast (ConvBERT 模型)
cpm — CpmTokenizer 或 CpmTokenizerFast (CPM 模型)
cpmant — CpmAntTokenizer (CPM-Ant 模型)
ctrl — CTRLTokenizer (CTRL 模型)
data2vec-audio — Wav2Vec2CTCTokenizer (Data2VecAudio 模型)
data2vec-text — RobertaTokenizer 或 RobertaTokenizerFast (Data2VecText 模型)
dbrx — GPT2Tokenizer 或 GPT2TokenizerFast (DBRX 模型)
deberta — DebertaTokenizer 或 DebertaTokenizerFast (DeBERTa 模型)
deberta-v2 — DebertaV2Tokenizer 或 DebertaV2TokenizerFast (DeBERTa-v2 模型)
deepseek_v3 — LlamaTokenizer 或 LlamaTokenizerFast (DeepSeek-V3 模型)
dia — DiaTokenizer (Dia 模型)
diffllama — LlamaTokenizer 或 LlamaTokenizerFast (DiffLlama 模型)
distilbert — DistilBertTokenizer 或 DistilBertTokenizerFast (DistilBERT 模型)
dpr — DPRQuestionEncoderTokenizer 或 DPRQuestionEncoderTokenizerFast (DPR 模型)
electra — ElectraTokenizer 或 ElectraTokenizerFast (ELECTRA 模型)
emu3 — GPT2Tokenizer 或 GPT2TokenizerFast (Emu3 模型)
ernie — BertTokenizer 或 BertTokenizerFast (ERNIE 模型)
ernie_m — ErnieMTokenizer (ErnieM 模型)
esm — EsmTokenizer (ESM 模型)
falcon — PreTrainedTokenizerFast (Falcon 模型)
falcon_mamba — GPTNeoXTokenizerFast (FalconMamba 模型)
fastspeech2_conformer — (FastSpeech2Conformer 模型)
flaubert — FlaubertTokenizer (FlauBERT 模型)
fnet — FNetTokenizer 或 FNetTokenizerFast (FNet 模型)
fsmt — FSMTTokenizer (FairSeq 機器翻譯模型)
funnel — FunnelTokenizer 或 FunnelTokenizerFast (Funnel Transformer 模型)
gemma — GemmaTokenizer 或 GemmaTokenizerFast (Gemma 模型)
gemma2 — GemmaTokenizer 或 GemmaTokenizerFast (Gemma2 模型)
gemma3 — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3ForConditionalGeneration 模型)
gemma3_text — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3ForCausalLM 模型)
gemma3n — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3nForConditionalGeneration 模型)
gemma3n_text — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3nForCausalLM 模型)
git — BertTokenizer 或 BertTokenizerFast (GIT 模型)
glm — PreTrainedTokenizerFast (GLM 模型)
glm4 — PreTrainedTokenizerFast (GLM4 模型)
glm4v — PreTrainedTokenizerFast (GLM4V 模型)
gpt-sw3 — GPTSw3Tokenizer (GPT-Sw3 模型)
gpt2 — GPT2Tokenizer 或 GPT2TokenizerFast (OpenAI GPT-2 模型)
gpt_bigcode — GPT2Tokenizer 或 GPT2TokenizerFast (GPTBigCode 模型)
gpt_neo — GPT2Tokenizer 或 GPT2TokenizerFast (GPT Neo 模型)
gpt_neox — GPTNeoXTokenizerFast (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseTokenizer (GPT NeoX Japanese 模型)
gptj — GPT2Tokenizer 或 GPT2TokenizerFast (GPT-J 模型)
gptsan-japanese — GPTSanJapaneseTokenizer (GPTSAN-japanese 模型)
granite — GPT2Tokenizer (Granite 模型)
granitemoe — GPT2Tokenizer (GraniteMoeMoe 模型)
granitemoehybrid — GPT2Tokenizer (GraniteMoeHybrid 模型)
granitemoeshared — GPT2Tokenizer (GraniteMoeSharedMoe 模型)
grounding-dino — BertTokenizer 或 BertTokenizerFast (Grounding DINO 模型)
groupvit — CLIPTokenizer 或 CLIPTokenizerFast (GroupViT 模型)
helium — PreTrainedTokenizerFast (Helium 模型)
herbert — HerbertTokenizer 或 HerbertTokenizerFast (HerBERT 模型)
hubert — Wav2Vec2CTCTokenizer (Hubert 模型)
ibert — RobertaTokenizer 或 RobertaTokenizerFast (I-BERT 模型)
idefics — LlamaTokenizerFast (IDEFICS 模型)
idefics2 — LlamaTokenizer 或 LlamaTokenizerFast (Idefics2 模型)
idefics3 — LlamaTokenizer 或 LlamaTokenizerFast (Idefics3 模型)
instructblip — GPT2Tokenizer 或 GPT2TokenizerFast (InstructBLIP 模型)
instructblipvideo — GPT2Tokenizer 或 GPT2TokenizerFast (InstructBlipVideo 模型)
internvl — Qwen2Tokenizer 或 Qwen2TokenizerFast (InternVL 模型)
jamba — LlamaTokenizer 或 LlamaTokenizerFast (Jamba 模型)
janus — LlamaTokenizerFast (Janus 模型)
jetmoe — LlamaTokenizer 或 LlamaTokenizerFast (JetMoe 模型)
jukebox — JukeboxTokenizer (Jukebox 模型)
kosmos-2 — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (KOSMOS-2 模型)
layoutlm — LayoutLMTokenizer 或 LayoutLMTokenizerFast (LayoutLM 模型)
layoutlmv2 — LayoutLMv2Tokenizer 或 LayoutLMv2TokenizerFast (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Tokenizer 或 LayoutLMv3TokenizerFast (LayoutLMv3 模型)
layoutxlm — LayoutXLMTokenizer 或 LayoutXLMTokenizerFast (LayoutXLM 模型)
led — LEDTokenizer 或 LEDTokenizerFast (LED 模型)
lilt — LayoutLMv3Tokenizer 或 LayoutLMv3TokenizerFast (LiLT 模型)
llama — LlamaTokenizer 或 LlamaTokenizerFast (LLaMA 模型)
llama4 — LlamaTokenizer 或 LlamaTokenizerFast (Llama4 模型)
llama4_text — LlamaTokenizer 或 LlamaTokenizerFast (Llama4ForCausalLM 模型)
llava — LlamaTokenizer 或 LlamaTokenizerFast (LLaVa 模型)
llava_next — LlamaTokenizer 或 LlamaTokenizerFast (LLaVA-NeXT 模型)
llava_next_video — LlamaTokenizer 或 LlamaTokenizerFast (LLaVa-NeXT-Video 模型)
llava_onevision — LlamaTokenizer 或 LlamaTokenizerFast (LLaVA-Onevision 模型)
longformer — LongformerTokenizer 或 LongformerTokenizerFast (Longformer 模型)
longt5 — T5Tokenizer 或 T5TokenizerFast (LongT5 模型)
luke — LukeTokenizer (LUKE 模型)
lxmert — LxmertTokenizer 或 LxmertTokenizerFast (LXMERT 模型)
m2m_100 — M2M100Tokenizer (M2M100 模型)
mamba — GPTNeoXTokenizerFast (Mamba 模型)
mamba2 — GPTNeoXTokenizerFast (mamba2 模型)
marian — MarianTokenizer (Marian 模型)
mbart — MBartTokenizer 或 MBartTokenizerFast (mBART 模型)
mbart50 — MBart50Tokenizer 或 MBart50TokenizerFast (mBART-50 模型)
mega — RobertaTokenizer 或 RobertaTokenizerFast (MEGA 模型)
megatron-bert — BertTokenizer 或 BertTokenizerFast (Megatron-BERT 模型)
mgp-str — MgpstrTokenizer (MGP-STR 模型)
minimax — GPT2Tokenizer 或 GPT2TokenizerFast (MiniMax 模型)
mistral — LlamaTokenizer 或 LlamaTokenizerFast (Mistral 模型)
mixtral — LlamaTokenizer 或 LlamaTokenizerFast (Mixtral 模型)
mllama — LlamaTokenizer 或 LlamaTokenizerFast (Mllama 模型)
mluke — MLukeTokenizer (mLUKE 模型)
mobilebert — MobileBertTokenizer 或 MobileBertTokenizerFast (MobileBERT 模型)
modernbert — PreTrainedTokenizerFast (ModernBERT 模型)
moonshine — PreTrainedTokenizerFast (Moonshine 模型)
moshi — PreTrainedTokenizerFast (Moshi 模型)
mpnet — MPNetTokenizer 或 MPNetTokenizerFast (MPNet 模型)
mpt — GPTNeoXTokenizerFast (MPT 模型)
mra — RobertaTokenizer 或 RobertaTokenizerFast (MRA 模型)
mt5 — MT5Tokenizer 或 MT5TokenizerFast (MT5 模型)
musicgen — T5Tokenizer 或 T5TokenizerFast (MusicGen 模型)
musicgen_melody — T5Tokenizer 或 T5TokenizerFast (MusicGen Melody 模型)
mvp — MvpTokenizer 或 MvpTokenizerFast (MVP 模型)
myt5 — MyT5Tokenizer (myt5 模型)
nemotron — PreTrainedTokenizerFast (Nemotron 模型)
nezha — BertTokenizer 或 BertTokenizerFast (Nezha 模型)
nllb — NllbTokenizer 或 NllbTokenizerFast (NLLB 模型)
nllb-moe — NllbTokenizer 或 NllbTokenizerFast (NLLB-MOE 模型)
nystromformer — `AlbertTokenizer` 或 AlbertTokenizerFast (Nyströmformer 模型)
olmo — GPTNeoXTokenizerFast (OLMo 模型)
olmo2 — GPTNeoXTokenizerFast (OLMo2 模型)
olmoe — GPTNeoXTokenizerFast (OLMoE 模型)
omdet-turbo — CLIPTokenizer 或 CLIPTokenizerFast (OmDet-Turbo 模型)
oneformer — CLIPTokenizer 或 CLIPTokenizerFast (OneFormer 模型)
openai-gpt — OpenAIGPTTokenizer 或 OpenAIGPTTokenizerFast (OpenAI GPT 模型)
opt — GPT2Tokenizer 或 GPT2TokenizerFast (OPT 模型)
owlv2 — CLIPTokenizer 或 CLIPTokenizerFast (OWLv2 模型)
owlvit — CLIPTokenizer 或 CLIPTokenizerFast (OWL-ViT 模型)
paligemma — LlamaTokenizer 或 LlamaTokenizerFast (PaliGemma 模型)
pegasus — PegasusTokenizer 或 PegasusTokenizerFast (Pegasus 模型)
pegasus_x — PegasusTokenizer 或 PegasusTokenizerFast (PEGASUS-X 模型)
perceiver — PerceiverTokenizer (Perceiver 模型)
persimmon — LlamaTokenizer 或 LlamaTokenizerFast (Persimmon 模型)
phi — CodeGenTokenizer 或 CodeGenTokenizerFast (Phi 模型)
phi3 — LlamaTokenizer 或 LlamaTokenizerFast (Phi3 模型)
phimoe — LlamaTokenizer 或 LlamaTokenizerFast (Phimoe 模型)
phobert — PhobertTokenizer (PhoBERT 模型)
pix2struct — T5Tokenizer 或 T5TokenizerFast (Pix2Struct 模型)
pixtral — PreTrainedTokenizerFast (Pixtral 模型)
plbart — PLBartTokenizer (PLBart 模型)
prophetnet — ProphetNetTokenizer (ProphetNet 模型)
qdqbert — BertTokenizer 或 BertTokenizerFast (QDQBert 模型)
qwen2 — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2 模型)
qwen2_5_omni — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2_5_VL 模型)
qwen2_audio — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2Audio 模型)
qwen2_moe — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2MoE 模型)
qwen2_vl — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2VL 模型)
qwen3 — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen3 模型)
qwen3_moe — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen3MoE 模型)
rag — RagTokenizer (RAG 模型)
realm — RealmTokenizer 或 RealmTokenizerFast (REALM 模型)
recurrent_gemma — GemmaTokenizer 或 GemmaTokenizerFast (RecurrentGemma 模型)
reformer — ReformerTokenizer 或 ReformerTokenizerFast (Reformer 模型)
rembert — RemBertTokenizer 或 RemBertTokenizerFast (RemBERT 模型)
retribert — RetriBertTokenizer 或 RetriBertTokenizerFast (RetriBERT 模型)
roberta — RobertaTokenizer 或 RobertaTokenizerFast (RoBERTa 模型)
roberta-prelayernorm — RobertaTokenizer 或 RobertaTokenizerFast (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertTokenizer (RoCBert 模型)
roformer — RoFormerTokenizer 或 RoFormerTokenizerFast (RoFormer 模型)
rwkv — GPTNeoXTokenizerFast (RWKV 模型)
seamless_m4t — SeamlessM4TTokenizer 或 SeamlessM4TTokenizerFast (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4TTokenizer 或 SeamlessM4TTokenizerFast (SeamlessM4Tv2 模型)
shieldgemma2 — GemmaTokenizer 或 GemmaTokenizerFast (Shieldgemma2 模型)
siglip — SiglipTokenizer (SigLIP 模型)
siglip2 — GemmaTokenizer 或 GemmaTokenizerFast (SigLIP2 模型)
smollm3 — PreTrainedTokenizerFast (SmolLM3 模型)
speech_to_text — Speech2TextTokenizer (Speech2Text 模型)
speech_to_text_2 — Speech2Text2Tokenizer (Speech2Text2 模型)
speecht5 — SpeechT5Tokenizer (SpeechT5 模型)
splinter — SplinterTokenizer 或 SplinterTokenizerFast (Splinter 模型)
squeezebert — SqueezeBertTokenizer 或 SqueezeBertTokenizerFast (SqueezeBERT 模型)
stablelm — GPTNeoXTokenizerFast (StableLm 模型)
starcoder2 — GPT2Tokenizer 或 GPT2TokenizerFast (Starcoder2 模型)
switch_transformers — T5Tokenizer 或 T5TokenizerFast (SwitchTransformers 模型)
t5 — T5Tokenizer 或 T5TokenizerFast (T5 模型)
t5gemma — GemmaTokenizer 或 GemmaTokenizerFast (T5Gemma 模型)
tapas — TapasTokenizer (TAPAS 模型)
tapex — TapexTokenizer (TAPEX 模型)
transfo-xl — TransfoXLTokenizer (Transformer-XL 模型)
tvp — BertTokenizer 或 BertTokenizerFast (TVP 模型)
udop — UdopTokenizer 或 UdopTokenizerFast (UDOP 模型)
umt5 — T5Tokenizer 或 T5TokenizerFast (UMT5 模型)
video_llava — LlamaTokenizer 或 LlamaTokenizerFast (VideoLlava 模型)
vilt — BertTokenizer 或 BertTokenizerFast (ViLT 模型)
vipllava — LlamaTokenizer 或 LlamaTokenizerFast (VipLlava 模型)
visual_bert — BertTokenizer 或 BertTokenizerFast (VisualBERT 模型)
vits — VitsTokenizer (VITS 模型)
wav2vec2 — Wav2Vec2CTCTokenizer (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2CTCTokenizer (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2CTCTokenizer (Wav2Vec2-Conformer 模型)
wav2vec2_phoneme — Wav2Vec2PhonemeCTCTokenizer (Wav2Vec2Phoneme 模型)
whisper — WhisperTokenizer 或 WhisperTokenizerFast (Whisper 模型)
xclip — CLIPTokenizer 或 CLIPTokenizerFast (X-CLIP 模型)
xglm — XGLMTokenizer 或 XGLMTokenizerFast (XGLM 模型)
xlm — XLMTokenizer (XLM 模型)
xlm-prophetnet — XLMProphetNetTokenizer (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (XLM-RoBERTa-XL 模型)
xlnet — XLNetTokenizer 或 XLNetTokenizerFast (XLNet 模型)
xmod — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (X-MOD 模型)
yoso — `AlbertTokenizer` 或 AlbertTokenizerFast (YOSO 模型)
zamba — LlamaTokenizer 或 LlamaTokenizerFast (Zamba 模型)
zamba2 — LlamaTokenizer 或 LlamaTokenizerFast (Zamba2 模型)

示例

>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")

>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)

register

( config_class slow_tokenizer_class = None fast_tokenizer_class = None exist_ok = False )

引數

config_class (PretrainedConfig) — 與要註冊的模型相對應的配置。
slow_tokenizer_class (PretrainedTokenizer, 可選) — 要註冊的慢速分詞器。
fast_tokenizer_class (PretrainedTokenizerFast, 可選) — 要註冊的快速分詞器。

在此對映中註冊一個新的分詞器。

AutoFeatureExtractor

class transformers.AutoFeatureExtractor

( )

這是一個通用的特徵提取器類，當使用 AutoFeatureExtractor.from_pretrained() 類方法建立時，它將被例項化為庫中的一個特徵提取器類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_pretrained

( pretrained_model_name_or_path **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中預訓練特徵提取器的 模型 ID。
- 一個包含使用 save_pretrained() 方法儲存的特徵提取器檔案的目錄路徑，例如，./my_model_directory/。
- 一個指向已儲存的特徵提取器 JSON 檔案的路徑或 URL，例如，./my_model_directory/preprocessor_config.json。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型特徵提取器應快取到的目錄路徑。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載特徵提取器檔案，並覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 一個用於按協議或端點指定代理伺服器的字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP Bearer 授權的 token。如果為 True，將使用執行 huggingface-cli login 時生成的 token（儲存在 ~/.huggingface 中）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
return_unused_kwargs (bool, 可選, 預設為 False) — 如果為 False，則此函式僅返回最終的特徵提取器物件。如果為 True，則此函式返回一個 Tuple(feature_extractor, unused_kwargs)，其中 *unused_kwargs* 是一個字典，包含其鍵不是特徵提取器屬性的鍵/值對：即 kwargs 中未用於更新 feature_extractor 且在其他情況下被忽略的部分。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
kwargs (dict[str, Any], 可選) — `kwargs` 中任何鍵是特徵提取器屬性的值將用於覆蓋載入的值。對於鍵 *不是* 特徵提取器屬性的鍵/值對的行為由 return_unused_kwargs 關鍵字引數控制。

從預訓練模型詞彙表中例項化庫中的一個特徵提取器類。

要例項化的特徵提取器類是根據配置物件（作為引數傳遞或儘可能從 `pretrained_model_name_or_path` 載入）的 `model_type` 屬性選擇的，或者當它缺失時，透過回退到對 `pretrained_model_name_or_path` 進行模式匹配來選擇。

audio-spectrogram-transformer — ASTFeatureExtractor (Audio Spectrogram Transformer 模型)
beit — BeitFeatureExtractor (BEiT 模型)
chinese_clip — ChineseCLIPFeatureExtractor (Chinese-CLIP 模型)
clap — ClapFeatureExtractor (CLAP 模型)
clip — CLIPFeatureExtractor (CLIP 模型)
clipseg — ViTFeatureExtractor (CLIPSeg 模型)
clvp — ClvpFeatureExtractor (CLVP 模型)
conditional_detr — ConditionalDetrFeatureExtractor (Conditional DETR 模型)
convnext — ConvNextFeatureExtractor (ConvNeXT 模型)
cvt — ConvNextFeatureExtractor (CvT 模型)
dac — DacFeatureExtractor (DAC 模型)
data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudio 模型)
data2vec-vision — BeitFeatureExtractor (Data2VecVision 模型)
deformable_detr — DeformableDetrFeatureExtractor (Deformable DETR 模型)
deit — DeiTFeatureExtractor (DeiT 模型)
detr — DetrFeatureExtractor (DETR 模型)
dia — DiaFeatureExtractor (Dia 模型)
dinat — ViTFeatureExtractor (DiNAT 模型)
donut-swin — DonutFeatureExtractor (DonutSwin 模型)
dpt — DPTFeatureExtractor (DPT 模型)
encodec — EncodecFeatureExtractor (EnCodec 模型)
flava — FlavaFeatureExtractor (FLAVA 模型)
gemma3n — Gemma3nAudioFeatureExtractor (Gemma3nForConditionalGeneration 模型)
glpn — GLPNFeatureExtractor (GLPN 模型)
granite_speech — GraniteSpeechFeatureExtractor (GraniteSpeech 模型)
groupvit — CLIPFeatureExtractor (GroupViT 模型)
hubert — Wav2Vec2FeatureExtractor (Hubert 模型)
imagegpt — ImageGPTFeatureExtractor (ImageGPT 模型)
kyutai_speech_to_text — KyutaiSpeechToTextFeatureExtractor (KyutaiSpeechToText 模型)
layoutlmv2 — LayoutLMv2FeatureExtractor (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3FeatureExtractor (LayoutLMv3 模型)
levit — LevitFeatureExtractor (LeViT 模型)
maskformer — MaskFormerFeatureExtractor (MaskFormer 模型)
mctct — MCTCTFeatureExtractor (M-CTC-T 模型)
mimi — EncodecFeatureExtractor (Mimi 模型)
mobilenet_v1 — MobileNetV1FeatureExtractor (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2FeatureExtractor (MobileNetV2 模型)
mobilevit — MobileViTFeatureExtractor (MobileViT 模型)
moonshine — Wav2Vec2FeatureExtractor (Moonshine 模型)
moshi — EncodecFeatureExtractor (Moshi 模型)
nat — ViTFeatureExtractor (NAT 模型)
owlvit — OwlViTFeatureExtractor (OWL-ViT 模型)
perceiver — PerceiverFeatureExtractor (Perceiver 模型)
phi4_multimodal — Phi4MultimodalFeatureExtractor (Phi4Multimodal 模型)
poolformer — PoolFormerFeatureExtractor (PoolFormer 模型)
pop2piano — Pop2PianoFeatureExtractor (Pop2Piano 模型)
regnet — ConvNextFeatureExtractor (RegNet 模型)
resnet — ConvNextFeatureExtractor (ResNet 模型)
seamless_m4t — SeamlessM4TFeatureExtractor (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4TFeatureExtractor (SeamlessM4Tv2 模型)
segformer — SegformerFeatureExtractor (SegFormer 模型)
sew — Wav2Vec2FeatureExtractor (SEW 模型)
sew-d — Wav2Vec2FeatureExtractor (SEW-D 模型)
speech_to_text — Speech2TextFeatureExtractor (Speech2Text 模型)
speecht5 — SpeechT5FeatureExtractor (SpeechT5 模型)
swiftformer — ViTFeatureExtractor (SwiftFormer 模型)
swin — ViTFeatureExtractor (Swin Transformer 模型)
swinv2 — ViTFeatureExtractor (Swin Transformer V2 模型)
table-transformer — DetrFeatureExtractor (Table Transformer 模型)
timesformer — VideoMAEFeatureExtractor (TimeSformer 模型)
tvlt — TvltFeatureExtractor (TVLT 模型)
unispeech — Wav2Vec2FeatureExtractor (UniSpeech 模型)
unispeech-sat — Wav2Vec2FeatureExtractor (UniSpeechSat 模型)
univnet — UnivNetFeatureExtractor (UnivNet 模型)
van — ConvNextFeatureExtractor (VAN 模型)
videomae — VideoMAEFeatureExtractor (VideoMAE 模型)
vilt — ViltFeatureExtractor (ViLT 模型)
vit — ViTFeatureExtractor (ViT 模型)
vit_mae — ViTFeatureExtractor (ViTMAE 模型)
vit_msn — ViTFeatureExtractor (ViTMSN 模型)
wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2FeatureExtractor (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer 模型)
wavlm — Wav2Vec2FeatureExtractor (WavLM 模型)
whisper — WhisperFeatureExtractor (Whisper 模型)
xclip — CLIPFeatureExtractor (X-CLIP 模型)
yolos — YolosFeatureExtractor (YOLOS 模型)

當您想使用私有模型時，需要傳遞 token=True。

示例

>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")

register

( config_class feature_extractor_class exist_ok = False )

引數

config_class (PretrainedConfig) — 與要註冊的模型相對應的配置。
feature_extractor_class (FeatureExtractorMixin) — 要註冊的特徵提取器。

為此類註冊一個新的特徵提取器。

AutoImageProcessor

class transformers.AutoImageProcessor

( )

這是一個通用的影像處理器類，當使用 AutoImageProcessor.from_pretrained() 類方法建立時，它將被例項化為庫中的一個影像處理器類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 這可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練 image_processor 的 *模型 ID*。
- 一個包含使用 save_pretrained() 方法儲存的影像處理器檔案的 *目錄* 路徑，例如 ./my_model_directory/。
- 一個指向已儲存的影像處理器 JSON *檔案* 的路徑或 URL，例如 ./my_model_directory/preprocessor_config.json。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，將下載的預訓練模型影像處理器快取到的目錄路徑。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載影像處理器檔案並覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 一個用於按協議或端點指定代理伺服器的字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP Bearer 授權的 token。如果為 True，將使用執行 huggingface-cli login 時生成的 token（儲存在 ~/.huggingface 中）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
use_fast (bool, 可選, 預設為 False) — 如果給定模型支援，則使用基於 torchvision 的快速影像處理器。如果給定模型沒有快速影像處理器，則返回基於 numpy 的普通影像處理器。
return_unused_kwargs (bool, 可選, 預設為 False) — 如果為 False，則此函式僅返回最終的影像處理器物件。如果為 True，則此函式返回一個 Tuple(image_processor, unused_kwargs)，其中 *unused_kwargs* 是一個字典，包含其鍵不是影像處理器屬性的鍵/值對：即 kwargs 中未用於更新 image_processor 且在其他情況下被忽略的部分。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
image_processor_filename (str, 可選, 預設為 "config.json") — 模型目錄中用於影像處理器配置的檔名。
kwargs (dict[str, Any], 可選) — `kwargs` 中任何鍵是影像處理器屬性的值將用於覆蓋載入的值。對於鍵 *不是* 影像處理器屬性的鍵/值對的行為由 return_unused_kwargs 關鍵字引數控制。

從預訓練模型詞彙表中例項化庫中的一個影像處理器類。

要例項化的影像處理器類是根據配置物件（作為引數傳遞或儘可能從 `pretrained_model_name_or_path` 載入）的 `model_type` 屬性選擇的，或者當它缺失時，透過回退到對 `pretrained_model_name_or_path` 進行模式匹配來選擇。

align — EfficientNetImageProcessor 或 EfficientNetImageProcessorFast (ALIGN 模型)
aria — A 或 r (Aria 模型)
beit — BeitImageProcessor 或 BeitImageProcessorFast (BEiT 模型)
bit — BitImageProcessor 或 BitImageProcessorFast (BiT 模型)
blip — BlipImageProcessor 或 BlipImageProcessorFast (BLIP 模型)
blip-2 — BlipImageProcessor 或 BlipImageProcessorFast (BLIP-2 模型)
bridgetower — BridgeTowerImageProcessor 或 BridgeTowerImageProcessorFast (BridgeTower 模型)
chameleon — ChameleonImageProcessor (Chameleon 模型)
chinese_clip — ChineseCLIPImageProcessor 或 ChineseCLIPImageProcessorFast (Chinese-CLIP 模型)
clip — CLIPImageProcessor 或 CLIPImageProcessorFast (CLIP 模型)
clipseg — ViTImageProcessor 或 ViTImageProcessorFast (CLIPSeg 模型)
conditional_detr — ConditionalDetrImageProcessor 或 ConditionalDetrImageProcessorFast (Conditional DETR 模型)
convnext — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ConvNeXT 模型)
convnextv2 — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ConvNeXTV2 模型)
cvt — ConvNextImageProcessor 或 ConvNextImageProcessorFast (CvT 模型)
data2vec-vision — BeitImageProcessor 或 BeitImageProcessorFast (Data2VecVision 模型)
deformable_detr — DeformableDetrImageProcessor 或 DeformableDetrImageProcessorFast (Deformable DETR 模型)
deit — DeiTImageProcessor 或 DeiTImageProcessorFast (DeiT 模型)
depth_anything — DPTImageProcessor 或 DPTImageProcessorFast (Depth Anything 模型)
depth_pro — DepthProImageProcessor 或 DepthProImageProcessorFast (DepthPro 模型)
deta — DetaImageProcessor (DETA 模型)
detr — DetrImageProcessor 或 DetrImageProcessorFast (DETR 模型)
dinat — ViTImageProcessor 或 ViTImageProcessorFast (DiNAT 模型)
dinov2 — BitImageProcessor 或 BitImageProcessorFast (DINOv2 模型)
donut-swin — DonutImageProcessor 或 DonutImageProcessorFast (DonutSwin 模型)
dpt — DPTImageProcessor 或 DPTImageProcessorFast (DPT 模型)
efficientformer — EfficientFormerImageProcessor (EfficientFormer 模型)
efficientnet — EfficientNetImageProcessor 或 EfficientNetImageProcessorFast (EfficientNet 模型)
flava — FlavaImageProcessor 或 FlavaImageProcessorFast (FLAVA 模型)
focalnet — BitImageProcessor 或 BitImageProcessorFast (FocalNet 模型)
fuyu — FuyuImageProcessor (Fuyu 模型)
gemma3 — Gemma3ImageProcessor 或 Gemma3ImageProcessorFast (Gemma3ForConditionalGeneration 模型)
gemma3n — SiglipImageProcessor 或 SiglipImageProcessorFast (Gemma3nForConditionalGeneration 模型)
git — CLIPImageProcessor 或 CLIPImageProcessorFast (GIT 模型)
glm4v — Glm4vImageProcessor 或 Glm4vImageProcessorFast (GLM4V 模型)
glpn — GLPNImageProcessor (GLPN 模型)
got_ocr2 — GotOcr2ImageProcessor 或 GotOcr2ImageProcessorFast (GOT-OCR2 模型)
grounding-dino — GroundingDinoImageProcessor 或 GroundingDinoImageProcessorFast (Grounding DINO 模型)
groupvit — CLIPImageProcessor 或 CLIPImageProcessorFast (GroupViT 模型)
hiera — BitImageProcessor 或 BitImageProcessorFast (Hiera 模型)
idefics — IdeficsImageProcessor (IDEFICS 模型)
idefics2 — Idefics2ImageProcessor 或 Idefics2ImageProcessorFast (Idefics2 模型)
idefics3 — Idefics3ImageProcessor 或 Idefics3ImageProcessorFast (Idefics3 模型)
ijepa — ViTImageProcessor 或 ViTImageProcessorFast (I-JEPA 模型)
imagegpt — ImageGPTImageProcessor (ImageGPT 模型)
instructblip — BlipImageProcessor 或 BlipImageProcessorFast (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoImageProcessor (InstructBlipVideo 模型)
janus — J 或 a (Janus 模型)
kosmos-2 — CLIPImageProcessor 或 CLIPImageProcessorFast (KOSMOS-2 模型)
layoutlmv2 — LayoutLMv2ImageProcessor 或 LayoutLMv2ImageProcessorFast (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ImageProcessor 或 LayoutLMv3ImageProcessorFast (LayoutLMv3 模型)
levit — LevitImageProcessor 或 LevitImageProcessorFast (LeViT 模型)
lightglue — LightGlueImageProcessor (LightGlue 模型)
llama4 — Llama4ImageProcessor 或 Llama4ImageProcessorFast (Llama4 模型)
llava — LlavaImageProcessor 或 LlavaImageProcessorFast (LLaVa 模型)
llava_next — LlavaNextImageProcessor 或 LlavaNextImageProcessorFast (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoImageProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionImageProcessor 或 LlavaOnevisionImageProcessorFast (LLaVA-Onevision 模型)
mask2former — Mask2FormerImageProcessor (Mask2Former 模型)
maskformer — MaskFormerImageProcessor (MaskFormer 模型)
mgp-str — ViTImageProcessor 或 ViTImageProcessorFast (MGP-STR 模型)
mistral3 — PixtralImageProcessor 或 PixtralImageProcessorFast (Mistral3 模型)
mlcd — CLIPImageProcessor 或 CLIPImageProcessorFast (MLCD 模型)
mllama — MllamaImageProcessor (Mllama 模型)
mobilenet_v1 — MobileNetV1ImageProcessor 或 MobileNetV1ImageProcessorFast (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2ImageProcessor 或 MobileNetV2ImageProcessorFast (MobileNetV2 模型)
mobilevit — MobileViTImageProcessor (MobileViT 模型)
mobilevitv2 — MobileViTImageProcessor (MobileViTV2 模型)
nat — ViTImageProcessor 或 ViTImageProcessorFast (NAT 模型)
nougat — NougatImageProcessor (Nougat 模型)
oneformer — OneFormerImageProcessor (OneFormer 模型)
owlv2 — Owlv2ImageProcessor (OWLv2 模型)
owlvit — OwlViTImageProcessor 或 OwlViTImageProcessorFast (OWL-ViT 模型)
paligemma — SiglipImageProcessor 或 SiglipImageProcessorFast (PaliGemma 模型)
perceiver — PerceiverImageProcessor 或 PerceiverImageProcessorFast (Perceiver 模型)
phi4_multimodal — Phi4MultimodalImageProcessorFast (Phi4Multimodal 模型)
pix2struct — Pix2StructImageProcessor (Pix2Struct 模型)
pixtral — PixtralImageProcessor 或 PixtralImageProcessorFast (Pixtral 模型)
poolformer — PoolFormerImageProcessor 或 PoolFormerImageProcessorFast (PoolFormer 模型)
prompt_depth_anything — PromptDepthAnythingImageProcessor (PromptDepthAnything 模型)
pvt — PvtImageProcessor 或 PvtImageProcessorFast (PVT 模型)
pvt_v2 — PvtImageProcessor 或 PvtImageProcessorFast (PVTv2 模型)
qwen2_5_vl — Qwen2VLImageProcessor 或 Qwen2VLImageProcessorFast (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLImageProcessor 或 Qwen2VLImageProcessorFast (Qwen2VL 模型)
regnet — ConvNextImageProcessor 或 ConvNextImageProcessorFast (RegNet 模型)
resnet — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ResNet 模型)
rt_detr — RTDetrImageProcessor 或 RTDetrImageProcessorFast (RT-DETR 模型)
sam — SamImageProcessor (SAM 模型)
sam_hq — SamImageProcessor (SAM-HQ 模型)
segformer — SegformerImageProcessor (SegFormer 模型)
seggpt — SegGptImageProcessor (SegGPT 模型)
shieldgemma2 — Gemma3ImageProcessor 或 Gemma3ImageProcessorFast (Shieldgemma2 模型)
siglip — SiglipImageProcessor 或 SiglipImageProcessorFast (SigLIP 模型)
siglip2 — Siglip2ImageProcessor 或 Siglip2ImageProcessorFast (SigLIP2 模型)
smolvlm — SmolVLMImageProcessor 或 SmolVLMImageProcessorFast (SmolVLM 模型)
superglue — SuperGlueImageProcessor (SuperGlue 模型)
swiftformer — ViTImageProcessor 或 ViTImageProcessorFast (SwiftFormer 模型)
swin — ViTImageProcessor 或 ViTImageProcessorFast (Swin Transformer 模型)
swin2sr — Swin2SRImageProcessor 或 Swin2SRImageProcessorFast (Swin2SR 模型)
swinv2 — ViTImageProcessor 或 ViTImageProcessorFast (Swin Transformer V2 模型)
table-transformer — DetrImageProcessor (Table Transformer 模型)
timesformer — VideoMAEImageProcessor (TimeSformer 模型)
timm_wrapper — TimmWrapperImageProcessor (TimmWrapperModel 模型)
tvlt — TvltImageProcessor (TVLT 模型)
tvp — TvpImageProcessor (TVP 模型)
udop — LayoutLMv3ImageProcessor 或 LayoutLMv3ImageProcessorFast (UDOP 模型)
upernet — SegformerImageProcessor (UPerNet 模型)
van — ConvNextImageProcessor 或 ConvNextImageProcessorFast (VAN 模型)
videomae — VideoMAEImageProcessor (VideoMAE 模型)
vilt — ViltImageProcessor 或 ViltImageProcessorFast (ViLT 模型)
vipllava — CLIPImageProcessor 或 CLIPImageProcessorFast (VipLlava 模型)
vit — ViTImageProcessor 或 ViTImageProcessorFast (ViT 模型)
vit_hybrid — ViTHybridImageProcessor (ViT Hybrid 模型)
vit_mae — ViTImageProcessor 或 ViTImageProcessorFast (ViTMAE 模型)
vit_msn — ViTImageProcessor 或 ViTImageProcessorFast (ViTMSN 模型)
vitmatte — VitMatteImageProcessor 或 VitMatteImageProcessorFast (ViTMatte 模型)
xclip — CLIPImageProcessor 或 CLIPImageProcessorFast (X-CLIP 模型)
yolos — YolosImageProcessor 或 YolosImageProcessorFast (YOLOS 模型)
zoedepth — ZoeDepthImageProcessor 或 ZoeDepthImageProcessorFast (ZoeDepth 模型)

當您想使用私有模型時，需要傳遞 token=True。

示例

>>> from transformers import AutoImageProcessor

>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")

register

( config_class image_processor_class = None slow_image_processor_class = None fast_image_processor_class = None exist_ok = False )

引數

config_class (PretrainedConfig) — 與要註冊的模型相對應的配置。
image_processor_class (ImageProcessingMixin) — 要註冊的影像處理器。

為此類註冊一個新的影像處理器。

AutoVideoProcessor

class transformers.AutoVideoProcessor

( )

這是一個通用的影片處理器類，當使用 AutoVideoProcessor.from_pretrained() 類方法建立時，它將被例項化為庫中的一個影片處理器類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練 video_processor 的 model id。
- 包含使用 save_pretrained() 方法儲存的影片處理器檔案的目錄路徑，例如 ./my_model_directory/。
- 已儲存的影片處理器 JSON 檔案的路徑或 URL，例如 ./my_model_directory/preprocessor_config.json。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型影片處理器應快取到的目錄路徑。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載影片處理器檔案並覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
token (str or bool, optional) — 用於遠端檔案的 HTTP Bearer 授權的令牌。如果為 True，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
return_unused_kwargs (bool, optional, defaults to False) — 如果為 False，則此函式僅返回最終的影片處理器物件。如果為 True，則此函式返回一個 Tuple(video_processor, unused_kwargs)，其中 unused_kwargs 是一個字典，包含其鍵不是影片處理器屬性的鍵/值對：即 kwargs 中未用於更新 video_processor 且在其他情況下被忽略的部分。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
kwargs (dict[str, Any], optional) — kwargs 中任何鍵是影片處理器屬性的值將用於覆蓋載入的值。對於鍵不是影片處理器屬性的鍵/值對的行為由 return_unused_kwargs 關鍵字引數控制。

從預訓練模型詞彙表中例項化庫中的一個影片處理器類。

要例項化的影片處理器類是根據 config 物件的 model_type 屬性選擇的（可以作為引數傳遞，也可以從 pretrained_model_name_or_path 載入），或者在缺少該屬性時，透過對 pretrained_model_name_or_path 進行模式匹配來回退選擇。

glm4v — Glm4vVideoProcessor (GLM4V 模型)
instructblip — InstructBlipVideoVideoProcessor (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoVideoProcessor (InstructBlipVideo 模型)
internvl — InternVLVideoProcessor (InternVL 模型)
llava_next_video — LlavaNextVideoVideoProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionVideoProcessor (LLaVA-Onevision 模型)
qwen2_5_omni — Qwen2VLVideoProcessor (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2VLVideoProcessor (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLVideoProcessor (Qwen2VL 模型)
smolvlm — SmolVLMVideoProcessor (SmolVLM 模型)
video_llava — VideoLlavaVideoProcessor (VideoLlava 模型)
vjepa2 — VJEPA2VideoProcessor (VJEPA2Model 模型)

當您想使用私有模型時，需要傳遞 token=True。

示例

>>> from transformers import AutoVideoProcessor

>>> # Download video processor from huggingface.co and cache.
>>> video_processor = AutoVideoProcessor.from_pretrained("llava-hf/llava-onevision-qwen2-0.5b-ov-hf")

>>> # If video processor files are in a directory (e.g. video processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # video_processor = AutoVideoProcessor.from_pretrained("./test/saved_model/")

register

（ config_class video_processor_class exist_ok = False ）

引數

config_class (PretrainedConfig) — 與要註冊的模型相對應的配置。
video_processor_class (BaseVideoProcessor) — 要註冊的影片處理器。

為此類註冊一個新的影片處理器。

AutoProcessor

class transformers.AutoProcessor

( )

這是一個通用的處理器類，當使用 AutoProcessor.from_pretrained() 類方法建立時，它將被例項化為庫中的一個處理器類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_pretrained

( pretrained_model_name_or_path **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練 feature_extractor 的 model id。
- 包含使用 save_pretrained() 方法儲存的處理器檔案的目錄路徑，例如 ./my_model_directory/。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型特徵提取器應快取到的目錄路徑。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載特徵提取器檔案並覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
token (str or bool, optional) — 用於遠端檔案的 HTTP Bearer 授權的令牌。如果為 True，將使用執行 huggingface-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
return_unused_kwargs (bool, optional, defaults to False) — 如果為 False，則此函式僅返回最終的特徵提取器物件。如果為 True，則此函式返回一個 Tuple(feature_extractor, unused_kwargs)，其中 unused_kwargs 是一個字典，包含其鍵不是特徵提取器屬性的鍵/值對：即 kwargs 中未用於更新 feature_extractor 且在其他情況下被忽略的部分。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
kwargs (dict[str, Any], optional) — kwargs 中任何鍵是特徵提取器屬性的值將用於覆蓋載入的值。對於鍵不是特徵提取器屬性的鍵/值對的行為由 return_unused_kwargs 關鍵字引數控制。

從預訓練模型詞彙表中例項化庫中的一個處理器類。

要例項化的處理器類是根據 config 物件的 model_type 屬性選擇的（可以作為引數傳遞，也可以從 pretrained_model_name_or_path 載入）。

align — AlignProcessor (ALIGN 模型)
altclip — AltCLIPProcessor (AltCLIP 模型)
aria — AriaProcessor (Aria 模型)
aya_vision — AyaVisionProcessor (AyaVision 模型)
bark — BarkProcessor (Bark 模型)
blip — BlipProcessor (BLIP 模型)
blip-2 — Blip2Processor (BLIP-2 模型)
bridgetower — BridgeTowerProcessor (BridgeTower 模型)
chameleon — ChameleonProcessor (Chameleon 模型)
chinese_clip — ChineseCLIPProcessor (Chinese-CLIP 模型)
clap — ClapProcessor (CLAP 模型)
clip — CLIPProcessor (CLIP 模型)
clipseg — CLIPSegProcessor (CLIPSeg 模型)
clvp — ClvpProcessor (CLVP 模型)
colpali — ColPaliProcessor (ColPali 模型)
colqwen2 — ColQwen2Processor (ColQwen2 模型)
dia — DiaProcessor (Dia 模型)
emu3 — Emu3Processor (Emu3 模型)
flava — FlavaProcessor (FLAVA 模型)
fuyu — FuyuProcessor (Fuyu 模型)
gemma3 — Gemma3Processor (Gemma3ForConditionalGeneration 模型)
gemma3n — Gemma3nProcessor (Gemma3nForConditionalGeneration 模型)
git — GitProcessor (GIT 模型)
glm4v — Glm4vProcessor (GLM4V 模型)
got_ocr2 — GotOcr2Processor (GOT-OCR2 模型)
granite_speech — GraniteSpeechProcessor (GraniteSpeech 模型)
grounding-dino — GroundingDinoProcessor (Grounding DINO 模型)
groupvit — CLIPProcessor (GroupViT 模型)
hubert — Wav2Vec2Processor (Hubert 模型)
idefics — IdeficsProcessor (IDEFICS 模型)
idefics2 — Idefics2Processor (Idefics2 模型)
idefics3 — Idefics3Processor (Idefics3 模型)
instructblip — InstructBlipProcessor (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoProcessor (InstructBlipVideo 模型)
internvl — InternVLProcessor (InternVL 模型)
janus — JanusProcessor (Janus 模型)
kosmos-2 — Kosmos2Processor (KOSMOS-2 模型)
kyutai_speech_to_text — KyutaiSpeechToTextProcessor (KyutaiSpeechToText 模型)
layoutlmv2 — LayoutLMv2Processor (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Processor (LayoutLMv3 模型)
llama4 — Llama4Processor (Llama4 模型)
llava — LlavaProcessor (LLaVa 模型)
llava_next — LlavaNextProcessor (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionProcessor (LLaVA-Onevision 模型)
markuplm — MarkupLMProcessor (MarkupLM 模型)
mctct — MCTCTProcessor (M-CTC-T 模型)
mgp-str — MgpstrProcessor (MGP-STR 模型)
mistral3 — PixtralProcessor (Mistral3 模型)
mllama — MllamaProcessor (Mllama 模型)
moonshine — Wav2Vec2Processor (Moonshine 模型)
oneformer — OneFormerProcessor (OneFormer 模型)
owlv2 — Owlv2Processor (OWLv2 模型)
owlvit — OwlViTProcessor (OWL-ViT 模型)
paligemma — PaliGemmaProcessor (PaliGemma 模型)
phi4_multimodal — Phi4MultimodalProcessor (Phi4Multimodal 模型)
pix2struct — Pix2StructProcessor (Pix2Struct 模型)
pixtral — PixtralProcessor (Pixtral 模型)
pop2piano — Pop2PianoProcessor (Pop2Piano 模型)
qwen2_5_omni — Qwen2_5OmniProcessor (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2_5_VLProcessor (Qwen2_5_VL 模型)
qwen2_audio — Qwen2AudioProcessor (Qwen2Audio 模型)
qwen2_vl — Qwen2VLProcessor (Qwen2VL 模型)
sam — SamProcessor (SAM 模型)
sam_hq — SamHQProcessor (SAM-HQ 模型)
seamless_m4t — SeamlessM4TProcessor (SeamlessM4T 模型)
sew — Wav2Vec2Processor (SEW 模型)
sew-d — Wav2Vec2Processor (SEW-D 模型)
shieldgemma2 — ShieldGemma2Processor (Shieldgemma2 模型)
siglip — SiglipProcessor (SigLIP 模型)
siglip2 — Siglip2Processor (SigLIP2 模型)
smolvlm — SmolVLMProcessor (SmolVLM 模型)
speech_to_text — Speech2TextProcessor (Speech2Text 模型)
speech_to_text_2 — Speech2Text2Processor (Speech2Text2 模型)
speecht5 — SpeechT5Processor (SpeechT5 模型)
trocr — TrOCRProcessor (TrOCR 模型)
tvlt — TvltProcessor (TVLT 模型)
tvp — TvpProcessor (TVP 模型)
udop — UdopProcessor (UDOP 模型)
unispeech — Wav2Vec2Processor (UniSpeech 模型)
unispeech-sat — Wav2Vec2Processor (UniSpeechSat 模型)
video_llava — VideoLlavaProcessor (VideoLlava 模型)
vilt — ViltProcessor (ViLT 模型)
vipllava — LlavaProcessor (VipLlava 模型)
vision-text-dual-encoder — VisionTextDualEncoderProcessor (VisionTextDualEncoder 模型)
wav2vec2 — Wav2Vec2Processor (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2Processor (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2Processor (Wav2Vec2-Conformer 模型)
wavlm — Wav2Vec2Processor (WavLM 模型)
whisper — WhisperProcessor (Whisper 模型)
xclip — XCLIPProcessor (X-CLIP 模型)

當您想使用私有模型時，需要傳遞 token=True。

示例

>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")

register

（ config_class processor_class exist_ok = False ）

引數

config_class (PretrainedConfig) — 與要註冊的模型相對應的配置。
processor_class (ProcessorMixin) — 要註冊的處理器。

為此類註冊一個新的處理器。

通用模型類

以下自動類可用於例項化沒有特定頭部的基礎模型類。

AutoModel

class transformers.AutoModel

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個基礎模型類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- ASTConfig 配置類：ASTModel（音訊頻譜 Transformer 模型）
- AlbertConfig 配置類：AlbertModel（ALBERT 模型）
- AlignConfig 配置類：AlignModel（ALIGN 模型）
- AltCLIPConfig 配置類：AltCLIPModel（AltCLIP 模型）
- ArceeConfig 配置類：ArceeModel（Arcee 模型）
- AriaConfig 配置類：AriaModel（Aria 模型）
- AriaTextConfig 配置類：AriaTextModel（AriaText 模型）
- AutoformerConfig 配置類：AutoformerModel（Autoformer 模型）
- AyaVisionConfig 配置類：AyaVisionModel（AyaVision 模型）
- BambaConfig 配置類：BambaModel（Bamba 模型）
- BarkConfig 配置類：BarkModel（Bark 模型）
- BartConfig 配置類：BartModel（BART 模型）
- BeitConfig 配置類：BeitModel（BEiT 模型）
- BertConfig 配置類：BertModel（BERT 模型）
- BertGenerationConfig 配置類：BertGenerationEncoder（Bert Generation 模型）
- BigBirdConfig 配置類：BigBirdModel（BigBird 模型）
- BigBirdPegasusConfig 配置類：BigBirdPegasusModel（BigBird-Pegasus 模型）
- BioGptConfig 配置類：BioGptModel（BioGpt 模型）
- BitConfig 配置類：BitModel（BiT 模型）
- BitNetConfig 配置類：BitNetModel（BitNet 模型）
- BlenderbotConfig 配置類：BlenderbotModel（Blenderbot 模型）
- BlenderbotSmallConfig 配置類：BlenderbotSmallModel（BlenderbotSmall 模型）
- Blip2Config 配置類：Blip2Model（BLIP-2 模型）
- Blip2QFormerConfig 配置類：Blip2QFormerModel（BLIP-2 QFormer 模型）
- BlipConfig 配置類：BlipModel（BLIP 模型）
- BloomConfig 配置類：BloomModel（BLOOM 模型）
- BridgeTowerConfig 配置類：BridgeTowerModel（BridgeTower 模型）
- BrosConfig 配置類：BrosModel（BROS 模型）
- CLIPConfig 配置類：CLIPModel（CLIP 模型）
- CLIPSegConfig 配置類：CLIPSegModel（CLIPSeg 模型）
- CLIPTextConfig 配置類：CLIPTextModel（CLIPTextModel 模型）
- CLIPVisionConfig 配置類：CLIPVisionModel（CLIPVisionModel 模型）
- CTRLConfig 配置類：CTRLModel（CTRL 模型）
- CamembertConfig 配置類：CamembertModel（CamemBERT 模型）
- CanineConfig 配置類：CanineModel（CANINE 模型）
- ChameleonConfig 配置類：ChameleonModel（Chameleon 模型）
- ChineseCLIPConfig 配置類：ChineseCLIPModel（Chinese-CLIP 模型）
- ChineseCLIPVisionConfig 配置類：ChineseCLIPVisionModel（ChineseCLIPVisionModel 模型）
- ClapConfig 配置類：ClapModel（CLAP 模型）
- ClvpConfig 配置類：ClvpModelForConditionalGeneration（CLVP 模型）
- CodeGenConfig 配置類：CodeGenModel（CodeGen 模型）
- Cohere2Config 配置類：Cohere2Model（Cohere2 模型）
- CohereConfig 配置類：CohereModel（Cohere 模型）
- ConditionalDetrConfig 配置類：ConditionalDetrModel（Conditional DETR 模型）
- ConvBertConfig 配置類：ConvBertModel（ConvBERT 模型）
- ConvNextConfig 配置類：ConvNextModel（ConvNeXT 模型）
- ConvNextV2Config 配置類：ConvNextV2Model（ConvNeXTV2 模型）
- CpmAntConfig 配置類：CpmAntModel（CPM-Ant 模型）
- CsmConfig 配置類：CsmForConditionalGeneration（CSM 模型）
- CvtConfig 配置類：CvtModel（CvT 模型）
- DFineConfig 配置類：DFineModel（D-FINE 模型）
- DPRConfig 配置類：DPRQuestionEncoder（DPR 模型）
- DPTConfig 配置類：DPTModel（DPT 模型）
- DabDetrConfig 配置類：DabDetrModel（DAB-DETR 模型）
- DacConfig 配置類：DacModel（DAC 模型）
- Data2VecAudioConfig 配置類：Data2VecAudioModel（Data2VecAudio 模型）
- Data2VecTextConfig 配置類：Data2VecTextModel（Data2VecText 模型）
- Data2VecVisionConfig 配置類：Data2VecVisionModel（Data2VecVision 模型）
- DbrxConfig 配置類：DbrxModel（DBRX 模型）
- DebertaConfig 配置類：DebertaModel（DeBERTa 模型）
- DebertaV2Config 配置類：DebertaV2Model（DeBERTa-v2 模型）
- DecisionTransformerConfig 配置類：DecisionTransformerModel（Decision Transformer 模型）
- DeepseekV3Config 配置類：DeepseekV3Model（DeepSeek-V3 模型）
- DeformableDetrConfig 配置類：DeformableDetrModel（Deformable DETR 模型）
- DeiTConfig 配置類：DeiTModel（DeiT 模型）
- DepthProConfig 配置類：DepthProModel（DepthPro 模型）
- DetaConfig 配置類：DetaModel（DETA 模型）
- DetrConfig 配置類：DetrModel（DETR 模型）
- DiaConfig 配置類：DiaModel（Dia 模型）
- DiffLlamaConfig 配置類：DiffLlamaModel（DiffLlama 模型）
- DinatConfig 配置類：DinatModel（DiNAT 模型）
- Dinov2Config 配置類：Dinov2Model（DINOv2 模型）
- Dinov2WithRegistersConfig 配置類：Dinov2WithRegistersModel（DINOv2 with Registers 模型）
- DistilBertConfig 配置類：DistilBertModel（DistilBERT 模型）
- DonutSwinConfig 配置類：DonutSwinModel（DonutSwin 模型）
- Dots1Config 配置類：Dots1Model（dots1 模型）
- EfficientFormerConfig 配置類：EfficientFormerModel（EfficientFormer 模型）
- EfficientNetConfig 配置類：EfficientNetModel（EfficientNet 模型）
- ElectraConfig 配置類：ElectraModel（ELECTRA 模型）
- Emu3Config 配置類：Emu3Model（Emu3 模型）
- EncodecConfig 配置類：EncodecModel（EnCodec 模型）
- ErnieConfig 配置類：ErnieModel（ERNIE 模型）
- ErnieMConfig 配置類：ErnieMModel（ErnieM 模型）
- EsmConfig 配置類：EsmModel（ESM 模型）
- FNetConfig 配置類：FNetModel（FNet 模型）
- FSMTConfig 配置類：FSMTModel（FairSeq 機器翻譯模型）
- FalconConfig 配置類：FalconModel（Falcon 模型）
- FalconH1Config 配置類：FalconH1Model（FalconH1 模型）
- FalconMambaConfig 配置類：FalconMambaModel（FalconMamba 模型）
- FastSpeech2ConformerConfig 配置類：FastSpeech2ConformerModel（FastSpeech2Conformer 模型）
- FlaubertConfig 配置類：FlaubertModel（FlauBERT 模型）
- FlavaConfig 配置類：FlavaModel（FLAVA 模型）
- FocalNetConfig 配置類：FocalNetModel（FocalNet 模型）
- FunnelConfig 配置類：FunnelModel 或 FunnelBaseModel（Funnel Transformer 模型）
- FuyuConfig 配置類：FuyuModel（Fuyu 模型）
- GLPNConfig 配置類：GLPNModel（GLPN 模型）
- GPT2Config 配置類：GPT2Model（OpenAI GPT-2 模型）
- GPTBigCodeConfig 配置類：GPTBigCodeModel（GPTBigCode 模型）
- GPTJConfig 配置類：GPTJModel（GPT-J 模型）
- GPTNeoConfig 配置類：GPTNeoModel（GPT Neo 模型）
- GPTNeoXConfig 配置類：GPTNeoXModel（GPT NeoX 模型）
- GPTNeoXJapaneseConfig 配置類：GPTNeoXJapaneseModel（GPT NeoX Japanese 模型）
- GPTSanJapaneseConfig 配置類：GPTSanJapaneseForConditionalGeneration（GPTSAN-japanese 模型）
- Gemma2Config 配置類：Gemma2Model（Gemma2 模型）
- Gemma3Config 配置類：Gemma3Model（Gemma3ForConditionalGeneration 模型）
- Gemma3TextConfig 配置類：Gemma3TextModel（Gemma3ForCausalLM 模型）
- Gemma3nAudioConfig 配置類：Gemma3nAudioEncoder（Gemma3nAudioEncoder 模型）
- Gemma3nConfig 配置類：Gemma3nModel（Gemma3nForConditionalGeneration 模型）
- Gemma3nTextConfig 配置類：Gemma3nTextModel（Gemma3nForCausalLM 模型）
- Gemma3nVisionConfig 配置類：TimmWrapperModel（TimmWrapperModel 模型）
- GemmaConfig 配置類：GemmaModel（Gemma 模型）
- GitConfig 配置類：GitModel（GIT 模型）
- Glm4Config 配置類：Glm4Model（GLM4 模型）
- Glm4vConfig 配置類：Glm4vModel（GLM4V 模型）
- Glm4vTextConfig 配置類：Glm4vTextModel（GLM4V 模型）
- GlmConfig 配置類：GlmModel（GLM 模型）
- GotOcr2Config 配置類：GotOcr2Model（GOT-OCR2 模型）
- GraniteConfig 配置類：GraniteModel（Granite 模型）
- GraniteMoeConfig 配置類：GraniteMoeModel（GraniteMoeMoe 模型）
- GraniteMoeHybridConfig 配置類：GraniteMoeHybridModel（GraniteMoeHybrid 模型）
- GraniteMoeSharedConfig 配置類：GraniteMoeSharedModel（GraniteMoeSharedMoe 模型）
- GraphormerConfig 配置類：GraphormerModel（Graphormer 模型）
- GroundingDinoConfig 配置類：GroundingDinoModel（Grounding DINO 模型）
- GroupViTConfig 配置類：GroupViTModel（GroupViT 模型）
- HGNetV2Config 配置類：HGNetV2Backbone（HGNet-V2 模型）
- HeliumConfig 配置類：HeliumModel（Helium 模型）
- HieraConfig 配置類：HieraModel（Hiera 模型）
- HubertConfig 配置類：HubertModel（Hubert 模型）
- IBertConfig 配置類：IBertModel（I-BERT 模型）
- IJepaConfig 配置類：IJepaModel（I-JEPA 模型）
- Idefics2Config 配置類：Idefics2Model（Idefics2 模型）
- Idefics3Config 配置類：Idefics3Model（Idefics3 模型）
- Idefics3VisionConfig 配置類：Idefics3VisionTransformer（Idefics3VisionTransformer 模型）
- IdeficsConfig 配置類：IdeficsModel（IDEFICS 模型）
- ImageGPTConfig 配置類：ImageGPTModel（ImageGPT 模型）
- InformerConfig 配置類：InformerModel（Informer 模型）
- InstructBlipConfig 配置類：InstructBlipModel（InstructBLIP 模型）
- InstructBlipVideoConfig 配置類：InstructBlipVideoModel（InstructBlipVideo 模型）
- InternVLConfig 配置類：InternVLModel（InternVL 模型）
- InternVLVisionConfig 配置類：InternVLVisionModel（InternVLVision 模型）
- JambaConfig 配置類：JambaModel（Jamba 模型）
- JanusConfig 配置類：JanusModel（Janus 模型）
- JetMoeConfig 配置類：JetMoeModel（JetMoe 模型）
- JukeboxConfig 配置類：JukeboxModel（Jukebox 模型）
- Kosmos2Config 配置類：Kosmos2Model（KOSMOS-2 模型）
- KyutaiSpeechToTextConfig 配置類：KyutaiSpeechToTextModel（KyutaiSpeechToText 模型）
- LEDConfig 配置類：LEDModel（LED 模型）
- LayoutLMConfig 配置類：LayoutLMModel（LayoutLM 模型）
- LayoutLMv2Config 配置類：LayoutLMv2Model（LayoutLMv2 模型）
- LayoutLMv3Config 配置類：LayoutLMv3Model（LayoutLMv3 模型）
- LevitConfig 配置類：LevitModel（LeViT 模型）
- LightGlueConfig 配置類：LightGlueForKeypointMatching（LightGlue 模型）
- LiltConfig 配置類：LiltModel（LiLT 模型）
- Llama4Config 配置類：Llama4ForConditionalGeneration（Llama4 模型）
- Llama4TextConfig 配置類：Llama4TextModel（Llama4ForCausalLM 模型）
- LlamaConfig 配置類：LlamaModel（LLaMA 模型）
- LlavaConfig 配置類：LlavaModel（LLaVa 模型）
- LlavaNextConfig 配置類：LlavaNextModel（LLaVA-NeXT 模型）
- LlavaNextVideoConfig 配置類：LlavaNextVideoModel（LLaVa-NeXT-Video 模型）
- LlavaOnevisionConfig 配置類：LlavaOnevisionModel（LLaVA-Onevision 模型）
- LongT5Config 配置類：LongT5Model（LongT5 模型）
- LongformerConfig 配置類：LongformerModel（Longformer 模型）
- LukeConfig 配置類：LukeModel（LUKE 模型）
- LxmertConfig 配置類：LxmertModel（LXMERT 模型）
- M2M100Config 配置類：M2M100Model（M2M100 模型）
- MBartConfig 配置類：MBartModel（mBART 模型）
- MCTCTConfig 配置類：MCTCTModel（M-CTC-T 模型）
- MLCDVisionConfig 配置類：MLCDVisionModel（MLCD 模型）
- MPNetConfig 配置類：MPNetModel（MPNet 模型）
- MT5Config 配置類：MT5Model（MT5 模型）
- Mamba2Config 配置類：Mamba2Model（mamba2 模型）
- MambaConfig 配置類：MambaModel（Mamba 模型）
- MarianConfig 配置類：MarianModel（Marian 模型）
- MarkupLMConfig 配置類：MarkupLMModel（MarkupLM 模型）
- Mask2FormerConfig 配置類：Mask2FormerModel（Mask2Former 模型）
- MaskFormerConfig 配置類：MaskFormerModel（MaskFormer 模型）
- MaskFormerSwinConfig 配置類：MaskFormerSwinModel（MaskFormerSwin 模型）
- MegaConfig 配置類：MegaModel（MEGA 模型）
- MegatronBertConfig 配置類：MegatronBertModel（Megatron-BERT 模型）
- MgpstrConfig 配置類：MgpstrForSceneTextRecognition（MGP-STR 模型）
- MimiConfig 配置類：MimiModel（Mimi 模型）
- MiniMaxConfig 配置類：MiniMaxModel（MiniMax 模型）
- Mistral3Config 配置類：Mistral3Model（Mistral3 模型）
- MistralConfig 配置類：MistralModel（Mistral 模型）
- MixtralConfig 配置類：MixtralModel（Mixtral 模型）
- MllamaConfig 配置類：MllamaModel（Mllama 模型）
- MobileBertConfig 配置類：MobileBertModel（MobileBERT 模型）
- MobileNetV1Config 配置類：MobileNetV1Model（MobileNetV1 模型）
- MobileNetV2Config 配置類：MobileNetV2Model（MobileNetV2 模型）
- MobileViTConfig 配置類：MobileViTModel（MobileViT 模型）
- MobileViTV2Config 配置類：MobileViTV2Model（MobileViTV2 模型）
- ModernBertConfig 配置類：ModernBertModel（ModernBERT 模型）
- MoonshineConfig 配置類：MoonshineModel（Moonshine 模型）
- MoshiConfig 配置類：MoshiModel（Moshi 模型）
- MptConfig 配置類：MptModel（MPT 模型）
- MraConfig 配置類：MraModel（MRA 模型）
- MusicgenConfig 配置類：MusicgenModel（MusicGen 模型）
- MusicgenMelodyConfig 配置類：MusicgenMelodyModel（MusicGen Melody 模型）
- MvpConfig 配置類：MvpModel（MVP 模型）
- NatConfig 配置類：NatModel（NAT 模型）
- NemotronConfig 配置類：NemotronModel（Nemotron 模型）
- NezhaConfig 配置類：NezhaModel（Nezha 模型）
- NllbMoeConfig 配置類：NllbMoeModel（NLLB-MOE 模型）
- NystromformerConfig 配置類：NystromformerModel（Nyströmformer 模型）
- OPTConfig 配置類：OPTModel（OPT 模型）
- Olmo2Config 配置類：Olmo2Model（OLMo2 模型）
- OlmoConfig 配置類：OlmoModel（OLMo 模型）
- OlmoeConfig 配置類：OlmoeModel（OLMoE 模型）
- OmDetTurboConfig 配置類：OmDetTurboForObjectDetection（OmDet-Turbo 模型）
- OneFormerConfig 配置類：OneFormerModel（OneFormer 模型）
- OpenAIGPTConfig 配置類：OpenAIGPTModel（OpenAI GPT 模型）
- OpenLlamaConfig 配置類：OpenLlamaModel（OpenLlama 模型）
- OwlViTConfig 配置類：OwlViTModel（OWL-ViT 模型）
- Owlv2Config 配置類：Owlv2Model（OWLv2 模型）
- PLBartConfig 配置類：PLBartModel（PLBart 模型）
- PaliGemmaConfig 配置類：PaliGemmaModel（PaliGemma 模型）
- PatchTSMixerConfig 配置類：PatchTSMixerModel（PatchTSMixer 模型）
- PatchTSTConfig 配置類：PatchTSTModel（PatchTST 模型）
- PegasusConfig 配置類：PegasusModel（Pegasus 模型）
- PegasusXConfig 配置類：PegasusXModel（PEGASUS-X 模型）
- PerceiverConfig 配置類：PerceiverModel（Perceiver 模型）
- PersimmonConfig 配置類：PersimmonModel（Persimmon 模型）
- Phi3Config 配置類：Phi3Model（Phi3 模型）
- Phi4MultimodalConfig 配置類：Phi4MultimodalModel（Phi4Multimodal 模型）
- PhiConfig 配置類：PhiModel（Phi 模型）
- PhimoeConfig 配置類：PhimoeModel（Phimoe 模型）
- PixtralVisionConfig 配置類：PixtralVisionModel（Pixtral 模型）
- PoolFormerConfig 配置類：PoolFormerModel（PoolFormer 模型）
- ProphetNetConfig 配置類：ProphetNetModel（ProphetNet 模型）
- PvtConfig 配置類：PvtModel（PVT 模型）
- PvtV2Config 配置類：PvtV2Model（PVTv2 模型）
- QDQBertConfig 配置類：QDQBertModel（QDQBert 模型）
- Qwen2AudioEncoderConfig 配置類：Qwen2AudioEncoder（Qwen2AudioEncoder 模型）
- Qwen2Config 配置類：Qwen2Model（Qwen2 模型）
- Qwen2MoeConfig 配置類：Qwen2MoeModel（Qwen2MoE 模型）
- Qwen2VLConfig 配置類：Qwen2VLModel（Qwen2VL 模型）
- Qwen2VLTextConfig 配置類：Qwen2VLTextModel（Qwen2VL 模型）
- Qwen2_5_VLConfig 配置類：Qwen2_5_VLModel（Qwen2_5_VL 模型）
- Qwen2_5_VLTextConfig 配置類：Qwen2_5_VLTextModel（Qwen2_5_VL 模型）
- Qwen3Config 配置類：Qwen3Model（Qwen3 模型）
- Qwen3MoeConfig 配置類：Qwen3MoeModel（Qwen3MoE 模型）
- RTDetrConfig 配置類：RTDetrModel（RT-DETR 模型）
- RTDetrV2Config 配置類：RTDetrV2Model（RT-DETRv2 模型）
- RecurrentGemmaConfig 配置類：RecurrentGemmaModel（RecurrentGemma 模型）
- ReformerConfig 配置類：ReformerModel（Reformer 模型）
- RegNetConfig 配置類：RegNetModel（RegNet 模型）
- RemBertConfig 配置類：RemBertModel（RemBERT 模型）
- ResNetConfig 配置類：ResNetModel（ResNet 模型）
- RetriBertConfig 配置類：RetriBertModel（RetriBERT 模型）
- RoCBertConfig 配置類：RoCBertModel（RoCBert 模型）
- RoFormerConfig 配置類：RoFormerModel（RoFormer 模型）
- RobertaConfig 配置類：RobertaModel（RoBERTa 模型）
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormModel（RoBERTa-PreLayerNorm 模型）
- RwkvConfig 配置類：RwkvModel（RWKV 模型）
- SEWConfig 配置類：SEWModel（SEW 模型）
- SEWDConfig 配置類：SEWDModel（SEW-D 模型）
- SamConfig 配置類：SamModel（SAM 模型）
- SamHQConfig 配置類：SamHQModel（SAM-HQ 模型）
- SamHQVisionConfig 配置類：SamHQVisionModel（SamHQVisionModel 模型）
- SamVisionConfig 配置類：SamVisionModel（SamVisionModel 模型）
- SeamlessM4TConfig 配置類：SeamlessM4TModel（SeamlessM4T 模型）
- SeamlessM4Tv2Config 配置類：SeamlessM4Tv2Model（SeamlessM4Tv2 模型）
- SegGptConfig 配置類：SegGptModel（SegGPT 模型）
- SegformerConfig 配置類：SegformerModel（SegFormer 模型）
- Siglip2Config 配置類：Siglip2Model（SigLIP2 模型）
- SiglipConfig 配置類：SiglipModel（SigLIP 模型）
- SiglipVisionConfig 配置類：SiglipVisionModel（SiglipVisionModel 模型）
- SmolLM3Config 配置類：SmolLM3Model（SmolLM3 模型）
- SmolVLMConfig 配置類：SmolVLMModel（SmolVLM 模型）
- SmolVLMVisionConfig 配置類：SmolVLMVisionTransformer（SmolVLMVisionTransformer 模型）
- Speech2TextConfig 配置類：Speech2TextModel（Speech2Text 模型）
- SpeechT5Config 配置類：SpeechT5Model（SpeechT5 模型）
- SplinterConfig 配置類：SplinterModel（Splinter 模型）
- SqueezeBertConfig 配置類：SqueezeBertModel（SqueezeBERT 模型）
- StableLmConfig 配置類：StableLmModel（StableLm 模型）
- Starcoder2Config 配置類：Starcoder2Model（Starcoder2 模型）
- SuperGlueConfig 配置類：SuperGlueForKeypointMatching（SuperGlue 模型）
- SwiftFormerConfig 配置類：SwiftFormerModel（SwiftFormer 模型）
- Swin2SRConfig 配置類：Swin2SRModel（Swin2SR 模型）
- SwinConfig 配置類：SwinModel（Swin Transformer 模型）
- Swinv2Config 配置類：Swinv2Model（Swin Transformer V2 模型）
- SwitchTransformersConfig 配置類：SwitchTransformersModel（SwitchTransformers 模型）
- T5Config 配置類：T5Model（T5 模型）
- T5GemmaConfig 配置類：T5GemmaModel（T5Gemma 模型）
- TableTransformerConfig 配置類：TableTransformerModel（Table Transformer 模型）
- TapasConfig 配置類：TapasModel（TAPAS 模型）
- TextNetConfig 配置類：TextNetModel（TextNet 模型）
- TimeSeriesTransformerConfig 配置類：TimeSeriesTransformerModel（Time Series Transformer 模型）
- TimesFmConfig 配置類：TimesFmModel（TimesFm 模型）
- TimesformerConfig 配置類：TimesformerModel（TimeSformer 模型）
- TimmBackboneConfig 配置類：TimmBackbone（TimmBackbone 模型）
- TimmWrapperConfig 配置類：TimmWrapperModel（TimmWrapperModel 模型）
- TrajectoryTransformerConfig 配置類：TrajectoryTransformerModel（Trajectory Transformer 模型）
- TransfoXLConfig 配置類：TransfoXLModel（Transformer-XL 模型）
- TvltConfig 配置類：TvltModel（TVLT 模型）
- TvpConfig 配置類：TvpModel（TVP 模型）
- UMT5Config 配置類：UMT5Model（UMT5 模型）
- UdopConfig 配置類：UdopModel（UDOP 模型）
- UniSpeechConfig 配置類：UniSpeechModel（UniSpeech 模型）
- UniSpeechSatConfig 配置類：UniSpeechSatModel（UniSpeechSat 模型）
- UnivNetConfig 配置類：UnivNetModel（UnivNet 模型）
- VJEPA2Config 配置類：VJEPA2Model（VJEPA2Model 模型）
- VanConfig 配置類：VanModel（VAN 模型）
- ViTConfig 配置類：ViTModel（ViT 模型）
- ViTHybridConfig 配置類：ViTHybridModel（ViT Hybrid 模型）
- ViTMAEConfig 配置類：ViTMAEModel（ViTMAE 模型）
- ViTMSNConfig 配置類：ViTMSNModel（ViTMSN 模型）
- VideoLlavaConfig 配置類：VideoLlavaModel（VideoLlava 模型）
- VideoMAEConfig 配置類：VideoMAEModel（VideoMAE 模型）
- ViltConfig 配置類：ViltModel（ViLT 模型）
- VipLlavaConfig 配置類：VipLlavaModel（VipLlava 模型）
- VisionTextDualEncoderConfig 配置類：VisionTextDualEncoderModel（VisionTextDualEncoder 模型）
- VisualBertConfig 配置類：VisualBertModel（VisualBERT 模型）
- VitDetConfig 配置類：VitDetModel（VitDet 模型）
- VitsConfig 配置類：VitsModel（VITS 模型）
- VivitConfig 配置類：VivitModel（ViViT 模型）
- Wav2Vec2BertConfig 配置類：Wav2Vec2BertModel（Wav2Vec2-BERT 模型）
- Wav2Vec2Config 配置類：Wav2Vec2Model（Wav2Vec2 模型）
- Wav2Vec2ConformerConfig 配置類：Wav2Vec2ConformerModel（Wav2Vec2-Conformer 模型）
- WavLMConfig 配置類：WavLMModel（WavLM 模型）
- WhisperConfig 配置類：WhisperModel（Whisper 模型）
- XCLIPConfig 配置類：XCLIPModel（X-CLIP 模型）
- XGLMConfig 配置類：XGLMModel（XGLM 模型）
- XLMConfig 配置類：XLMModel（XLM 模型）
- XLMProphetNetConfig 配置類：XLMProphetNetModel（XLM-ProphetNet 模型）
- XLMRobertaConfig 配置類：XLMRobertaModel（XLM-RoBERTa 模型）
- XLMRobertaXLConfig 配置類：XLMRobertaXLModel（XLM-RoBERTa-XL 模型）
- XLNetConfig 配置類：XLNetModel（XLNet 模型）
- XmodConfig 配置類：XmodModel（X-MOD 模型）
- YolosConfig 配置類：YolosModel（YOLOS 模型）
- YosoConfig 配置類：YosoModel（YOSO 模型）
- Zamba2Config 配置類：Zamba2Model（Zamba2 模型）
- ZambaConfig 配置類：ZambaModel（Zamba 模型）
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動的 "eager" 實現。

透過配置例項化一個庫中的基礎模型類。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *TensorFlow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是由庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供一個本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了一個名為 config.json 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 用於替代從已儲存權重檔案載入的狀態字典。

如果你想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 用於按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的、存在於其自身建模檔案中的自定義模型。此選項只應為受信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個基礎模型類。

要例項化的模型類是根據配置物件的 model_type 屬性選擇的（可以作為引數傳遞，或者如果可能的話從 pretrained_model_name_or_path 載入），或者在缺失時，透過對 pretrained_model_name_or_path 進行模式匹配來選擇。

albert — AlbertModel (ALBERT 模型)
align — AlignModel (ALIGN 模型)
altclip — AltCLIPModel (AltCLIP 模型)
arcee — ArceeModel (Arcee 模型)
aria — AriaModel (Aria 模型)
aria_text — AriaTextModel (AriaText 模型)
audio-spectrogram-transformer — ASTModel (Audio Spectrogram Transformer 模型)
autoformer — AutoformerModel (Autoformer 模型)
aya_vision — AyaVisionModel (AyaVision 模型)
bamba — BambaModel (Bamba 模型)
bark — BarkModel (Bark 模型)
bart — BartModel (BART 模型)
beit — BeitModel (BEiT 模型)
bert — BertModel (BERT 模型)
bert-generation — BertGenerationEncoder (Bert Generation 模型)
big_bird — BigBirdModel (BigBird 模型)
bigbird_pegasus — BigBirdPegasusModel (BigBird-Pegasus 模型)
biogpt — BioGptModel (BioGpt 模型)
bit — BitModel (BiT 模型)
bitnet — BitNetModel (BitNet 模型)
blenderbot — BlenderbotModel (Blenderbot 模型)
blenderbot-small — BlenderbotSmallModel (BlenderbotSmall 模型)
blip — BlipModel (BLIP 模型)
blip-2 — Blip2Model (BLIP-2 模型)
blip_2_qformer — Blip2QFormerModel (BLIP-2 QFormer 模型)
bloom — BloomModel (BLOOM 模型)
bridgetower — BridgeTowerModel (BridgeTower 模型)
bros — BrosModel (BROS 模型)
camembert — CamembertModel (CamemBERT 模型)
canine — CanineModel (CANINE 模型)
chameleon — ChameleonModel (Chameleon 模型)
chinese_clip — ChineseCLIPModel (Chinese-CLIP 模型)
chinese_clip_vision_model — ChineseCLIPVisionModel (ChineseCLIPVisionModel 模型)
clap — ClapModel (CLAP 模型)
clip — CLIPModel (CLIP 模型)
clip_text_model — CLIPTextModel (CLIPTextModel 模型)
clip_vision_model — CLIPVisionModel (CLIPVisionModel 模型)
clipseg — CLIPSegModel (CLIPSeg 模型)
clvp — ClvpModelForConditionalGeneration (CLVP 模型)
code_llama — LlamaModel (CodeLlama 模型)
codegen — CodeGenModel (CodeGen 模型)
cohere — CohereModel (Cohere 模型)
cohere2 — Cohere2Model (Cohere2 模型)
conditional_detr — ConditionalDetrModel (Conditional DETR 模型)
convbert — ConvBertModel (ConvBERT 模型)
convnext — ConvNextModel (ConvNeXT 模型)
convnextv2 — ConvNextV2Model (ConvNeXTV2 模型)
cpmant — CpmAntModel (CPM-Ant 模型)
csm — CsmForConditionalGeneration (CSM 模型)
ctrl — CTRLModel (CTRL 模型)
cvt — CvtModel (CvT 模型)
d_fine — DFineModel (D-FINE 模型)
dab-detr — DabDetrModel (DAB-DETR 模型)
dac — DacModel (DAC 模型)
data2vec-audio — Data2VecAudioModel (Data2VecAudio 模型)
data2vec-text — Data2VecTextModel (Data2VecText 模型)
data2vec-vision — Data2VecVisionModel (Data2VecVision 模型)
dbrx — DbrxModel (DBRX 模型)
deberta — DebertaModel (DeBERTa 模型)
deberta-v2 — DebertaV2Model (DeBERTa-v2 模型)
decision_transformer — DecisionTransformerModel (Decision Transformer 模型)
deepseek_v3 — DeepseekV3Model (DeepSeek-V3 模型)
deformable_detr — DeformableDetrModel (Deformable DETR 模型)
deit — DeiTModel (DeiT 模型)
depth_pro — DepthProModel (DepthPro 模型)
deta — DetaModel (DETA 模型)
detr — DetrModel (DETR 模型)
dia — DiaModel (Dia 模型)
diffllama — DiffLlamaModel (DiffLlama 模型)
dinat — DinatModel (DiNAT 模型)
dinov2 — Dinov2Model (DINOv2 模型)
dinov2_with_registers — Dinov2WithRegistersModel (DINOv2 with Registers 模型)
distilbert — DistilBertModel (DistilBERT 模型)
donut-swin — DonutSwinModel (DonutSwin 模型)
dots1 — Dots1Model (dots1 模型)
dpr — DPRQuestionEncoder (DPR 模型)
dpt — DPTModel (DPT 模型)
efficientformer — EfficientFormerModel (EfficientFormer 模型)
efficientnet — EfficientNetModel (EfficientNet 模型)
electra — ElectraModel (ELECTRA 模型)
emu3 — Emu3Model (Emu3 模型)
encodec — EncodecModel (EnCodec 模型)
ernie — ErnieModel (ERNIE 模型)
ernie_m — ErnieMModel (ErnieM 模型)
esm — EsmModel (ESM 模型)
falcon — FalconModel (Falcon 模型)
falcon_h1 — FalconH1Model (FalconH1 模型)
falcon_mamba — FalconMambaModel (FalconMamba 模型)
fastspeech2_conformer — FastSpeech2ConformerModel (FastSpeech2Conformer 模型)
flaubert — FlaubertModel (FlauBERT 模型)
flava — FlavaModel (FLAVA 模型)
fnet — FNetModel (FNet 模型)
focalnet — FocalNetModel (FocalNet 模型)
fsmt — FSMTModel (FairSeq 機器翻譯模型)
funnel — FunnelModel 或 FunnelBaseModel (Funnel Transformer 模型)
fuyu — FuyuModel (Fuyu 模型)
gemma — GemmaModel (Gemma 模型)
gemma2 — Gemma2Model (Gemma2 模型)
gemma3 — Gemma3Model (Gemma3ForConditionalGeneration 模型)
gemma3_text — Gemma3TextModel (Gemma3ForCausalLM 模型)
gemma3n — Gemma3nModel (Gemma3nForConditionalGeneration 模型)
gemma3n_audio — Gemma3nAudioEncoder (Gemma3nAudioEncoder 模型)
gemma3n_text — Gemma3nTextModel (Gemma3nForCausalLM 模型)
gemma3n_vision — TimmWrapperModel (TimmWrapperModel 模型)
git — GitModel (GIT 模型)
glm — GlmModel (GLM 模型)
glm4 — Glm4Model (GLM4 模型)
glm4v — Glm4vModel (GLM4V 模型)
glm4v_text — Glm4vTextModel (GLM4V 模型)
glpn — GLPNModel (GLPN 模型)
got_ocr2 — GotOcr2Model (GOT-OCR2 模型)
gpt-sw3 — GPT2Model (GPT-Sw3 模型)
gpt2 — GPT2Model (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeModel (GPTBigCode 模型)
gpt_neo — GPTNeoModel (GPT Neo 模型)
gpt_neox — GPTNeoXModel (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseModel (GPT NeoX Japanese 模型)
gptj — GPTJModel (GPT-J 模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
granite — GraniteModel (Granite 模型)
granitemoe — GraniteMoeModel (GraniteMoeMoe 模型)
granitemoehybrid — GraniteMoeHybridModel (GraniteMoeHybrid 模型)
granitemoeshared — GraniteMoeSharedModel (GraniteMoeSharedMoe 模型)
graphormer — GraphormerModel (Graphormer 模型)
grounding-dino — GroundingDinoModel (Grounding DINO 模型)
groupvit — GroupViTModel (GroupViT 模型)
helium — HeliumModel (Helium 模型)
hgnet_v2 — HGNetV2Backbone (HGNet-V2 模型)
hiera — HieraModel (Hiera 模型)
hubert — HubertModel (Hubert 模型)
ibert — IBertModel (I-BERT 模型)
idefics — IdeficsModel (IDEFICS 模型)
idefics2 — Idefics2Model (Idefics2 模型)
idefics3 — Idefics3Model (Idefics3 模型)
idefics3_vision — Idefics3VisionTransformer (Idefics3VisionTransformer 模型)
ijepa — IJepaModel (I-JEPA 模型)
imagegpt — ImageGPTModel (ImageGPT 模型)
informer — InformerModel (Informer 模型)
instructblip — InstructBlipModel (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoModel (InstructBlipVideo 模型)
internvl — InternVLModel (InternVL 模型)
internvl_vision — InternVLVisionModel (InternVLVision 模型)
jamba — JambaModel (Jamba 模型)
janus — JanusModel (Janus 模型)
jetmoe — JetMoeModel (JetMoe 模型)
jukebox — JukeboxModel (Jukebox 模型)
kosmos-2 — Kosmos2Model (KOSMOS-2 模型)
kyutai_speech_to_text — KyutaiSpeechToTextModel (KyutaiSpeechToText 模型)
layoutlm — LayoutLMModel (LayoutLM 模型)
layoutlmv2 — LayoutLMv2Model (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Model (LayoutLMv3 模型)
led — LEDModel (LED 模型)
levit — LevitModel (LeViT 模型)
lightglue — LightGlueForKeypointMatching (LightGlue 模型)
lilt — LiltModel (LiLT 模型)
llama — LlamaModel (LLaMA 模型)
llama4 — Llama4ForConditionalGeneration (Llama4 模型)
llama4_text — Llama4TextModel (Llama4ForCausalLM 模型)
llava — LlavaModel (LLaVa 模型)
llava_next — LlavaNextModel (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoModel (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionModel (LLaVA-Onevision 模型)
longformer — LongformerModel (Longformer 模型)
longt5 — LongT5Model (LongT5 模型)
luke — LukeModel (LUKE 模型)
lxmert — LxmertModel (LXMERT 模型)
m2m_100 — M2M100Model (M2M100 模型)
mamba — MambaModel (Mamba 模型)
mamba2 — Mamba2Model (mamba2 模型)
marian — MarianModel (Marian 模型)
markuplm — MarkupLMModel (MarkupLM 模型)
mask2former — Mask2FormerModel (Mask2Former 模型)
maskformer — MaskFormerModel (MaskFormer 模型)
maskformer-swin — MaskFormerSwinModel (MaskFormerSwin 模型)
mbart — MBartModel (mBART 模型)
mctct — MCTCTModel (M-CTC-T 模型)
mega — MegaModel (MEGA 模型)
megatron-bert — MegatronBertModel (Megatron-BERT 模型)
mgp-str — MgpstrForSceneTextRecognition (MGP-STR 模型)
mimi — MimiModel (Mimi 模型)
minimax — MiniMaxModel (MiniMax 模型)
mistral — MistralModel (Mistral 模型)
mistral3 — Mistral3Model (Mistral3 模型)
mixtral — MixtralModel (Mixtral 模型)
mlcd — MLCDVisionModel (MLCD 模型)
mllama — MllamaModel (Mllama 模型)
mobilebert — MobileBertModel (MobileBERT 模型)
mobilenet_v1 — MobileNetV1Model (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2Model (MobileNetV2 模型)
mobilevit — MobileViTModel (MobileViT 模型)
mobilevitv2 — MobileViTV2Model (MobileViTV2 模型)
modernbert — ModernBertModel (ModernBERT 模型)
moonshine — MoonshineModel (Moonshine 模型)
moshi — MoshiModel (Moshi 模型)
mpnet — MPNetModel (MPNet 模型)
mpt — MptModel (MPT 模型)
mra — MraModel (MRA 模型)
mt5 — MT5Model (MT5 模型)
musicgen — MusicgenModel (MusicGen 模型)
musicgen_melody — MusicgenMelodyModel (MusicGen Melody 模型)
mvp — MvpModel (MVP 模型)
nat — NatModel (NAT 模型)
nemotron — NemotronModel (Nemotron 模型)
nezha — NezhaModel (Nezha 模型)
nllb-moe — NllbMoeModel (NLLB-MOE 模型)
nystromformer — NystromformerModel (Nyströmformer 模型)
olmo — OlmoModel (OLMo 模型)
olmo2 — Olmo2Model (OLMo2 模型)
olmoe — OlmoeModel (OLMoE 模型)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo 模型)
oneformer — OneFormerModel (OneFormer 模型)
open-llama — OpenLlamaModel (OpenLlama 模型)
openai-gpt — OpenAIGPTModel (OpenAI GPT 模型)
opt — OPTModel (OPT 模型)
owlv2 — Owlv2Model (OWLv2 模型)
owlvit — OwlViTModel (OWL-ViT 模型)
paligemma — PaliGemmaModel (PaliGemma 模型)
patchtsmixer — PatchTSMixerModel (PatchTSMixer 模型)
patchtst — PatchTSTModel (PatchTST 模型)
pegasus — PegasusModel (Pegasus 模型)
pegasus_x — PegasusXModel (PEGASUS-X 模型)
perceiver — PerceiverModel (Perceiver 模型)
persimmon — PersimmonModel (Persimmon 模型)
phi — PhiModel (Phi 模型)
phi3 — Phi3Model (Phi3 模型)
phi4_multimodal — Phi4MultimodalModel (Phi4Multimodal 模型)
phimoe — PhimoeModel (Phimoe 模型)
pixtral — PixtralVisionModel (Pixtral 模型)
plbart — PLBartModel (PLBart 模型)
poolformer — PoolFormerModel (PoolFormer 模型)
prophetnet — ProphetNetModel (ProphetNet 模型)
pvt — PvtModel (PVT 模型)
pvt_v2 — PvtV2Model (PVTv2 模型)
qdqbert — QDQBertModel (QDQBert 模型)
qwen2 — Qwen2Model (Qwen2 模型)
qwen2_5_vl — Qwen2_5_VLModel (Qwen2_5_VL 模型)
qwen2_5_vl_text — Qwen2_5_VLTextModel (Qwen2_5_VL 模型)
qwen2_audio_encoder — Qwen2AudioEncoder (Qwen2AudioEncoder 模型)
qwen2_moe — Qwen2MoeModel (Qwen2MoE 模型)
qwen2_vl — Qwen2VLModel (Qwen2VL 模型)
qwen2_vl_text — Qwen2VLTextModel (Qwen2VL 模型)
qwen3 — Qwen3Model (Qwen3 模型)
qwen3_moe — Qwen3MoeModel (Qwen3MoE 模型)
recurrent_gemma — RecurrentGemmaModel (RecurrentGemma 模型)
reformer — ReformerModel (Reformer 模型)
regnet — RegNetModel (RegNet 模型)
rembert — RemBertModel (RemBERT 模型)
resnet — ResNetModel (ResNet 模型)
retribert — RetriBertModel (RetriBERT 模型)
roberta — RobertaModel (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertModel (RoCBert 模型)
roformer — RoFormerModel (RoFormer 模型)
rt_detr — RTDetrModel (RT-DETR 模型)
rt_detr_v2 — RTDetrV2Model (RT-DETRv2 模型)
rwkv — RwkvModel (RWKV 模型)
sam — SamModel (SAM 模型)
sam_hq — SamHQModel (SAM-HQ 模型)
sam_hq_vision_model — SamHQVisionModel (SamHQVisionModel 模型)
sam_vision_model — SamVisionModel (SamVisionModel 模型)
seamless_m4t — SeamlessM4TModel (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2Model (SeamlessM4Tv2 模型)
segformer — SegformerModel (SegFormer 模型)
seggpt — SegGptModel (SegGPT 模型)
sew — SEWModel (SEW 模型)
sew-d — SEWDModel (SEW-D 模型)
siglip — SiglipModel (SigLIP 模型)
siglip2 — Siglip2Model (SigLIP2 模型)
siglip_vision_model — SiglipVisionModel (SiglipVisionModel 模型)
smollm3 — SmolLM3Model (SmolLM3 模型)
smolvlm — SmolVLMModel (SmolVLM 模型)
smolvlm_vision — SmolVLMVisionTransformer (SmolVLMVisionTransformer 模型)
speech_to_text — Speech2TextModel (Speech2Text 模型)
speecht5 — SpeechT5Model (SpeechT5 模型)
splinter — SplinterModel (Splinter 模型)
squeezebert — SqueezeBertModel (SqueezeBERT 模型)
stablelm — StableLmModel (StableLm 模型)
starcoder2 — Starcoder2Model (Starcoder2 模型)
superglue — SuperGlueForKeypointMatching (SuperGlue 模型)
swiftformer — SwiftFormerModel (SwiftFormer 模型)
swin — SwinModel (Swin Transformer 模型)
swin2sr — Swin2SRModel (Swin2SR 模型)
swinv2 — Swinv2Model (Swin Transformer V2 模型)
switch_transformers — SwitchTransformersModel (SwitchTransformers 模型)
t5 — T5Model (T5 模型)
t5gemma — T5GemmaModel (T5Gemma 模型)
table-transformer — TableTransformerModel (Table Transformer 模型)
tapas — TapasModel (TAPAS 模型)
textnet — TextNetModel (TextNet 模型)
time_series_transformer — TimeSeriesTransformerModel (Time Series Transformer 模型)
timesfm — TimesFmModel (TimesFm 模型)
timesformer — TimesformerModel (TimeSformer 模型)
timm_backbone — TimmBackbone (TimmBackbone 模型)
timm_wrapper — TimmWrapperModel (TimmWrapperModel 模型)
trajectory_transformer — TrajectoryTransformerModel (Trajectory Transformer 模型)
transfo-xl — TransfoXLModel (Transformer-XL 模型)
tvlt — TvltModel (TVLT 模型)
tvp — TvpModel (TVP 模型)
udop — UdopModel (UDOP 模型)
umt5 — UMT5Model (UMT5 模型)
unispeech — UniSpeechModel (UniSpeech 模型)
unispeech-sat — UniSpeechSatModel (UniSpeechSat 模型)
univnet — UnivNetModel (UnivNet 模型)
van — VanModel (VAN 模型)
video_llava — VideoLlavaModel (VideoLlava 模型)
videomae — VideoMAEModel (VideoMAE 模型)
vilt — ViltModel (ViLT 模型)
vipllava — VipLlavaModel (VipLlava 模型)
vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoder 模型)
visual_bert — VisualBertModel (VisualBERT 模型)
vit — ViTModel (ViT 模型)
vit_hybrid — ViTHybridModel (ViT Hybrid 模型)
vit_mae — ViTMAEModel (ViTMAE 模型)
vit_msn — ViTMSNModel (ViTMSN 模型)
vitdet — VitDetModel (VitDet 模型)
vits — VitsModel (VITS 模型)
vivit — VivitModel (ViViT 模型)
vjepa2 — VJEPA2Model (VJEPA2Model 模型)
wav2vec2 — Wav2Vec2Model (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertModel (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2-Conformer 模型)
wavlm — WavLMModel (WavLM 模型)
whisper — WhisperModel (Whisper 模型)
xclip — XCLIPModel (X-CLIP 模型)
xglm — XGLMModel (XGLM 模型)
xlm — XLMModel (XLM 模型)
xlm-prophetnet — XLMProphetNetModel (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaModel (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLModel (XLM-RoBERTa-XL 模型)
xlnet — XLNetModel (XLNet 模型)
xmod — XmodModel (X-MOD 模型)
yolos — YolosModel (YOLOS 模型)
yoso — YosoModel (YOSO 模型)
zamba — ZambaModel (Zamba 模型)
zamba2 — Zamba2Model (Zamba2 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModel

class transformers.TFAutoModel

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個基礎模型類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：TFAlbertModel (ALBERT 模型)
- BartConfig 配置類：TFBartModel (BART 模型)
- BertConfig 配置類：TFBertModel (BERT 模型)
- BlenderbotConfig 配置類：TFBlenderbotModel (Blenderbot 模型)
- BlenderbotSmallConfig 配置類：TFBlenderbotSmallModel (BlenderbotSmall 模型)
- BlipConfig 配置類：TFBlipModel (BLIP 模型)
- CLIPConfig 配置類：TFCLIPModel (CLIP 模型)
- CTRLConfig 配置類：TFCTRLModel (CTRL 模型)
- CamembertConfig 配置類：TFCamembertModel (CamemBERT 模型)
- ConvBertConfig 配置類：TFConvBertModel (ConvBERT 模型)
- ConvNextConfig 配置類：TFConvNextModel (ConvNeXT 模型)
- ConvNextV2Config 配置類：TFConvNextV2Model (ConvNeXTV2 模型)
- CvtConfig 配置類：TFCvtModel (CvT 模型)
- DPRConfig 配置類：TFDPRQuestionEncoder (DPR 模型)
- Data2VecVisionConfig 配置類：TFData2VecVisionModel (Data2VecVision 模型)
- DebertaConfig 配置類：TFDebertaModel (DeBERTa 模型)
- DebertaV2Config 配置類：TFDebertaV2Model (DeBERTa-v2 模型)
- DeiTConfig 配置類：TFDeiTModel (DeiT 模型)
- DistilBertConfig 配置類：TFDistilBertModel (DistilBERT 模型)
- EfficientFormerConfig 配置類：TFEfficientFormerModel (EfficientFormer 模型)
- ElectraConfig 配置類：TFElectraModel (ELECTRA 模型)
- EsmConfig 配置類：TFEsmModel (ESM 模型)
- FlaubertConfig 配置類：TFFlaubertModel (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelModel 或 TFFunnelBaseModel (Funnel Transformer 模型)
- GPT2Config 配置類：TFGPT2Model (OpenAI GPT-2 模型)
- GPTJConfig 配置類：TFGPTJModel (GPT-J 模型)
- GroupViTConfig 配置類：TFGroupViTModel (GroupViT 模型)
- HubertConfig 配置類：TFHubertModel (Hubert 模型)
- IdeficsConfig 配置類：TFIdeficsModel (IDEFICS 模型)
- LEDConfig 配置類：TFLEDModel (LED 模型)
- LayoutLMConfig 配置類：TFLayoutLMModel (LayoutLM 模型)
- LayoutLMv3Config 配置類：TFLayoutLMv3Model (LayoutLMv3 模型)
- LongformerConfig 配置類：TFLongformerModel (Longformer 模型)
- LxmertConfig 配置類：TFLxmertModel (LXMERT 模型)
- MBartConfig 配置類：TFMBartModel (mBART 模型)
- MPNetConfig 配置類：TFMPNetModel (MPNet 模型)
- MT5Config 配置類：TFMT5Model (MT5 模型)
- MarianConfig 配置類：TFMarianModel (Marian 模型)
- MistralConfig 配置類：TFMistralModel (Mistral 模型)
- MobileBertConfig 配置類：TFMobileBertModel (MobileBERT 模型)
- MobileViTConfig 配置類：TFMobileViTModel (MobileViT 模型)
- OPTConfig 配置類：TFOPTModel (OPT 模型)
- OpenAIGPTConfig 配置類：TFOpenAIGPTModel (OpenAI GPT 模型)
- PegasusConfig 配置類：TFPegasusModel (Pegasus 模型)
- RegNetConfig 配置類：TFRegNetModel (RegNet 模型)
- RemBertConfig 配置類：TFRemBertModel (RemBERT 模型)
- ResNetConfig 配置類：TFResNetModel (ResNet 模型)
- RoFormerConfig 配置類：TFRoFormerModel (RoFormer 模型)
- RobertaConfig 配置類：TFRobertaModel (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
- SamConfig 配置類：TFSamModel (SAM 模型)
- SamVisionConfig 配置類：TFSamVisionModel (SamVisionModel 模型)
- SegformerConfig 配置類：TFSegformerModel (SegFormer 模型)
- Speech2TextConfig 配置類：TFSpeech2TextModel (Speech2Text 模型)
- SwiftFormerConfig 配置類：TFSwiftFormerModel (SwiftFormer 模型)
- SwinConfig 配置類：TFSwinModel (Swin Transformer 模型)
- T5Config 配置類：TFT5Model (T5 模型)
- TapasConfig 配置類：TFTapasModel (TAPAS 模型)
- TransfoXLConfig 配置類：TFTransfoXLModel (Transformer-XL 模型)
- ViTConfig 配置類：TFViTModel (ViT 模型)
- ViTMAEConfig 配置類：TFViTMAEModel (ViTMAE 模型)
- VisionTextDualEncoderConfig 配置類：TFVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
- Wav2Vec2Config 配置類：TFWav2Vec2Model (Wav2Vec2 模型)
- WhisperConfig 配置類：TFWhisperModel (Whisper 模型)
- XGLMConfig 配置類：TFXGLMModel (XGLM 模型)
- XLMConfig 配置類：TFXLMModel (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaModel (XLM-RoBERTa 模型)
- XLNetConfig 配置類：TFXLNetModel (XLNet 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動的注意力實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

透過配置例項化一個庫中的基礎模型類。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如 ./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。與使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型相比，這種載入路徑較慢。
model_args (額外的位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應對您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼與模型的其餘部分位於不同的倉庫中，要使用的特定程式碼版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應於任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個基礎模型類。

albert — TFAlbertModel (ALBERT 模型)
bart — TFBartModel (BART 模型)
bert — TFBertModel (BERT 模型)
blenderbot — TFBlenderbotModel (Blenderbot 模型)
blenderbot-small — TFBlenderbotSmallModel (BlenderbotSmall 模型)
blip — TFBlipModel (BLIP 模型)
camembert — TFCamembertModel (CamemBERT 模型)
clip — TFCLIPModel (CLIP 模型)
convbert — TFConvBertModel (ConvBERT 模型)
convnext — TFConvNextModel (ConvNeXT 模型)
convnextv2 — TFConvNextV2Model (ConvNeXTV2 模型)
ctrl — TFCTRLModel (CTRL 模型)
cvt — TFCvtModel (CvT 模型)
data2vec-vision — TFData2VecVisionModel (Data2VecVision 模型)
deberta — TFDebertaModel (DeBERTa 模型)
deberta-v2 — TFDebertaV2Model (DeBERTa-v2 模型)
deit — TFDeiTModel (DeiT 模型)
distilbert — TFDistilBertModel (DistilBERT 模型)
dpr — TFDPRQuestionEncoder (DPR 模型)
efficientformer — TFEfficientFormerModel (EfficientFormer 模型)
electra — TFElectraModel (ELECTRA 模型)
esm — TFEsmModel (ESM 模型)
flaubert — TFFlaubertModel (FlauBERT 模型)
funnel — TFFunnelModel 或 TFFunnelBaseModel (Funnel Transformer 模型)
gpt-sw3 — TFGPT2Model (GPT-Sw3 模型)
gpt2 — TFGPT2Model (OpenAI GPT-2 模型)
gptj — TFGPTJModel (GPT-J 模型)
groupvit — TFGroupViTModel (GroupViT 模型)
hubert — TFHubertModel (Hubert 模型)
idefics — TFIdeficsModel (IDEFICS 模型)
layoutlm — TFLayoutLMModel (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3Model (LayoutLMv3 模型)
led — TFLEDModel (LED 模型)
longformer — TFLongformerModel (Longformer 模型)
lxmert — TFLxmertModel (LXMERT 模型)
marian — TFMarianModel (Marian 模型)
mbart — TFMBartModel (mBART 模型)
mistral — TFMistralModel (Mistral 模型)
mobilebert — TFMobileBertModel (MobileBERT 模型)
mobilevit — TFMobileViTModel (MobileViT 模型)
mpnet — TFMPNetModel (MPNet 模型)
mt5 — TFMT5Model (MT5 模型)
openai-gpt — TFOpenAIGPTModel (OpenAI GPT 模型)
opt — TFOPTModel (OPT 模型)
pegasus — TFPegasusModel (Pegasus 模型)
regnet — TFRegNetModel (RegNet 模型)
rembert — TFRemBertModel (RemBERT 模型)
resnet — TFResNetModel (ResNet 模型)
roberta — TFRobertaModel (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerModel (RoFormer 模型)
sam — TFSamModel (SAM 模型)
sam_vision_model — TFSamVisionModel (SamVisionModel 模型)
segformer — TFSegformerModel (SegFormer 模型)
speech_to_text — TFSpeech2TextModel (Speech2Text 模型)
swiftformer — TFSwiftFormerModel (SwiftFormer 模型)
swin — TFSwinModel (Swin Transformer 模型)
t5 — TFT5Model (T5 模型)
tapas — TFTapasModel (TAPAS 模型)
transfo-xl — TFTransfoXLModel (Transformer-XL 模型)
vision-text-dual-encoder — TFVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
vit — TFViTModel (ViT 模型)
vit_mae — TFViTMAEModel (ViTMAE 模型)
wav2vec2 — TFWav2Vec2Model (Wav2Vec2 模型)
whisper — TFWhisperModel (Whisper 模型)
xglm — TFXGLMModel (XGLM 模型)
xlm — TFXLMModel (XLM 模型)
xlm-roberta — TFXLMRobertaModel (XLM-RoBERTa 模型)
xlnet — TFXLNetModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModel

class transformers.FlaxAutoModel

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個基礎模型類。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：FlaxAlbertModel (ALBERT 模型)
- BartConfig 配置類：FlaxBartModel (BART 模型)
- BeitConfig 配置類：FlaxBeitModel (BEiT 模型)
- BertConfig 配置類：FlaxBertModel (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdModel (BigBird 模型)
- BlenderbotConfig 配置類：FlaxBlenderbotModel (Blenderbot 模型)
- BlenderbotSmallConfig 配置類：FlaxBlenderbotSmallModel (BlenderbotSmall 模型)
- BloomConfig 配置類：FlaxBloomModel (BLOOM 模型)
- CLIPConfig 配置類：FlaxCLIPModel (CLIP 模型)
- Dinov2Config 配置類：FlaxDinov2Model (DINOv2 模型)
- DistilBertConfig 配置類：FlaxDistilBertModel (DistilBERT 模型)
- ElectraConfig 配置類：FlaxElectraModel (ELECTRA 模型)
- GPT2Config 配置類：FlaxGPT2Model (OpenAI GPT-2 模型)
- GPTJConfig 配置類：FlaxGPTJModel (GPT-J 模型)
- GPTNeoConfig 配置類：FlaxGPTNeoModel (GPT Neo 模型)
- GemmaConfig 配置類：FlaxGemmaModel (Gemma 模型)
- LlamaConfig 配置類：FlaxLlamaModel (LLaMA 模型)
- LongT5Config 配置類：FlaxLongT5Model (LongT5 模型)
- MBartConfig 配置類：FlaxMBartModel (mBART 模型)
- MT5Config 配置類：FlaxMT5Model (MT5 模型)
- MarianConfig 配置類：FlaxMarianModel (Marian 模型)
- MistralConfig 配置類：FlaxMistralModel (Mistral 模型)
- OPTConfig 配置類：FlaxOPTModel (OPT 模型)
- PegasusConfig 配置類：FlaxPegasusModel (Pegasus 模型)
- RegNetConfig 配置類：FlaxRegNetModel (RegNet 模型)
- ResNetConfig 配置類：FlaxResNetModel (ResNet 模型)
- RoFormerConfig 配置類：FlaxRoFormerModel (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaModel (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
- T5Config 配置類：FlaxT5Model (T5 模型)
- ViTConfig 配置類：FlaxViTModel (ViT 模型)
- VisionTextDualEncoderConfig 配置類：FlaxVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
- Wav2Vec2Config 配置類：FlaxWav2Vec2Model (Wav2Vec2 模型)
- WhisperConfig 配置類：FlaxWhisperModel (Whisper 模型)
- XGLMConfig 配置類：FlaxXGLMModel (XGLM 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaModel (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

透過配置例項化一個庫中的基礎模型類。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如 ./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。與使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型相比，這種載入路徑較慢。
model_args (額外的位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應對您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼與模型的其餘部分位於不同的倉庫中，要使用的特定程式碼版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應於任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個基礎模型類。

albert — FlaxAlbertModel (ALBERT 模型)
bart — FlaxBartModel (BART 模型)
beit — FlaxBeitModel (BEiT 模型)
bert — FlaxBertModel (BERT 模型)
big_bird — FlaxBigBirdModel (BigBird 模型)
blenderbot — FlaxBlenderbotModel (Blenderbot 模型)
blenderbot-small — FlaxBlenderbotSmallModel (BlenderbotSmall 模型)
bloom — FlaxBloomModel (BLOOM 模型)
clip — FlaxCLIPModel (CLIP 模型)
dinov2 — FlaxDinov2Model (DINOv2 模型)
distilbert — FlaxDistilBertModel (DistilBERT 模型)
electra — FlaxElectraModel (ELECTRA 模型)
gemma — FlaxGemmaModel (Gemma 模型)
gpt-sw3 — FlaxGPT2Model (GPT-Sw3 模型)
gpt2 — FlaxGPT2Model (OpenAI GPT-2 模型)
gpt_neo — FlaxGPTNeoModel (GPT Neo 模型)
gptj — FlaxGPTJModel (GPT-J 模型)
llama — FlaxLlamaModel (LLaMA 模型)
longt5 — FlaxLongT5Model (LongT5 模型)
marian — FlaxMarianModel (Marian 模型)
mbart — FlaxMBartModel (mBART 模型)
mistral — FlaxMistralModel (Mistral 模型)
mt5 — FlaxMT5Model (MT5 模型)
opt — FlaxOPTModel (OPT 模型)
pegasus — FlaxPegasusModel (Pegasus 模型)
regnet — FlaxRegNetModel (RegNet 模型)
resnet — FlaxResNetModel (ResNet 模型)
roberta — FlaxRobertaModel (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerModel (RoFormer 模型)
t5 — FlaxT5Model (T5 模型)
vision-text-dual-encoder — FlaxVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
vit — FlaxViTModel (ViT 模型)
wav2vec2 — FlaxWav2Vec2Model (Wav2Vec2 模型)
whisper — FlaxWhisperModel (Whisper 模型)
xglm — FlaxXGLMModel (XGLM 模型)
xlm-roberta — FlaxXLMRobertaModel (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

通用預訓練類

以下自動類可用於例項化帶有預訓練頭的模型。

AutoModelForPreTraining

class transformers.AutoModelForPreTraining

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有預訓練頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 待例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：AlbertForPreTraining (ALBERT 模型)
- BartConfig 配置類：BartForConditionalGeneration (BART 模型)
- BertConfig 配置類：BertForPreTraining (BERT 模型)
- BigBirdConfig 配置類：BigBirdForPreTraining (BigBird 模型)
- BloomConfig 配置類：BloomForCausalLM (BLOOM 模型)
- CTRLConfig 配置類：CTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置類：CamembertForMaskedLM (CamemBERT 模型)
- ColPaliConfig 配置類：ColPaliForRetrieval (ColPali 模型)
- ColQwen2Config 配置類：ColQwen2ForRetrieval (ColQwen2 模型)
- Data2VecTextConfig 配置類：Data2VecTextForMaskedLM (Data2VecText 模型)
- DebertaConfig 配置類：DebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置類：DebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置類：DistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置類：ElectraForPreTraining (ELECTRA 模型)
- ErnieConfig 配置類：ErnieForPreTraining (ERNIE 模型)
- FNetConfig 配置類：FNetForPreTraining (FNet 模型)
- FSMTConfig 配置類：FSMTForConditionalGeneration (FairSeq 機器翻譯模型)
- FalconMambaConfig 配置類：FalconMambaForCausalLM (FalconMamba 模型)
- FlaubertConfig 配置類：FlaubertWithLMHeadModel (FlauBERT 模型)
- FlavaConfig 配置類：FlavaForPreTraining (FLAVA 模型)
- FunnelConfig 配置類：FunnelForPreTraining (Funnel Transformer 模型)
- GPT2Config 配置類：GPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置類：GPTBigCodeForCausalLM (GPTBigCode 模型)
- GPTSanJapaneseConfig 配置類：GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
- Gemma3Config 配置類：Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- HieraConfig 配置類：HieraForPreTraining (Hiera 模型)
- IBertConfig 配置類：IBertForMaskedLM (I-BERT 模型)
- Idefics2Config 配置類：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置類：Idefics3ForConditionalGeneration (Idefics3 模型)
- IdeficsConfig 配置類：IdeficsForVisionText2Text (IDEFICS 模型)
- JanusConfig 配置類：JanusForConditionalGeneration (Janus 模型)
- LayoutLMConfig 配置類：LayoutLMForMaskedLM (LayoutLM 模型)
- LlavaConfig 配置類：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置類：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置類：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置類：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- LongformerConfig 配置類：LongformerForMaskedLM (Longformer 模型)
- LukeConfig 配置類：LukeForMaskedLM (LUKE 模型)
- LxmertConfig 配置類：LxmertForPreTraining (LXMERT 模型)
- MPNetConfig 配置類：MPNetForMaskedLM (MPNet 模型)
- Mamba2Config 配置類：Mamba2ForCausalLM (mamba2 模型)
- MambaConfig 配置類：MambaForCausalLM (Mamba 模型)
- MegaConfig 配置類：MegaForMaskedLM (MEGA 模型)
- MegatronBertConfig 配置類：MegatronBertForPreTraining (Megatron-BERT 模型)
- Mistral3Config 配置類：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置類：MllamaForConditionalGeneration (Mllama 模型)
- MobileBertConfig 配置類：MobileBertForPreTraining (MobileBERT 模型)
- MptConfig 配置類：MptForCausalLM (MPT 模型)
- MraConfig 配置類：MraForMaskedLM (MRA 模型)
- MvpConfig 配置類：MvpForConditionalGeneration (MVP 模型)
- NezhaConfig 配置類：NezhaForPreTraining (Nezha 模型)
- NllbMoeConfig 配置類：NllbMoeForConditionalGeneration (NLLB-MOE 模型)
- OpenAIGPTConfig 配置類：OpenAIGPTLMHeadModel (OpenAI GPT 模型)
- PaliGemmaConfig 配置類：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Qwen2AudioConfig 配置類：Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
- RetriBertConfig 配置類：RetriBertModel (RetriBERT 模型)
- RoCBertConfig 配置類：RoCBertForPreTraining (RoCBert 模型)
- RobertaConfig 配置類：RobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- RwkvConfig 配置類：RwkvForCausalLM (RWKV 模型)
- SplinterConfig 配置類：SplinterForPreTraining (Splinter 模型)
- SqueezeBertConfig 配置類：SqueezeBertForMaskedLM (SqueezeBERT 模型)
- SwitchTransformersConfig 配置類：SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
- T5Config 配置類：T5ForConditionalGeneration (T5 模型)
- T5GemmaConfig 配置類：T5GemmaForConditionalGeneration (T5Gemma 模型)
- TapasConfig 配置類：TapasForMaskedLM (TAPAS 模型)
- TransfoXLConfig 配置類：TransfoXLLMHeadModel (Transformer-XL 模型)
- TvltConfig 配置類：TvltForPreTraining (TVLT 模型)
- UniSpeechConfig 配置類：UniSpeechForPreTraining (UniSpeech 模型)
- UniSpeechSatConfig 配置類：UniSpeechSatForPreTraining (UniSpeechSat 模型)
- ViTMAEConfig 配置類：ViTMAEForPreTraining (ViTMAE 模型)
- VideoLlavaConfig 配置類：VideoLlavaForConditionalGeneration (VideoLlava 模型)
- VideoMAEConfig 配置類：VideoMAEForPreTraining (VideoMAE 模型)
- VipLlavaConfig 配置類：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisualBertConfig 配置類：VisualBertForPreTraining (VisualBERT 模型)
- Wav2Vec2Config 配置類：Wav2Vec2ForPreTraining (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置類：Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer 模型)
- XLMConfig 配置類：XLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置類：XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類：XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類：XLNetLMHeadModel (XLNet 模型)
- XmodConfig 配置類：XmodForMaskedLM (X-MOD 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現方式（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有預訓練頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 TensorFlow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於替代從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 Git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 Git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 Git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 Git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個與配置屬性對應的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有預訓練頭）。

albert — AlbertForPreTraining (ALBERT 模型)
bart — BartForConditionalGeneration (BART 模型)
bert — BertForPreTraining (BERT 模型)
big_bird — BigBirdForPreTraining (BigBird 模型)
bloom — BloomForCausalLM (BLOOM 模型)
camembert — CamembertForMaskedLM (CamemBERT 模型)
colpali — ColPaliForRetrieval (ColPali 模型)
colqwen2 — ColQwen2ForRetrieval (ColQwen2 模型)
ctrl — CTRLLMHeadModel (CTRL 模型)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText 模型)
deberta — DebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — DistilBertForMaskedLM (DistilBERT 模型)
electra — ElectraForPreTraining (ELECTRA 模型)
ernie — ErnieForPreTraining (ERNIE 模型)
falcon_mamba — FalconMambaForCausalLM (FalconMamba 模型)
flaubert — FlaubertWithLMHeadModel (FlauBERT 模型)
flava — FlavaForPreTraining (FLAVA 模型)
fnet — FNetForPreTraining (FNet 模型)
fsmt — FSMTForConditionalGeneration (FairSeq 機器翻譯模型)
funnel — FunnelForPreTraining (Funnel Transformer 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode 模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
hiera — HieraForPreTraining (Hiera 模型)
ibert — IBertForMaskedLM (I-BERT 模型)
idefics — IdeficsForVisionText2Text (IDEFICS 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
janus — JanusForConditionalGeneration (Janus 模型)
layoutlm — LayoutLMForMaskedLM (LayoutLM 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
longformer — LongformerForMaskedLM (Longformer 模型)
luke — LukeForMaskedLM (LUKE 模型)
lxmert — LxmertForPreTraining (LXMERT 模型)
mamba — MambaForCausalLM (Mamba 模型)
mamba2 — Mamba2ForCausalLM (mamba2 模型)
mega — MegaForMaskedLM (MEGA 模型)
megatron-bert — MegatronBertForPreTraining (Megatron-BERT 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
mobilebert — MobileBertForPreTraining (MobileBERT 模型)
mpnet — MPNetForMaskedLM (MPNet 模型)
mpt — MptForCausalLM (MPT 模型)
mra — MraForMaskedLM (MRA 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nezha — NezhaForPreTraining (Nezha 模型)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE 模型)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
retribert — RetriBertModel (RetriBERT 模型)
roberta — RobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForPreTraining (RoCBert 模型)
rwkv — RwkvForCausalLM (RWKV 模型)
splinter — SplinterForPreTraining (Splinter 模型)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT 模型)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
t5 — T5ForConditionalGeneration (T5 模型)
t5gemma — T5GemmaForConditionalGeneration (T5Gemma 模型)
tapas — TapasForMaskedLM (TAPAS 模型)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL 模型)
tvlt — TvltForPreTraining (TVLT 模型)
unispeech — UniSpeechForPreTraining (UniSpeech 模型)
unispeech-sat — UniSpeechSatForPreTraining (UniSpeechSat 模型)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava 模型)
videomae — VideoMAEForPreTraining (VideoMAE 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
visual_bert — VisualBertForPreTraining (VisualBERT 模型)
vit_mae — ViTMAEForPreTraining (ViTMAE 模型)
wav2vec2 — Wav2Vec2ForPreTraining (Wav2Vec2 模型)
wav2vec2-conformer — Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
xlnet — XLNetLMHeadModel (XLNet 模型)
xmod — XmodForMaskedLM (X-MOD 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForPreTraining

class transformers.TFAutoModelForPreTraining

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有預訓練頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 待例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：TFAlbertForPreTraining (ALBERT 模型)
- BartConfig 配置類：TFBartForConditionalGeneration (BART 模型)
- BertConfig 配置類：TFBertForPreTraining (BERT 模型)
- CTRLConfig 配置類：TFCTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置類：TFCamembertForMaskedLM (CamemBERT 模型)
- DistilBertConfig 配置類：TFDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置類：TFElectraForPreTraining (ELECTRA 模型)
- FlaubertConfig 配置類：TFFlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelForPreTraining (Funnel Transformer 模型)
- GPT2Config 配置類：TFGPT2LMHeadModel (OpenAI GPT-2 模型)
- IdeficsConfig 配置類：TFIdeficsForVisionText2Text (IDEFICS 模型)
- LayoutLMConfig 配置類：TFLayoutLMForMaskedLM (LayoutLM 模型)
- LxmertConfig 配置類：TFLxmertForPreTraining (LXMERT 模型)
- MPNetConfig 配置類：TFMPNetForMaskedLM (MPNet 模型)
- MobileBertConfig 配置類：TFMobileBertForPreTraining (MobileBERT 模型)
- OpenAIGPTConfig 配置類：TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
- RobertaConfig 配置類：TFRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- T5Config 配置類：TFT5ForConditionalGeneration (T5 模型)
- TapasConfig 配置類：TFTapasForMaskedLM (TAPAS 模型)
- TransfoXLConfig 配置類：TFTransfoXLLMHeadModel (Transformer-XL 模型)
- ViTMAEConfig 配置類：TFViTMAEForPreTraining (ViTMAE 模型)
- XLMConfig 配置類：TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLNetConfig 配置類：TFXLNetLMHeadModel (XLNet 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現方式（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有預訓練頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都預設支援斷點續傳。此引數將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個字典，用於指定按協議或端點使用的代理伺服器，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將被傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有預訓練頭）。

albert — TFAlbertForPreTraining (ALBERT 模型)
bart — TFBartForConditionalGeneration (BART 模型)
bert — TFBertForPreTraining (BERT 模型)
camembert — TFCamembertForMaskedLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
distilbert — TFDistilBertForMaskedLM (DistilBERT 模型)
electra — TFElectraForPreTraining (ELECTRA 模型)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT 模型)
funnel — TFFunnelForPreTraining (Funnel Transformer 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
idefics — TFIdeficsForVisionText2Text (IDEFICS 模型)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM 模型)
lxmert — TFLxmertForPreTraining (LXMERT 模型)
mobilebert — TFMobileBertForPreTraining (MobileBERT 模型)
mpnet — TFMPNetForMaskedLM (MPNet 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
roberta — TFRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
t5 — TFT5ForConditionalGeneration (T5 模型)
tapas — TFTapasForMaskedLM (TAPAS 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
vit_mae — TFViTMAEForPreTraining (ViTMAE 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlnet — TFXLNetLMHeadModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForPreTraining

class transformers.FlaxAutoModelForPreTraining

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有預訓練頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：FlaxAlbertForPreTraining (ALBERT 模型)
- BartConfig 配置類：FlaxBartForConditionalGeneration (BART 模型)
- BertConfig 配置類：FlaxBertForPreTraining (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdForPreTraining (BigBird 模型)
- ElectraConfig 配置類：FlaxElectraForPreTraining (ELECTRA 模型)
- LongT5Config 配置類：FlaxLongT5ForConditionalGeneration (LongT5 模型)
- MBartConfig 配置類：FlaxMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置類：FlaxMT5ForConditionalGeneration (MT5 模型)
- RoFormerConfig 配置類：FlaxRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- T5Config 配置類：FlaxT5ForConditionalGeneration (T5 模型)
- Wav2Vec2Config 配置類：FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
- WhisperConfig 配置類：FlaxWhisperForConditionalGeneration (Whisper 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有預訓練頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到名為 *config.json* 的配置 JSON 檔案。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都預設支援斷點續傳。此引數將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個字典，用於指定按協議或端點使用的代理伺服器，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將被傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有預訓練頭）。

albert — FlaxAlbertForPreTraining (ALBERT 模型)
bart — FlaxBartForConditionalGeneration (BART 模型)
bert — FlaxBertForPreTraining (BERT 模型)
big_bird — FlaxBigBirdForPreTraining (BigBird 模型)
electra — FlaxElectraForPreTraining (ELECTRA 模型)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
mt5 — FlaxMT5ForConditionalGeneration (MT5 模型)
roberta — FlaxRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMaskedLM (RoFormer 模型)
t5 — FlaxT5ForConditionalGeneration (T5 模型)
wav2vec2 — FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
whisper — FlaxWhisperForConditionalGeneration (Whisper 模型)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

自然語言處理

以下自動類可用於以下自然語言處理任務。

AutoModelForCausalLM

class transformers.AutoModelForCausalLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有因果語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 用於例項化的模型類是根據配置類選擇的：
- ArceeConfig 配置類： ArceeForCausalLM (Arcee 模型)
- AriaTextConfig 配置類： AriaTextForCausalLM (AriaText 模型)
- BambaConfig 配置類： BambaForCausalLM (Bamba 模型)
- BartConfig 配置類： BartForCausalLM (BART 模型)
- BertConfig 配置類： BertLMHeadModel (BERT 模型)
- BertGenerationConfig 配置類： BertGenerationDecoder (Bert Generation 模型)
- BigBirdConfig 配置類： BigBirdForCausalLM (BigBird 模型)
- BigBirdPegasusConfig 配置類： BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
- BioGptConfig 配置類： BioGptForCausalLM (BioGpt 模型)
- BitNetConfig 配置類： BitNetForCausalLM (BitNet 模型)
- BlenderbotConfig 配置類： BlenderbotForCausalLM (Blenderbot 模型)
- BlenderbotSmallConfig 配置類： BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
- BloomConfig 配置類： BloomForCausalLM (BLOOM 模型)
- CTRLConfig 配置類： CTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置類： CamembertForCausalLM (CamemBERT 模型)
- CodeGenConfig 配置類： CodeGenForCausalLM (CodeGen 模型)
- Cohere2Config 配置類： Cohere2ForCausalLM (Cohere2 模型)
- CohereConfig 配置類： CohereForCausalLM (Cohere 模型)
- CpmAntConfig 配置類： CpmAntForCausalLM (CPM-Ant 模型)
- Data2VecTextConfig 配置類： Data2VecTextForCausalLM (Data2VecText 模型)
- DbrxConfig 配置類： DbrxForCausalLM (DBRX 模型)
- DeepseekV3Config 配置類： DeepseekV3ForCausalLM (DeepSeek-V3 模型)
- DiffLlamaConfig 配置類： DiffLlamaForCausalLM (DiffLlama 模型)
- Dots1Config 配置類： Dots1ForCausalLM (dots1 模型)
- ElectraConfig 配置類： ElectraForCausalLM (ELECTRA 模型)
- Emu3Config 配置類： Emu3ForCausalLM (Emu3 模型)
- ErnieConfig 配置類： ErnieForCausalLM (ERNIE 模型)
- FalconConfig 配置類： FalconForCausalLM (Falcon 模型)
- FalconH1Config 配置類： FalconH1ForCausalLM (FalconH1 模型)
- FalconMambaConfig 配置類： FalconMambaForCausalLM (FalconMamba 模型)
- FuyuConfig 配置類： FuyuForCausalLM (Fuyu 模型)
- GPT2Config 配置類： GPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置類： GPTBigCodeForCausalLM (GPTBigCode 模型)
- GPTJConfig 配置類： GPTJForCausalLM (GPT-J 模型)
- GPTNeoConfig 配置類： GPTNeoForCausalLM (GPT Neo 模型)
- GPTNeoXConfig 配置類： GPTNeoXForCausalLM (GPT NeoX 模型)
- GPTNeoXJapaneseConfig 配置類： GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese 模型)
- Gemma2Config 配置類： Gemma2ForCausalLM (Gemma2 模型)
- Gemma3Config 配置類： Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- Gemma3TextConfig 配置類： Gemma3ForCausalLM (Gemma3ForCausalLM 模型)
- Gemma3nConfig 配置類： Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
- Gemma3nTextConfig 配置類： Gemma3nForCausalLM (Gemma3nForCausalLM 模型)
- GemmaConfig 配置類： GemmaForCausalLM (Gemma 模型)
- GitConfig 配置類： GitForCausalLM (GIT 模型)
- Glm4Config 配置類： Glm4ForCausalLM (GLM4 模型)
- GlmConfig 配置類： GlmForCausalLM (GLM 模型)
- GotOcr2Config 配置類： GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
- GraniteConfig 配置類： GraniteForCausalLM (Granite 模型)
- GraniteMoeConfig 配置類： GraniteMoeForCausalLM (GraniteMoeMoe 模型)
- GraniteMoeHybridConfig 配置類： GraniteMoeHybridForCausalLM (GraniteMoeHybrid 模型)
- GraniteMoeSharedConfig 配置類： GraniteMoeSharedForCausalLM (GraniteMoeSharedMoe 模型)
- HeliumConfig 配置類： HeliumForCausalLM (Helium 模型)
- JambaConfig 配置類： JambaForCausalLM (Jamba 模型)
- JetMoeConfig 配置類： JetMoeForCausalLM (JetMoe 模型)
- Llama4Config 配置類： Llama4ForCausalLM (Llama4 模型)
- Llama4TextConfig 配置類： Llama4ForCausalLM (Llama4ForCausalLM 模型)
- LlamaConfig 配置類： LlamaForCausalLM (LLaMA 模型)
- MBartConfig 配置類： MBartForCausalLM (mBART 模型)
- Mamba2Config 配置類： Mamba2ForCausalLM (mamba2 模型)
- MambaConfig 配置類： MambaForCausalLM (Mamba 模型)
- MarianConfig 配置類： MarianForCausalLM (Marian 模型)
- MegaConfig 配置類： MegaForCausalLM (MEGA 模型)
- MegatronBertConfig 配置類： MegatronBertForCausalLM (Megatron-BERT 模型)
- MiniMaxConfig 配置類： MiniMaxForCausalLM (MiniMax 模型)
- MistralConfig 配置類： MistralForCausalLM (Mistral 模型)
- MixtralConfig 配置類： MixtralForCausalLM (Mixtral 模型)
- MllamaConfig 配置類： MllamaForCausalLM (Mllama 模型)
- MoshiConfig 配置類： MoshiForCausalLM (Moshi 模型)
- MptConfig 配置類： MptForCausalLM (MPT 模型)
- MusicgenConfig 配置類： MusicgenForCausalLM (MusicGen 模型)
- MusicgenMelodyConfig 配置類： MusicgenMelodyForCausalLM (MusicGen Melody 模型)
- MvpConfig 配置類： MvpForCausalLM (MVP 模型)
- NemotronConfig 配置類： NemotronForCausalLM (Nemotron 模型)
- OPTConfig 配置類： OPTForCausalLM (OPT 模型)
- Olmo2Config 配置類： Olmo2ForCausalLM (OLMo2 模型)
- OlmoConfig 配置類： OlmoForCausalLM (OLMo 模型)
- OlmoeConfig 配置類： OlmoeForCausalLM (OLMoE 模型)
- OpenAIGPTConfig 配置類： OpenAIGPTLMHeadModel (OpenAI GPT 模型)
- OpenLlamaConfig 配置類： OpenLlamaForCausalLM (OpenLlama 模型)
- PLBartConfig 配置類： PLBartForCausalLM (PLBart 模型)
- PegasusConfig 配置類： PegasusForCausalLM (Pegasus 模型)
- PersimmonConfig 配置類： PersimmonForCausalLM (Persimmon 模型)
- Phi3Config 配置類： Phi3ForCausalLM (Phi3 模型)
- Phi4MultimodalConfig 配置類： Phi4MultimodalForCausalLM (Phi4Multimodal 模型)
- PhiConfig 配置類： PhiForCausalLM (Phi 模型)
- PhimoeConfig 配置類： PhimoeForCausalLM (Phimoe 模型)
- ProphetNetConfig 配置類： ProphetNetForCausalLM (ProphetNet 模型)
- QDQBertConfig 配置類： QDQBertLMHeadModel (QDQBert 模型)
- Qwen2Config 配置類： Qwen2ForCausalLM (Qwen2 模型)
- Qwen2MoeConfig 配置類： Qwen2MoeForCausalLM (Qwen2MoE 模型)
- Qwen3Config 配置類： Qwen3ForCausalLM (Qwen3 模型)
- Qwen3MoeConfig 配置類： Qwen3MoeForCausalLM (Qwen3MoE 模型)
- RecurrentGemmaConfig 配置類： RecurrentGemmaForCausalLM (RecurrentGemma 模型)
- ReformerConfig 配置類： ReformerModelWithLMHead (Reformer 模型)
- RemBertConfig 配置類： RemBertForCausalLM (RemBERT 模型)
- RoCBertConfig 配置類： RoCBertForCausalLM (RoCBert 模型)
- RoFormerConfig 配置類： RoFormerForCausalLM (RoFormer 模型)
- RobertaConfig 配置類： RobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類： RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- RwkvConfig 配置類： RwkvForCausalLM (RWKV 模型)
- SmolLM3Config 配置類： SmolLM3ForCausalLM (SmolLM3 模型)
- Speech2Text2Config 配置類： Speech2Text2ForCausalLM (Speech2Text2 模型)
- StableLmConfig 配置類： StableLmForCausalLM (StableLm 模型)
- Starcoder2Config 配置類： Starcoder2ForCausalLM (Starcoder2 模型)
- TrOCRConfig 配置類： TrOCRForCausalLM (TrOCR 模型)
- TransfoXLConfig 配置類： TransfoXLLMHeadModel (Transformer-XL 模型)
- WhisperConfig 配置類： WhisperForCausalLM (Whisper 模型)
- XGLMConfig 配置類： XGLMForCausalLM (XGLM 模型)
- XLMConfig 配置類： XLMWithLMHeadModel (XLM 模型)
- XLMProphetNetConfig 配置類： XLMProphetNetForCausalLM (XLM-ProphetNet 模型)
- XLMRobertaConfig 配置類： XLMRobertaForCausalLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類： XLMRobertaXLForCausalLM (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類： XLNetLMHeadModel (XLNet 模型)
- XmodConfig 配置類： XmodForCausalLM (X-MOD 模型)
- Zamba2Config 配置類： Zamba2ForCausalLM (Zamba2 模型)
- ZambaConfig 配置類： ZambaForCausalLM (Zamba 模型)
attn_implementation (str, 可選) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有因果語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個*tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供一個本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能時都會預設續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應為受信任的且您已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有因果語言建模頭）。

arcee — ArceeForCausalLM (Arcee 模型)
aria_text — AriaTextForCausalLM (AriaText 模型)
bamba — BambaForCausalLM (Bamba 模型)
bart — BartForCausalLM (BART 模型)
bert — BertLMHeadModel (BERT 模型)
bert-generation — BertGenerationDecoder (Bert Generation 模型)
big_bird — BigBirdForCausalLM (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
biogpt — BioGptForCausalLM (BioGpt 模型)
bitnet — BitNetForCausalLM (BitNet 模型)
blenderbot — BlenderbotForCausalLM (Blenderbot 模型)
blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
bloom — BloomForCausalLM (BLOOM 模型)
camembert — CamembertForCausalLM (CamemBERT 模型)
code_llama — LlamaForCausalLM (CodeLlama 模型)
codegen — CodeGenForCausalLM (CodeGen 模型)
cohere — CohereForCausalLM (Cohere 模型)
cohere2 — Cohere2ForCausalLM (Cohere2 模型)
cpmant — CpmAntForCausalLM (CPM-Ant 模型)
ctrl — CTRLLMHeadModel (CTRL 模型)
data2vec-text — Data2VecTextForCausalLM (Data2VecText 模型)
dbrx — DbrxForCausalLM (DBRX 模型)
deepseek_v3 — DeepseekV3ForCausalLM (DeepSeek-V3 模型)
diffllama — DiffLlamaForCausalLM (DiffLlama 模型)
dots1 — Dots1ForCausalLM (dots1 模型)
electra — ElectraForCausalLM (ELECTRA 模型)
emu3 — Emu3ForCausalLM (Emu3 模型)
ernie — ErnieForCausalLM (ERNIE 模型)
falcon — FalconForCausalLM (Falcon 模型)
falcon_h1 — FalconH1ForCausalLM (FalconH1 模型)
falcon_mamba — FalconMambaForCausalLM (FalconMamba 模型)
fuyu — FuyuForCausalLM (Fuyu 模型)
gemma — GemmaForCausalLM (Gemma 模型)
gemma2 — Gemma2ForCausalLM (Gemma2 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gemma3_text — Gemma3ForCausalLM (Gemma3ForCausalLM 模型)
gemma3n — Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
gemma3n_text — Gemma3nForCausalLM (Gemma3nForCausalLM 模型)
git — GitForCausalLM (GIT 模型)
glm — GlmForCausalLM (GLM 模型)
glm4 — Glm4ForCausalLM (GLM4 模型)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode 模型)
gpt_neo — GPTNeoForCausalLM (GPT Neo 模型)
gpt_neox — GPTNeoXForCausalLM (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese 模型)
gptj — GPTJForCausalLM (GPT-J 模型)
granite — GraniteForCausalLM (Granite 模型)
granitemoe — GraniteMoeForCausalLM (GraniteMoeMoe 模型)
granitemoehybrid — GraniteMoeHybridForCausalLM (GraniteMoeHybrid 模型)
granitemoeshared — GraniteMoeSharedForCausalLM (GraniteMoeSharedMoe 模型)
helium — HeliumForCausalLM (Helium 模型)
jamba — JambaForCausalLM (Jamba 模型)
jetmoe — JetMoeForCausalLM (JetMoe 模型)
llama — LlamaForCausalLM (LLaMA 模型)
llama4 — Llama4ForCausalLM (Llama4 模型)
llama4_text — Llama4ForCausalLM (Llama4ForCausalLM 模型)
mamba — MambaForCausalLM (Mamba 模型)
mamba2 — Mamba2ForCausalLM (mamba2 模型)
marian — MarianForCausalLM (Marian 模型)
mbart — MBartForCausalLM (mBART 模型)
mega — MegaForCausalLM (MEGA 模型)
megatron-bert — MegatronBertForCausalLM (Megatron-BERT 模型)
minimax — MiniMaxForCausalLM (MiniMax 模型)
mistral — MistralForCausalLM (Mistral 模型)
mixtral — MixtralForCausalLM (Mixtral 模型)
mllama — MllamaForCausalLM (Mllama 模型)
moshi — MoshiForCausalLM (Moshi 模型)
mpt — MptForCausalLM (MPT 模型)
musicgen — MusicgenForCausalLM (MusicGen 模型)
musicgen_melody — MusicgenMelodyForCausalLM (MusicGen Melody 模型)
mvp — MvpForCausalLM (MVP 模型)
nemotron — NemotronForCausalLM (Nemotron 模型)
olmo — OlmoForCausalLM (OLMo 模型)
olmo2 — Olmo2ForCausalLM (OLMo2 模型)
olmoe — OlmoeForCausalLM (OLMoE 模型)
open-llama — OpenLlamaForCausalLM (OpenLlama 模型)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — OPTForCausalLM (OPT 模型)
pegasus — PegasusForCausalLM (Pegasus 模型)
persimmon — PersimmonForCausalLM (Persimmon 模型)
phi — PhiForCausalLM (Phi 模型)
phi3 — Phi3ForCausalLM (Phi3 模型)
phi4_multimodal — Phi4MultimodalForCausalLM (Phi4Multimodal 模型)
phimoe — PhimoeForCausalLM (Phimoe 模型)
plbart — PLBartForCausalLM (PLBart 模型)
prophetnet — ProphetNetForCausalLM (ProphetNet 模型)
qdqbert — QDQBertLMHeadModel (QDQBert 模型)
qwen2 — Qwen2ForCausalLM (Qwen2 模型)
qwen2_moe — Qwen2MoeForCausalLM (Qwen2MoE 模型)
qwen3 — Qwen3ForCausalLM (Qwen3 模型)
qwen3_moe — Qwen3MoeForCausalLM (Qwen3MoE 模型)
recurrent_gemma — RecurrentGemmaForCausalLM (RecurrentGemma 模型)
reformer — ReformerModelWithLMHead (Reformer 模型)
rembert — RemBertForCausalLM (RemBERT 模型)
roberta — RobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForCausalLM (RoCBert 模型)
roformer — RoFormerForCausalLM (RoFormer 模型)
rwkv — RwkvForCausalLM (RWKV 模型)
smollm3 — SmolLM3ForCausalLM (SmolLM3 模型)
speech_to_text_2 — Speech2Text2ForCausalLM (Speech2Text2 模型)
stablelm — StableLmForCausalLM (StableLm 模型)
starcoder2 — Starcoder2ForCausalLM (Starcoder2 模型)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL 模型)
trocr — TrOCRForCausalLM (TrOCR 模型)
whisper — WhisperForCausalLM (Whisper 模型)
xglm — XGLMForCausalLM (XGLM 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-prophetnet — XLMProphetNetForCausalLM (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaForCausalLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForCausalLM (XLM-RoBERTa-XL 模型)
xlnet — XLNetLMHeadModel (XLNet 模型)
xmod — XmodForCausalLM (X-MOD 模型)
zamba — ZambaForCausalLM (Zamba 模型)
zamba2 — Zamba2ForCausalLM (Zamba2 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForCausalLM

class transformers.TFAutoModelForCausalLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有因果語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 用於例項化的模型類是根據配置類選擇的：
- BertConfig 配置類： TFBertLMHeadModel (BERT 模型)
- CTRLConfig 配置類： TFCTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置類： TFCamembertForCausalLM (CamemBERT 模型)
- GPT2Config 配置類： TFGPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTJConfig 配置類： TFGPTJForCausalLM (GPT-J 模型)
- MistralConfig 配置類： TFMistralForCausalLM (Mistral 模型)
- OPTConfig 配置類： TFOPTForCausalLM (OPT 模型)
- OpenAIGPTConfig 配置類： TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
- RemBertConfig 配置類： TFRemBertForCausalLM (RemBERT 模型)
- RoFormerConfig 配置類： TFRoFormerForCausalLM (RoFormer 模型)
- RobertaConfig 配置類： TFRobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類： TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- TransfoXLConfig 配置類： TFTransfoXLLMHeadModel (Transformer-XL 模型)
- XGLMConfig 配置類： TFXGLMForCausalLM (XGLM 模型)
- XLMConfig 配置類： TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置類： TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
- XLNetConfig 配置類： TFXLNetLMHeadModel (XLNet 模型)
attn_implementation (str, 可選) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有因果語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 *model id*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（透過預訓練模型的 *model id* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄來重新載入。
- 透過將本地目錄作為 pretrained_model_name_or_path 提供來載入模型，並且在目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則使用 Hub 上程式碼的特定版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有因果語言建模頭）。

bert — TFBertLMHeadModel (BERT 模型)
camembert — TFCamembertForCausalLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
gptj — TFGPTJForCausalLM (GPT-J 模型)
mistral — TFMistralForCausalLM (Mistral 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — TFOPTForCausalLM (OPT 模型)
rembert — TFRemBertForCausalLM (RemBERT 模型)
roberta — TFRobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForCausalLM (RoFormer 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
xglm — TFXGLMForCausalLM (XGLM 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
xlnet — TFXLNetLMHeadModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForCausalLM

class transformers.FlaxAutoModelForCausalLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有因果語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BartConfig 配置類: FlaxBartForCausalLM (BART 模型)
- BertConfig 配置類: FlaxBertForCausalLM (BERT 模型)
- BigBirdConfig 配置類: FlaxBigBirdForCausalLM (BigBird 模型)
- BloomConfig 配置類: FlaxBloomForCausalLM (BLOOM 模型)
- ElectraConfig 配置類: FlaxElectraForCausalLM (ELECTRA 模型)
- GPT2Config 配置類: FlaxGPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTJConfig 配置類: FlaxGPTJForCausalLM (GPT-J 模型)
- GPTNeoConfig 配置類: FlaxGPTNeoForCausalLM (GPT Neo 模型)
- GemmaConfig 配置類: FlaxGemmaForCausalLM (Gemma 模型)
- LlamaConfig 配置類: FlaxLlamaForCausalLM (LLaMA 模型)
- MistralConfig 配置類: FlaxMistralForCausalLM (Mistral 模型)
- OPTConfig 配置類: FlaxOPTForCausalLM (OPT 模型)
- RobertaConfig 配置類: FlaxRobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類: FlaxRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- XGLMConfig 配置類: FlaxXGLMForCausalLM (XGLM 模型)
- XLMRobertaConfig 配置類: FlaxXLMRobertaForCausalLM (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有因果語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 *model id*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（透過預訓練模型的 *model id* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄來重新載入。
- 透過將本地目錄作為 pretrained_model_name_or_path 提供來載入模型，並且在目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則使用 Hub 上程式碼的特定版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有因果語言建模頭）。

bart — FlaxBartForCausalLM (BART 模型)
bert — FlaxBertForCausalLM (BERT 模型)
big_bird — FlaxBigBirdForCausalLM (BigBird 模型)
bloom — FlaxBloomForCausalLM (BLOOM 模型)
electra — FlaxElectraForCausalLM (ELECTRA 模型)
gemma — FlaxGemmaForCausalLM (Gemma 模型)
gpt-sw3 — FlaxGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — FlaxGPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_neo — FlaxGPTNeoForCausalLM (GPT Neo 模型)
gptj — FlaxGPTJForCausalLM (GPT-J 模型)
llama — FlaxLlamaForCausalLM (LLaMA 模型)
mistral — FlaxMistralForCausalLM (Mistral 模型)
opt — FlaxOPTForCausalLM (OPT 模型)
roberta — FlaxRobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
xglm — FlaxXGLMForCausalLM (XGLM 模型)
xlm-roberta — FlaxXLMRobertaForCausalLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskedLM

class transformers.AutoModelForMaskedLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有掩碼語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類: AlbertForMaskedLM (ALBERT 模型)
- BartConfig 配置類: BartForConditionalGeneration (BART 模型)
- BertConfig 配置類: BertForMaskedLM (BERT 模型)
- BigBirdConfig 配置類: BigBirdForMaskedLM (BigBird 模型)
- CamembertConfig 配置類: CamembertForMaskedLM (CamemBERT 模型)
- ConvBertConfig 配置類: ConvBertForMaskedLM (ConvBERT 模型)
- Data2VecTextConfig 配置類: Data2VecTextForMaskedLM (Data2VecText 模型)
- DebertaConfig 配置類: DebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置類: DebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置類: DistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置類: ElectraForMaskedLM (ELECTRA 模型)
- ErnieConfig 配置類: ErnieForMaskedLM (ERNIE 模型)
- EsmConfig 配置類: EsmForMaskedLM (ESM 模型)
- FNetConfig 配置類: FNetForMaskedLM (FNet 模型)
- FlaubertConfig 配置類: FlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置類: FunnelForMaskedLM (Funnel Transformer 模型)
- IBertConfig 配置類: IBertForMaskedLM (I-BERT 模型)
- LayoutLMConfig 配置類: LayoutLMForMaskedLM (LayoutLM 模型)
- LongformerConfig 配置類: LongformerForMaskedLM (Longformer 模型)
- LukeConfig 配置類: LukeForMaskedLM (LUKE 模型)
- MBartConfig 配置類: MBartForConditionalGeneration (mBART 模型)
- MPNetConfig 配置類: MPNetForMaskedLM (MPNet 模型)
- MegaConfig 配置類: MegaForMaskedLM (MEGA 模型)
- MegatronBertConfig 配置類: MegatronBertForMaskedLM (Megatron-BERT 模型)
- MobileBertConfig 配置類: MobileBertForMaskedLM (MobileBERT 模型)
- ModernBertConfig 配置類: ModernBertForMaskedLM (ModernBERT 模型)
- MraConfig 配置類: MraForMaskedLM (MRA 模型)
- MvpConfig 配置類: MvpForConditionalGeneration (MVP 模型)
- NezhaConfig 配置類: NezhaForMaskedLM (Nezha 模型)
- NystromformerConfig 配置類: NystromformerForMaskedLM (Nyströmformer 模型)
- PerceiverConfig 配置類: PerceiverForMaskedLM (Perceiver 模型)
- QDQBertConfig 配置類: QDQBertForMaskedLM (QDQBert 模型)
- ReformerConfig 配置類: ReformerForMaskedLM (Reformer 模型)
- RemBertConfig 配置類: RemBertForMaskedLM (RemBERT 模型)
- RoCBertConfig 配置類: RoCBertForMaskedLM (RoCBert 模型)
- RoFormerConfig 配置類: RoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置類: RobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類: RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- SqueezeBertConfig 配置類: SqueezeBertForMaskedLM (SqueezeBERT 模型)
- TapasConfig 配置類: TapasForMaskedLM (TAPAS 模型)
- Wav2Vec2Config 配置類: Wav2Vec2ForMaskedLM (Wav2Vec2 模型)
- XLMConfig 配置類: XLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置類: XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類: XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
- XmodConfig 配置類: XmodForMaskedLM (X-MOD 模型)
- YosoConfig 配置類: YosoForMaskedLM (YOSO 模型)
attn_implementation (str, 可選) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有掩碼語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 *model id*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow index checkpoint file* 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將沿底層模型的 __init__() 方法傳遞。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 用於替代從已儲存權重檔案載入的狀態字典的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，則可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項僅應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則要為 Hub 上的程式碼使用的特定修訂版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有掩碼語言建模頭）。

albert — AlbertForMaskedLM (ALBERT 模型)
bart — BartForConditionalGeneration (BART 模型)
bert — BertForMaskedLM (BERT 模型)
big_bird — BigBirdForMaskedLM (BigBird 模型)
camembert — CamembertForMaskedLM (CamemBERT 模型)
convbert — ConvBertForMaskedLM (ConvBERT 模型)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText 模型)
deberta — DebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — DistilBertForMaskedLM (DistilBERT 模型)
electra — ElectraForMaskedLM (ELECTRA 模型)
ernie — ErnieForMaskedLM (ERNIE 模型)
esm — EsmForMaskedLM (ESM 模型)
flaubert — FlaubertWithLMHeadModel (FlauBERT 模型)
fnet — FNetForMaskedLM (FNet 模型)
funnel — FunnelForMaskedLM (Funnel Transformer 模型)
ibert — IBertForMaskedLM (I-BERT 模型)
layoutlm — LayoutLMForMaskedLM (LayoutLM 模型)
longformer — LongformerForMaskedLM (Longformer 模型)
luke — LukeForMaskedLM (LUKE 模型)
mbart — MBartForConditionalGeneration (mBART 模型)
mega — MegaForMaskedLM (MEGA 模型)
megatron-bert — MegatronBertForMaskedLM (Megatron-BERT 模型)
mobilebert — MobileBertForMaskedLM (MobileBERT 模型)
modernbert — ModernBertForMaskedLM (ModernBERT 模型)
mpnet — MPNetForMaskedLM (MPNet 模型)
mra — MraForMaskedLM (MRA 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nezha — NezhaForMaskedLM (Nezha 模型)
nystromformer — NystromformerForMaskedLM (Nyströmformer 模型)
perceiver — PerceiverForMaskedLM (Perceiver 模型)
qdqbert — QDQBertForMaskedLM (QDQBert 模型)
reformer — ReformerForMaskedLM (Reformer 模型)
rembert — RemBertForMaskedLM (RemBERT 模型)
roberta — RobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForMaskedLM (RoCBert 模型)
roformer — RoFormerForMaskedLM (RoFormer 模型)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT 模型)
tapas — TapasForMaskedLM (TAPAS 模型)
wav2vec2 — Wav2Vec2ForMaskedLM (Wav2Vec2 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
xmod — XmodForMaskedLM (X-MOD 模型)
yoso — YosoForMaskedLM (YOSO 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedLM

class transformers.TFAutoModelForMaskedLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有掩碼語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類根據配置類選擇：
- AlbertConfig 配置類：TFAlbertForMaskedLM (ALBERT 模型)
- BertConfig 配置類：TFBertForMaskedLM (BERT 模型)
- CamembertConfig 配置類：TFCamembertForMaskedLM (CamemBERT 模型)
- ConvBertConfig 配置類：TFConvBertForMaskedLM (ConvBERT 模型)
- DebertaConfig 配置類：TFDebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置類：TFDebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置類：TFDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置類：TFElectraForMaskedLM (ELECTRA 模型)
- EsmConfig 配置類：TFEsmForMaskedLM (ESM 模型)
- FlaubertConfig 配置類：TFFlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelForMaskedLM (Funnel Transformer 模型)
- LayoutLMConfig 配置類：TFLayoutLMForMaskedLM (LayoutLM 模型)
- LongformerConfig 配置類：TFLongformerForMaskedLM (Longformer 模型)
- MPNetConfig 配置類：TFMPNetForMaskedLM (MPNet 模型)
- MobileBertConfig 配置類：TFMobileBertForMaskedLM (MobileBERT 模型)
- RemBertConfig 配置類：TFRemBertForMaskedLM (RemBERT 模型)
- RoFormerConfig 配置類：TFRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置類：TFRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- TapasConfig 配置類：TFTapasForMaskedLM (TAPAS 模型)
- XLMConfig 配置類：TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有掩碼語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將沿底層模型的 __init__() 方法傳遞。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項僅應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則要為 Hub 上的程式碼使用的特定修訂版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有掩碼語言建模頭）。

albert — TFAlbertForMaskedLM (ALBERT 模型)
bert — TFBertForMaskedLM (BERT 模型)
camembert — TFCamembertForMaskedLM (CamemBERT 模型)
convbert — TFConvBertForMaskedLM (ConvBERT 模型)
deberta — TFDebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — TFDistilBertForMaskedLM (DistilBERT 模型)
electra — TFElectraForMaskedLM (ELECTRA 模型)
esm — TFEsmForMaskedLM (ESM 模型)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT 模型)
funnel — TFFunnelForMaskedLM (Funnel Transformer 模型)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM 模型)
longformer — TFLongformerForMaskedLM (Longformer 模型)
mobilebert — TFMobileBertForMaskedLM (MobileBERT 模型)
mpnet — TFMPNetForMaskedLM (MPNet 模型)
rembert — TFRemBertForMaskedLM (RemBERT 模型)
roberta — TFRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForMaskedLM (RoFormer 模型)
tapas — TFTapasForMaskedLM (TAPAS 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMaskedLM

class transformers.FlaxAutoModelForMaskedLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有掩碼語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類根據配置類選擇：
- AlbertConfig 配置類：FlaxAlbertForMaskedLM (ALBERT 模型)
- BartConfig 配置類：FlaxBartForConditionalGeneration (BART 模型)
- BertConfig 配置類：FlaxBertForMaskedLM (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdForMaskedLM (BigBird 模型)
- DistilBertConfig 配置類：FlaxDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置類：FlaxElectraForMaskedLM (ELECTRA 模型)
- MBartConfig 配置類：FlaxMBartForConditionalGeneration (mBART 模型)
- RoFormerConfig 配置類：FlaxRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有掩碼語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將沿底層模型的 __init__() 方法傳遞。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str 或 os.PathLike，可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool，可選，預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool，可選，預設為 False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 版本中移除。
proxies (dict[str, str]，可選) — 用於按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選，預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str，可選，預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool，可選，預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str，可選，預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定在 Hub 上使用的程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有掩碼語言建模頭）。

albert — FlaxAlbertForMaskedLM (ALBERT 模型)
bart — FlaxBartForConditionalGeneration (BART 模型)
bert — FlaxBertForMaskedLM (BERT 模型)
big_bird — FlaxBigBirdForMaskedLM (BigBird 模型)
distilbert — FlaxDistilBertForMaskedLM (DistilBERT 模型)
electra — FlaxElectraForMaskedLM (ELECTRA 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
roberta — FlaxRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMaskedLM (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskGeneration

class transformers.AutoModelForMaskGeneration

（ *args **kwargs ）

TFAutoModelForMaskGeneration

class transformers.TFAutoModelForMaskGeneration

（ *args **kwargs ）

AutoModelForSeq2SeqLM

class transformers.AutoModelForSeq2SeqLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有序列到序列語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BartConfig 配置類：BartForConditionalGeneration (BART 模型)
- BigBirdPegasusConfig 配置類：BigBirdPegasusForConditionalGeneration (BigBird-Pegasus 模型)
- BlenderbotConfig 配置類：BlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置類：BlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置類：EncoderDecoderModel (編碼器-解碼器模型)
- FSMTConfig 配置類：FSMTForConditionalGeneration (FairSeq 機器翻譯模型)
- GPTSanJapaneseConfig 配置類：GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
- GraniteSpeechConfig 配置類：GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
- LEDConfig 配置類：LEDForConditionalGeneration (LED 模型)
- LongT5Config 配置類：LongT5ForConditionalGeneration (LongT5 模型)
- M2M100Config 配置類：M2M100ForConditionalGeneration (M2M100 模型)
- MBartConfig 配置類：MBartForConditionalGeneration (mBART 模型)
- MT5Config 配置類：MT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置類：MarianMTModel (Marian 模型)
- MvpConfig 配置類：MvpForConditionalGeneration (MVP 模型)
- NllbMoeConfig 配置類：NllbMoeForConditionalGeneration (NLLB-MOE 模型)
- PLBartConfig 配置類：PLBartForConditionalGeneration (PLBart 模型)
- PegasusConfig 配置類：PegasusForConditionalGeneration (Pegasus 模型)
- PegasusXConfig 配置類：PegasusXForConditionalGeneration (PEGASUS-X 模型)
- ProphetNetConfig 配置類：ProphetNetForConditionalGeneration (ProphetNet 模型)
- Qwen2AudioConfig 配置類：Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
- SeamlessM4TConfig 配置類：SeamlessM4TForTextToText (SeamlessM4T 模型)
- SeamlessM4Tv2Config 配置類：SeamlessM4Tv2ForTextToText (SeamlessM4Tv2 模型)
- SwitchTransformersConfig 配置類：SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
- T5Config 配置類：T5ForConditionalGeneration (T5 模型)
- T5GemmaConfig 配置類：T5GemmaForConditionalGeneration (T5Gemma 模型)
- UMT5Config 配置類：UMT5ForConditionalGeneration (UMT5 模型)
- XLMProphetNetConfig 配置類：XLMProphetNetForConditionalGeneration (XLM-ProphetNet 模型)
attn_implementation (str，可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的模型 ID。
- 包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- tensorflow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (附加位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的模型 ID 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選項。
cache_dir (str 或 os.PathLike，可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool，可選，預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool，可選，預設為 False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下預設恢復。將在 Transformers v5 版本中移除。
proxies (dict[str, str]，可選) — 用於按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選，預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str，可選，預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool，可選，預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str，可選，預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定在 Hub 上使用的程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

bart — BartForConditionalGeneration (BART 模型)
bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBird-Pegasus 模型)
blenderbot — BlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — EncoderDecoderModel (編碼器-解碼器模型)
fsmt — FSMTForConditionalGeneration (FairSeq 機器翻譯模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
led — LEDForConditionalGeneration (LED 模型)
longt5 — LongT5ForConditionalGeneration (LongT5 模型)
m2m_100 — M2M100ForConditionalGeneration (M2M100 模型)
marian — MarianMTModel (Marian 模型)
mbart — MBartForConditionalGeneration (mBART 模型)
mt5 — MT5ForConditionalGeneration (MT5 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE 模型)
pegasus — PegasusForConditionalGeneration (Pegasus 模型)
pegasus_x — PegasusXForConditionalGeneration (PEGASUS-X 模型)
plbart — PLBartForConditionalGeneration (PLBart 模型)
prophetnet — ProphetNetForConditionalGeneration (ProphetNet 模型)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
seamless_m4t — SeamlessM4TForTextToText (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2ForTextToText (SeamlessM4Tv2 模型)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
t5 — T5ForConditionalGeneration (T5 模型)
t5gemma — T5GemmaForConditionalGeneration (T5Gemma 模型)
umt5 — UMT5ForConditionalGeneration (UMT5 模型)
xlm-prophetnet — XLMProphetNetForConditionalGeneration (XLM-ProphetNet 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
...     "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSeq2SeqLM

class transformers.TFAutoModelForSeq2SeqLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有序列到序列語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BartConfig 配置類：TFBartForConditionalGeneration (BART 模型)
- BlenderbotConfig 配置類：TFBlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置類：TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置類：TFEncoderDecoderModel (編碼器-解碼器模型)
- LEDConfig 配置類：TFLEDForConditionalGeneration (LED 模型)
- MBartConfig 配置類：TFMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置類：TFMT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置類：TFMarianMTModel (Marian 模型)
- PegasusConfig 配置類：TFPegasusForConditionalGeneration (Pegasus 模型)
- T5Config 配置類：TFT5ForConditionalGeneration (T5 模型)
attn_implementation (str，可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以代替自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 一個根據協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理伺服器用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設對配置的所有相關更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

bart — TFBartForConditionalGeneration (BART 模型)
blenderbot — TFBlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — TFEncoderDecoderModel (編碼器-解碼器模型)
led — TFLEDForConditionalGeneration (LED 模型)
marian — TFMarianMTModel (Marian 模型)
mbart — TFMBartForConditionalGeneration (mBART 模型)
mt5 — TFMT5ForConditionalGeneration (MT5 模型)
pegasus — TFPegasusForConditionalGeneration (Pegasus 模型)
t5 — TFT5ForConditionalGeneration (T5 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSeq2SeqLM

class transformers.FlaxAutoModelForSeq2SeqLM

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有序列到序列語言建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BartConfig 配置類：FlaxBartForConditionalGeneration (BART 模型)
- BlenderbotConfig 配置類：FlaxBlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置類：FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置類：FlaxEncoderDecoderModel (編碼器-解碼器模型)
- LongT5Config 配置類：FlaxLongT5ForConditionalGeneration (LongT5 模型)
- MBartConfig 配置類：FlaxMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置類：FlaxMT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置類：FlaxMarianMTModel (Marian 模型)
- PegasusConfig 配置類：FlaxPegasusForConditionalGeneration (Pegasus 模型)
- T5Config 配置類：FlaxT5ForConditionalGeneration (T5 模型)
attn_implementation (str, 可選) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = FlaxAutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以代替自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 當不應使用標準快取時，下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 一個根據協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理伺服器用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設對配置的所有相關更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語言建模頭）。

bart — FlaxBartForConditionalGeneration (BART 模型)
blenderbot — FlaxBlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — FlaxEncoderDecoderModel (編碼器-解碼器模型)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 模型)
marian — FlaxMarianMTModel (Marian 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
mt5 — FlaxMT5ForConditionalGeneration (MT5 模型)
pegasus — FlaxPegasusForConditionalGeneration (Pegasus 模型)
t5 — FlaxT5ForConditionalGeneration (T5 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForSequenceClassification

class transformers.AutoModelForSequenceClassification

（ *args **kwargs ）

這是一個通用的模型類，在使用 from_pretrained() 類方法或 from_config() 類方法建立時，將被例項化為庫中的某個模型類（帶序列分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 用於例項化模型類的配置。例項化的模型類將根據配置類選擇：
- AlbertConfig 配置類：AlbertForSequenceClassification (ALBERT 模型)
- ArceeConfig 配置類：ArceeForSequenceClassification (Arcee 模型)
- BartConfig 配置類：BartForSequenceClassification (BART 模型)
- BertConfig 配置類：BertForSequenceClassification (BERT 模型)
- BigBirdConfig 配置類：BigBirdForSequenceClassification (BigBird 模型)
- BigBirdPegasusConfig 配置類：BigBirdPegasusForSequenceClassification (BigBird-Pegasus 模型)
- BioGptConfig 配置類：BioGptForSequenceClassification (BioGpt 模型)
- BloomConfig 配置類：BloomForSequenceClassification (BLOOM 模型)
- CTRLConfig 配置類：CTRLForSequenceClassification (CTRL 模型)
- CamembertConfig 配置類：CamembertForSequenceClassification (CamemBERT 模型)
- CanineConfig 配置類：CanineForSequenceClassification (CANINE 模型)
- ConvBertConfig 配置類：ConvBertForSequenceClassification (ConvBERT 模型)
- Data2VecTextConfig 配置類：Data2VecTextForSequenceClassification (Data2VecText 模型)
- DebertaConfig 配置類：DebertaForSequenceClassification (DeBERTa 模型)
- DebertaV2Config 配置類：DebertaV2ForSequenceClassification (DeBERTa-v2 模型)
- DiffLlamaConfig 配置類：DiffLlamaForSequenceClassification (DiffLlama 模型)
- DistilBertConfig 配置類：DistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置類：ElectraForSequenceClassification (ELECTRA 模型)
- ErnieConfig 配置類：ErnieForSequenceClassification (ERNIE 模型)
- ErnieMConfig 配置類：ErnieMForSequenceClassification (ErnieM 模型)
- EsmConfig 配置類：EsmForSequenceClassification (ESM 模型)
- FNetConfig 配置類：FNetForSequenceClassification (FNet 模型)
- FalconConfig 配置類：FalconForSequenceClassification (Falcon 模型)
- FlaubertConfig 配置類：FlaubertForSequenceClassification (FlauBERT 模型)
- FunnelConfig 配置類：FunnelForSequenceClassification (Funnel Transformer 模型)
- GPT2Config 配置類：GPT2ForSequenceClassification (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置類：GPTBigCodeForSequenceClassification (GPTBigCode 模型)
- GPTJConfig 配置類：GPTJForSequenceClassification (GPT-J 模型)
- GPTNeoConfig 配置類：GPTNeoForSequenceClassification (GPT Neo 模型)
- GPTNeoXConfig 配置類：GPTNeoXForSequenceClassification (GPT NeoX 模型)
- Gemma2Config 配置類：Gemma2ForSequenceClassification (Gemma2 模型)
- GemmaConfig 配置類：GemmaForSequenceClassification (Gemma 模型)
- Glm4Config 配置類：Glm4ForSequenceClassification (GLM4 模型)
- GlmConfig 配置類：GlmForSequenceClassification (GLM 模型)
- HeliumConfig 配置類：HeliumForSequenceClassification (Helium 模型)
- IBertConfig 配置類：IBertForSequenceClassification (I-BERT 模型)
- JambaConfig 配置類：JambaForSequenceClassification (Jamba 模型)
- JetMoeConfig 配置類：JetMoeForSequenceClassification (JetMoe 模型)
- LEDConfig 配置類：LEDForSequenceClassification (LED 模型)
- LayoutLMConfig 配置類：LayoutLMForSequenceClassification (LayoutLM 模型)
- LayoutLMv2Config 配置類：LayoutLMv2ForSequenceClassification (LayoutLMv2 模型)
- LayoutLMv3Config 配置類：LayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
- LiltConfig 配置類：LiltForSequenceClassification (LiLT 模型)
- LlamaConfig 配置類：LlamaForSequenceClassification (LLaMA 模型)
- LongformerConfig 配置類：LongformerForSequenceClassification (Longformer 模型)
- LukeConfig 配置類：LukeForSequenceClassification (LUKE 模型)
- MBartConfig 配置類：MBartForSequenceClassification (mBART 模型)
- MPNetConfig 配置類：MPNetForSequenceClassification (MPNet 模型)
- MT5Config 配置類：MT5ForSequenceClassification (MT5 模型)
- MarkupLMConfig 配置類：MarkupLMForSequenceClassification (MarkupLM 模型)
- MegaConfig 配置類：MegaForSequenceClassification (MEGA 模型)
- MegatronBertConfig 配置類：MegatronBertForSequenceClassification (Megatron-BERT 模型)
- MiniMaxConfig 配置類：MiniMaxForSequenceClassification (MiniMax 模型)
- MistralConfig 配置類：MistralForSequenceClassification (Mistral 模型)
- MixtralConfig 配置類：MixtralForSequenceClassification (Mixtral 模型)
- MobileBertConfig 配置類：MobileBertForSequenceClassification (MobileBERT 模型)
- ModernBertConfig 配置類：ModernBertForSequenceClassification (ModernBERT 模型)
- MptConfig 配置類：MptForSequenceClassification (MPT 模型)
- MraConfig 配置類：MraForSequenceClassification (MRA 模型)
- MvpConfig 配置類：MvpForSequenceClassification (MVP 模型)
- NemotronConfig 配置類：NemotronForSequenceClassification (Nemotron 模型)
- NezhaConfig 配置類：NezhaForSequenceClassification (Nezha 模型)
- NystromformerConfig 配置類：NystromformerForSequenceClassification (Nyströmformer 模型)
- OPTConfig 配置類：OPTForSequenceClassification (OPT 模型)
- OpenAIGPTConfig 配置類：OpenAIGPTForSequenceClassification (OpenAI GPT 模型)
- OpenLlamaConfig 配置類：OpenLlamaForSequenceClassification (OpenLlama 模型)
- PLBartConfig 配置類：PLBartForSequenceClassification (PLBart 模型)
- PerceiverConfig 配置類：PerceiverForSequenceClassification (Perceiver 模型)
- PersimmonConfig 配置類：PersimmonForSequenceClassification (Persimmon 模型)
- Phi3Config 配置類：Phi3ForSequenceClassification (Phi3 模型)
- PhiConfig 配置類：PhiForSequenceClassification (Phi 模型)
- PhimoeConfig 配置類：PhimoeForSequenceClassification (Phimoe 模型)
- QDQBertConfig 配置類：QDQBertForSequenceClassification (QDQBert 模型)
- Qwen2Config 配置類：Qwen2ForSequenceClassification (Qwen2 模型)
- Qwen2MoeConfig 配置類：Qwen2MoeForSequenceClassification (Qwen2MoE 模型)
- Qwen3Config 配置類：Qwen3ForSequenceClassification (Qwen3 模型)
- Qwen3MoeConfig 配置類：Qwen3MoeForSequenceClassification (Qwen3MoE 模型)
- ReformerConfig 配置類：ReformerForSequenceClassification (Reformer 模型)
- RemBertConfig 配置類：RemBertForSequenceClassification (RemBERT 模型)
- RoCBertConfig 配置類：RoCBertForSequenceClassification (RoCBert 模型)
- RoFormerConfig 配置類：RoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置類：RobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置類：SmolLM3ForSequenceClassification (SmolLM3 模型)
- SqueezeBertConfig 配置類：SqueezeBertForSequenceClassification (SqueezeBERT 模型)
- StableLmConfig 配置類：StableLmForSequenceClassification (StableLm 模型)
- Starcoder2Config 配置類：Starcoder2ForSequenceClassification (Starcoder2 模型)
- T5Config 配置類：T5ForSequenceClassification (T5 模型)
- T5GemmaConfig 配置類：T5GemmaForSequenceClassification (T5Gemma 模型)
- TapasConfig 配置類：TapasForSequenceClassification (TAPAS 模型)
- TransfoXLConfig 配置類：TransfoXLForSequenceClassification (Transformer-XL 模型)
- UMT5Config 配置類：UMT5ForSequenceClassification (UMT5 模型)
- XLMConfig 配置類：XLMForSequenceClassification (XLM 模型)
- XLMRobertaConfig 配置類：XLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類：XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類：XLNetForSequenceClassification (XLNet 模型)
- XmodConfig 配置類：XmodForSequenceClassification (X-MOD 模型)
- YosoConfig 配置類：YosoForSequenceClassification (YOSO 模型)
- Zamba2Config 配置類：Zamba2ForSequenceClassification (Zamba2 模型)
- ZambaConfig 配置類：ZambaForSequenceClassification (Zamba 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現方式（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動實現的 "eager"。

根據配置例項化庫中的一個模型類（帶有序列分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*TensorFlow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 `from_tf` 設定為 `True`，並應提供一個配置物件作為 `config` 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (其他位置引數，可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供一個本地目錄作為 `pretrained_model_name_or_path` 來載入模型，並且在目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 `pretrained_model_name_or_path` 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 `True`，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數，可選) — 可用於更新配置物件（載入後）和初始化模型（例如，`output_attentions=True`）。其行為根據是否提供了 `config` 或自動載入而有所不同：
- 如果透過 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式（from_pretrained()）。`kwargs` 中與配置屬性對應的每個鍵將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列分類頭）。

albert — `AlbertForSequenceClassification` (ALBERT 模型)
arcee — ArceeForSequenceClassification (Arcee 模型)
bart — BartForSequenceClassification (BART 模型)
bert — BertForSequenceClassification (BERT 模型)
big_bird — BigBirdForSequenceClassification (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBird-Pegasus 模型)
biogpt — BioGptForSequenceClassification (BioGpt 模型)
bloom — BloomForSequenceClassification (BLOOM 模型)
camembert — CamembertForSequenceClassification (CamemBERT 模型)
canine — CanineForSequenceClassification (CANINE 模型)
code_llama — LlamaForSequenceClassification (CodeLlama 模型)
convbert — ConvBertForSequenceClassification (ConvBERT 模型)
ctrl — CTRLForSequenceClassification (CTRL 模型)
data2vec-text — Data2VecTextForSequenceClassification (Data2VecText 模型)
deberta — DebertaForSequenceClassification (DeBERTa 模型)
deberta-v2 — DebertaV2ForSequenceClassification (DeBERTa-v2 模型)
diffllama — DiffLlamaForSequenceClassification (DiffLlama 模型)
distilbert — DistilBertForSequenceClassification (DistilBERT 模型)
electra — ElectraForSequenceClassification (ELECTRA 模型)
ernie — ErnieForSequenceClassification (ERNIE 模型)
ernie_m — ErnieMForSequenceClassification (ErnieM 模型)
esm — EsmForSequenceClassification (ESM 模型)
falcon — FalconForSequenceClassification (Falcon 模型)
flaubert — FlaubertForSequenceClassification (FlauBERT 模型)
fnet — FNetForSequenceClassification (FNet 模型)
funnel — FunnelForSequenceClassification (Funnel Transformer 模型)
gemma — GemmaForSequenceClassification (Gemma 模型)
gemma2 — Gemma2ForSequenceClassification (Gemma2 模型)
glm — GlmForSequenceClassification (GLM 模型)
glm4 — Glm4ForSequenceClassification (GLM4 模型)
gpt-sw3 — GPT2ForSequenceClassification (GPT-Sw3 模型)
gpt2 — GPT2ForSequenceClassification (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForSequenceClassification (GPTBigCode 模型)
gpt_neo — GPTNeoForSequenceClassification (GPT Neo 模型)
gpt_neox — GPTNeoXForSequenceClassification (GPT NeoX 模型)
gptj — GPTJForSequenceClassification (GPT-J 模型)
helium — HeliumForSequenceClassification (Helium 模型)
ibert — IBertForSequenceClassification (I-BERT 模型)
jamba — JambaForSequenceClassification (Jamba 模型)
jetmoe — JetMoeForSequenceClassification (JetMoe 模型)
layoutlm — LayoutLMForSequenceClassification (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForSequenceClassification (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
led — LEDForSequenceClassification (LED 模型)
lilt — LiltForSequenceClassification (LiLT 模型)
llama — LlamaForSequenceClassification (LLaMA 模型)
longformer — LongformerForSequenceClassification (Longformer 模型)
luke — LukeForSequenceClassification (LUKE 模型)
markuplm — MarkupLMForSequenceClassification (MarkupLM 模型)
mbart — MBartForSequenceClassification (mBART 模型)
mega — MegaForSequenceClassification (MEGA 模型)
megatron-bert — MegatronBertForSequenceClassification (Megatron-BERT 模型)
minimax — MiniMaxForSequenceClassification (MiniMax 模型)
mistral — MistralForSequenceClassification (Mistral 模型)
mixtral — MixtralForSequenceClassification (Mixtral 模型)
mobilebert — MobileBertForSequenceClassification (MobileBERT 模型)
modernbert — ModernBertForSequenceClassification (ModernBERT 模型)
mpnet — MPNetForSequenceClassification (MPNet 模型)
mpt — MptForSequenceClassification (MPT 模型)
mra — MraForSequenceClassification (MRA 模型)
mt5 — MT5ForSequenceClassification (MT5 模型)
mvp — MvpForSequenceClassification (MVP 模型)
nemotron — NemotronForSequenceClassification (Nemotron 模型)
nezha — NezhaForSequenceClassification (Nezha 模型)
nystromformer — NystromformerForSequenceClassification (Nyströmformer 模型)
open-llama — OpenLlamaForSequenceClassification (OpenLlama 模型)
openai-gpt — OpenAIGPTForSequenceClassification (OpenAI GPT 模型)
opt — OPTForSequenceClassification (OPT 模型)
perceiver — PerceiverForSequenceClassification (Perceiver 模型)
persimmon — PersimmonForSequenceClassification (Persimmon 模型)
phi — PhiForSequenceClassification (Phi 模型)
phi3 — Phi3ForSequenceClassification (Phi3 模型)
phimoe — PhimoeForSequenceClassification (Phimoe 模型)
plbart — PLBartForSequenceClassification (PLBart 模型)
qdqbert — QDQBertForSequenceClassification (QDQBert 模型)
qwen2 — Qwen2ForSequenceClassification (Qwen2 模型)
qwen2_moe — Qwen2MoeForSequenceClassification (Qwen2MoE 模型)
qwen3 — Qwen3ForSequenceClassification (Qwen3 模型)
qwen3_moe — Qwen3MoeForSequenceClassification (Qwen3MoE 模型)
reformer — ReformerForSequenceClassification (Reformer 模型)
rembert — RemBertForSequenceClassification (RemBERT 模型)
roberta — RobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForSequenceClassification (RoCBert 模型)
roformer — RoFormerForSequenceClassification (RoFormer 模型)
smollm3 — SmolLM3ForSequenceClassification (SmolLM3 模型)
squeezebert — SqueezeBertForSequenceClassification (SqueezeBERT 模型)
stablelm — StableLmForSequenceClassification (StableLm 模型)
starcoder2 — Starcoder2ForSequenceClassification (Starcoder2 模型)
t5 — T5ForSequenceClassification (T5 模型)
t5gemma — T5GemmaForSequenceClassification (T5Gemma 模型)
tapas — TapasForSequenceClassification (TAPAS 模型)
transfo-xl — TransfoXLForSequenceClassification (Transformer-XL 模型)
umt5 — UMT5ForSequenceClassification (UMT5 模型)
xlm — XLMForSequenceClassification (XLM 模型)
xlm-roberta — XLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL 模型)
xlnet — XLNetForSequenceClassification (XLNet 模型)
xmod — XmodForSequenceClassification (X-MOD 模型)
yoso — YosoForSequenceClassification (YOSO 模型)
zamba — ZambaForSequenceClassification (Zamba 模型)
zamba2 — Zamba2ForSequenceClassification (Zamba2 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSequenceClassification

class transformers.TFAutoModelForSequenceClassification

（ *args **kwargs ）

這是一個通用的模型類，在使用 from_pretrained() 類方法或 from_config() 類方法建立時，將被例項化為庫中的某個模型類（帶序列分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 用於例項化模型類的配置。例項化的模型類將根據配置類選擇：
- AlbertConfig 配置類：TFAlbertForSequenceClassification (ALBERT 模型)
- BartConfig 配置類：TFBartForSequenceClassification (BART 模型)
- BertConfig 配置類：TFBertForSequenceClassification (BERT 模型)
- CTRLConfig 配置類：TFCTRLForSequenceClassification (CTRL 模型)
- CamembertConfig 配置類：TFCamembertForSequenceClassification (CamemBERT 模型)
- ConvBertConfig 配置類：TFConvBertForSequenceClassification (ConvBERT 模型)
- DebertaConfig 配置類：TFDebertaForSequenceClassification (DeBERTa 模型)
- DebertaV2Config 配置類：TFDebertaV2ForSequenceClassification (DeBERTa-v2 模型)
- DistilBertConfig 配置類：TFDistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置類：TFElectraForSequenceClassification (ELECTRA 模型)
- EsmConfig 配置類：TFEsmForSequenceClassification (ESM 模型)
- FlaubertConfig 配置類：TFFlaubertForSequenceClassification (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelForSequenceClassification (Funnel Transformer 模型)
- GPT2Config 配置類：TFGPT2ForSequenceClassification (OpenAI GPT-2 模型)
- GPTJConfig 配置類：TFGPTJForSequenceClassification (GPT-J 模型)
- LayoutLMConfig 配置類：TFLayoutLMForSequenceClassification (LayoutLM 模型)
- LayoutLMv3Config 配置類：TFLayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
- LongformerConfig 配置類：TFLongformerForSequenceClassification (Longformer 模型)
- MPNetConfig 配置類：TFMPNetForSequenceClassification (MPNet 模型)
- MistralConfig 配置類：TFMistralForSequenceClassification (Mistral 模型)
- MobileBertConfig 配置類：TFMobileBertForSequenceClassification (MobileBERT 模型)
- OpenAIGPTConfig 配置類：TFOpenAIGPTForSequenceClassification (OpenAI GPT 模型)
- RemBertConfig 配置類：TFRemBertForSequenceClassification (RemBERT 模型)
- RoFormerConfig 配置類：TFRoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置類：TFRobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- TapasConfig 配置類：TFTapasForSequenceClassification (TAPAS 模型)
- TransfoXLConfig 配置類：TFTransfoXLForSequenceClassification (Transformer-XL 模型)
- XLMConfig 配置類：TFXLMForSequenceClassification (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
- XLNetConfig 配置類：TFXLNetForSequenceClassification (XLNet 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有序列分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的 positional 引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以替代自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的模型 ID字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置 JSON 檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入了配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列分類頭）。

albert — TFAlbertForSequenceClassification (ALBERT 模型)
bart — TFBartForSequenceClassification (BART 模型)
bert — TFBertForSequenceClassification (BERT 模型)
camembert — TFCamembertForSequenceClassification (CamemBERT 模型)
convbert — TFConvBertForSequenceClassification (ConvBERT 模型)
ctrl — TFCTRLForSequenceClassification (CTRL 模型)
deberta — TFDebertaForSequenceClassification (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForSequenceClassification (DeBERTa-v2 模型)
distilbert — TFDistilBertForSequenceClassification (DistilBERT 模型)
electra — TFElectraForSequenceClassification (ELECTRA 模型)
esm — TFEsmForSequenceClassification (ESM 模型)
flaubert — TFFlaubertForSequenceClassification (FlauBERT 模型)
funnel — TFFunnelForSequenceClassification (Funnel Transformer 模型)
gpt-sw3 — TFGPT2ForSequenceClassification (GPT-Sw3 模型)
gpt2 — TFGPT2ForSequenceClassification (OpenAI GPT-2 模型)
gptj — TFGPTJForSequenceClassification (GPT-J 模型)
layoutlm — TFLayoutLMForSequenceClassification (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
longformer — TFLongformerForSequenceClassification (Longformer 模型)
mistral — TFMistralForSequenceClassification (Mistral 模型)
mobilebert — TFMobileBertForSequenceClassification (MobileBERT 模型)
mpnet — TFMPNetForSequenceClassification (MPNet 模型)
openai-gpt — TFOpenAIGPTForSequenceClassification (OpenAI GPT 模型)
rembert — TFRemBertForSequenceClassification (RemBERT 模型)
roberta — TFRobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForSequenceClassification (RoFormer 模型)
tapas — TFTapasForSequenceClassification (TAPAS 模型)
transfo-xl — TFTransfoXLForSequenceClassification (Transformer-XL 模型)
xlm — TFXLMForSequenceClassification (XLM 模型)
xlm-roberta — TFXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
xlnet — TFXLNetForSequenceClassification (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSequenceClassification

class transformers.FlaxAutoModelForSequenceClassification

（ *args **kwargs ）

這是一個通用的模型類，在使用 from_pretrained() 類方法或 from_config() 類方法建立時，將被例項化為庫中的某個模型類（帶序列分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：FlaxAlbertForSequenceClassification (ALBERT 模型)
- BartConfig 配置類：FlaxBartForSequenceClassification (BART 模型)
- BertConfig 配置類：FlaxBertForSequenceClassification (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdForSequenceClassification (BigBird 模型)
- DistilBertConfig 配置類：FlaxDistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置類：FlaxElectraForSequenceClassification (ELECTRA 模型)
- MBartConfig 配置類：FlaxMBartForSequenceClassification (mBART 模型)
- RoFormerConfig 配置類：FlaxRoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有序列分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的 positional 引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以替代自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的模型 ID字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置 JSON 檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入了配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列分類頭）。

albert — FlaxAlbertForSequenceClassification (ALBERT 模型)
bart — FlaxBartForSequenceClassification (BART 模型)
bert — FlaxBertForSequenceClassification (BERT 模型)
big_bird — FlaxBigBirdForSequenceClassification (BigBird 模型)
distilbert — FlaxDistilBertForSequenceClassification (DistilBERT 模型)
electra — FlaxElectraForSequenceClassification (ELECTRA 模型)
mbart — FlaxMBartForSequenceClassification (mBART 模型)
roberta — FlaxRobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForSequenceClassification (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMultipleChoice

class transformers.AutoModelForMultipleChoice

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有選擇題頭部）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：AlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置類：BertForMultipleChoice (BERT 模型)
- BigBirdConfig 配置類：BigBirdForMultipleChoice (BigBird 模型)
- CamembertConfig 配置類：CamembertForMultipleChoice (CamemBERT 模型)
- CanineConfig 配置類：CanineForMultipleChoice (CANINE 模型)
- ConvBertConfig 配置類：ConvBertForMultipleChoice (ConvBERT 模型)
- Data2VecTextConfig 配置類：Data2VecTextForMultipleChoice (Data2VecText 模型)
- DebertaV2Config 配置類：DebertaV2ForMultipleChoice (DeBERTa-v2 模型)
- DistilBertConfig 配置類：DistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置類：ElectraForMultipleChoice (ELECTRA 模型)
- ErnieConfig 配置類：ErnieForMultipleChoice (ERNIE 模型)
- ErnieMConfig 配置類：ErnieMForMultipleChoice (ErnieM 模型)
- FNetConfig 配置類：FNetForMultipleChoice (FNet 模型)
- FlaubertConfig 配置類：FlaubertForMultipleChoice (FlauBERT 模型)
- FunnelConfig 配置類：FunnelForMultipleChoice (Funnel Transformer 模型)
- IBertConfig 配置類：IBertForMultipleChoice (I-BERT 模型)
- LongformerConfig 配置類：LongformerForMultipleChoice (Longformer 模型)
- LukeConfig 配置類：LukeForMultipleChoice (LUKE 模型)
- MPNetConfig 配置類：MPNetForMultipleChoice (MPNet 模型)
- MegaConfig 配置類：MegaForMultipleChoice (MEGA 模型)
- MegatronBertConfig 配置類：MegatronBertForMultipleChoice (Megatron-BERT 模型)
- MobileBertConfig 配置類：MobileBertForMultipleChoice (MobileBERT 模型)
- MraConfig 配置類：MraForMultipleChoice (MRA 模型)
- NezhaConfig 配置類：NezhaForMultipleChoice (Nezha 模型)
- NystromformerConfig 配置類：NystromformerForMultipleChoice (Nyströmformer 模型)
- QDQBertConfig 配置類：QDQBertForMultipleChoice (QDQBert 模型)
- RemBertConfig 配置類：RemBertForMultipleChoice (RemBERT 模型)
- RoCBertConfig 配置類：RoCBertForMultipleChoice (RoCBert 模型)
- RoFormerConfig 配置類：RoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置類：RobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- SqueezeBertConfig 配置類：SqueezeBertForMultipleChoice (SqueezeBERT 模型)
- XLMConfig 配置類：XLMForMultipleChoice (XLM 模型)
- XLMRobertaConfig 配置類：XLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類：XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類：XLNetForMultipleChoice (XLNet 模型)
- XmodConfig 配置類：XmodForMultipleChoice (X-MOD 模型)
- YosoConfig 配置類：YosoForMultipleChoice (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有多項選擇頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個*tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (其他位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選項。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能時都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上自定義模型定義在其自己的建模檔案中。此選項只應為 True 設定為您信任且已閱讀其程式碼的倉庫，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的儲存庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有多項選擇頭）。

albert — AlbertForMultipleChoice (ALBERT 模型)
bert — BertForMultipleChoice (BERT 模型)
big_bird — BigBirdForMultipleChoice (BigBird 模型)
camembert — CamembertForMultipleChoice (CamemBERT 模型)
canine — CanineForMultipleChoice (CANINE 模型)
convbert — ConvBertForMultipleChoice (ConvBERT 模型)
data2vec-text — Data2VecTextForMultipleChoice (Data2VecText 模型)
deberta-v2 — DebertaV2ForMultipleChoice (DeBERTa-v2 模型)
distilbert — DistilBertForMultipleChoice (DistilBERT 模型)
electra — ElectraForMultipleChoice (ELECTRA 模型)
ernie — ErnieForMultipleChoice (ERNIE 模型)
ernie_m — ErnieMForMultipleChoice (ErnieM 模型)
flaubert — FlaubertForMultipleChoice (FlauBERT 模型)
fnet — FNetForMultipleChoice (FNet 模型)
funnel — FunnelForMultipleChoice (Funnel Transformer 模型)
ibert — IBertForMultipleChoice (I-BERT 模型)
longformer — LongformerForMultipleChoice (Longformer 模型)
luke — LukeForMultipleChoice (LUKE 模型)
mega — MegaForMultipleChoice (MEGA 模型)
megatron-bert — MegatronBertForMultipleChoice (Megatron-BERT 模型)
mobilebert — MobileBertForMultipleChoice (MobileBERT 模型)
mpnet — MPNetForMultipleChoice (MPNet 模型)
mra — MraForMultipleChoice (MRA 模型)
nezha — NezhaForMultipleChoice (Nezha 模型)
nystromformer — NystromformerForMultipleChoice (Nyströmformer 模型)
qdqbert — QDQBertForMultipleChoice (QDQBert 模型)
rembert — RemBertForMultipleChoice (RemBERT 模型)
roberta — RobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForMultipleChoice (RoCBert 模型)
roformer — RoFormerForMultipleChoice (RoFormer 模型)
squeezebert — SqueezeBertForMultipleChoice (SqueezeBERT 模型)
xlm — XLMForMultipleChoice (XLM 模型)
xlm-roberta — XLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL 模型)
xlnet — XLNetForMultipleChoice (XLNet 模型)
xmod — XmodForMultipleChoice (X-MOD 模型)
yoso — YosoForMultipleChoice (YOSO 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMultipleChoice

class transformers.TFAutoModelForMultipleChoice

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有選擇題頭部）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：TFAlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置類：TFBertForMultipleChoice (BERT 模型)
- CamembertConfig 配置類：TFCamembertForMultipleChoice (CamemBERT 模型)
- ConvBertConfig 配置類：TFConvBertForMultipleChoice (ConvBERT 模型)
- DebertaV2Config 配置類：TFDebertaV2ForMultipleChoice (DeBERTa-v2 模型)
- DistilBertConfig 配置類：TFDistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置類：TFElectraForMultipleChoice (ELECTRA 模型)
- FlaubertConfig 配置類：TFFlaubertForMultipleChoice (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelForMultipleChoice (Funnel Transformer 模型)
- LongformerConfig 配置類：TFLongformerForMultipleChoice (Longformer 模型)
- MPNetConfig 配置類：TFMPNetForMultipleChoice (MPNet 模型)
- MobileBertConfig 配置類：TFMobileBertForMultipleChoice (MobileBERT 模型)
- RemBertConfig 配置類：TFRemBertForMultipleChoice (RemBERT 模型)
- RoFormerConfig 配置類：TFRoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置類：TFRobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置類：TFXLMForMultipleChoice (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
- XLNetConfig 配置類：TFXLNetForMultipleChoice (XLNet 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有多項選擇頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後再載入 TensorFlow 模型要慢。
model_args (其他位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, optional, defaults to False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能時都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上自定義模型定義在其自己的建模檔案中。此選項只應為 True 設定為您信任且已閱讀其程式碼的倉庫，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的儲存庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有多項選擇頭）。

albert — TFAlbertForMultipleChoice (ALBERT 模型)
bert — TFBertForMultipleChoice (BERT 模型)
camembert — TFCamembertForMultipleChoice (CamemBERT 模型)
convbert — TFConvBertForMultipleChoice (ConvBERT 模型)
deberta-v2 — TFDebertaV2ForMultipleChoice (DeBERTa-v2 模型)
distilbert — TFDistilBertForMultipleChoice (DistilBERT 模型)
electra — TFElectraForMultipleChoice (ELECTRA 模型)
flaubert — TFFlaubertForMultipleChoice (FlauBERT 模型)
funnel — TFFunnelForMultipleChoice (Funnel Transformer 模型)
longformer — TFLongformerForMultipleChoice (Longformer 模型)
mobilebert — TFMobileBertForMultipleChoice (MobileBERT 模型)
mpnet — TFMPNetForMultipleChoice (MPNet 模型)
rembert — TFRemBertForMultipleChoice (RemBERT 模型)
roberta — TFRobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForMultipleChoice (RoFormer 模型)
xlm — TFXLMForMultipleChoice (XLM 模型)
xlm-roberta — TFXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
xlnet — TFXLNetForMultipleChoice (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMultipleChoice

class transformers.FlaxAutoModelForMultipleChoice

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有選擇題頭部）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類來選擇的：
- AlbertConfig 配置類：FlaxAlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置類：FlaxBertForMultipleChoice (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdForMultipleChoice (BigBird 模型)
- DistilBertConfig 配置類：FlaxDistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置類：FlaxElectraForMultipleChoice (ELECTRA 模型)
- RoFormerConfig 配置類：FlaxRoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有多項選擇頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。當滿足以下條件時，可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 如果不想使用標準快取，可以指定一個目錄路徑，用於快取下載的預訓練模型配置。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個協議或端點使用的代理伺服器字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則使用 Hub 上的特定程式碼版本。可以是一個分支名、一個標籤名或一個提交ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有多項選擇頭）。

albert — FlaxAlbertForMultipleChoice (ALBERT 模型)
bert — FlaxBertForMultipleChoice (BERT 模型)
big_bird — FlaxBigBirdForMultipleChoice (BigBird 模型)
distilbert — FlaxDistilBertForMultipleChoice (DistilBERT 模型)
electra — FlaxElectraForMultipleChoice (ELECTRA 模型)
roberta — FlaxRobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMultipleChoice (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForNextSentencePrediction

class transformers.AutoModelForNextSentencePrediction

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有下一句預測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類來選擇的：
- BertConfig 配置類：BertForNextSentencePrediction (BERT 模型)
- ErnieConfig 配置類：ErnieForNextSentencePrediction (ERNIE 模型)
- FNetConfig 配置類：FNetForNextSentencePrediction (FNet 模型)
- MegatronBertConfig 配置類：MegatronBertForNextSentencePrediction (Megatron-BERT 模型)
- MobileBertConfig 配置類：MobileBertForNextSentencePrediction (MobileBERT 模型)
- NezhaConfig 配置類：NezhaForNextSentencePrediction (Nezha 模型)
- QDQBertConfig 配置類：QDQBertForNextSentencePrediction (QDQBert 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有下一句預測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向*tensorflow索引檢查點檔案*的路徑或URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。當滿足以下條件時，可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str or os.PathLike, 可選) — 如果不想使用標準快取，可以指定一個目錄路徑，用於快取下載的預訓練模型配置。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個協議或端點使用的代理伺服器字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則使用 Hub 上的特定程式碼版本。可以是一個分支名、一個標籤名或一個提交ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有下一句預測頭）。

bert — BertForNextSentencePrediction (BERT 模型)
ernie — ErnieForNextSentencePrediction (ERNIE 模型)
fnet — FNetForNextSentencePrediction (FNet 模型)
megatron-bert — MegatronBertForNextSentencePrediction (Megatron-BERT 模型)
mobilebert — MobileBertForNextSentencePrediction (MobileBERT 模型)
nezha — NezhaForNextSentencePrediction (Nezha 模型)
qdqbert — QDQBertForNextSentencePrediction (QDQBert 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForNextSentencePrediction

class transformers.TFAutoModelForNextSentencePrediction

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有下一句預測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類來選擇的：
- BertConfig 配置類：TFBertForNextSentencePrediction (BERT 模型)
- MobileBertConfig 配置類：TFMobileBertForNextSentencePrediction (MobileBERT 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有下一句預測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。當滿足以下條件時，可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 如果不想使用標準快取，可以指定一個目錄路徑，用於快取下載的預訓練模型配置。
from_pt (bool, optional, defaults to False) — 是否從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為因是否提供了 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有下一句預測頭）。

bert — TFBertForNextSentencePrediction (BERT 模型)
mobilebert — TFMobileBertForNextSentencePrediction (MobileBERT 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForNextSentencePrediction

class transformers.FlaxAutoModelForNextSentencePrediction

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有下一句預測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類根據配置類選擇：
- BertConfig 配置類：FlaxBertForNextSentencePrediction (BERT 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有下一句預測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個 PyTorch state_dict 儲存檔案 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的 模型 ID 字串載入）。
- 該模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, optional, defaults to False) — 是否從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為因是否提供了 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有下一句預測頭）。

bert — FlaxBertForNextSentencePrediction (BERT 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTokenClassification

class transformers.AutoModelForTokenClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有詞元分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類根據配置類選擇：
- AlbertConfig 配置類：AlbertForTokenClassification (ALBERT 模型)
- ArceeConfig 配置類：ArceeForTokenClassification (Arcee 模型)
- BertConfig 配置類：BertForTokenClassification (BERT 模型)
- BigBirdConfig 配置類：BigBirdForTokenClassification (BigBird 模型)
- BioGptConfig 配置類：BioGptForTokenClassification (BioGpt 模型)
- BloomConfig 配置類：BloomForTokenClassification (BLOOM 模型)
- BrosConfig 配置類：BrosForTokenClassification (BROS 模型)
- CamembertConfig 配置類：CamembertForTokenClassification (CamemBERT 模型)
- CanineConfig 配置類：CanineForTokenClassification (CANINE 模型)
- ConvBertConfig 配置類：ConvBertForTokenClassification (ConvBERT 模型)
- Data2VecTextConfig 配置類：Data2VecTextForTokenClassification (Data2VecText 模型)
- DebertaConfig 配置類：DebertaForTokenClassification (DeBERTa 模型)
- DebertaV2Config 配置類：DebertaV2ForTokenClassification (DeBERTa-v2 模型)
- DiffLlamaConfig 配置類：DiffLlamaForTokenClassification (DiffLlama 模型)
- DistilBertConfig 配置類：DistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置類：ElectraForTokenClassification (ELECTRA 模型)
- ErnieConfig 配置類：ErnieForTokenClassification (ERNIE 模型)
- ErnieMConfig 配置類：ErnieMForTokenClassification (ErnieM 模型)
- EsmConfig 配置類：EsmForTokenClassification (ESM 模型)
- FNetConfig 配置類：FNetForTokenClassification (FNet 模型)
- FalconConfig 配置類：FalconForTokenClassification (Falcon 模型)
- FlaubertConfig 配置類：FlaubertForTokenClassification (FlauBERT 模型)
- FunnelConfig 配置類：FunnelForTokenClassification (Funnel Transformer 模型)
- GPT2Config 配置類：GPT2ForTokenClassification (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置類：GPTBigCodeForTokenClassification (GPTBigCode 模型)
- GPTNeoConfig 配置類：GPTNeoForTokenClassification (GPT Neo 模型)
- GPTNeoXConfig 配置類：GPTNeoXForTokenClassification (GPT NeoX 模型)
- Gemma2Config 配置類：Gemma2ForTokenClassification (Gemma2 模型)
- GemmaConfig 配置類：GemmaForTokenClassification (Gemma 模型)
- Glm4Config 配置類：Glm4ForTokenClassification (GLM4 模型)
- GlmConfig 配置類：GlmForTokenClassification (GLM 模型)
- HeliumConfig 配置類：HeliumForTokenClassification (Helium 模型)
- IBertConfig 配置類：IBertForTokenClassification (I-BERT 模型)
- LayoutLMConfig 配置類：LayoutLMForTokenClassification (LayoutLM 模型)
- LayoutLMv2Config 配置類：LayoutLMv2ForTokenClassification (LayoutLMv2 模型)
- LayoutLMv3Config 配置類：LayoutLMv3ForTokenClassification (LayoutLMv3 模型)
- LiltConfig 配置類：LiltForTokenClassification (LiLT 模型)
- LlamaConfig 配置類：LlamaForTokenClassification (LLaMA 模型)
- LongformerConfig 配置類：LongformerForTokenClassification (Longformer 模型)
- LukeConfig 配置類：LukeForTokenClassification (LUKE 模型)
- MPNetConfig 配置類：MPNetForTokenClassification (MPNet 模型)
- MT5Config 配置類：MT5ForTokenClassification (MT5 模型)
- MarkupLMConfig 配置類：MarkupLMForTokenClassification (MarkupLM 模型)
- MegaConfig 配置類：MegaForTokenClassification (MEGA 模型)
- MegatronBertConfig 配置類：MegatronBertForTokenClassification (Megatron-BERT 模型)
- MiniMaxConfig 配置類：MiniMaxForTokenClassification (MiniMax 模型)
- MistralConfig 配置類：MistralForTokenClassification (Mistral 模型)
- MixtralConfig 配置類：MixtralForTokenClassification (Mixtral 模型)
- MobileBertConfig 配置類：MobileBertForTokenClassification (MobileBERT 模型)
- ModernBertConfig 配置類：ModernBertForTokenClassification (ModernBERT 模型)
- MptConfig 配置類：MptForTokenClassification (MPT 模型)
- MraConfig 配置類：MraForTokenClassification (MRA 模型)
- NemotronConfig 配置類：NemotronForTokenClassification (Nemotron 模型)
- NezhaConfig 配置類：NezhaForTokenClassification (Nezha 模型)
- NystromformerConfig 配置類：NystromformerForTokenClassification (Nyströmformer 模型)
- PersimmonConfig 配置類：PersimmonForTokenClassification (Persimmon 模型)
- Phi3Config 配置類：Phi3ForTokenClassification (Phi3 模型)
- PhiConfig 配置類：PhiForTokenClassification (Phi 模型)
- QDQBertConfig 配置類：QDQBertForTokenClassification (QDQBert 模型)
- Qwen2Config 配置類：Qwen2ForTokenClassification (Qwen2 模型)
- Qwen2MoeConfig 配置類：Qwen2MoeForTokenClassification (Qwen2MoE 模型)
- Qwen3Config 配置類：Qwen3ForTokenClassification (Qwen3 模型)
- Qwen3MoeConfig 配置類：Qwen3MoeForTokenClassification (Qwen3MoE 模型)
- RemBertConfig 配置類：RemBertForTokenClassification (RemBERT 模型)
- RoCBertConfig 配置類：RoCBertForTokenClassification (RoCBert 模型)
- RoFormerConfig 配置類：RoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置類：RobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置類：SmolLM3ForTokenClassification (SmolLM3 模型)
- SqueezeBertConfig 配置類：SqueezeBertForTokenClassification (SqueezeBERT 模型)
- StableLmConfig 配置類：StableLmForTokenClassification (StableLm 模型)
- Starcoder2Config 配置類：Starcoder2ForTokenClassification (Starcoder2 模型)
- T5Config 配置類：T5ForTokenClassification (T5 模型)
- T5GemmaConfig 配置類：T5GemmaForTokenClassification (T5Gemma 模型)
- UMT5Config 配置類：UMT5ForTokenClassification (UMT5 模型)
- XLMConfig 配置類：XLMForTokenClassification (XLM 模型)
- XLMRobertaConfig 配置類：XLMRobertaForTokenClassification (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類：XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類：XLNetForTokenClassification (XLNet 模型)
- XmodConfig 配置類：XmodForTokenClassification (X-MOD 模型)
- YosoConfig 配置類：YosoForTokenClassification (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的某個模型類（帶有詞元分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個 tensorflow 索引檢查點檔案 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄進行重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置中建立一個模型，但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能時都預設支援斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，要使用的 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果未提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有詞元分類頭）。

albert — AlbertForTokenClassification (ALBERT 模型)
arcee — ArceeForTokenClassification (Arcee 模型)
bert — BertForTokenClassification (BERT 模型)
big_bird — BigBirdForTokenClassification (BigBird 模型)
biogpt — BioGptForTokenClassification (BioGpt 模型)
bloom — BloomForTokenClassification (BLOOM 模型)
bros — BrosForTokenClassification (BROS 模型)
camembert — CamembertForTokenClassification (CamemBERT 模型)
canine — CanineForTokenClassification (CANINE 模型)
convbert — ConvBertForTokenClassification (ConvBERT 模型)
data2vec-text — Data2VecTextForTokenClassification (Data2VecText 模型)
deberta — DebertaForTokenClassification (DeBERTa 模型)
deberta-v2 — DebertaV2ForTokenClassification (DeBERTa-v2 模型)
diffllama — DiffLlamaForTokenClassification (DiffLlama 模型)
distilbert — DistilBertForTokenClassification (DistilBERT 模型)
electra — ElectraForTokenClassification (ELECTRA 模型)
ernie — ErnieForTokenClassification (ERNIE 模型)
ernie_m — ErnieMForTokenClassification (ErnieM 模型)
esm — EsmForTokenClassification (ESM 模型)
falcon — FalconForTokenClassification (Falcon 模型)
flaubert — FlaubertForTokenClassification (FlauBERT 模型)
fnet — FNetForTokenClassification (FNet 模型)
funnel — FunnelForTokenClassification (Funnel Transformer 模型)
gemma — GemmaForTokenClassification (Gemma 模型)
gemma2 — Gemma2ForTokenClassification (Gemma2 模型)
glm — GlmForTokenClassification (GLM 模型)
glm4 — Glm4ForTokenClassification (GLM4 模型)
gpt-sw3 — GPT2ForTokenClassification (GPT-Sw3 模型)
gpt2 — GPT2ForTokenClassification (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForTokenClassification (GPTBigCode 模型)
gpt_neo — GPTNeoForTokenClassification (GPT Neo 模型)
gpt_neox — GPTNeoXForTokenClassification (GPT NeoX 模型)
helium — HeliumForTokenClassification (Helium 模型)
ibert — IBertForTokenClassification (I-BERT 模型)
layoutlm — LayoutLMForTokenClassification (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForTokenClassification (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForTokenClassification (LayoutLMv3 模型)
lilt — LiltForTokenClassification (LiLT 模型)
llama — LlamaForTokenClassification (LLaMA 模型)
longformer — LongformerForTokenClassification (Longformer 模型)
luke — LukeForTokenClassification (LUKE 模型)
markuplm — MarkupLMForTokenClassification (MarkupLM 模型)
mega — MegaForTokenClassification (MEGA 模型)
megatron-bert — MegatronBertForTokenClassification (Megatron-BERT 模型)
minimax — MiniMaxForTokenClassification (MiniMax 模型)
mistral — MistralForTokenClassification (Mistral 模型)
mixtral — MixtralForTokenClassification (Mixtral 模型)
mobilebert — MobileBertForTokenClassification (MobileBERT 模型)
modernbert — ModernBertForTokenClassification (ModernBERT 模型)
mpnet — MPNetForTokenClassification (MPNet 模型)
mpt — MptForTokenClassification (MPT 模型)
mra — MraForTokenClassification (MRA 模型)
mt5 — MT5ForTokenClassification (MT5 模型)
nemotron — NemotronForTokenClassification (Nemotron 模型)
nezha — NezhaForTokenClassification (Nezha 模型)
nystromformer — NystromformerForTokenClassification (Nyströmformer 模型)
persimmon — PersimmonForTokenClassification (Persimmon 模型)
phi — PhiForTokenClassification (Phi 模型)
phi3 — Phi3ForTokenClassification (Phi3 模型)
qdqbert — QDQBertForTokenClassification (QDQBert 模型)
qwen2 — Qwen2ForTokenClassification (Qwen2 模型)
qwen2_moe — Qwen2MoeForTokenClassification (Qwen2MoE 模型)
qwen3 — Qwen3ForTokenClassification (Qwen3 模型)
qwen3_moe — Qwen3MoeForTokenClassification (Qwen3MoE 模型)
rembert — RemBertForTokenClassification (RemBERT 模型)
roberta — RobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForTokenClassification (RoCBert 模型)
roformer — RoFormerForTokenClassification (RoFormer 模型)
smollm3 — SmolLM3ForTokenClassification (SmolLM3 模型)
squeezebert — SqueezeBertForTokenClassification (SqueezeBERT 模型)
stablelm — StableLmForTokenClassification (StableLm 模型)
starcoder2 — Starcoder2ForTokenClassification (Starcoder2 模型)
t5 — T5ForTokenClassification (T5 模型)
t5gemma — T5GemmaForTokenClassification (T5Gemma 模型)
umt5 — UMT5ForTokenClassification (UMT5 模型)
xlm — XLMForTokenClassification (XLM 模型)
xlm-roberta — XLMRobertaForTokenClassification (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL 模型)
xlnet — XLNetForTokenClassification (XLNet 模型)
xmod — XmodForTokenClassification (X-MOD 模型)
yoso — YosoForTokenClassification (YOSO 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTokenClassification

class transformers.TFAutoModelForTokenClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有詞元分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：TFAlbertForTokenClassification (ALBERT 模型)
- BertConfig 配置類：TFBertForTokenClassification (BERT 模型)
- CamembertConfig 配置類：TFCamembertForTokenClassification (CamemBERT 模型)
- ConvBertConfig 配置類：TFConvBertForTokenClassification (ConvBERT 模型)
- DebertaConfig 配置類：TFDebertaForTokenClassification (DeBERTa 模型)
- DebertaV2Config 配置類：TFDebertaV2ForTokenClassification (DeBERTa-v2 模型)
- DistilBertConfig 配置類：TFDistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置類：TFElectraForTokenClassification (ELECTRA 模型)
- EsmConfig 配置類：TFEsmForTokenClassification (ESM 模型)
- FlaubertConfig 配置類：TFFlaubertForTokenClassification (FlauBERT 模型)
- FunnelConfig 配置類：TFFunnelForTokenClassification (Funnel Transformer 模型)
- LayoutLMConfig 配置類：TFLayoutLMForTokenClassification (LayoutLM 模型)
- LayoutLMv3Config 配置類：TFLayoutLMv3ForTokenClassification (LayoutLMv3 模型)
- LongformerConfig 配置類：TFLongformerForTokenClassification (Longformer 模型)
- MPNetConfig 配置類：TFMPNetForTokenClassification (MPNet 模型)
- MobileBertConfig 配置類：TFMobileBertForTokenClassification (MobileBERT 模型)
- RemBertConfig 配置類：TFRemBertForTokenClassification (RemBERT 模型)
- RoFormerConfig 配置類：TFRoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置類：TFRobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：TFRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置類：TFXLMForTokenClassification (XLM 模型)
- XLMRobertaConfig 配置類：TFXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
- XLNetConfig 配置類：TFXLNetForTokenClassification (XLNet 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的某個模型類（帶有詞元分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字串，即託管在 huggingface.co 上模型倉庫中的預訓練模型的*模型 ID*。
- 包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型並隨後載入 TensorFlow 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄進行重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能時都預設支援斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，要使用的 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果未提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有詞元分類頭）。

albert — TFAlbertForTokenClassification (ALBERT 模型)
bert — TFBertForTokenClassification (BERT 模型)
camembert — TFCamembertForTokenClassification (CamemBERT 模型)
convbert — TFConvBertForTokenClassification (ConvBERT 模型)
deberta — TFDebertaForTokenClassification (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForTokenClassification (DeBERTa-v2 模型)
distilbert — TFDistilBertForTokenClassification (DistilBERT 模型)
electra — TFElectraForTokenClassification (ELECTRA 模型)
esm — TFEsmForTokenClassification (ESM 模型)
flaubert — TFFlaubertForTokenClassification (FlauBERT 模型)
funnel — TFFunnelForTokenClassification (Funnel Transformer 模型)
layoutlm — TFLayoutLMForTokenClassification (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForTokenClassification (LayoutLMv3 模型)
longformer — TFLongformerForTokenClassification (Longformer 模型)
mobilebert — TFMobileBertForTokenClassification (MobileBERT 模型)
mpnet — TFMPNetForTokenClassification (MPNet 模型)
rembert — TFRemBertForTokenClassification (RemBERT 模型)
roberta — TFRobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForTokenClassification (RoFormer 模型)
xlm — TFXLMForTokenClassification (XLM 模型)
xlm-roberta — TFXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
xlnet — TFXLNetForTokenClassification (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForTokenClassification

class transformers.FlaxAutoModelForTokenClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有詞元分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：FlaxAlbertForTokenClassification (ALBERT 模型)
- BertConfig 配置類：FlaxBertForTokenClassification (BERT 模型)
- BigBirdConfig 配置類：FlaxBigBirdForTokenClassification (BigBird 模型)
- DistilBertConfig 配置類：FlaxDistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置類：FlaxElectraForTokenClassification (ELECTRA 模型)
- RoFormerConfig 配置類：FlaxRoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置類：FlaxRobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：FlaxRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置類：FlaxXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的某個模型類（帶有詞元分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應該設定為 True，並且應該提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後載入 TensorFlow 模型要慢。
model_args (額外的位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str 或 os.PathLike, optional) — 如果不想使用標準快取，可以指定一個目錄路徑，用於快取下載的預訓練模型配置。
from_pt (bool, optional, defaults to False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 一個按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地機器上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, optional) — 可用於更新配置物件（在載入後）並初始化模型（例如，output_attentions=True）。行為會根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有詞元分類頭）。

albert — FlaxAlbertForTokenClassification (ALBERT 模型)
bert — FlaxBertForTokenClassification (BERT 模型)
big_bird — FlaxBigBirdForTokenClassification (BigBird 模型)
distilbert — FlaxDistilBertForTokenClassification (DistilBERT 模型)
electra — FlaxElectraForTokenClassification (ELECTRA 模型)
roberta — FlaxRobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForTokenClassification (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForTokenClassification (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForQuestionAnswering

class transformers.AutoModelForQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類：AlbertForQuestionAnswering (ALBERT 模型)
- ArceeConfig 配置類：ArceeForQuestionAnswering (Arcee 模型)
- BartConfig 配置類：BartForQuestionAnswering (BART 模型)
- BertConfig 配置類：BertForQuestionAnswering (BERT 模型)
- BigBirdConfig 配置類：BigBirdForQuestionAnswering (BigBird 模型)
- BigBirdPegasusConfig 配置類：BigBirdPegasusForQuestionAnswering (BigBird-Pegasus 模型)
- BloomConfig 配置類：BloomForQuestionAnswering (BLOOM 模型)
- CamembertConfig 配置類：CamembertForQuestionAnswering (CamemBERT 模型)
- CanineConfig 配置類：CanineForQuestionAnswering (CANINE 模型)
- ConvBertConfig 配置類：ConvBertForQuestionAnswering (ConvBERT 模型)
- Data2VecTextConfig 配置類：Data2VecTextForQuestionAnswering (Data2VecText 模型)
- DebertaConfig 配置類：DebertaForQuestionAnswering (DeBERTa 模型)
- DebertaV2Config 配置類：DebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
- DiffLlamaConfig 配置類：DiffLlamaForQuestionAnswering (DiffLlama 模型)
- DistilBertConfig 配置類：DistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置類：ElectraForQuestionAnswering (ELECTRA 模型)
- ErnieConfig 配置類：ErnieForQuestionAnswering (ERNIE 模型)
- ErnieMConfig 配置類：ErnieMForQuestionAnswering (ErnieM 模型)
- FNetConfig 配置類：FNetForQuestionAnswering (FNet 模型)
- FalconConfig 配置類：FalconForQuestionAnswering (Falcon 模型)
- FlaubertConfig 配置類：FlaubertForQuestionAnsweringSimple (FlauBERT 模型)
- FunnelConfig 配置類：FunnelForQuestionAnswering (Funnel Transformer 模型)
- GPT2Config 配置類：GPT2ForQuestionAnswering (OpenAI GPT-2 模型)
- GPTJConfig 配置類：GPTJForQuestionAnswering (GPT-J 模型)
- GPTNeoConfig 配置類：GPTNeoForQuestionAnswering (GPT Neo 模型)
- GPTNeoXConfig 配置類：GPTNeoXForQuestionAnswering (GPT NeoX 模型)
- IBertConfig 配置類：IBertForQuestionAnswering (I-BERT 模型)
- LEDConfig 配置類：LEDForQuestionAnswering (LED 模型)
- LayoutLMv2Config 配置類：LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
- LayoutLMv3Config 配置類：LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
- LiltConfig 配置類：LiltForQuestionAnswering (LiLT 模型)
- LlamaConfig 配置類：LlamaForQuestionAnswering (LLaMA 模型)
- LongformerConfig 配置類：LongformerForQuestionAnswering (Longformer 模型)
- LukeConfig 配置類：LukeForQuestionAnswering (LUKE 模型)
- LxmertConfig 配置類：LxmertForQuestionAnswering (LXMERT 模型)
- MBartConfig 配置類：MBartForQuestionAnswering (mBART 模型)
- MPNetConfig 配置類：MPNetForQuestionAnswering (MPNet 模型)
- MT5Config 配置類：MT5ForQuestionAnswering (MT5 模型)
- MarkupLMConfig 配置類：MarkupLMForQuestionAnswering (MarkupLM 模型)
- MegaConfig 配置類：MegaForQuestionAnswering (MEGA 模型)
- MegatronBertConfig 配置類：MegatronBertForQuestionAnswering (Megatron-BERT 模型)
- MiniMaxConfig 配置類：MiniMaxForQuestionAnswering (MiniMax 模型)
- MistralConfig 配置類：MistralForQuestionAnswering (Mistral 模型)
- MixtralConfig 配置類：MixtralForQuestionAnswering (Mixtral 模型)
- MobileBertConfig 配置類：MobileBertForQuestionAnswering (MobileBERT 模型)
- ModernBertConfig 配置類：ModernBertForQuestionAnswering (ModernBERT 模型)
- MptConfig 配置類：MptForQuestionAnswering (MPT 模型)
- MraConfig 配置類：MraForQuestionAnswering (MRA 模型)
- MvpConfig 配置類：MvpForQuestionAnswering (MVP 模型)
- NemotronConfig 配置類：NemotronForQuestionAnswering (Nemotron 模型)
- NezhaConfig 配置類：NezhaForQuestionAnswering (Nezha 模型)
- NystromformerConfig 配置類：NystromformerForQuestionAnswering (Nyströmformer 模型)
- OPTConfig 配置類：OPTForQuestionAnswering (OPT 模型)
- QDQBertConfig 配置類：QDQBertForQuestionAnswering (QDQBert 模型)
- Qwen2Config 配置類：Qwen2ForQuestionAnswering (Qwen2 模型)
- Qwen2MoeConfig 配置類：Qwen2MoeForQuestionAnswering (Qwen2MoE 模型)
- Qwen3Config 配置類：Qwen3ForQuestionAnswering (Qwen3 模型)
- Qwen3MoeConfig 配置類：Qwen3MoeForQuestionAnswering (Qwen3MoE 模型)
- ReformerConfig 配置類：ReformerForQuestionAnswering (Reformer 模型)
- RemBertConfig 配置類：RemBertForQuestionAnswering (RemBERT 模型)
- RoCBertConfig 配置類：RoCBertForQuestionAnswering (RoCBert 模型)
- RoFormerConfig 配置類：RoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置類：RobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類：RobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置類：SmolLM3ForQuestionAnswering (SmolLM3 模型)
- SplinterConfig 配置類：SplinterForQuestionAnswering (Splinter 模型)
- SqueezeBertConfig 配置類：SqueezeBertForQuestionAnswering (SqueezeBERT 模型)
- T5Config 配置類：T5ForQuestionAnswering (T5 模型)
- UMT5Config 配置類：UMT5ForQuestionAnswering (UMT5 模型)
- XLMConfig 配置類：XLMForQuestionAnsweringSimple (XLM 模型)
- XLMRobertaConfig 配置類：XLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置類：XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置類：XLNetForQuestionAnsweringSimple (XLNet 模型)
- XmodConfig 配置類：XmodForQuestionAnswering (X-MOD 模型)
- YosoConfig 配置類：YosoForQuestionAnswering (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個 *TensorFlow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應該設定為 True，並且應該提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後載入 PyTorch 模型要慢。
model_args (額外的位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, optional) — 如果不想使用標準快取，可以指定一個目錄路徑，用於快取下載的預訓練模型配置。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 一個按協議或端點使用的代理伺服器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統來儲存 huggingface.co 上的模型和其他構件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統來儲存 huggingface.co 上的模型和其他構件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有問答頭）。

albert — AlbertForQuestionAnswering (ALBERT 模型)
arcee — ArceeForQuestionAnswering (Arcee 模型)
bart — BartForQuestionAnswering (BART 模型)
bert — BertForQuestionAnswering (BERT 模型)
big_bird — BigBirdForQuestionAnswering (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBird-Pegasus 模型)
bloom — BloomForQuestionAnswering (BLOOM 模型)
camembert — CamembertForQuestionAnswering (CamemBERT 模型)
canine — CanineForQuestionAnswering (CANINE 模型)
convbert — ConvBertForQuestionAnswering (ConvBERT 模型)
data2vec-text — Data2VecTextForQuestionAnswering (Data2VecText 模型)
deberta — DebertaForQuestionAnswering (DeBERTa 模型)
deberta-v2 — DebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
diffllama — DiffLlamaForQuestionAnswering (DiffLlama 模型)
distilbert — DistilBertForQuestionAnswering (DistilBERT 模型)
electra — ElectraForQuestionAnswering (ELECTRA 模型)
ernie — ErnieForQuestionAnswering (ERNIE 模型)
ernie_m — ErnieMForQuestionAnswering (ErnieM 模型)
falcon — FalconForQuestionAnswering (Falcon 模型)
flaubert — FlaubertForQuestionAnsweringSimple (FlauBERT 模型)
fnet — FNetForQuestionAnswering (FNet 模型)
funnel — FunnelForQuestionAnswering (Funnel Transformer 模型)
gpt2 — GPT2ForQuestionAnswering (OpenAI GPT-2 模型)
gpt_neo — GPTNeoForQuestionAnswering (GPT Neo 模型)
gpt_neox — GPTNeoXForQuestionAnswering (GPT NeoX 模型)
gptj — GPTJForQuestionAnswering (GPT-J 模型)
ibert — IBertForQuestionAnswering (I-BERT 模型)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
led — LEDForQuestionAnswering (LED 模型)
lilt — LiltForQuestionAnswering (LiLT 模型)
llama — LlamaForQuestionAnswering (LLaMA 模型)
longformer — LongformerForQuestionAnswering (Longformer 模型)
luke — LukeForQuestionAnswering (LUKE 模型)
lxmert — LxmertForQuestionAnswering (LXMERT 模型)
markuplm — MarkupLMForQuestionAnswering (MarkupLM 模型)
mbart — MBartForQuestionAnswering (mBART 模型)
mega — MegaForQuestionAnswering (MEGA 模型)
megatron-bert — MegatronBertForQuestionAnswering (Megatron-BERT 模型)
minimax — MiniMaxForQuestionAnswering (MiniMax 模型)
mistral — MistralForQuestionAnswering (Mistral 模型)
mixtral — MixtralForQuestionAnswering (Mixtral 模型)
mobilebert — MobileBertForQuestionAnswering (MobileBERT 模型)
modernbert — ModernBertForQuestionAnswering (ModernBERT 模型)
mpnet — MPNetForQuestionAnswering (MPNet 模型)
mpt — MptForQuestionAnswering (MPT 模型)
mra — MraForQuestionAnswering (MRA 模型)
mt5 — MT5ForQuestionAnswering (MT5 模型)
mvp — MvpForQuestionAnswering (MVP 模型)
nemotron — NemotronForQuestionAnswering (Nemotron 模型)
nezha — NezhaForQuestionAnswering (Nezha 模型)
nystromformer — NystromformerForQuestionAnswering (Nyströmformer 模型)
opt — OPTForQuestionAnswering (OPT 模型)
qdqbert — QDQBertForQuestionAnswering (QDQBert 模型)
qwen2 — Qwen2ForQuestionAnswering (Qwen2 模型)
qwen2_moe — Qwen2MoeForQuestionAnswering (Qwen2MoE 模型)
qwen3 — Qwen3ForQuestionAnswering (Qwen3 模型)
qwen3_moe — Qwen3MoeForQuestionAnswering (Qwen3MoE 模型)
reformer — ReformerForQuestionAnswering (Reformer 模型)
rembert — RemBertForQuestionAnswering (RemBERT 模型)
roberta — RobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForQuestionAnswering (RoCBert 模型)
roformer — RoFormerForQuestionAnswering (RoFormer 模型)
smollm3 — SmolLM3ForQuestionAnswering (SmolLM3 模型)
splinter — SplinterForQuestionAnswering (Splinter 模型)
squeezebert — SqueezeBertForQuestionAnswering (SqueezeBERT 模型)
t5 — T5ForQuestionAnswering (T5 模型)
umt5 — UMT5ForQuestionAnswering (UMT5 模型)
xlm — XLMForQuestionAnsweringSimple (XLM 模型)
xlm-roberta — XLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL 模型)
xlnet — XLNetForQuestionAnsweringSimple (XLNet 模型)
xmod — XmodForQuestionAnswering (X-MOD 模型)
yoso — YosoForQuestionAnswering (YOSO 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForQuestionAnswering

class transformers.TFAutoModelForQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類: TFAlbertForQuestionAnswering (ALBERT 模型)
- BertConfig 配置類: TFBertForQuestionAnswering (BERT 模型)
- CamembertConfig 配置類: TFCamembertForQuestionAnswering (CamemBERT 模型)
- ConvBertConfig 配置類: TFConvBertForQuestionAnswering (ConvBERT 模型)
- DebertaConfig 配置類: TFDebertaForQuestionAnswering (DeBERTa 模型)
- DebertaV2Config 配置類: TFDebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
- DistilBertConfig 配置類: TFDistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置類: TFElectraForQuestionAnswering (ELECTRA 模型)
- FlaubertConfig 配置類: TFFlaubertForQuestionAnsweringSimple (FlauBERT 模型)
- FunnelConfig 配置類: TFFunnelForQuestionAnswering (Funnel Transformer 模型)
- GPTJConfig 配置類: TFGPTJForQuestionAnswering (GPT-J 模型)
- LayoutLMv3Config 配置類: TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
- LongformerConfig 配置類: TFLongformerForQuestionAnswering (Longformer 模型)
- MPNetConfig 配置類: TFMPNetForQuestionAnswering (MPNet 模型)
- MobileBertConfig 配置類: TFMobileBertForQuestionAnswering (MobileBERT 模型)
- RemBertConfig 配置類: TFRemBertForQuestionAnswering (RemBERT 模型)
- RoFormerConfig 配置類: TFRoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置類: TFRobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類: TFRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置類: TFXLMForQuestionAnsweringSimple (XLM 模型)
- XLMRobertaConfig 配置類: TFXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
- XLNetConfig 配置類: TFXLNetForQuestionAnsweringSimple (XLNet 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能時都預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統來儲存 huggingface.co 上的模型和其他構件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統來儲存 huggingface.co 上的模型和其他構件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有問答頭）。

albert — TFAlbertForQuestionAnswering (ALBERT 模型)
bert — TFBertForQuestionAnswering (BERT 模型)
camembert — TFCamembertForQuestionAnswering (CamemBERT 模型)
convbert — TFConvBertForQuestionAnswering (ConvBERT 模型)
deberta — TFDebertaForQuestionAnswering (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
distilbert — TFDistilBertForQuestionAnswering (DistilBERT 模型)
electra — TFElectraForQuestionAnswering (ELECTRA 模型)
flaubert — TFFlaubertForQuestionAnsweringSimple (FlauBERT 模型)
funnel — TFFunnelForQuestionAnswering (Funnel Transformer 模型)
gptj — TFGPTJForQuestionAnswering (GPT-J 模型)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
longformer — TFLongformerForQuestionAnswering (Longformer 模型)
mobilebert — TFMobileBertForQuestionAnswering (MobileBERT 模型)
mpnet — TFMPNetForQuestionAnswering (MPNet 模型)
rembert — TFRemBertForQuestionAnswering (RemBERT 模型)
roberta — TFRobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForQuestionAnswering (RoFormer 模型)
xlm — TFXLMForQuestionAnsweringSimple (XLM 模型)
xlm-roberta — TFXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
xlnet — TFXLNetForQuestionAnsweringSimple (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForQuestionAnswering

class transformers.FlaxAutoModelForQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlbertConfig 配置類: FlaxAlbertForQuestionAnswering (ALBERT 模型)
- BartConfig 配置類: FlaxBartForQuestionAnswering (BART 模型)
- BertConfig 配置類: FlaxBertForQuestionAnswering (BERT 模型)
- BigBirdConfig 配置類: FlaxBigBirdForQuestionAnswering (BigBird 模型)
- DistilBertConfig 配置類: FlaxDistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置類: FlaxElectraForQuestionAnswering (ELECTRA 模型)
- MBartConfig 配置類: FlaxMBartForQuestionAnswering (mBART 模型)
- RoFormerConfig 配置類: FlaxRoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置類: FlaxRobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置類: FlaxRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置類: FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, optional) — 目錄路徑，如果不想使用標準快取，則下載的預訓練模型配置將快取到此目錄中。
from_pt (bool, optional, defaults to False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 代理伺服器字典，用於按協議或端點指定代理，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。每次請求都會使用這些代理。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設對配置的所有相關更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應於任何配置屬性的其餘鍵將被傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有問答頭）。

albert — FlaxAlbertForQuestionAnswering (ALBERT 模型)
bart — FlaxBartForQuestionAnswering (BART 模型)
bert — FlaxBertForQuestionAnswering (BERT 模型)
big_bird — FlaxBigBirdForQuestionAnswering (BigBird 模型)
distilbert — FlaxDistilBertForQuestionAnswering (DistilBERT 模型)
electra — FlaxElectraForQuestionAnswering (ELECTRA 模型)
mbart — FlaxMBartForQuestionAnswering (mBART 模型)
roberta — FlaxRobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForQuestionAnswering (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTextEncoding

class transformers.AutoModelForTextEncoding

（ *args **kwargs ）

TFAutoModelForTextEncoding

class transformers.TFAutoModelForTextEncoding

（ *args **kwargs ）

計算機視覺

以下 auto 類可用於以下計算機視覺任務。

AutoModelForDepthEstimation

class transformers.AutoModelForDepthEstimation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有深度估計頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DPTConfig 配置類：DPTForDepthEstimation (DPT 模型)
- DepthAnythingConfig 配置類：DepthAnythingForDepthEstimation (Depth Anything 模型)
- DepthProConfig 配置類：DepthProForDepthEstimation (DepthPro 模型)
- GLPNConfig 配置類：GLPNForDepthEstimation (GLPN 模型)
- PromptDepthAnythingConfig 配置類：PromptDepthAnythingForDepthEstimation (PromptDepthAnything 模型)
- ZoeDepthConfig 配置類：ZoeDepthForDepthEstimation (ZoeDepth 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有深度估計頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 tensorflow index checkpoint file 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (附加位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果你想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str or os.PathLike, optional) — 目錄路徑，如果不想使用標準快取，則下載的預訓練模型配置將快取到此目錄中。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 代理伺服器字典，用於按協議或端點指定代理，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。每次請求都會使用這些代理。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為你信任且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設對配置的所有相關更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應於任何配置屬性的其餘鍵將被傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有深度估計頭）。

depth_anything — DepthAnythingForDepthEstimation (Depth Anything 模型)
depth_pro — DepthProForDepthEstimation (DepthPro 模型)
dpt — DPTForDepthEstimation (DPT 模型)
glpn — GLPNForDepthEstimation (GLPN 模型)
prompt_depth_anything — PromptDepthAnythingForDepthEstimation (PromptDepthAnything 模型)
zoedepth — ZoeDepthForDepthEstimation (ZoeDepth 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForDepthEstimation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageClassification

class transformers.AutoModelForImageClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有影像分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BeitConfig 配置類：BeitForImageClassification (BEiT 模型)
- BitConfig 配置類：BitForImageClassification (BiT 模型)
- CLIPConfig 配置類：CLIPForImageClassification (CLIP 模型)
- ConvNextConfig 配置類：ConvNextForImageClassification (ConvNeXT 模型)
- ConvNextV2Config 配置類：ConvNextV2ForImageClassification (ConvNeXTV2 模型)
- CvtConfig 配置類：CvtForImageClassification (CvT 模型)
- Data2VecVisionConfig 配置類：Data2VecVisionForImageClassification (Data2VecVision 模型)
- DeiTConfig 配置類：DeiTForImageClassification 或 DeiTForImageClassificationWithTeacher (DeiT 模型)
- DinatConfig 配置類：DinatForImageClassification (DiNAT 模型)
- Dinov2Config 配置類：Dinov2ForImageClassification (DINOv2 模型)
- Dinov2WithRegistersConfig 配置類：Dinov2WithRegistersForImageClassification (DINOv2 with Registers 模型)
- DonutSwinConfig 配置類：DonutSwinForImageClassification (DonutSwin 模型)
- EfficientFormerConfig 配置類：EfficientFormerForImageClassification 或 EfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
- EfficientNetConfig 配置類：EfficientNetForImageClassification (EfficientNet 模型)
- FocalNetConfig 配置類：FocalNetForImageClassification (FocalNet 模型)
- HGNetV2Config 配置類：HGNetV2ForImageClassification (HGNet-V2 模型)
- HieraConfig 配置類：HieraForImageClassification (Hiera 模型)
- IJepaConfig 配置類：IJepaForImageClassification (I-JEPA 模型)
- ImageGPTConfig 配置類：ImageGPTForImageClassification (ImageGPT 模型)
- LevitConfig 配置類：LevitForImageClassification 或 LevitForImageClassificationWithTeacher (LeViT 模型)
- MobileNetV1Config 配置類：MobileNetV1ForImageClassification (MobileNetV1 模型)
- MobileNetV2Config 配置類：MobileNetV2ForImageClassification (MobileNetV2 模型)
- MobileViTConfig 配置類：MobileViTForImageClassification (MobileViT 模型)
- MobileViTV2Config 配置類：MobileViTV2ForImageClassification (MobileViTV2 模型)
- NatConfig 配置類：NatForImageClassification (NAT 模型)
- PerceiverConfig 配置類：PerceiverForImageClassificationLearned 或 PerceiverForImageClassificationFourier 或 PerceiverForImageClassificationConvProcessing (Perceiver 模型)
- PoolFormerConfig 配置類：PoolFormerForImageClassification (PoolFormer 模型)
- PvtConfig 配置類：PvtForImageClassification (PVT 模型)
- PvtV2Config 配置類：PvtV2ForImageClassification (PVTv2 模型)
- RegNetConfig 配置類：RegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置類：ResNetForImageClassification (ResNet 模型)
- SegformerConfig 配置類：SegformerForImageClassification (SegFormer 模型)
- ShieldGemma2Config 配置類：ShieldGemma2ForImageClassification (Shieldgemma2 模型)
- Siglip2Config 配置類：Siglip2ForImageClassification (SigLIP2 模型)
- SiglipConfig 配置類：SiglipForImageClassification (SigLIP 模型)
- SwiftFormerConfig 配置類：SwiftFormerForImageClassification (SwiftFormer 模型)
- SwinConfig 配置類：SwinForImageClassification (Swin Transformer 模型)
- Swinv2Config 配置類：Swinv2ForImageClassification (Swin Transformer V2 模型)
- TextNetConfig 配置類：TextNetForImageClassification (TextNet 模型)
- TimmWrapperConfig 配置類：TimmWrapperForImageClassification (TimmWrapperModel 模型)
- VanConfig 配置類：VanForImageClassification (VAN 模型)
- ViTConfig 配置類：ViTForImageClassification (ViT 模型)
- ViTHybridConfig 配置類：ViTHybridForImageClassification (ViT Hybrid 模型)
- ViTMSNConfig 配置類：ViTMSNForImageClassification (ViTMSN 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動的注意力實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設為手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有影像分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 TensorFlow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的模型 ID 字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義自定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則要使用的 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從一個預訓練模型中例項化庫中的一個模型類（帶有影像分類頭）。

beit — BeitForImageClassification (BEiT 模型)
bit — BitForImageClassification (BiT 模型)
clip — CLIPForImageClassification (CLIP 模型)
convnext — ConvNextForImageClassification (ConvNeXT 模型)
convnextv2 — ConvNextV2ForImageClassification (ConvNeXTV2 模型)
cvt — CvtForImageClassification (CvT 模型)
data2vec-vision — Data2VecVisionForImageClassification (Data2VecVision 模型)
deit — DeiTForImageClassification 或 DeiTForImageClassificationWithTeacher (DeiT 模型)
dinat — DinatForImageClassification (DiNAT 模型)
dinov2 — Dinov2ForImageClassification (DINOv2 模型)
dinov2_with_registers — Dinov2WithRegistersForImageClassification (DINOv2 with Registers 模型)
donut-swin — DonutSwinForImageClassification (DonutSwin 模型)
efficientformer — EfficientFormerForImageClassification 或 EfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
efficientnet — EfficientNetForImageClassification (EfficientNet 模型)
focalnet — FocalNetForImageClassification (FocalNet 模型)
hgnet_v2 — HGNetV2ForImageClassification (HGNet-V2 模型)
hiera — HieraForImageClassification (Hiera 模型)
ijepa — IJepaForImageClassification (I-JEPA 模型)
imagegpt — ImageGPTForImageClassification (ImageGPT 模型)
levit — LevitForImageClassification 或 LevitForImageClassificationWithTeacher (LeViT 模型)
mobilenet_v1 — MobileNetV1ForImageClassification (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2ForImageClassification (MobileNetV2 模型)
mobilevit — MobileViTForImageClassification (MobileViT 模型)
mobilevitv2 — MobileViTV2ForImageClassification (MobileViTV2 模型)
nat — NatForImageClassification (NAT 模型)
perceiver — PerceiverForImageClassificationLearned 或 PerceiverForImageClassificationFourier 或 PerceiverForImageClassificationConvProcessing (Perceiver 模型)
poolformer — PoolFormerForImageClassification (PoolFormer 模型)
pvt — PvtForImageClassification (PVT 模型)
pvt_v2 — PvtV2ForImageClassification (PVTv2 模型)
regnet — RegNetForImageClassification (RegNet 模型)
resnet — ResNetForImageClassification (ResNet 模型)
segformer — SegformerForImageClassification (SegFormer 模型)
shieldgemma2 — ShieldGemma2ForImageClassification (Shieldgemma2 模型)
siglip — SiglipForImageClassification (SigLIP 模型)
siglip2 — Siglip2ForImageClassification (SigLIP2 模型)
swiftformer — SwiftFormerForImageClassification (SwiftFormer 模型)
swin — SwinForImageClassification (Swin Transformer 模型)
swinv2 — Swinv2ForImageClassification (Swin Transformer V2 模型)
textnet — TextNetForImageClassification (TextNet 模型)
timm_wrapper — TimmWrapperForImageClassification (TimmWrapperModel 模型)
van — VanForImageClassification (VAN 模型)
vit — ViTForImageClassification (ViT 模型)
vit_hybrid — ViTHybridForImageClassification (ViT Hybrid 模型)
vit_msn — ViTMSNForImageClassification (ViTMSN 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForImageClassification

class transformers.TFAutoModelForImageClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有影像分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- ConvNextConfig 配置類：TFConvNextForImageClassification (ConvNeXT 模型)
- ConvNextV2Config 配置類：TFConvNextV2ForImageClassification (ConvNeXTV2 模型)
- CvtConfig 配置類：TFCvtForImageClassification (CvT 模型)
- Data2VecVisionConfig 配置類：TFData2VecVisionForImageClassification (Data2VecVision 模型)
- DeiTConfig 配置類：TFDeiTForImageClassification 或 TFDeiTForImageClassificationWithTeacher (DeiT 模型)
- EfficientFormerConfig 配置類：TFEfficientFormerForImageClassification 或 TFEfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
- MobileViTConfig 配置類：TFMobileViTForImageClassification (MobileViT 模型)
- RegNetConfig 配置類：TFRegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置類：TFResNetForImageClassification (ResNet 模型)
- SegformerConfig 配置類：TFSegformerForImageClassification (SegFormer 模型)
- SwiftFormerConfig 配置類：TFSwiftFormerForImageClassification (SwiftFormer 模型)
- SwinConfig 配置類：TFSwinForImageClassification (Swin Transformer 模型)
- ViTConfig 配置類：TFViTForImageClassification (ViT 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動的注意力實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設為手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有影像分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的模型 ID。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而非自動載入的配置。配置可以在以下情況下自動載入：
- 該模型是庫提供的模型（使用預訓練模型的模型 ID 字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義自定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則要使用的 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從一個預訓練模型中例項化庫中的一個模型類（帶有影像分類頭）。

convnext — TFConvNextForImageClassification (ConvNeXT 模型)
convnextv2 — TFConvNextV2ForImageClassification (ConvNeXTV2 模型)
cvt — TFCvtForImageClassification (CvT 模型)
data2vec-vision — TFData2VecVisionForImageClassification (Data2VecVision 模型)
deit — TFDeiTForImageClassification 或 TFDeiTForImageClassificationWithTeacher (DeiT 模型)
efficientformer — TFEfficientFormerForImageClassification 或 TFEfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
mobilevit — TFMobileViTForImageClassification (MobileViT 模型)
regnet — TFRegNetForImageClassification (RegNet 模型)
resnet — TFResNetForImageClassification (ResNet 模型)
segformer — TFSegformerForImageClassification (SegFormer 模型)
swiftformer — TFSwiftFormerForImageClassification (SwiftFormer 模型)
swin — TFSwinForImageClassification (Swin Transformer 模型)
vit — TFViTForImageClassification (ViT 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForImageClassification

class transformers.FlaxAutoModelForImageClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有影像分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BeitConfig 配置類：FlaxBeitForImageClassification (BEiT 模型)
- Dinov2Config 配置類：FlaxDinov2ForImageClassification (DINOv2 模型)
- RegNetConfig 配置類：FlaxRegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置類：FlaxResNetForImageClassification (ResNet 模型)
- ViTConfig 配置類：FlaxViTForImageClassification (ViT 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從一個配置中例項化庫中的一個模型類（帶有影像分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *PyTorch state_dict 儲存檔案* 的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（透過預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則使用 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從一個預訓練模型中例項化庫中的一個模型類（帶有影像分類頭）。

beit — FlaxBeitForImageClassification (BEiT 模型)
dinov2 — FlaxDinov2ForImageClassification (DINOv2 模型)
regnet — FlaxRegNetForImageClassification (RegNet 模型)
resnet — FlaxResNetForImageClassification (ResNet 模型)
vit — FlaxViTForImageClassification (ViT 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVideoClassification

class transformers.AutoModelForVideoClassification

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有影片分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- TimesformerConfig 配置類：TimesformerForVideoClassification (TimeSformer 模型)
- VJEPA2Config 配置類：VJEPA2ForVideoClassification (VJEPA2Model 模型)
- VideoMAEConfig 配置類：VideoMAEForVideoClassification (VideoMAE 模型)
- VivitConfig 配置類：VivitForVideoClassification (ViViT 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從一個配置例項化庫中的某個模型類（帶有影片分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案* 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（透過預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則使用 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有影片分類頭）。

timesformer — TimesformerForVideoClassification (TimeSformer 模型)
videomae — VideoMAEForVideoClassification (VideoMAE 模型)
vivit — VivitForVideoClassification (ViViT 模型)
vjepa2 — VJEPA2ForVideoClassification (VJEPA2Model 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVideoClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForKeypointDetection

class transformers.AutoModelForKeypointDetection

（ *args **kwargs ）

AutoModelForMaskedImageModeling

class transformers.AutoModelForMaskedImageModeling

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有掩碼影像建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DeiTConfig 配置類：DeiTForMaskedImageModeling (DeiT 模型)
- FocalNetConfig 配置類：FocalNetForMaskedImageModeling (FocalNet 模型)
- SwinConfig 配置類：SwinForMaskedImageModeling (Swin Transformer 模型)
- Swinv2Config 配置類：Swinv2ForMaskedImageModeling (Swin Transformer V2 模型)
- ViTConfig 配置類：ViTForMaskedImageModeling (ViT 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從一個配置例項化庫中的某個模型類（帶有掩碼影像建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案* 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（透過預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 當不應使用標準快取時，用於快取下載的預訓練模型配置的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 是否從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個根據協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其他部分位於不同的倉庫中，指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有掩碼影像建模頭）。

deit — DeiTForMaskedImageModeling (DeiT model)
focalnet — FocalNetForMaskedImageModeling (FocalNet model)
swin — SwinForMaskedImageModeling (Swin Transformer model)
swinv2 — Swinv2ForMaskedImageModeling (Swin Transformer V2 model)
vit — ViTForMaskedImageModeling (ViT model)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedImageModeling

class transformers.TFAutoModelForMaskedImageModeling

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有掩碼影像建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DeiTConfig 配置類：TFDeiTForMaskedImageModeling (DeiT model)
- SwinConfig 配置類：TFSwinForMaskedImageModeling (Swin Transformer model)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從一個配置例項化庫中的某個模型類（帶有掩碼影像建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedImageModeling.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 是否從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個根據協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其他部分位於不同的倉庫中，指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有掩碼影像建模頭）。

deit — TFDeiTForMaskedImageModeling (DeiT model)
swin — TFSwinForMaskedImageModeling (Swin Transformer model)

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForObjectDetection

class transformers.AutoModelForObjectDetection

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有目標檢測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- ConditionalDetrConfig 配置類：ConditionalDetrForObjectDetection (Conditional DETR model)
- DFineConfig 配置類：DFineForObjectDetection (D-FINE model)
- DabDetrConfig 配置類：DabDetrForObjectDetection (DAB-DETR model)
- DeformableDetrConfig 配置類：DeformableDetrForObjectDetection (Deformable DETR model)
- DetaConfig 配置類：DetaForObjectDetection (DETA model)
- DetrConfig 配置類：DetrForObjectDetection (DETR model)
- RTDetrConfig 配置類：RTDetrForObjectDetection (RT-DETR model)
- RTDetrV2Config 配置類：RTDetrV2ForObjectDetection (RT-DETRv2 model)
- TableTransformerConfig 配置類：TableTransformerForObjectDetection (Table Transformer model)
- YolosConfig 配置類：YolosForObjectDetection (YOLOS model)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有目標檢測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個 TensorFlow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選項。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 是否從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個根據協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其他部分位於不同的倉庫中，指定要用於 Hub 上程式碼的特定修訂版。它可以是分支名、標籤名或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有目標檢測頭）。

conditional_detr — ConditionalDetrForObjectDetection (Conditional DETR model)
d_fine — DFineForObjectDetection (D-FINE model)
dab-detr — DabDetrForObjectDetection (DAB-DETR model)
deformable_detr — DeformableDetrForObjectDetection (Deformable DETR model)
deta — DetaForObjectDetection (DETA model)
detr — DetrForObjectDetection (DETR 模型)
rt_detr — RTDetrForObjectDetection (RT-DETR 模型)
rt_detr_v2 — RTDetrV2ForObjectDetection (RT-DETRv2 模型)
table-transformer — TableTransformerForObjectDetection (Table Transformer 模型)
yolos — YolosForObjectDetection (YOLOS 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageSegmentation

class transformers.AutoModelForImageSegmentation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有影像分割頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DetrConfig 配置類：DetrForSegmentation (DETR 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有影像分割頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 model id。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的目錄的路徑，例如 ./my_model_directory/。
- 一個指向 tensorflow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過將本地目錄作為 pretrained_model_name_or_path 提供來載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為那些您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。每個與配置屬性對應的 kwargs 鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有影像分割頭）。

detr — DetrForSegmentation (DETR 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageToImage

class transformers.AutoModelForImageToImage

（ *args **kwargs ）

AutoModelForSemanticSegmentation

class transformers.AutoModelForSemanticSegmentation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有語義分割頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BeitConfig 配置類： BeitForSemanticSegmentation (BEiT 模型)
- DPTConfig 配置類： DPTForSemanticSegmentation (DPT 模型)
- Data2VecVisionConfig 配置類： Data2VecVisionForSemanticSegmentation (Data2VecVision 模型)
- MobileNetV2Config 配置類： MobileNetV2ForSemanticSegmentation (MobileNetV2 模型)
- MobileViTConfig 配置類： MobileViTForSemanticSegmentation (MobileViT 模型)
- MobileViTV2Config 配置類： MobileViTV2ForSemanticSegmentation (MobileViTV2 模型)
- SegformerConfig 配置類： SegformerForSemanticSegmentation (SegFormer 模型)
- UperNetConfig 配置類： UperNetForSemanticSegmentation (UPerNet 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有語義分割頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 model id。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的目錄的路徑，例如 ./my_model_directory/。
- 一個指向 tensorflow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過將本地目錄作為 pretrained_model_name_or_path 提供來載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為那些您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類初始化函式 (from_pretrained())。每個與配置屬性對應的 kwargs 鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有語義分割頭）。

beit — BeitForSemanticSegmentation (BEiT 模型)
data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVision 模型)
dpt — DPTForSemanticSegmentation (DPT 模型)
mobilenet_v2 — MobileNetV2ForSemanticSegmentation (MobileNetV2 模型)
mobilevit — MobileViTForSemanticSegmentation (MobileViT 模型)
mobilevitv2 — MobileViTV2ForSemanticSegmentation (MobileViTV2 模型)
segformer — SegformerForSemanticSegmentation (SegFormer 模型)
upernet — UperNetForSemanticSegmentation (UPerNet 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSemanticSegmentation

class transformers.TFAutoModelForSemanticSegmentation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有語義分割頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Data2VecVisionConfig 配置類：TFData2VecVisionForSemanticSegmentation (Data2VecVision 模型)
- MobileViTConfig 配置類：TFMobileViTForSemanticSegmentation (MobileViT 模型)
- SegformerConfig 配置類：TFSegformerForSemanticSegmentation (SegFormer 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有語義分割頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSemanticSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向 *PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用它們自己的建模檔案。此選項只應為那些您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼位於與模型其餘部分不同的倉庫中，則使用該程式碼的特定修訂版。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（在載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有語義分割頭）。

data2vec-vision — TFData2VecVisionForSemanticSegmentation (Data2VecVision 模型)
mobilevit — TFMobileViTForSemanticSegmentation (MobileViT 模型)
segformer — TFSegformerForSemanticSegmentation (SegFormer 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForInstanceSegmentation

class transformers.AutoModelForInstanceSegmentation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有例項分割頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- MaskFormerConfig 配置類：MaskFormerForInstanceSegmentation (MaskFormer 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有例項分割頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於替代從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str or os.PathLike, 可選) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用它們自己的建模檔案。此選項只應為那些您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼位於與模型其餘部分不同的倉庫中，則使用該程式碼的特定修訂版。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（在載入後）和初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入配置：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有例項分割頭）。

maskformer — MaskFormerForInstanceSegmentation (MaskFormer 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForUniversalSegmentation

class transformers.AutoModelForUniversalSegmentation

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有通用影像分割頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DetrConfig 配置類：DetrForSegmentation (DETR 模型)
- Mask2FormerConfig 配置類：Mask2FormerForUniversalSegmentation (Mask2Former 模型)
- MaskFormerConfig 配置類：MaskFormerForInstanceSegmentation (MaskFormer 模型)
- OneFormerConfig 配置類：OneFormerForUniversalSegmentation (OneFormer 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有通用影像分割頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如：./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於替代從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str or os.PathLike, 可選) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都預設支援斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為受信任且已閱讀程式碼的倉庫設定為 True，因為它將在本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定 Hub 上要使用的特定程式碼版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於以提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有通用影像分割頭）。

detr — DetrForSegmentation (DETR 模型)
mask2former — Mask2FormerForUniversalSegmentation (Mask2Former 模型)
maskformer — MaskFormerForInstanceSegmentation (MaskFormer 模型)
oneformer — OneFormerForUniversalSegmentation (OneFormer 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForUniversalSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForZeroShotImageClassification

class transformers.AutoModelForZeroShotImageClassification

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有零樣本影像分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AlignConfig 配置類：AlignModel (ALIGN 模型)
- AltCLIPConfig 配置類：AltCLIPModel (AltCLIP 模型)
- Blip2Config 配置類：Blip2ForImageTextRetrieval (BLIP-2 模型)
- BlipConfig 配置類：BlipModel (BLIP 模型)
- CLIPConfig 配置類：CLIPModel (CLIP 模型)
- CLIPSegConfig 配置類：CLIPSegModel (CLIPSeg 模型)
- ChineseCLIPConfig 配置類：ChineseCLIPModel (Chinese-CLIP 模型)
- Siglip2Config 配置類：Siglip2Model (SigLIP2 模型)
- SiglipConfig 配置類：SiglipModel (SigLIP 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有零樣本影像分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個*tensorflow 索引檢查點檔案*的路徑或 URL（例如 ./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 模型的配置，用於替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str or os.PathLike, 可選) — 一個目錄的路徑，用於快取下載的預訓練模型配置，如果不想使用標準快取目錄。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都預設支援斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為受信任且已閱讀程式碼的倉庫設定為 True，因為它將在本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定 Hub 上要使用的特定程式碼版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於以提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有零樣本影像分類頭）。

align — AlignModel (ALIGN 模型)
altclip — AltCLIPModel (AltCLIP 模型)
blip — BlipModel (BLIP 模型)
blip-2 — Blip2ForImageTextRetrieval (BLIP-2 模型)
chinese_clip — ChineseCLIPModel (Chinese-CLIP 模型)
clip — CLIPModel (CLIP 模型)
clipseg — CLIPSegModel (CLIPSeg 模型)
siglip — SiglipModel (SigLIP 模型)
siglip2 — Siglip2Model (SigLIP2 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForZeroShotImageClassification

class transformers.TFAutoModelForZeroShotImageClassification

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有零樣本影像分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BlipConfig 配置類：TFBlipModel (BLIP 模型)
- CLIPConfig 配置類：TFCLIPModel (CLIP 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有零樣本影像分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForZeroShotImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個*PyTorch state_dict 儲存檔案*的路徑或 URL（例如 ./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 模型的配置，用於替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置 JSON 檔案。
cache_dir (str or os.PathLike, 可選) — 一個目錄的路徑，用於快取下載的預訓練模型配置，如果不想使用標準快取目錄。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都預設支援斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許 Hub 上自定義模型在其自己的建模檔案中定義。此選項只應為受信任且已閱讀程式碼的倉庫設定為 True，因為它將在本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定 Hub 上要使用的特定程式碼版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如 output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於以提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有零樣本影像分類頭）。

blip — TFBlipModel (BLIP 模型)
clip — TFCLIPModel (CLIP 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForZeroShotObjectDetection

class transformers.AutoModelForZeroShotObjectDetection

（ *args **kwargs ）

這是一個通用模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有零樣本目標檢測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- GroundingDinoConfig 配置類： GroundingDinoForObjectDetection (Grounding DINO 模型)
- OmDetTurboConfig 配置類： OmDetTurboForObjectDetection (OmDet-Turbo 模型)
- OwlViTConfig 配置類： OwlViTForObjectDetection (OWL-ViT 模型)
- Owlv2Config 配置類： Owlv2ForObjectDetection (OWLv2 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從一個配置中例項化一個庫中的模型類（帶有零樣本物件檢測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案* 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 模型的配置，用於替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 用於代替從已儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（在載入後）和初始化模型（例如，output_attentions=True）。其行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個與配置屬性對應的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化一個庫中的模型類（帶有零樣本物件檢測頭）。

grounding-dino — GroundingDinoForObjectDetection (Grounding DINO 模型)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo 模型)
owlv2 — Owlv2ForObjectDetection (OWLv2 模型)
owlvit — OwlViTForObjectDetection (OWL-ViT 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

音訊

以下自動類可用於以下音訊任務。

AutoModelForAudioClassification

class transformers.AutoModelForAudioClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有音訊分類頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- ASTConfig 配置類：ASTForAudioClassification (Audio Spectrogram Transformer 模型)
- Data2VecAudioConfig 配置類：Data2VecAudioForSequenceClassification (Data2VecAudio 模型)
- HubertConfig 配置類：HubertForSequenceClassification (Hubert 模型)
- SEWConfig 配置類：SEWForSequenceClassification (SEW 模型)
- SEWDConfig 配置類：SEWDForSequenceClassification (SEW-D 模型)
- UniSpeechConfig 配置類：UniSpeechForSequenceClassification (UniSpeech 模型)
- UniSpeechSatConfig 配置類：UniSpeechSatForSequenceClassification (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置類：Wav2Vec2BertForSequenceClassification (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置類：Wav2Vec2ForSequenceClassification (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置類：Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置類：WavLMForSequenceClassification (WavLM 模型)
- WhisperConfig 配置類：WhisperForAudioClassification (Whisper 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從一個配置中例項化一個庫中的模型類（帶有音訊分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案* 的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 模型的配置，用於替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 用於代替從已儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地機器上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。可以是一個分支名、一個標籤名或一個提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（在載入後）和初始化模型（例如，output_attentions=True）。其行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中每個與配置屬性對應的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化一個庫中的模型類（帶有音訊分類頭）。

audio-spectrogram-transformer — ASTForAudioClassification (Audio Spectrogram Transformer 模型)
data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudio 模型)
hubert — HubertForSequenceClassification (Hubert 模型)
sew — SEWForSequenceClassification (SEW 模型)
sew-d — SEWDForSequenceClassification (SEW-D 模型)
unispeech — UniSpeechForSequenceClassification (UniSpeech 模型)
unispeech-sat — UniSpeechSatForSequenceClassification (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForSequenceClassification (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForSequenceClassification (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer 模型)
wavlm — WavLMForSequenceClassification (WavLM 模型)
whisper — WhisperForAudioClassification (Whisper 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForAudioFrameClassification

class transformers.TFAutoModelForAudioClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有音訊分類頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Wav2Vec2Config 配置類：TFWav2Vec2ForSequenceClassification (Wav2Vec2 模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（手動實現注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設為手動的 "eager" 實現。

從一個配置中例項化一個庫中的模型類（帶有音訊分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForAudioClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向 *PyTorch state_dict 儲存檔案* 的路徑或 URL（例如 ./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 模型的配置，用於替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str or os.PathLike, optional) — 目錄路徑，如果不想使用標準快取，下載的預訓練模型配置將快取到該目錄中。
from_pt (bool, optional, defaults to False) — 從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，若存在快取版本則覆蓋它們。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都預設支援斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 一個字典，用於指定按協議或端點使用的代理伺服器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應在你信任的且已閱讀過程式碼的倉庫中設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果 Hub 上的程式碼與模型的其餘部分不在同一個倉庫中，要使用的特定程式碼版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化一個庫中的模型類（帶有音訊分類頭）。

wav2vec2 — TFWav2Vec2ForSequenceClassification (Wav2Vec2 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForAudioClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

TFAutoModelForAudioFrameClassification

class transformers.AutoModelForAudioFrameClassification

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有音訊幀（詞元）分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Data2VecAudioConfig 配置類：Data2VecAudioForAudioFrameClassification (Data2VecAudio 模型)
- UniSpeechSatConfig 配置類：UniSpeechSatForAudioFrameClassification (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置類：Wav2Vec2BertForAudioFrameClassification (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置類：Wav2Vec2ForAudioFrameClassification (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置類：Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置類：WavLMForAudioFrameClassification (WavLM 模型)
attn_implementation (str, optional) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動實現的 "eager"。

根據配置例項化庫中的一個模型類（帶有音訊幀（詞元）分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 *模型 ID*。
- 包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- *tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的 *模型 ID* 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從已儲存的權重檔案中載入的狀態字典。

如果您想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選擇。
cache_dir (str or os.PathLike, optional) — 目錄路徑，如果不想使用標準快取，下載的預訓練模型配置將快取到該目錄中。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，若存在快取版本則覆蓋它們。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都預設支援斷點續傳。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 一個字典，用於指定按協議或端點使用的代理伺服器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將用於每個請求。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型檔案中定義模型。此選項只應在你信任的且已閱讀過程式碼的倉庫中設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果 Hub 上的程式碼與模型的其餘部分不在同一個倉庫中，要使用的特定程式碼版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為因是否提供 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有音訊幀（詞元）分類頭）。

data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudio 模型)
unispeech-sat — UniSpeechSatForAudioFrameClassification (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForAudioFrameClassification (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForAudioFrameClassification (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer 模型)
wavlm — WavLMForAudioFrameClassification (WavLM 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForCTC

class transformers.AutoModelForCTC

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的某個模型類（帶有連線時序分類頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Data2VecAudioConfig 配置類：Data2VecAudioForCTC (Data2VecAudio 模型)
- HubertConfig 配置類：HubertForCTC (Hubert 模型)
- MCTCTConfig 配置類：MCTCTForCTC (M-CTC-T 模型)
- SEWConfig 配置類：SEWForCTC (SEW 模型)
- SEWDConfig 配置類：SEWDForCTC (SEW-D 模型)
- UniSpeechConfig 配置類：UniSpeechForCTC (UniSpeech 模型)
- UniSpeechSatConfig 配置類：UniSpeechSatForCTC (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置類：Wav2Vec2BertForCTC (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置類：Wav2Vec2ForCTC (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置類：Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置類：WavLMForCTC (WavLM 模型)
attn_implementation (str, optional) — 模型中要使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動實現的 "eager"。

根據配置例項化庫中的一個模型類（帶有連線時序分類頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即 huggingface.co 上模型倉庫中託管的預訓練模型的 *模型 ID*。
- 包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- *tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的 *模型 ID* 字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從已儲存的權重檔案中載入的狀態字典。

如果您想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選擇。
cache_dir (str or os.PathLike, optional) — 目錄路徑，如果不想使用標準快取，下載的預訓練模型配置將快取到該目錄中。
from_tf (bool, optional, defaults to False) — 是否從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。現在，所有下載在可能的情況下都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有連線主義時間分類頭）。

data2vec-audio — Data2VecAudioForCTC (Data2VecAudio 模型)
hubert — HubertForCTC (Hubert 模型)
mctct — MCTCTForCTC (M-CTC-T 模型)
sew — SEWForCTC (SEW 模型)
sew-d — SEWDForCTC (SEW-D 模型)
unispeech — UniSpeechForCTC (UniSpeech 模型)
unispeech-sat — UniSpeechSatForCTC (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForCTC (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForCTC (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer 模型)
wavlm — WavLMForCTC (WavLM 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForSpeechSeq2Seq

class transformers.AutoModelForSpeechSeq2Seq

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DiaConfig 配置類：DiaForConditionalGeneration (Dia 模型)
- GraniteSpeechConfig 配置類：GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
- KyutaiSpeechToTextConfig 配置類：KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToText 模型)
- MoonshineConfig 配置類：MoonshineForConditionalGeneration (Moonshine 模型)
- Pop2PianoConfig 配置類：Pop2PianoForConditionalGeneration (Pop2Piano 模型)
- SeamlessM4TConfig 配置類：SeamlessM4TForSpeechToText (SeamlessM4T 模型)
- SeamlessM4Tv2Config 配置類：SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2 模型)
- Speech2TextConfig 配置類：Speech2TextForConditionalGeneration (Speech2Text 模型)
- SpeechEncoderDecoderConfig 配置類：SpeechEncoderDecoderModel (Speech Encoder decoder 模型)
- SpeechT5Config 配置類：SpeechT5ForSpeechToText (SpeechT5 模型)
- WhisperConfig 配置類：WhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法的其他位置引數。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 要使用的狀態字典，而不是從儲存的權重檔案載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，則可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是更簡單的選項。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, optional, defaults to False) — 是否從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。現在，所有下載在可能的情況下都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則指定用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (additional keyword arguments, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為方式根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵都將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

dia — DiaForConditionalGeneration (Dia 模型)
granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
kyutai_speech_to_text — KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToText 模型)
moonshine — MoonshineForConditionalGeneration (Moonshine 模型)
pop2piano — Pop2PianoForConditionalGeneration (Pop2Piano 模型)
seamless_m4t — SeamlessM4TForSpeechToText (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2 模型)
speech-encoder-decoder — SpeechEncoderDecoderModel (Speech Encoder decoder 模型)
speech_to_text — Speech2TextForConditionalGeneration (Speech2Text 模型)
speecht5 — SpeechT5ForSpeechToText (SpeechT5 模型)
whisper — WhisperForConditionalGeneration (Whisper 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSpeechSeq2Seq

class transformers.TFAutoModelForSpeechSeq2Seq

（ *args **kwargs ）

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Speech2TextConfig 配置類：TFSpeech2TextForConditionalGeneration (Speech2Text 模型)
- WhisperConfig 配置類：TFWhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型，然後再載入 TensorFlow 模型要慢。
model_args (additional positional arguments, optional) — 將傳遞給底層模型 __init__() 方法的其他位置引數。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。配置可以在以下情況下自動載入：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型使用 save_pretrained() 儲存，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 *config.json* 的配置檔案。
cache_dir (str or os.PathLike, optional) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, optional, defaults to False) — 是否從 PyTorch 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋它們。
resume_download — 已棄用並忽略。現在，所有下載在可能的情況下都會預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 一個用於按協議或端點指定代理伺服器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為受信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為方式根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於覆蓋該屬性的值。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

speech_to_text — TFSpeech2TextForConditionalGeneration (Speech2Text 模型)
whisper — TFWhisperForConditionalGeneration (Whisper 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSpeechSeq2Seq

class transformers.FlaxAutoModelForSpeechSeq2Seq

（ *args **kwargs ）

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- SpeechEncoderDecoderConfig 配置類: FlaxSpeechEncoderDecoderModel (語音編碼器-解碼器模型)
- WhisperConfig 配置類: FlaxWhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個*目錄*的路徑，該目錄包含使用 save_pretrained() 儲存的模型權重，例如 ./my_model_directory/。
- 一個*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個用於按協議或端點指定代理伺服器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為受信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為方式根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於覆蓋該屬性的值。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有序列到序列語音轉文字建模頭）。

speech-encoder-decoder — FlaxSpeechEncoderDecoderModel (語音編碼器-解碼器模型)
whisper — FlaxWhisperForConditionalGeneration (Whisper 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForAudioXVector

class transformers.AutoModelForAudioXVector

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有透過 x-vector 進行音訊檢索的頭部）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Data2VecAudioConfig 配置類: Data2VecAudioForXVector (Data2VecAudio 模型)
- UniSpeechSatConfig 配置類: UniSpeechSatForXVector (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置類: Wav2Vec2BertForXVector (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置類: Wav2Vec2ForXVector (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置類: Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置類: WavLMForXVector (WavLM 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有透過 x-vector 進行音訊檢索的頭部）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個*目錄*的路徑，該目錄包含使用 save_pretrained() 儲存的模型權重，例如 ./my_model_directory/。
- 一個*tensorflow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，from_tf 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (附加位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 用於替代從已儲存權重檔案載入的狀態字典的狀態字典。

如果您想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str or os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 一個用於按協議或端點指定代理伺服器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為受信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則指定要用於 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交ID，因為我們使用基於git的系統在huggingface.co上儲存模型和其他工件，所以 revision 可以是git允許的任何識別符號。
kwargs (附加關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為方式根據是否提供 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式 (from_pretrained())。kwargs 中與配置屬性對應的每個鍵都將用於覆蓋該屬性的值。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有透過 x-vector 進行音訊檢索的頭部）。

data2vec-audio — Data2VecAudioForXVector (Data2VecAudio 模型)
unispeech-sat — UniSpeechSatForXVector (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForXVector (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForXVector (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer 模型)
wavlm — WavLMForXVector (WavLM 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForTextToSpectrogram

class transformers.AutoModelForTextToSpectrogram

（ *args **kwargs ）

AutoModelForTextToWaveform

class transformers.AutoModelForTextToWaveform

（ *args **kwargs ）

AutoModelForAudioTokenization

class transformers.AutoModelForAudioTokenization

（ *args **kwargs ）

這是一個通用的模型類，當透過 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有一個透過碼本進行音訊分詞的頭部）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- DacConfig 配置類：DacModel (DAC 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1，將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有一個透過碼本進行音訊分詞的頭部）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioTokenization.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個 *TensorFlow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供一個本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選項。
cache_dir (str or os.PathLike, optional) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有一個透過碼本進行音訊分詞的頭部）。

dac — DacModel (DAC 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioTokenization.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

多模態

以下自動類可用於以下多模態任務。

AutoModelForTableQuestionAnswering

class transformers.AutoModelForTableQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當透過 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有一個表格問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- TapasConfig 配置類：TapasForQuestionAnswering (TAPAS 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1，將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有一個表格問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個 *TensorFlow 索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型，然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型透過提供一個本地目錄作為 pretrained_model_name_or_path 載入，並且在該目錄中找到了名為 *config.json* 的配置檔案。
state_dict (dict[str, torch.Tensor], optional) — 一個狀態字典，用於代替從儲存的權重檔案中載入的狀態字典。

如果你想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，你應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選項。
cache_dir (str or os.PathLike, optional) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次請求時使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為你信任的且已閱讀其程式碼的倉庫設定為 True，因為它將在你的本地計算機上執行 Hub 上的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼與模型的其餘部分位於不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為根據是否提供了 config 或自動載入而有所不同：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中每個對應於配置屬性的鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有一個表格問答頭）。

tapas — TapasForQuestionAnswering (TAPAS 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
...     "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTableQuestionAnswering

class transformers.TFAutoModelForTableQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當透過 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有一個表格問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- TapasConfig 配置類：TFTapasForQuestionAnswering (TAPAS 模型)
attn_implementation (str, optional) — 在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，對於 torch>=2.1.1，將使用 SDPA。否則，預設是手動的 "eager" 實現。

從配置中例項化庫中的一個模型類（帶有一個表格問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = TFAutoModelForTableQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個字典，包含按協議或端點使用的代理伺服器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則使用 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為因是否提供了 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型中例項化庫中的一個模型類（帶有一個表格問答頭）。

tapas — TFTapasForQuestionAnswering (TAPAS 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
...     "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForDocumentQuestionAnswering

class transformers.AutoModelForDocumentQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有文件問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- LayoutLMConfig 配置類：LayoutLMForQuestionAnswering (LayoutLM 模型)
- LayoutLMv2Config 配置類：LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
- LayoutLMv3Config 配置類：LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
attn_implementation (str, 可選) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動的注意力實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有文件問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 tensorflow 索引檢查點檔案的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
state_dict (dict[str, torch.Tensor], 可選) — 一個狀態字典，用於替代從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否是更簡單的選擇。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個字典，包含按協議或端點使用的代理伺服器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則使用 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。行為因是否提供了 config 或自動載入而異：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新都已完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有文件問答頭）。

layoutlm — LayoutLMForQuestionAnswering (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/layoutlm_tf_model_config.json")
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./tf_model/layoutlm_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForDocumentQuestionAnswering

class transformers.TFAutoModelForDocumentQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有文件問答頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- LayoutLMConfig 配置類：TFLayoutLMForQuestionAnswering (LayoutLM 模型)
- LayoutLMv3Config 配置類：TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
attn_implementation (str, 可選) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（手動的注意力實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，對於 torch>=2.1.1 將使用 SDPA。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有文件問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 模型倉庫中的預訓練模型的 model id。
- 一個包含使用 save_pretrained() 儲存的模型權重的目錄路徑，例如 ./my_model_directory/。
- 一個指向 PyTorch state_dict 儲存檔案的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，應將 from_pt 設定為 True，並且應透過 config 引數提供一個配置物件。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型後再載入 TensorFlow 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，以替代自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的 model id 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在該目錄中找到了名為 config.json 的配置檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（參見 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用且被忽略。現在所有下載在可能的情況下都會預設斷點續傳。將在 Transformers v5 版本中移除。
proxies (dict[str, str], 可選) — 一個字典，包含按協議或端點使用的代理伺服器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理將在每個請求中使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上定義的自定義模型使用其自己的建模檔案。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則使用 Hub 上程式碼的特定版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如 output_attentions=True）。其行為因是否提供了 `config` 或自動載入而異：
- 如果透過 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式 (from_pretrained())。`kwargs` 中與配置屬性對應的每個鍵都將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將被傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有文件問答頭）。

layoutlm — TFLayoutLMForQuestionAnswering (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/layoutlm_pt_model_config.json")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./pt_model/layoutlm_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVisualQuestionAnswering

class transformers.AutoModelForVisualQuestionAnswering

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有視覺問答頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Blip2Config 配置類：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置類：BlipForQuestionAnswering (BLIP 模型)
- ViltConfig 配置類：ViltForQuestionAnswering (ViLT 模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現方式（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有視覺問答頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案*的路徑或 URL（例如 ./tf_model/model.ckpt.index）。在這種情況下，`from_tf` 應設定為 `True`，並且應提供一個配置物件作為 `config` 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 `__init__()` 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是由庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 該模型透過提供一個本地目錄作為 `pretrained_model_name_or_path` 來載入，並且在該目錄中找到了一個名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄的路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 `pretrained_model_name_or_path` 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋現有的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 `True`，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如 output_attentions=True）。其行為因是否提供了 `config` 或自動載入而異：
- 如果透過 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式 (from_pretrained())。`kwargs` 中與配置屬性對應的每個鍵都將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將被傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有視覺問答頭）。

blip — BlipForQuestionAnswering (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
vilt — ViltForQuestionAnswering (ViLT 模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
...     "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForVision2Seq

class transformers.AutoModelForVision2Seq

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有視覺到文字建模頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- Blip2Config 配置類：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置類：BlipForConditionalGeneration (BLIP 模型)
- ChameleonConfig 配置類：ChameleonForConditionalGeneration (Chameleon 模型)
- GitConfig 配置類：GitForCausalLM (GIT 模型)
- Idefics2Config 配置類：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置類：Idefics3ForConditionalGeneration (Idefics3 模型)
- InstructBlipConfig 配置類：InstructBlipForConditionalGeneration (InstructBLIP 模型)
- InstructBlipVideoConfig 配置類：InstructBlipVideoForConditionalGeneration (InstructBlipVideo 模型)
- Kosmos2Config 配置類：Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
- LlavaConfig 配置類：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置類：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置類：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置類：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- Mistral3Config 配置類：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置類：MllamaForConditionalGeneration (Mllama 模型)
- PaliGemmaConfig 配置類：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Pix2StructConfig 配置類：Pix2StructForConditionalGeneration (Pix2Struct 模型)
- Qwen2VLConfig 配置類：Qwen2VLForConditionalGeneration (Qwen2VL 模型)
- Qwen2_5_VLConfig 配置類：Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
- VideoLlavaConfig 配置類：VideoLlavaForConditionalGeneration (VideoLlava 模型)
- VipLlavaConfig 配置類：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisionEncoderDecoderConfig 配置類：VisionEncoderDecoderModel (視覺編碼器解碼器模型)
attn_implementation (str, 可選) — 在模型中使用的注意力實現方式（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有視覺到文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向 *tensorflow 索引檢查點檔案*的路徑或 URL（例如 ./tf_model/model.ckpt.index）。在這種情況下，`from_tf` 應設定為 `True`，並且應提供一個配置物件作為 `config` 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後再載入 PyTorch 模型要慢。
model_args (額外的位置引數, 可選) — 將傳遞給底層模型的 `__init__()` 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 該模型是由庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 該模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 該模型透過提供一個本地目錄作為 `pretrained_model_name_or_path` 來載入，並且在該目錄中找到了一個名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], 可選) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練的配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取到的目錄的路徑。
from_tf (bool, 可選, 預設為 False) — 從 TensorFlow 檢查點儲存檔案中載入模型權重（請參閱 `pretrained_model_name_or_path` 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋現有的快取版本。
resume_download — 已棄用並被忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 `True`，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則用於 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 git 的系統來儲存模型和其他工件，所以 `revision` 可以是 git 允許的任何識別符號。
kwargs (額外的關鍵字引數, 可選) — 可用於更新配置物件（載入後）並初始化模型（例如 output_attentions=True）。其行為因是否提供了 `config` 或自動載入而異：
- 如果透過 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式 (from_pretrained())。`kwargs` 中與配置屬性對應的每個鍵都將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將被傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有視覺到文字建模頭）。

blip — BlipForConditionalGeneration (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
chameleon — ChameleonForConditionalGeneration (Chameleon 模型)
git — GitForCausalLM (GIT 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoForConditionalGeneration (InstructBlipVideo 模型)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct 模型)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL 模型)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
vision-encoder-decoder — VisionEncoderDecoderModel (視覺編碼器解碼器模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVision2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForVision2Seq

class transformers.TFAutoModelForVision2Seq

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有視覺到文字建模頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- BlipConfig 配置類：TFBlipForConditionalGeneration (BLIP 模型)
- VisionEncoderDecoderConfig 配置類：TFVisionEncoderDecoderModel (視覺編碼器-解碼器模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一個。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有視覺到文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 `__init__()` 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型是透過提供本地目錄作為 pretrained_model_name_or_path 載入的，並且在目錄中找到了名為 *config.json* 的配置 JSON 檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 `pretrained_model_name_or_path` 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能時預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義自定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼與模型的其餘部分位於不同的倉庫中，則使用特定的程式碼修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，`output_attentions=True`）。行為因是否提供 `config` 或自動載入而異：
- 如果使用 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式（from_pretrained()）。`kwargs` 中與配置屬性對應的每個鍵將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有視覺到文字建模頭）。

blip — TFBlipForConditionalGeneration (BLIP 模型)
vision-encoder-decoder — TFVisionEncoderDecoderModel (視覺編碼器-解碼器模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForVision2Seq

class transformers.FlaxAutoModelForVision2Seq

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中帶有視覺到文字建模頭的模型類之一。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- VisionEncoderDecoderConfig 配置類：FlaxVisionEncoderDecoderModel (視覺編碼器-解碼器模型)
attn_implementation (str, 可選) — 模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一個。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設是手動的 "eager" 實現。

根據配置例項化庫中的一個模型類（帶有視覺到文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個包含使用 save_pretrained() 儲存的模型權重的*目錄*路徑，例如 ./my_model_directory/。
- 一個指向*PyTorch state_dict 儲存檔案*的路徑或 URL（例如，./pt_model/pytorch_model.bin）。在這種情況下，from_pt 應設定為 True，並且應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 PyTorch 模型轉換為 TensorFlow 模型然後載入 TensorFlow 模型要慢。
model_args (其他位置引數, 可選) — 將傳遞給底層模型的 `__init__()` 方法。
config (PretrainedConfig, 可選) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID* 字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 模型是透過提供本地目錄作為 pretrained_model_name_or_path 載入的，並且在目錄中找到了名為 *config.json* 的配置 JSON 檔案。
cache_dir (str 或 os.PathLike, 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置應快取的目錄路徑。
from_pt (bool, 可選, 預設為 False) — 從 PyTorch 檢查點儲存檔案中載入模型權重（請參閱 `pretrained_model_name_or_path` 引數的文件字串）。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。所有下載現在在可能時預設恢復。將在 Transformers v5 中移除。
proxies (dict[str, str], 可選) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, 可選, 預設為 False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤訊息的字典。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, 可選, 預設為 "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, 可選, 預設為 False) — 是否允許在 Hub 上的自定義模型檔案中定義自定義模型。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上的程式碼。
code_revision (str, 可選, 預設為 "main") — 如果 Hub 上的程式碼與模型的其餘部分位於不同的倉庫中，則使用特定的程式碼修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, 可選) — 可用於更新配置物件（載入後）和初始化模型（例如，`output_attentions=True`）。行為因是否提供 `config` 或自動載入而異：
- 如果使用 `config` 提供了配置，`**kwargs` 將直接傳遞給底層模型的 `__init__` 方法（我們假設所有對配置的相關更新已經完成）
- 如果沒有提供配置，`kwargs` 將首先傳遞給配置類的初始化函式（from_pretrained()）。`kwargs` 中與配置屬性對應的每個鍵將用於使用提供的 `kwargs` 值覆蓋該屬性。不對應任何配置屬性的其餘鍵將傳遞給底層模型的 `__init__` 函式。

從預訓練模型例項化庫中的一個模型類（帶有視覺到文字建模頭）。

vision-encoder-decoder — FlaxVisionEncoderDecoderModel (視覺編碼器-解碼器模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForImageTextToText

class transformers.AutoModelForImageTextToText

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有影像-文字到文字建模頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- AriaConfig 配置類：AriaForConditionalGeneration (Aria 模型)
- AyaVisionConfig 配置類：AyaVisionForConditionalGeneration (AyaVision 模型)
- Blip2Config 配置類：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置類：BlipForConditionalGeneration (BLIP 模型)
- ChameleonConfig 配置類：ChameleonForConditionalGeneration (Chameleon 模型)
- Emu3Config 配置類：Emu3ForConditionalGeneration (Emu3 模型)
- FuyuConfig 配置類：FuyuForCausalLM (Fuyu 模型)
- Gemma3Config 配置類：Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- Gemma3nConfig 配置類：Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
- GitConfig 配置類：GitForCausalLM (GIT 模型)
- Glm4vConfig 配置類：Glm4vForConditionalGeneration (GLM4V 模型)
- GotOcr2Config 配置類：GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
- Idefics2Config 配置類：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置類：Idefics3ForConditionalGeneration (Idefics3 模型)
- IdeficsConfig 配置類：IdeficsForVisionText2Text (IDEFICS 模型)
- InstructBlipConfig 配置類：InstructBlipForConditionalGeneration (InstructBLIP 模型)
- InternVLConfig 配置類：InternVLForConditionalGeneration (InternVL 模型)
- JanusConfig 配置類：JanusForConditionalGeneration (Janus 模型)
- Kosmos2Config 配置類：Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
- Llama4Config 配置類：Llama4ForConditionalGeneration (Llama4 模型)
- LlavaConfig 配置類：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置類：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置類：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置類：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- Mistral3Config 配置類：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置類：MllamaForConditionalGeneration (Mllama 模型)
- PaliGemmaConfig 配置類：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Pix2StructConfig 配置類：Pix2StructForConditionalGeneration (Pix2Struct 模型)
- PixtralVisionConfig 配置類：LlavaForConditionalGeneration (Pixtral 模型)
- Qwen2VLConfig 配置類：Qwen2VLForConditionalGeneration (Qwen2VL 模型)
- Qwen2_5_VLConfig 配置類：Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
- ShieldGemma2Config 配置類：Gemma3ForConditionalGeneration (Shieldgemma2 模型)
- SmolVLMConfig 配置類：SmolVLMForConditionalGeneration (SmolVLM 模型)
- UdopConfig 配置類：UdopForConditionalGeneration (UDOP 模型)
- VipLlavaConfig 配置類：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisionEncoderDecoderConfig 配置類：VisionEncoderDecoderModel (Vision Encoder decoder 模型)
attn_implementation (str, optional) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動的 "eager" 實現。

從一個配置例項化庫中的一個模型類（帶有影像-文字到文字建模頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageTextToText.from_config(config)

from_pretrained

( *model_args **kwargs )

引數

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一個字串，即託管在 huggingface.co 上的模型倉庫中的預訓練模型的*模型 ID*。
- 一個指向使用 save_pretrained() 儲存的模型權重*目錄*的路徑，例如 ./my_model_directory/。
- 一個指向*tensorflow索引檢查點檔案*的路徑或 URL（例如，./tf_model/model.ckpt.index）。在這種情況下，應將 from_tf 設定為 True，並應提供一個配置物件作為 config 引數。這種載入路徑比使用提供的轉換指令碼將 TensorFlow 檢查點轉換為 PyTorch 模型然後載入 PyTorch 模型要慢。
model_args (其他位置引數, optional) — 將傳遞給底層模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用於模型的配置，而不是自動載入的配置。在以下情況下可以自動載入配置：
- 模型是庫提供的模型（使用預訓練模型的*模型 ID*字串載入）。
- 模型是使用 save_pretrained() 儲存的，並透過提供儲存目錄重新載入。
- 透過提供本地目錄作為 pretrained_model_name_or_path 載入模型，並且在目錄中找到名為 *config.json* 的配置 JSON 檔案。
state_dict (dict[str, torch.Tensor], optional) — 要使用的狀態字典，而不是從儲存的權重檔案中載入的狀態字典。

如果您想從預訓練配置建立模型但載入自己的權重，可以使用此選項。但在這種情況下，您應該檢查使用 save_pretrained() 和 from_pretrained() 是否不是一個更簡單的選項。
cache_dir (str or os.PathLike, optional) — 下載的預訓練模型配置應快取的目錄路徑，如果不應使用標準快取。
from_tf (bool, optional, defaults to False) — 從 TensorFlow 檢查點儲存檔案載入模型權重（請參閱 pretrained_model_name_or_path 引數的文件字串）。
force_download (bool, optional, defaults to False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋已存在的快取版本。
resume_download — 已棄用並忽略。現在所有下載在可能的情況下都會預設恢復。將在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
output_loading_info(bool, optional, defaults to False) — 是否同時返回一個包含缺失鍵、意外部索引鍵和錯誤資訊的字典。
local_files_only(bool, optional, defaults to False) — 是否只檢視本地檔案（例如，不嘗試下載模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
trust_remote_code (bool, optional, defaults to False) — 是否允許在 Hub 上的自定義模型在其自己的建模檔案中定義。此選項只應為您信任且已閱讀其程式碼的倉庫設定為 True，因為它將在您的本地計算機上執行 Hub 上存在的程式碼。
code_revision (str, optional, defaults to "main") — 如果程式碼位於與模型其餘部分不同的倉庫中，則要使用的 Hub 上程式碼的特定修訂版。它可以是分支名稱、標籤名稱或提交 ID，因為我們使用基於 git 的系統在 huggingface.co 上儲存模型和其他工件，所以 revision 可以是 git 允許的任何識別符號。
kwargs (其他關鍵字引數, optional) — 可用於更新配置物件（載入後）並初始化模型（例如，output_attentions=True）。其行為取決於是否提供了 config 或自動載入：
- 如果透過 config 提供了配置，**kwargs 將直接傳遞給底層模型的 __init__ 方法（我們假設所有相關的配置更新已經完成）。
- 如果沒有提供配置，kwargs 將首先傳遞給配置類的初始化函式（from_pretrained()）。kwargs 中與配置屬性對應的每個鍵將用於使用提供的 kwargs 值覆蓋該屬性。不對應任何配置屬性的剩餘鍵將傳遞給底層模型的 __init__ 函式。

從預訓練模型例項化庫中的一個模型類（帶有影像-文字到文字建模頭）。

aria — AriaForConditionalGeneration (Aria 模型)
aya_vision — AyaVisionForConditionalGeneration (AyaVision 模型)
blip — BlipForConditionalGeneration (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
chameleon — ChameleonForConditionalGeneration (Chameleon 模型)
emu3 — Emu3ForConditionalGeneration (Emu3 模型)
fuyu — FuyuForCausalLM (Fuyu 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gemma3n — Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
git — GitForCausalLM (GIT 模型)
glm4v — Glm4vForConditionalGeneration (GLM4V 模型)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
idefics — IdeficsForVisionText2Text (IDEFICS 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP 模型)
internvl — InternVLForConditionalGeneration (InternVL 模型)
janus — JanusForConditionalGeneration (Janus 模型)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
llama4 — Llama4ForConditionalGeneration (Llama4 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct 模型)
pixtral — LlavaForConditionalGeneration (Pixtral 模型)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL 模型)
shieldgemma2 — Gemma3ForConditionalGeneration (Shieldgemma2 模型)
smolvlm — SmolVLMForConditionalGeneration (SmolVLM 模型)
udop — UdopForConditionalGeneration (UDOP 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
vision-encoder-decoder — VisionEncoderDecoderModel (視覺編碼器解碼器模型)

預設情況下，模型透過 model.eval() 設定為評估模式（例如，dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其設定回訓練模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageTextToText.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

時間序列

AutoModelForTimeSeriesPrediction

class transformers.AutoModelForTimeSeriesPrediction

（ *args **kwargs ）

這是一個通用的模型類，當使用 from_pretrained() 類方法或 from_config() 類方法建立時，它將被例項化為庫中的一個模型類（帶有時間序列預測頭）。

這個類不能直接使用 __init__() 進行例項化（會丟擲錯誤）。

from_config

（ **kwargs ）

引數

config (PretrainedConfig) — 要例項化的模型類是根據配置類選擇的：
- TimesFmConfig 配置類：TimesFmModelForPrediction (TimesFm 模型)
attn_implementation (str, optional) — 要在模型中使用的注意力實現（如果相關）。可以是 "eager"（注意力的手動實現）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一種。預設情況下，如果可用，SDPA 將用於 torch>=2.1.1。否則，預設值為手動的 "eager" 實現。

從一個配置例項化庫中的一個模型類（帶有時間序列預測頭）。

注意：從其配置檔案載入模型並不會載入模型權重。它隻影響模型的配置。請使用 from_pretrained() 來載入模型權重。

示例

>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTimeSeriesPrediction.from_config(config)

from_pretrained