混入與序列化方法

混入

huggingface_hub 庫提供了一系列混入，可用作您物件的父類，以提供簡單的上傳和下載功能。請檢視我們的整合指南，瞭解如何將任何機器學習框架與 Hub 整合。

通用

class huggingface_hub.ModelHubMixin

< 來源 >

( *args **kwargs )

引數

repo_url (str, 可選) — 庫倉庫的 URL。用於生成模型卡。
paper_url (str, 可選) — 庫論文的 URL。用於生成模型卡。
docs_url (str, 可選) — 庫文件的 URL。用於生成模型卡。
model_card_template (str, 可選) — 模型卡模板。用於生成模型卡。預設為通用模板。
language (str 或 List[str], 可選) — 庫支援的語言。用於生成模型卡。
library_name (str, 可選) — 整合 ModelHubMixin 的庫的名稱。用於生成模型卡。
license (str, 可選) — 整合 ModelHubMixin 的庫的許可證。用於生成模型卡。例如：“apache-2.0”
license_name (str, 可選) — 整合 ModelHubMixin 的庫的名稱。用於生成模型卡。僅在 license 設定為 other 時使用。例如：“coqui-public-model-license”。
license_link (str, 可選) — 整合 ModelHubMixin 的庫的許可證 URL。用於生成模型卡。僅在 license 設定為 other 且 license_name 已設定時使用。例如：”https://coqui.ai/cpml”。
pipeline_tag (str, 可選) — 管道標籤。用於生成模型卡。例如“text-classification”。
tags (List[str], 可選) — 要新增到模型卡的標籤。用於生成模型卡。例如 [“computer-vision”]
coders (Dict[Type, Tuple[Callable, Callable]], 可選) — 自定義型別及其編碼器/解碼器的字典。用於編碼/解碼預設情況下不可 JSON 序列化的引數。例如資料類、argparse.Namespace、OmegaConf 等。

用於將任何機器學習框架與 Hub 整合的通用混入。

要整合您的框架，您的模型類必須繼承自此混入。儲存/載入模型的自定義邏輯必須在 _from_pretrained 和 _save_pretrained 中覆蓋。PyTorchModelHubMixin 是與 Hub 整合混入的一個很好的例子。請檢視我們的整合指南以獲取更多說明。

繼承自 ModelHubMixin 時，您可以定義類級屬性。這些屬性不傳遞給 __init__，而是傳遞給類定義本身。這對於定義整合 ModelHubMixin 的庫的元資料很有用。

有關如何將混入與您的庫整合的更多詳細資訊，請檢視整合指南。

示例

>>> from huggingface_hub import ModelHubMixin

# Inherit from ModelHubMixin
>>> class MyCustomModel(
...         ModelHubMixin,
...         library_name="my-library",
...         tags=["computer-vision"],
...         repo_url="https://github.com/huggingface/my-cool-library",
...         paper_url="https://arxiv.org/abs/2304.12244",
...         docs_url="https://huggingface.co/docs/my-cool-library",
...         # ^ optional metadata to generate model card
...     ):
...     def __init__(self, size: int = 512, device: str = "cpu"):
...         # define how to initialize your model
...         super().__init__()
...         ...
...
...     def _save_pretrained(self, save_directory: Path) -> None:
...         # define how to serialize your model
...         ...
...
...     @classmethod
...     def from_pretrained(
...         cls: Type[T],
...         pretrained_model_name_or_path: Union[str, Path],
...         *,
...         force_download: bool = False,
...         resume_download: Optional[bool] = None,
...         proxies: Optional[Dict] = None,
...         token: Optional[Union[str, bool]] = None,
...         cache_dir: Optional[Union[str, Path]] = None,
...         local_files_only: bool = False,
...         revision: Optional[str] = None,
...         **model_kwargs,
...     ) -> T:
...         # define how to deserialize your model
...         ...

>>> model = MyCustomModel(size=256, device="gpu")

# Save model weights to local directory
>>> model.save_pretrained("my-awesome-model")

# Push model weights to the Hub
>>> model.push_to_hub("my-awesome-model")

# Download and initialize weights from the Hub
>>> reloaded_model = MyCustomModel.from_pretrained("username/my-awesome-model")
>>> reloaded_model.size
256

# Model card has been correctly populated
>>> from huggingface_hub import ModelCard
>>> card = ModelCard.load("username/my-awesome-model")
>>> card.data.tags
["x-custom-tag", "pytorch_model_hub_mixin", "model_hub_mixin"]
>>> card.data.library_name
"my-library"

_save_pretrained

< 來源 >

( save_directory: Path )

引數

save_directory (str 或 Path) — 模型權重和配置將儲存的目錄路徑。

在子類中覆蓋此方法以定義如何儲存模型。請檢視我們的整合指南以獲取說明。

_from_pretrained

< 來源 >

( model_id: str revision: typing.Optional[str] cache_dir: typing.Union[str, pathlib.Path, NoneType] force_download: bool proxies: typing.Optional[typing.Dict] resume_download: typing.Optional[bool] local_files_only: bool token: typing.Union[bool, str, NoneType] **model_kwargs )

引數

model_id (str) — 要從 Huggingface Hub 載入的模型 ID（例如 bigscience/bloom）。
revision (str, 可選) — Hub 上模型的修訂版本。可以是分支名稱、git 標籤或任何提交 ID。預設為 main 分支上的最新提交。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）從 Hub 下載模型權重和配置檔案，覆蓋現有快取。
proxies (Dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典（例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}）。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP bearer 授權令牌。預設情況下，它將使用執行 hf auth login 時快取的令牌。
cache_dir (str, Path, 可選) — 快取檔案儲存的資料夾路徑。
local_files_only (bool, 可選, 預設為 False) — 如果為 True，則避免下載檔案並返回本地快取檔案的路徑（如果存在）。
model_kwargs — 傳遞給 _from_pretrained() 方法的其他關鍵字引數。

在子類中覆蓋此方法以定義如何從預訓練模型載入模型。

使用 hf_hub_download() 或 snapshot_download() 從 Hub 下載檔案，然後再載入它們。大多數作為輸入引數都可以直接傳遞給這兩個方法。如果需要，可以使用“model_kwargs”向此方法新增更多引數。例如 PyTorchModelHubMixin._from_pretrained() 將 map_location 引數作為輸入，用於設定模型應載入到哪個裝置上。

請檢視我們的整合指南以獲取更多說明。

from_pretrained

< 來源 >

( pretrained_model_name_or_path: typing.Union[str, pathlib.Path] force_download: bool = False resume_download: typing.Optional[bool] = None proxies: typing.Optional[typing.Dict] = None token: typing.Union[bool, str, NoneType] = None cache_dir: typing.Union[str, pathlib.Path, NoneType] = None local_files_only: bool = False revision: typing.Optional[str] = None **model_kwargs )

引數

pretrained_model_name_or_path (str, Path) —
- Hub 上模型的 model_id（字串），例如 bigscience/bloom。
- 或者包含使用 save_pretrained 儲存的模型權重的 directory 路徑，例如 ../path/to/my_model_directory/。
revision (str, 可選) — Hub 上模型的修訂版本。可以是分支名稱、git 標籤或任何提交 ID。預設為 main 分支上的最新提交。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，覆蓋現有快取。
proxies (Dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用於每個請求。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP bearer 授權令牌。預設情況下，它將使用執行 hf auth login 時快取的令牌。
cache_dir (str, Path, 可選) — 快取檔案儲存的資料夾路徑。
local_files_only (bool, 可選, 預設為 False) — 如果為 True，則避免下載檔案並返回本地快取檔案的路徑（如果存在）。
model_kwargs (Dict, 可選) — 模型初始化時傳遞給模型的附加關鍵字引數。

從 Huggingface Hub 下載模型並例項化它。

push_to_hub

< 來源 >

( repo_id: str config: typing.Union[dict, huggingface_hub.hub_mixin.DataclassInstance, NoneType] = None commit_message: str = 'Push model using huggingface_hub.' private: typing.Optional[bool] = None token: typing.Optional[str] = None branch: typing.Optional[str] = None create_pr: typing.Optional[bool] = None allow_patterns: typing.Union[typing.List[str], str, NoneType] = None ignore_patterns: typing.Union[typing.List[str], str, NoneType] = None delete_patterns: typing.Union[typing.List[str], str, NoneType] = None model_card_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None )

引數

repo_id (str) — 要推送到的倉庫 ID（例如："username/my-model"）。
config (dict 或 DataclassInstance, 可選) — 指定為鍵/值字典或資料類例項的模型配置。
commit_message (str, 可選) — 推送時的提交訊息。
private (bool, 可選) — 建立的倉庫是否應為私有。如果為 None（預設），則倉庫將為公開，除非組織的預設設定為私有。
token (str, 可選) — 用於遠端檔案的 HTTP bearer 授權令牌。預設情況下，它將使用執行 hf auth login 時快取的令牌。
branch (str, 可選) — 推送模型的 git 分支。預設為 "main"。
create_pr (boolean, 可選) — 是否從 branch 建立帶有該提交的拉取請求。預設為 False。
allow_patterns (List[str] 或 str, 可選) — 如果提供，則僅推送與至少一個模式匹配的檔案。
ignore_patterns (List[str] 或 str, 可選) — 如果提供，則不推送與任何模式匹配的檔案。
delete_patterns (List[str] 或 str, 可選) — 如果提供，則與任何模式匹配的遠端檔案將從倉庫中刪除。
model_card_kwargs (Dict[str, Any], 可選) — 傳遞給模型卡模板以自定義模型卡的附加引數。

將模型檢查點上傳到 Hub。

使用 allow_patterns 和 ignore_patterns 精確篩選要推送到 hub 的檔案。使用 delete_patterns 在同一提交中刪除現有的遠端檔案。有關更多詳細資訊，請參閱 upload_folder() 參考。

save_pretrained

< 源 >

( save_directory: typing.Union[str, pathlib.Path] config: typing.Union[dict, huggingface_hub.hub_mixin.DataclassInstance, NoneType] = None repo_id: typing.Optional[str] = None push_to_hub: bool = False model_card_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None **push_to_hub_kwargs ) → str 或 None

引數

save_directory (str 或 Path) — 儲存模型權重和配置的目錄路徑。
config (dict 或 DataclassInstance, 可選) — 以鍵/值字典或資料類例項形式指定的模型配置。
push_to_hub (bool, 可選, 預設為 False) — 儲存模型後是否將其推送到 Huggingface Hub。
repo_id (str, 可選) — 您在 Hub 上的儲存庫 ID。僅在 push_to_hub=True 時使用。如果未提供，將預設為資料夾名稱。
model_card_kwargs (Dict[str, Any], 可選) — 傳遞給模型卡模板以自定義模型卡的附加引數。
push_to_hub_kwargs — 傳遞給 push_to_hub() 方法的額外關鍵字引數。

str 或 None

如果 push_to_hub=True，則為 Hub 上提交的 URL，否則為 None。

將權重儲存在本地目錄中。

PyTorch

class huggingface_hub.PyTorchModelHubMixin

< 源 >

( *args **kwargs )

實現了 ModelHubMixin，為 PyTorch 模型提供模型 Hub 上傳/下載功能。模型預設使用 model.eval() 設定為評估模式（dropout 模組被停用）。要訓練模型，您應該首先使用 model.train() 將其重新設定為訓練模式。

有關如何使用 mixin 的更多詳細資訊，請參閱 ModelHubMixin。

示例

>>> import torch
>>> import torch.nn as nn
>>> from huggingface_hub import PyTorchModelHubMixin

>>> class MyModel(
...         nn.Module,
...         PyTorchModelHubMixin,
...         library_name="keras-nlp",
...         repo_url="https://github.com/keras-team/keras-nlp",
...         paper_url="https://arxiv.org/abs/2304.12244",
...         docs_url="https://keras.io/keras_nlp/",
...         # ^ optional metadata to generate model card
...     ):
...     def __init__(self, hidden_size: int = 512, vocab_size: int = 30000, output_size: int = 4):
...         super().__init__()
...         self.param = nn.Parameter(torch.rand(hidden_size, vocab_size))
...         self.linear = nn.Linear(output_size, vocab_size)

...     def forward(self, x):
...         return self.linear(x + self.param)
>>> model = MyModel(hidden_size=256)

# Save model weights to local directory
>>> model.save_pretrained("my-awesome-model")

# Push model weights to the Hub
>>> model.push_to_hub("my-awesome-model")

# Download and initialize weights from the Hub
>>> model = MyModel.from_pretrained("username/my-awesome-model")
>>> model.hidden_size
256

Keras

class huggingface_hub.KerasModelHubMixin

< 源 >

( *args **kwargs )

實現了 ModelHubMixin，為 Keras 模型提供模型 Hub 上傳/下載功能。

>>> import tensorflow as tf
>>> from huggingface_hub import KerasModelHubMixin


>>> class MyModel(tf.keras.Model, KerasModelHubMixin):
...     def __init__(self, **kwargs):
...         super().__init__()
...         self.config = kwargs.pop("config", None)
...         self.dummy_inputs = ...
...         self.layer = ...

...     def call(self, *args):
...         return ...


>>> # Initialize and compile the model as you normally would
>>> model = MyModel()
>>> model.compile(...)
>>> # Build the graph by training it or passing dummy inputs
>>> _ = model(model.dummy_inputs)
>>> # Save model weights to local directory
>>> model.save_pretrained("my-awesome-model")
>>> # Push model weights to the Hub
>>> model.push_to_hub("my-awesome-model")
>>> # Download and initialize weights from the Hub
>>> model = MyModel.from_pretrained("username/super-cool-model")

huggingface_hub.from_pretrained_keras

< 源 >

( *args **kwargs )

引數

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字串，huggingface.co 上模型儲存庫中託管的預訓練模型的 model id。有效的模型 ID 可以在根級別找到，如 bert-base-uncased，也可以在使用者或組織名稱下命名，如 dbmdz/bert-base-german-cased。
- 您可以透過在 model_id 末尾附加 @ 來新增 revision，就像這樣：dbmdz/bert-base-german-cased@main。Revision 是要使用的特定模型版本。它可以是分支名稱、標籤名稱或提交 ID，因為我們在 huggingface.co 上使用基於 Git 的系統儲存模型和其他工件，所以 revision 可以是 Git 允許的任何識別符號。
- 包含使用 save_pretrained 儲存的模型權重的 目錄 路徑，例如 ./my_model_directory/。
- 如果您同時提供了配置和狀態字典（分別使用關鍵字引數 config 和 state_dict），則為 None。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）下載模型權重和配置檔案，如果存在快取版本則覆蓋。
proxies (Dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每個請求上使用。
token (str 或 bool, 可選) — 用作遠端檔案 HTTP 持有者授權的令牌。如果為 True，將使用執行 transformers-cli login 時生成的令牌（儲存在 ~/.huggingface 中）。
cache_dir (Union[str, os.PathLike], 可選) — 如果不應使用標準快取，則為下載的預訓練模型配置的快取目錄路徑。
local_files_only(bool, 可選, 預設為 False) — 是否只檢視本地檔案（即不嘗試下載模型）。
model_kwargs (Dict, 可選) — model_kwargs 將在初始化期間傳遞給模型

從 Hub 上的預訓練模型例項化預訓練的 Keras 模型。模型應為 SavedModel 格式。

當您想使用私有模型時，需要傳遞 token=True。

huggingface_hub.push_to_hub_keras

< 源 >

( model repo_id: str config: typing.Optional[dict] = None commit_message: str = 'Push Keras model using huggingface_hub.' private: typing.Optional[bool] = None api_endpoint: typing.Optional[str] = None token: typing.Optional[str] = None branch: typing.Optional[str] = None create_pr: typing.Optional[bool] = None allow_patterns: typing.Union[typing.List[str], str, NoneType] = None ignore_patterns: typing.Union[typing.List[str], str, NoneType] = None delete_patterns: typing.Union[typing.List[str], str, NoneType] = None log_dir: typing.Optional[str] = None include_optimizer: bool = False tags: typing.Union[list, str, NoneType] = None plot_model: bool = True **model_save_kwargs )

引數

model (Keras.Model) — 您希望推送到 Hub 的 Keras 模型。模型必須已編譯和構建。
repo_id (str) — 要推送到的儲存庫 ID（例如：“username/my-model”）。
commit_message (str, 可選, 預設為“Add Keras model”) — 推送時提交的訊息。
private (bool, 可選) — 建立的儲存庫是否應為私有。如果為 None（預設），則除非組織的預設設定為私有，否則儲存庫將是公開的。
api_endpoint (str, 可選) — 將模型推送到 Hub 時使用的 API 端點。
token (str, 可選) — 用作遠端檔案 HTTP 持有者授權的令牌。如果未設定，將使用透過 hf auth login 登入時設定的令牌（儲存在 ~/.huggingface 中）。
branch (str, 可選) — 推送模型的 git 分支。預設使用儲存庫中指定的預設分支，預設為 "main"。
create_pr (boolean, 可選) — 是否從 branch 建立帶有該提交的拉取請求。預設為 False。
config (dict, 可選) — 與模型權重一起儲存的配置物件。
allow_patterns (List[str] 或 str, 可選) — 如果提供，則只推送與至少一個模式匹配的檔案。
ignore_patterns (List[str] 或 str, 可選) — 如果提供，則不推送與任何模式匹配的檔案。
delete_patterns (List[str] 或 str, 可選) — 如果提供，則遠端檔案中與任何模式匹配的檔案將被從儲存庫中刪除。
log_dir (str, 可選) — 要推送的 TensorBoard 日誌目錄。如果日誌檔案包含在儲存庫中，Hub 會自動託管和顯示 TensorBoard 例項。
include_optimizer (bool, 可選, 預設為 False) — 序列化時是否包含最佳化器。
tags (Union[list, str], 可選) — 與模型相關的標籤列表或單個標籤的字串。請參閱此處的示例標籤。
plot_model (bool, 可選, 預設為 True) — 將此設定為 True 將繪製模型並將其放入模型卡中。需要安裝 graphviz 和 pydot。
model_save_kwargs(dict, 可選) — model_save_kwargs 將傳遞給 tf.keras.models.save_model()。

將模型檢查點上傳到 Hub。

huggingface_hub.save_pretrained_keras

< 源 >

( model save_directory: typing.Union[str, pathlib.Path] config: typing.Optional[typing.Dict[str, typing.Any]] = None include_optimizer: bool = False plot_model: bool = True tags: typing.Union[list, str, NoneType] = None **model_save_kwargs )

引數

model (Keras.Model) — 您希望儲存的 Keras 模型。模型必須已編譯和構建。
save_directory (str 或 Path) — 指定要儲存 Keras 模型的目錄。
config (dict, 可選) — 與模型權重一起儲存的配置物件。
include_optimizer(bool, 可選, 預設為 False) — 序列化時是否包含最佳化器。
plot_model (bool, 可選, 預設為 True) — 將此設定為 True 將繪製模型並將其放入模型卡中。需要安裝 graphviz 和 pydot。
tags (Union[str,list], 可選) — 與模型相關的標籤列表或單個標籤的字串。請參閱此處的示例標籤。
model_save_kwargs(dict, 可選) — model_save_kwargs 將傳遞給 tf.keras.models.save_model()。

以 SavedModel 格式將 Keras 模型儲存到 save_directory。如果您正在使用 Functional 或 Sequential API，請使用此方法。

Fastai

huggingface_hub.from_pretrained_fastai

< 源 >

( repo_id: str revision: typing.Optional[str] = None )

引數

repo_id (str) — 儲存 fastai.Learner 模型的路徑。可以是以下兩種情況之一：
- 託管在 Hugging Face Hub 上。例如：'espejelomar/fatai-pet-breeds-classification' 或 'distilgpt2'。您可以透過在 repo_id 末尾附加 @ 來新增 revision。例如：dbmdz/bert-base-german-cased@main。Revision 是要使用的特定模型版本。由於我們在 Hugging Face Hub 上使用基於 Git 的系統儲存模型和其他工件，因此它可以是分支名稱、標籤名稱或提交 ID。
- 本地託管。repo_id 將是包含 pickle 檔案和指示用於構建 fastai.Learner 的 fastai 和 fastcore 版本的 pyproject.toml 的目錄。例如：./my_model_directory/。
revision (str, 可選) — 下載儲存庫檔案的修訂版本。請參閱 snapshot_download 的文件。

從 Hub 或本地目錄載入預訓練的 fastai 模型。

huggingface_hub.push_to_hub_fastai

< 源 >

( learner repo_id: str commit_message: str = 'Push FastAI model using huggingface_hub.' private: typing.Optional[bool] = None token: typing.Optional[str] = None config: typing.Optional[dict] = None branch: typing.Optional[str] = None create_pr: typing.Optional[bool] = None allow_patterns: typing.Union[typing.List[str], str, NoneType] = None ignore_patterns: typing.Union[typing.List[str], str, NoneType] = None delete_patterns: typing.Union[typing.List[str], str, NoneType] = None api_endpoint: typing.Optional[str] = None )

引數

learner (Learner) — 您想要推送到 Hub 的 *fastai.Learner’。
repo_id (str) — 您在 Hub 中模型的儲存庫 ID，格式為“namespace/repo_name”。名稱空間可以是您的個人帳戶，也可以是您具有寫入許可權的組織（例如，“stanfordnlp/stanza-de”）。
commit_message (str`, 可選*) — 推送時的提交訊息。預設為 "add model"。
private (bool, 可選) — 建立的儲存庫是否應為私有。如果為 None（預設），則預設為公開，除非組織的預設設定為私有。
token (str, 可選) — 用於遠端檔案 HTTP 承載授權的 Hugging Face 帳戶令牌。如果為 None，將透過提示請求令牌。
config (dict, 可選) — 要與模型權重一起儲存的配置物件。
branch (str, 可選) — 推送模型所用的 git 分支。預設為儲存庫中指定的預設分支，即 “main”。
create_pr (boolean, 可選) — 是否從 branch 建立帶有該提交的拉取請求。預設為 False。
api_endpoint (str, 可選) — 將模型推送到 Hub 時使用的 API 終點。
allow_patterns (List[str] 或 str, 可選) — 如果提供，則僅推送至少匹配一個模式的檔案。
ignore_patterns (List[str] 或 str, 可選) — 如果提供，則不推送匹配任何模式的檔案。
delete_patterns (List[str] 或 str, 可選) — 如果提供，則刪除匹配任何模式的遠端檔案。

將 learner 檢查點檔案上傳到 Hub。

使用 allow_patterns 和 ignore_patterns 精確過濾要推送到 hub 的檔案。使用 delete_patterns 在同一提交中刪除現有遠端檔案。有關更多詳細資訊，請參閱 [upload_folder] 參考。

引發以下錯誤

ValueError 如果使用者未登入 Hugging Face Hub。

< > 在 GitHub 上更新

Hub Python 庫

混入與序列化方法

混入

通用

class huggingface_hub.ModelHubMixin

_save_pretrained

_from_pretrained

from_pretrained

push_to_hub

save_pretrained

PyTorch

class huggingface_hub.PyTorchModelHubMixin

Keras

class huggingface_hub.KerasModelHubMixin

huggingface_hub.from_pretrained_keras

huggingface_hub.push_to_hub_keras

huggingface_hub.save_pretrained_keras

Fastai

huggingface_hub.from_pretrained_fastai

huggingface_hub.push_to_hub_fastai