主類

SetFitModel

class setfit.SetFitModel

( model_body: typing.Optional[sentence_transformers.SentenceTransformer.SentenceTransformer] = None model_head: typing.Union[setfit.modeling.SetFitHead, sklearn.linear_model._logistic.LogisticRegression, NoneType] = None multi_target_strategy: typing.Optional[str] = None normalize_embeddings: bool = False labels: typing.Optional[typing.List[str]] = None model_card_data: typing.Optional[setfit.model_card.SetFitModelCardData] = None sentence_transformers_kwargs: typing.Optional[typing.Dict] = None **kwargs )

一個集成了 Hugging Face Hub 的 SetFit 模型。

示例

>>> from setfit import SetFitModel
>>> model = SetFitModel.from_pretrained("tomaarsen/setfit-bge-small-v1.5-sst2-8-shot")
>>> model.predict([
...     "It's a charming and often affecting journey.",
...     "It's slow -- very, very slow.",
...     "A sometimes tedious film.",
... ])
['positive', 'negative', 'negative']

from_pretrained

< 來源 >

( force_download: bool = False resume_download: typing.Optional[bool] = None proxies: typing.Optional[typing.Dict] = None token: typing.Union[bool, str, NoneType] = None cache_dir: typing.Union[str, pathlib.Path, NoneType] = None local_files_only: bool = False revision: typing.Optional[str] = None **model_kwargs )

引數

pretrained_model_name_or_path (str, Path) —
- Hub 上模型的 model_id (字串)，例如 bigscience/bloom。
- 或者包含使用 [~transformers.PreTrainedModel.save_pretrained] 儲存的模型權重的目錄路徑，例如 ../path/to/my_model_directory/。
revision (str, 可選) — Hub 上模型的修訂版本。可以是分支名稱、git 標籤或任何提交 ID。預設為 main 分支上的最新提交。
force_download (bool, 可選, 預設為 False) — 是否強制（重新）從 Hub 下載模型權重和配置檔案，覆蓋現有快取。
proxies (Dict[str, str], 可選) — 要按協議或端點使用的代理伺服器字典，例如 {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}。每個請求都會使用代理。
token (str 或 bool, 可選) — 用於遠端檔案的 HTTP Bearer 授權令牌。預設情況下，它將使用執行 hf auth login 時快取的令牌。
cache_dir (str, Path, 可選) — 快取檔案儲存的資料夾路徑。
local_files_only (bool, 可選, 預設為 False) — 如果為 True，則避免下載檔案，如果本地快取檔案存在則返回其路徑。
labels (List[str], 可選) — 如果標籤是 0 到 num_classes-1 範圍內的整數，則這些標籤表示相應的標籤。
model_card_data (SetFitModelCardData, 可選) — 一個 SetFitModelCardData 例項，儲存模型語言、許可證、資料集名稱等資料，用於自動生成的模型卡。
multi_target_strategy (str, 可選) — 與多標籤分類一起使用的策略。可以是 “one-vs-rest”、“multi-output” 或 “classifier-chain” 之一。
use_differentiable_head (bool, 可選) — 是否使用可微分（即 Torch）頭部而不是邏輯迴歸來載入 SetFit。
normalize_embeddings (bool, 可選) — 是否對 Sentence Transformer 主體生成的嵌入應用歸一化。
device (Union[torch.device, str], 可選) — 載入 SetFit 模型的裝置，例如 “cuda:0”、 “mps” 或 torch.device(“cuda”)。
trust_remote_code (bool, 預設為 False) — 是否允許在 Hub 上自己的建模檔案中定義的自定義 Sentence Transformers 模型。此選項僅應設定為您信任且已閱讀其程式碼的倉庫，因為它將在您的本地機器上執行 Hub 上存在的程式碼。預設為 False。

從 Huggingface Hub 下載模型並例項化它。

示例

>>> from setfit import SetFitModel
>>> model = SetFitModel.from_pretrained(
...     "sentence-transformers/paraphrase-mpnet-base-v2",
...     labels=["positive", "negative"],
... )

save_pretrained

< 來源 >

( save_directory: typing.Union[str, pathlib.Path] config: typing.Union[dict, huggingface_hub.hub_mixin.DataclassInstance, NoneType] = None repo_id: typing.Optional[str] = None push_to_hub: bool = False model_card_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None **push_to_hub_kwargs ) → str 或 None

引數

save_directory (str 或 Path) — 儲存模型權重和配置的目錄路徑。
config (dict 或 DataclassInstance, 可選) — 指定為鍵/值字典或資料類例項的模型配置。
push_to_hub (bool, 可選, 預設為 False) — 儲存模型後是否將其推送到 Huggingface Hub。
repo_id (str, 可選) — 您在 Hub 上的倉庫 ID。僅在 push_to_hub=True 時使用。如果未提供，將預設為資料夾名稱。
model_card_kwargs (Dict[str, Any], 可選) — 傳遞給模型卡模板的其他引數，用於自定義模型卡。
push_to_hub_kwargs — 傳遞給 ~ModelHubMixin.push_to_hub 方法的其他關鍵字引數。

str 或 None

如果 push_to_hub=True，則為 Hub 上提交的 URL，否則為 None。

將權重儲存到本地目錄。

push_to_hub

< 來源 >

( repo_id: str config: typing.Union[dict, huggingface_hub.hub_mixin.DataclassInstance, NoneType] = None commit_message: str = '使用 huggingface_hub 推送模型。' private: typing.Optional[bool] = None token: typing.Optional[str] = None branch: typing.Optional[str] = None create_pr: typing.Optional[bool] = None allow_patterns: typing.Union[str, typing.List[str], NoneType] = None ignore_patterns: typing.Union[str, typing.List[str], NoneType] = None delete_patterns: typing.Union[str, typing.List[str], NoneType] = None model_card_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None )

引數

repo_id (str) — 要推送到的倉庫 ID（例如："username/my-model"）。
config (dict 或 DataclassInstance, 可選) — 指定為鍵/值字典或資料類例項的模型配置。
commit_message (str, 可選) — 推送時提交的訊息。
private (bool, 可選) — 建立的倉庫是否應為私有。如果為 None（預設），則倉庫將為公開，除非組織的預設設定為私有。
token (str, 可選) — 用於遠端檔案的 HTTP Bearer 授權令牌。預設情況下，它將使用執行 hf auth login 時快取的令牌。
branch (str, 可選) — 推送模型的 git 分支。預設為 "main"。
create_pr (boolean, 可選) — 是否從 branch 建立帶有該提交的 Pull Request。預設為 False。
allow_patterns (List[str] 或 str, 可選) — 如果提供，則只推送至少匹配一個模式的檔案。
ignore_patterns (List[str] 或 str, 可選) — 如果提供，則不推送匹配任何模式的檔案。
delete_patterns (List[str] 或 str, 可選) — 如果提供，則匹配任何模式的遠端檔案將從倉庫中刪除。
model_card_kwargs (Dict[str, Any], 可選) — 傳遞給模型卡模板的其他引數，用於自定義模型卡。

將模型檢查點上傳到 Hub。

使用 allow_patterns 和 ignore_patterns 精確篩選哪些檔案應推送到 Hub。使用 delete_patterns 在同一提交中刪除現有遠端檔案。有關更多詳細資訊，請參閱 upload_folder 參考。

call

< 來源 >

( inputs: typing.Union[str, typing.List[str]] batch_size: int = 32 as_numpy: bool = False use_labels: bool = True show_progress_bar: typing.Optional[bool] = None ) → Union[torch.Tensor, np.ndarray, List[str], int, str]

引數

inputs (Union[str, List[str]]) — 用於預測類別的輸入句子或句子列表。
batch_size (int, 預設為 32) — 用於將句子編碼為嵌入的批大小。越大通常意味著更快的處理速度，但記憶體使用量也越大。
as_numpy (bool, 預設為 False) — 是否輸出為 numpy 陣列。
use_labels (bool, 預設為 True) — 是否嘗試返回 SetFitModel.labels 的元素。
show_progress_bar (Optional[bool], 預設為 None) — 編碼時是否顯示進度條。

Union[torch.Tensor, np.ndarray, List[str], int, str]

如果 use_labels 為 True 且 SetFitModel.labels 已定義，則返回與輸入長度相同的字串標籤列表。否則返回與輸入長度相同的向量，表示每個輸入所屬的預測類別。如果輸入是單個字串，則輸出也是單個標籤。

預測各種類別。

示例

>>> model = SetFitModel.from_pretrained(...)
>>> model(["What a boring display", "Exhilarating through and through", "I'm wowed!"])
["negative", "positive", "positive"]
>>> model("That was cool!")
"positive"

label2id

< 來源 >

( )

返回從字串標籤到整數 ID 的對映。

id2label

< 來源 >

( )

返回從整數 ID 到字串標籤的對映。

建立模型卡片

< 來源 >

( path: str model_name: typing.Optional[str] = 'SetFit 模型' )

引數

path (str) — 儲存模型卡的路徑。
model_name (str, 可選) — 模型的名稱。預設為 SetFit Model。

為 SetFit 模型建立並儲存模型卡。

SetFit

主類

SetFitModel

class setfit.SetFitModel

from_pretrained

save_pretrained

push_to_hub

__call__

label2id

id2label

建立模型卡片

編碼

fit

freeze

generate_model_card

predict

predict_proba

到

unfreeze

SetFitHead

class setfit.SetFitHead

forward

SetFitModelCardData

class setfit.SetFitModelCardData

to_dict

to_yaml

AbsaModel

類 setfit.AbsaModel

__call__

裝置

from_pretrained

predict

push_to_hub

到

save_pretrained

AspectModel

類 setfit.AspectModel

__call__

裝置

from_pretrained

predict

push_to_hub

save_pretrained

到

極性模型

class setfit.PolarityModel

__call__

裝置

from_pretrained

predict

push_to_hub

save_pretrained

到

call

call

call

call