Python 快速入門

AutoTrain 是一個庫，允許您在 Hugging Face Spaces 上或本地訓練最先進的模型。它提供了一個簡單易用的介面，用於訓練各種任務的模型，如 LLM 微調、文字分類、影像分類、目標檢測等。

在本快速入門指南中，我們將向您展示如何使用 AutoTrain 在 Python 中訓練模型。

入門

AutoTrain 可以使用 pip 安裝

$ pip install autotrain-advanced

下面的示例程式碼展示瞭如何使用 AutoTrain 在 Python 中微調 LLM 模型。

import os

from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject


params = LLMTrainingParams(
    model="meta-llama/Llama-3.2-1B-Instruct",
    data_path="HuggingFaceH4/no_robots",
    chat_template="tokenizer",
    text_column="messages",
    train_split="train",
    trainer="sft",
    epochs=3,
    batch_size=1,
    lr=1e-5,
    peft=True,
    quantization="int4",
    target_modules="all-linear",
    padding="right",
    optimizer="paged_adamw_8bit",
    scheduler="cosine",
    gradient_accumulation=8,
    mixed_precision="bf16",
    merge_adapter=True,
    project_name="autotrain-llama32-1b-finetune",
    log="tensorboard",
    push_to_hub=True,
    username=os.environ.get("HF_USERNAME"),
    token=os.environ.get("HF_TOKEN"),
)


backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()

在此示例中，我們正在 `HuggingFaceH4/no_robots` 資料集上微調 `meta-llama/Llama-3.2-1B-Instruct` 模型。我們以 1 的批大小和 `1e-5` 的學習率對模型進行 3 個 epoch 的訓練。我們使用 `paged_adamw_8bit` 最佳化器和 `cosine` 排程器。我們還使用混合精度訓練，梯度累積為 8。訓練完成後，最終模型將被推送到 Hugging Face Hub。

要訓練模型，請執行以下命令

$ export HF_USERNAME=<your-hf-username>
$ export HF_TOKEN=<your-hf-write-token>
$ python train.py

這將建立一個名為 `autotrain-llama32-1b-finetune` 的新專案目錄並開始訓練過程。訓練完成後，模型將被推送到 Hugging Face Hub。

僅當您想要推送模型或訪問受限模型或資料集時，才需要您的 HF_TOKEN 和 HF_USERNAME。

AutoTrainProject 類

class autotrain.project.AutoTrainProject

< source >

( params: typing.Union[autotrain.trainers.clm.params.LLMTrainingParams, autotrain.trainers.text_classification.params.TextClassificationParams, autotrain.trainers.tabular.params.TabularParams, autotrain.trainers.seq2seq.params.Seq2SeqParams, autotrain.trainers.image_classification.params.ImageClassificationParams, autotrain.trainers.text_regression.params.TextRegressionParams, autotrain.trainers.object_detection.params.ObjectDetectionParams, autotrain.trainers.token_classification.params.TokenClassificationParams, autotrain.trainers.sent_transformers.params.SentenceTransformersParams, autotrain.trainers.image_regression.params.ImageRegressionParams, autotrain.trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams, autotrain.trainers.vlm.params.VLMTrainingParams] backend: str process: bool = False )

一個用於訓練 AutoTrain 專案的類。

屬性

params : Union[ LLMTrainingParams, TextClassificationParams, TabularParams, Seq2SeqParams, ImageClassificationParams, TextRegressionParams, ObjectDetectionParams, TokenClassificationParams, SentenceTransformersParams, ImageRegressionParams, ExtractiveQuestionAnsweringParams, VLMTrainingParams, ] AutoTrain 專案的引數。 backend : str 用於 AutoTrain 專案的後端。它應該是以下之一

local
spaces-a10g-large
spaces-a10g-small
spaces-a100-large
spaces-t4-medium
spaces-t4-small
spaces-cpu-upgrade
spaces-cpu-basic
spaces-l4x1
spaces-l4x4
spaces-l40sx1
spaces-l40sx4
spaces-l40sx8
spaces-a10g-largex2
spaces-a10g-largex4 process : bool 指示是否應處理引數和資料集的標誌。如果您的資料格式不是 AutoTrain 可讀的，請將其設定為 True。如有疑問，請設定為 True。預設為 False。

方法

post_init(): 驗證後端屬性。 create(): 根據後端建立一個執行器並初始化 AutoTrain 專案。

引數

文字任務

class autotrain.trainers.clm.params.LLMTrainingParams

< source >

( model: str = 'gpt2' project_name: str = 'project-name' data_path: str = 'data' train_split: str = 'train' valid_split: typing.Optional[str] = None add_eos_token: bool = True block_size: typing.Union[int, typing.List[int]] = -1 model_max_length: int = 2048 padding: typing.Optional[str] = 'right' trainer: str = 'default' use_flash_attention_2: bool = False log: str = 'none' disable_gradient_checkpointing: bool = False logging_steps: int = -1 eval_strategy: str = 'epoch' save_total_limit: int = 1 auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None lr: float = 3e-05 epochs: int = 1 batch_size: int = 2 warmup_ratio: float = 0.1 gradient_accumulation: int = 4 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 chat_template: typing.Optional[str] = None quantization: typing.Optional[str] = 'int4' target_modules: typing.Optional[str] = 'all-linear' merge_adapter: bool = False peft: bool = False lora_r: int = 16 lora_alpha: int = 32 lora_dropout: float = 0.05 model_ref: typing.Optional[str] = None dpo_beta: float = 0.1 max_prompt_length: int = 128 max_completion_length: typing.Optional[int] = None prompt_text_column: typing.Optional[str] = None text_column: str = 'text' rejected_text_column: typing.Optional[str] = None push_to_hub: bool = False username: typing.Optional[str] = None token: typing.Optional[str] = None unsloth: bool = False distributed_backend: typing.Optional[str] = None )

引數

model (str) — 用於訓練的模型名稱。預設為 “gpt2”。
project_name (str) — 專案名稱和輸出目錄。預設為 “project-name”。
data_path (str) — 資料集路徑。預設為 “data”。
train_split (str) — 訓練資料分割的配置。預設為 “train”。
valid_split (Optional[str]) — 驗證資料分割的配置。預設為 None。
add_eos_token (bool) — 是否在序列末尾新增 EOS 標記。預設為 True。
block_size (Union[int, List[int]]) — 訓練塊的大小，可以是一個整數或一個整數列表。預設為 -1。
model_max_length (int) — 模型輸入的最大長度。預設為 2048。
padding (Optional[str]) — 填充序列的一側（左側或右側）。預設為 “right”。
trainer (str) — 要使用的訓練器型別。預設為 “default”。
use_flash_attention_2 (bool) — 是否使用 flash attention 第 2 版。預設為 False。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 “none”。
disable_gradient_checkpointing (bool) — 是否停用梯度檢查點。預設為 False。
logging_steps (int) — 日誌記錄事件之間的步數。預設為 -1。
eval_strategy (str) — 評估策略（例如，'epoch'）。預設為 “epoch”。
save_total_limit (int) — 要保留的最大檢查點數。預設為 1。
auto_find_batch_size (bool) — 是否自動查詢最佳批大小。預設為 False。
mixed_precision (Optional[str]) — 要使用的混合精度型別（例如，'fp16'，'bf16' 或 None）。預設為 None。
lr (float) — 訓練的學習率。預設為 3e-5。
epochs (int) — 訓練的輪數。預設為 1。
batch_size (int) — 訓練的批大小。預設為 2。
warmup_ratio (float) — 用於學習率預熱的訓練比例。預設為 0.1。
gradient_accumulation (int) — 更新前累積梯度的步數。預設為 4。
optimizer (str) — 用於訓練的最佳化器。預設為 “adamw_torch”。
scheduler (str) — 要使用的學習率排程器。預設為 “linear”。
weight_decay (float) — 應用於最佳化器的權重衰減。預設為 0.0。
max_grad_norm (float) — 梯度裁剪的最大範數。預設為 1.0。
seed (int) — 用於可復現性的隨機種子。預設為 42。
chat_template (Optional[str]) — 對話模型的模板，選項包括：None、zephyr、chatml 或 tokenizer。預設為 None。
quantization (Optional[str]) — 要使用的量化方法（例如，'int4'，'int8' 或 None）。預設為 “int4”。
target_modules (Optional[str]) — 用於量化或微調的目標模組。預設為 “all-linear”。
merge_adapter (bool) — 是否合併介面卡層。預設為 False。
peft (bool) — 是否使用引數高效微調（PEFT）。預設為 False。
lora_r (int) — LoRA 矩陣的秩。預設為 16。
lora_alpha (int) — LoRA 的 Alpha 引數。預設為 32。
lora_dropout (float) — LoRA 的丟棄率。預設為 0.05。
model_ref (Optional[str]) — DPO 訓練器的參考模型。預設為 None。
dpo_beta (float) — DPO 訓練器的 Beta 引數。預設為 0.1。
max_prompt_length (int) — 提示的最大長度。預設為 128。
max_completion_length (Optional[int]) — 補全內容的最大長度。預設為 None。
prompt_text_column (Optional[str]) — 提示文字的列名。預設為 None。
text_column (str) — 文字資料的列名。預設為 “text”。
rejected_text_column (Optional[str]) — 拒絕文字資料的列名。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
username (Optional[str]) — 用於身份驗證的 Hugging Face 使用者名稱。預設為 None。
token (Optional[str]) — 用於身份驗證的 Hugging Face 令牌。預設為 None。
unsloth (bool) — 是否使用 unsloth 庫。預設為 False。
distributed_backend (Optional[str]) — 用於分散式訓練的後端。預設為 None。

LLMTrainingParams：使用 autotrain 庫訓練語言模型的引數。

class autotrain.trainers.sent_transformers.params.SentenceTransformersParams

< 源 >

( data_path: str = None model: str = 'microsoft/mpnet-base' lr: float = 3e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: typing.Optional[str] = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 trainer: str = 'pair_score' sentence1_column: str = 'sentence1' sentence2_column: str = 'sentence2' sentence3_column: typing.Optional[str] = None target_column: typing.Optional[str] = None )

引數

data_path (str) — 資料集的路徑。
model (str) — 要使用的預訓練模型的名稱。預設為 “microsoft/mpnet-base”。
lr (float) — 訓練的學習率。預設為 3e-5。
epochs (int) — 訓練輪數。預設為 3。
max_seq_length (int) — 輸入的最大序列長度。預設為 128。
batch_size (int) — 訓練的批次大小。預設為 8。
warmup_ratio (float) — 用於學習率預熱的訓練比例。預設為 0.1。
gradient_accumulation (int) — 更新前累積梯度的步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 “adamw_torch”。
scheduler (str) — 要使用的學習率排程器。預設為 “linear”。
weight_decay (float) — 要應用的權重衰減。預設為 0.0。
max_grad_norm (float) — 用於裁剪的最大梯度範數。預設為 1.0。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料分割的名稱。預設為 “train”。
valid_split (Optional[str]) — 驗證資料分割的名稱。預設為 None。
logging_steps (int) — 日誌記錄之間的步數。預設為 -1。
project_name (str) — 輸出目錄的專案名稱。預設為 “project-name”。
auto_find_batch_size (bool) — 是否自動尋找最佳批次大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度訓練模式（fp16、bf16 或 None）。預設為 None。
save_total_limit (int) — 要儲存的最大檢查點數量。預設為 1。
token (Optional[str]) — 用於訪問 Hugging Face Hub 的令牌。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
eval_strategy (str) — 要使用的評估策略。預設為 “epoch”。
username (Optional[str]) — Hugging Face 使用者名稱。預設為 None。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 “none”。
early_stopping_patience (int) — 訓練在沒有改善的情況下將停止的輪數。預設為 5。
early_stopping_threshold (float) — 用於衡量新的最優值的閾值，以符合改進條件。預設為 0.01。
trainer (str) — 要使用的訓練器名稱。預設為 “pair_score”。
sentence1_column (str) — 包含第一個句子的列的名稱。預設為 “sentence1”。
sentence2_column (str) — 包含第二個句子的列的名稱。預設為 “sentence2”。
sentence3_column (Optional[str]) — 包含第三個句子的列的名稱（如果適用）。預設為 None。
target_column (Optional[str]) — 包含目標變數的列的名稱。預設為 None。

SentenceTransformersParams 是一個用於為訓練句子轉換器設定引數的配置類。

class autotrain.trainers.seq2seq.params.Seq2SeqParams

< 源 >

( data_path: str = None model: str = 'google/flan-t5-base' username: typing.Optional[str] = None seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None project_name: str = 'project-name' token: typing.Optional[str] = None push_to_hub: bool = False text_column: str = 'text' target_column: str = 'target' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 max_target_length: int = 128 batch_size: int = 2 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 logging_steps: int = -1 eval_strategy: str = 'epoch' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 peft: bool = False quantization: typing.Optional[str] = 'int8' lora_r: int = 16 lora_alpha: int = 32 lora_dropout: float = 0.05 target_modules: str = 'all-linear' log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集的路徑。
model (str) — 要使用的模型名稱。預設為 “google/flan-t5-base”。
username (Optional[str]) — Hugging Face 使用者名稱。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料分割的名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證資料分割的名稱。
project_name (str) — 專案或輸出目錄的名稱。預設為 "project-name"。
token (Optional[str]) — 用於身份驗證的 Hub 令牌。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
text_column (str) — 資料集中文字列的名稱。預設為 "text"。
target_column (str) — 資料集中目標文字列的名稱。預設為 "target"。
lr (float) — 訓練的學習率。預設為 5e-5。
epochs (int) — 訓練輪數。預設為 3。
max_seq_length (int) — 輸入文字的最大序列長度。預設為 128。
max_target_length (int) — 目標文字的最大序列長度。預設為 128。
batch_size (int) — 訓練批次大小。預設為 2。
warmup_ratio (float) — 預熱步數的比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 "adamw_torch"。
scheduler (str) — 要使用的學習率排程器。預設為 "linear"。
weight_decay (float) — 最佳化器的權重衰減。預設為 0.0。
max_grad_norm (float) — 用於裁剪的最大梯度範數。預設為 1.0。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1 (停用)。
eval_strategy (str) — 評估策略。預設為 "epoch"。
auto_find_batch_size (bool) — 是否自動查詢批次大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度訓練模式 (fp16, bf16, 或 None)。
save_total_limit (int) — 要儲存的檢查點最大數量。預設為 1。
peft (bool) — 是否使用引數高效微調 (Parameter-Efficient Fine-Tuning, PEFT)。預設為 False。
quantization (Optional[str]) — 量化模式 (int4, int8, 或 None)。預設為 "int8"。
lora_r (int) — 用於 PEFT 的 LoRA-R 引數。預設為 16。
lora_alpha (int) — 用於 PEFT 的 LoRA-Alpha 引數。預設為 32。
lora_dropout (float) — 用於 PEFT 的 LoRA-Dropout 引數。預設為 0.05。
target_modules (str) — PEFT 的目標模組。預設為 "all-linear"。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 "none"。
early_stopping_patience (int) — 早停的耐心值。預設為 5。
early_stopping_threshold (float) — 早停的閾值。預設為 0.01。

Seq2SeqParams 是一個用於序列到序列訓練引數的配置類。

class autotrain.trainers.token_classification.params.TokenClassificationParams

< 原始碼 >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None tokens_column: str = 'tokens' tags_column: str = 'tags' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: typing.Optional[str] = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集的路徑。
model (str) — 要使用的模型名稱。預設為 “bert-base-uncased”。
lr (float) — 學習率。預設為 5e-5。
epochs (int) — 訓練輪數。預設為 3。
max_seq_length (int) — 最大序列長度。預設為 128。
batch_size (int) — 訓練批次大小。預設為 8。
warmup_ratio (float) — 預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 "adamw_torch"。
scheduler (str) — 要使用的排程器。預設為 "linear"。
weight_decay (float) — 權重衰減。預設為 0.0。
max_grad_norm (float) — 最大梯度範數。預設為 1.0。
seed (int) — 隨機種子。預設為 42。
train_split (str) — 訓練分割的名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證分割的名稱。預設為 None。
tokens_column (str) — 標記列的名稱。預設為 "tokens"。
tags_column (str) — 標籤列的名稱。預設為 "tags"。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1。
project_name (str) — 專案名稱。預設為 “project-name”。
auto_find_batch_size (bool) — 是否自動查詢批次大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度設定 (fp16, bf16, 或 None)。預設為 None。
save_total_limit (int) — 要儲存的檢查點總數。預設為 1。
token (Optional[str]) — 用於身份驗證的 Hub 令牌。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hugging Face hub。預設為 False。
eval_strategy (str) — 評估策略。預設為 "epoch"。
username (Optional[str]) — Hugging Face 使用者名稱。預設為 None。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 “none”。
early_stopping_patience (int) — 早停的耐心值。預設為 5。
early_stopping_threshold (float) — 早停的閾值。預設為 0.01。

TokenClassificationParams 是一個用於詞元分類訓練引數的配置類。

class autotrain.trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams

< 原始碼 >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 max_doc_stride: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None text_column: str = 'context' question_column: str = 'question' answer_column: str = 'answers' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: typing.Optional[str] = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集路徑。
model (str) — 預訓練模型名稱。預設為 "bert-base-uncased"。
lr (float) — 最佳化器的學習率。預設為 5e-5。
epochs (int) — 訓練週期數。預設為 3。
max_seq_length (int) — 輸入的最大序列長度。預設為 128。
max_doc_stride (int) — 用於拆分上下文的最大文件步長。預設為 128。
batch_size (int) — 訓練的批次大小。預設為 8。
warmup_ratio (float) — 學習率排程器的預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積的步數。預設為 1。
optimizer (str) — 最佳化器型別。預設為 "adamw_torch"。
scheduler (str) — 學習率排程器型別。預設為 "linear"。
weight_decay (float) — 最佳化器的權重衰減。預設為 0.0。
max_grad_norm (float) — 用於梯度裁剪的最大梯度範數。預設為 1.0。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料分割的名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證資料分割的名稱。預設為 None。
text_column (str) — 上下文/文字的列名。預設為 "context"。
question_column (str) — 問題的列名。預設為 "question"。
answer_column (str) — 答案的列名。預設為 "answers"。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1。
project_name (str) — 輸出目錄的專案名稱。預設為 "project-name"。
auto_find_batch_size (bool) — 自動尋找最佳批次大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度訓練模式 (fp16, bf16, 或 None)。預設為 None。
save_total_limit (int) — 要儲存的檢查點最大數量。預設為 1。
token (Optional[str]) — 用於 Hugging Face Hub 的身份驗證令牌。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
eval_strategy (str) — 訓練期間的評估策略。預設為 "epoch"。
username (Optional[str]) — 用於身份驗證的 Hugging Face 使用者名稱。預設為 None。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 "none"。
early_stopping_patience (int) — 提前停止前沒有改進的週期數。預設為 5。
early_stopping_threshold (float) — 提前停止的改進閾值。預設為 0.01。

抽取式問答引數

class autotrain.trainers.text_classification.params.TextClassificationParams

< 來源 >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None text_column: str = 'text' target_column: str = 'target' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: typing.Optional[str] = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集路徑。
model (str) — 要使用的模型名稱。預設為 "bert-base-uncased"。
lr (float) — 學習率。預設為 5e-5。
epochs (int) — 訓練週期數。預設為 3。
max_seq_length (int) — 最大序列長度。預設為 128。
batch_size (int) — 訓練批次大小。預設為 8。
warmup_ratio (float) — 預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積的步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 "adamw_torch"。
scheduler (str) — 要使用的排程器。預設為 "linear"。
weight_decay (float) — 權重衰減。預設為 0.0。
max_grad_norm (float) — 最大梯度範數。預設為 1.0。
seed (int) — 隨機種子。預設為 42。
train_split (str) — 訓練分割的名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證分割的名稱。預設為 None。
text_column (str) — 資料集中文字列的名稱。預設為 "text"。
target_column (str) — 資料集中目標列的名稱。預設為 "target"。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1。
project_name (str) — 專案名稱。預設為“project-name”。
auto_find_batch_size (bool) — 是否自動查詢批處理大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度設定 (fp16、bf16 或 None)。預設為 None。
save_total_limit (int) — 要儲存的檢查點總數。預設為 1。
token (Optional[str]) — 用於身份驗證的 Hub 令牌。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hub。預設為 False。
eval_strategy (str) — 評估策略。預設為“epoch”。
username (Optional[str]) — Hugging Face 使用者名稱。預設為 None。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為“none”。
early_stopping_patience (int) — 在效能沒有改善的情況下，訓練將停止的輪次數。預設為 5。
early_stopping_threshold (float) — 用於衡量新的最優值以繼續訓練的閾值。預設為 0.01。

TextClassificationParams 是用於文字分類訓練引數的配置類。

class autotrain.trainers.text_regression.params.TextRegressionParams

< 原始碼 >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None text_column: str = 'text' target_column: str = 'target' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: typing.Optional[str] = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集路徑。
model (str) — 要使用的預訓練模型名稱。預設為“bert-base-uncased”。
lr (float) — 最佳化器的學習率。預設為 5e-5。
epochs (int) — 訓練輪數。預設為 3。
max_seq_length (int) — 輸入的最大序列長度。預設為 128。
batch_size (int) — 訓練的批處理大小。預設為 8。
warmup_ratio (float) — 用於學習率預熱的訓練比例。預設為 0.1。
gradient_accumulation (int) — 更新前累積梯度的步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為“adamw_torch”。
scheduler (str) — 要使用的學習率排程器。預設為“linear”。
weight_decay (float) — 要應用的權重衰減。預設為 0.0。
max_grad_norm (float) — 梯度的最大範數。預設為 1.0。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料拆分的名稱。預設為“train”。
valid_split (Optional[str]) — 驗證資料拆分的名稱。預設為 None。
text_column (str) — 包含文字資料的列的名稱。預設為“text”。
target_column (str) — 包含目標資料的列的名稱。預設為“target”。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1 (不記錄日誌)。
project_name (str) — 輸出目錄的專案名稱。預設為“project-name”。
auto_find_batch_size (bool) — 是否自動查詢批處理大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度訓練模式 (fp16、bf16 或 None)。預設為 None。
save_total_limit (int) — 要儲存的最大檢查點數。預設為 1。
token (Optional[str]) — 用於訪問 Hugging Face Hub 的令牌。預設為 None。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
eval_strategy (str) — 要使用的評估策略。預設為“epoch”。
username (Optional[str]) — Hugging Face 使用者名稱。預設為 None。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為“none”。
early_stopping_patience (int) — 在效能沒有改善的情況下，訓練將停止的輪次數。預設為 5。
early_stopping_threshold (float) — 用於衡量新的最優值，以符合改進標準的閾值。預設為 0.01。

TextRegressionParams 是一個用於設定文本回歸訓練引數的配置類。

影像任務

class autotrain.trainers.image_classification.params.ImageClassificationParams

< 原始碼 >

( data_path: str = None model: str = 'google/vit-base-patch16-224' username: typing.Optional[str] = None lr: float = 5e-05 epochs: int = 3 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' image_column: str = 'image' target_column: str = 'target' log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集路徑。
model (str) — 預訓練模型的名稱或路徑。預設為“google/vit-base-patch16-224”。
username (Optional[str]) — Hugging Face 賬戶使用者名稱。
lr (float) — 最佳化器的學習率。預設為 5e-5。
epochs (int) — 訓練的輪數。預設為 3。
batch_size (int) — 訓練的批處理大小。預設為 8。
warmup_ratio (float) — 學習率排程器的預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積步數。預設為 1。
optimizer (str) — 最佳化器型別。預設為“adamw_torch”。
scheduler (str) — 學習率排程器型別。預設為“linear”。
weight_decay (float) — 最佳化器的權重衰減。預設為 0.0。
max_grad_norm (float) — 用於裁剪的最大梯度範數。預設為 1.0。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料拆分的名稱。預設為“train”。
valid_split (Optional[str]) — 驗證資料集拆分的名稱。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1。
project_name (str) — 輸出目錄的專案名稱。預設為 "project-name"。
auto_find_batch_size (bool) — 自動尋找最佳批處理大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度訓練模式 (fp16, bf16, 或 None)。
save_total_limit (int) — 保留的檢查點最大數量。預設為 1。
token (Optional[str]) — 用於身份驗證的 Hugging Face Hub 令牌。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
eval_strategy (str) — 訓練期間的評估策略。預設為 "epoch"。
image_column (str) — 資料集中影像的列名。預設為 "image"。
target_column (str) — 資料集中目標標籤的列名。預設為 "target"。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 "none"。
early_stopping_patience (int) — 用於提前停止的無改進的 epoch 數。預設為 5。
early_stopping_threshold (float) — 提前停止的閾值。預設為 0.01。

ImageClassificationParams 是一個用於影像分類訓練引數的配置類。

class autotrain.trainers.image_regression.params.ImageRegressionParams

< 原始碼 >

引數

data_path (str) — 資料集的路徑。
model (str) — 要使用的模型名稱。預設為 "google/vit-base-patch16-224"。
username (Optional[str]) — Hugging Face 使用者名稱。
lr (float) — 學習率。預設為 5e-5。
epochs (int) — 訓練輪數。預設為 3。
batch_size (int) — 訓練批處理大小。預設為 8。
warmup_ratio (float) — 預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 "adamw_torch"。
scheduler (str) — 要使用的排程器。預設為 "linear"。
weight_decay (float) — 權重衰減。預設為 0.0。
max_grad_norm (float) — 最大梯度範數。預設為 1.0。
seed (int) — 隨機種子。預設為 42。
train_split (str) — 訓練集拆分名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證集拆分名稱。
logging_steps (int) — 日誌記錄步數。預設為 -1。
project_name (str) — 輸出目錄名稱。預設為 "project-name"。
auto_find_batch_size (bool) — 是否自動尋找批處理大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度型別 (fp16, bf16, 或 None)。
save_total_limit (int) — 儲存總數限制。預設為 1。
token (Optional[str]) — Hub 令牌。
push_to_hub (bool) — 是否推送到 Hub。預設為 False。
eval_strategy (str) — 評估策略。預設為 "epoch"。
image_column (str) — 影像列名。預設為 "image"。
target_column (str) — 目標列名。預設為 "target"。
log (str) — 使用實驗跟蹤進行日誌記錄。預設為 "none"。
early_stopping_patience (int) — 提前停止的耐心值。預設為 5。
early_stopping_threshold (float) — 提前停止的閾值。預設為 0.01。

ImageRegressionParams 是一個用於影像迴歸訓練引數的配置類。

class autotrain.trainers.object_detection.params.ObjectDetectionParams

< 原始碼 >

( data_path: str = None model: str = 'google/vit-base-patch16-224' username: typing.Optional[str] = None lr: float = 5e-05 epochs: int = 3 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: typing.Optional[str] = None save_total_limit: int = 1 token: typing.Optional[str] = None push_to_hub: bool = False eval_strategy: str = 'epoch' image_column: str = 'image' objects_column: str = 'objects' log: str = 'none' image_square_size: typing.Optional[int] = 600 early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

引數

data_path (str) — 資料集的路徑。
model (str) — 要使用的模型名稱。預設為 "google/vit-base-patch16-224"。
username (Optional[str]) — Hugging Face 使用者名稱。
lr (float) — 學習率。預設為 5e-5。
epochs (int) — 訓練輪數。預設為 3。
batch_size (int) — 訓練批處理大小。預設為 8。
warmup_ratio (float) — 預熱比例。預設為 0.1。
gradient_accumulation (int) — 梯度累積步數。預設為 1。
optimizer (str) — 要使用的最佳化器。預設為 "adamw_torch"。
scheduler (str) — 要使用的排程器。預設為 "linear"。
weight_decay (float) — 權重衰減。預設為 0.0。
max_grad_norm (float) — 最大梯度範數。預設為 1.0。
seed (int) — 隨機種子。預設為 42。
train_split (str) — 訓練資料集拆分的名稱。預設為 "train"。
valid_split (Optional[str]) — 驗證資料集拆分的名稱。
logging_steps (int) — 兩次日誌記錄之間的步數。預設為 -1。
project_name (str) — 輸出目錄的專案名稱。預設為 "project-name"。
auto_find_batch_size (bool) — 是否自動尋找批處理大小。預設為 False。
mixed_precision (Optional[str]) — 混合精度型別 (fp16, bf16, 或 None)。
save_total_limit (int) — 要儲存的檢查點總數。預設為 1。
token (Optional[str]) — 用於身份驗證的 Hub 令牌。
push_to_hub (bool) — 是否將模型推送到 Hugging Face Hub。預設為 False。
eval_strategy (str) — 評估策略。預設為 “epoch”。
image_column (str) — 資料集中影像列的名稱。預設為 “image”。
objects_column (str) — 資料集中目標列的名稱。預設為 “objects”。
log (str) — 用於實驗跟蹤的日誌記錄方法。預設為 “none”。
image_square_size (Optional[int]) — 影像將被調整到的最長邊尺寸，然後填充為正方形。預設為 600。
early_stopping_patience (int) — 在沒有改善的情況下，訓練將停止的輪次數。預設為 5。
early_stopping_threshold (float) — 符合改善條件的最小變化量。預設為 0.01。

ObjectDetectionParams 是用於目標檢測訓練引數的配置類。

表格任務

class autotrain.trainers.tabular.params.TabularParams

< 原始碼 >

( data_path: str = None model: str = 'xgboost' username: typing.Optional[str] = None seed: int = 42 train_split: str = 'train' valid_split: typing.Optional[str] = None project_name: str = 'project-name' token: typing.Optional[str] = None push_to_hub: bool = False id_column: str = 'id' target_columns: typing.Union[typing.List[str], str] = ['target'] categorical_columns: typing.Optional[typing.List[str]] = None numerical_columns: typing.Optional[typing.List[str]] = None task: str = 'classification' num_trials: int = 10 time_limit: int = 600 categorical_imputer: typing.Optional[str] = None numerical_imputer: typing.Optional[str] = None numeric_scaler: typing.Optional[str] = None )

引數

data_path (str) — 資料集路徑。
model (str) — 要使用的模型名稱。預設為 “xgboost”。
username (Optional[str]) — Hugging Face 使用者名稱。
seed (int) — 用於可復現性的隨機種子。預設為 42。
train_split (str) — 訓練資料拆分的名稱。預設為 “train”。
valid_split (Optional[str]) — 驗證資料拆分的名稱。
project_name (str) — 輸出目錄的名稱。預設為 “project-name”。
token (Optional[str]) — 用於身份驗證的 Hub 令牌。
push_to_hub (bool) — 是否將模型推送到 Hub。預設為 False。
id_column (str) — ID 列的名稱。預設為 “id”。
target_columns (Union[List[str], str]) — 資料集中的目標列。預設為 [“target”]。
categorical_columns (Optional[List[str]]) — 分類列的列表。
numerical_columns (Optional[List[str]]) — 數值列的列表。
task (str) — 任務型別（例如，“classification”）。預設為 “classification”。
num_trials (int) — 超引數最佳化的試驗次數。預設為 10。
time_limit (int) — 訓練的時間限制（秒）。預設為 600。
categorical_imputer (Optional[str]) — 分類列的插補策略。
numerical_imputer (Optional[str]) — 數值列的插補策略。
numeric_scaler (Optional[str]) — 數值列的縮放策略。

TabularParams 是用於表格資料訓練引數的配置類。

< > 在 GitHub 上更新