LoRA for Neuron

為在 AWS Trainium 裝置上進行分散式訓練而最佳化的 LoRA（低秩自適應）實現。該模組提供高效的引數高效微調，並支援張量並行和序列並行。

PEFT 模型類

NeuronPeftModel

class optimum.neuron.peft.NeuronPeftModel

( model: PreTrainedModel peft_config: PeftConfig adapter_name: str = 'default' autocast_adapter_dtype: bool = True **kwargs: Any )

NeuronPeftModelForCausalLM

class optimum.neuron.peft.NeuronPeftModelForCausalLM

< 原始碼 >

( model: PreTrainedModel peft_config: PeftConfig adapter_name: str = 'default' autocast_adapter_dtype: bool = True **kwargs: Any )

LoRA 層實現

基礎 LoRA 層

class optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer

< 原始碼 >

( base_layer: Module ephemeral_gpu_offload: bool = False **kwargs )

並行線性 LoRA

class optimum.neuron.peft.tuners.lora.layer.ParallelLinear

< 原始碼 >

( base_layer adapter_name: str r: int = 0 lora_alpha: int = 1 lora_dropout: float = 0.0 fan_in_fan_out: bool = False is_target_conv_1d_layer: bool = False init_lora_weights: bool | str = True use_rslora: bool = False use_dora: bool = False lora_bias: bool = False **kwargs )

GQA QKV 列並行 LoRA

class optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear

< 原始碼 >

並行嵌入 LoRA

class optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding

< 原始碼 >

( base_layer: Module adapter_name: str r: int = 0 lora_alpha: int = 1 lora_dropout: float = 0.0 fan_in_fan_out: bool = False init_lora_weights: bool | str = True use_rslora: bool = False use_dora: bool = False lora_bias: bool = False **kwargs )

LoRA 模型

NeuronLoraModel

class optimum.neuron.peft.tuners.NeuronLoraModel

< 原始碼 >

( model config adapter_name low_cpu_mem_usage: bool = False )

實用函式

get_peft_model

optimum.neuron.peft.get_peft_model

< 原始碼 >

( model: PreTrainedModel peft_config: PeftConfig adapter_name: str = 'default' mixed: bool = False autocast_adapter_dtype: bool = True revision: str | None = None low_cpu_mem_usage: bool = False )

架構支援

Neuron LoRA 實現支援以下並行層型別

ColumnParallelLinear：用於沿輸出維度拆分權重的層
RowParallelLinear：用於沿輸入維度拆分權重的層
ParallelEmbedding：用於在不同等級之間分佈的嵌入層
GQAQKVColumnParallelLinear：用於具有挑戰性張量並行配置的分組查詢注意力投影

每種層型別都有相應的 LoRA 實現，在增加低秩自適應功能的同時保持並行化策略。

主要功能

分散式訓練：完全支援張量並行和序列並行
檢查點合併：分片和合並檢查點之間的自動轉換
權重轉換：與模型權重轉換規範無縫整合
相容性：適用於 Optimum Neuron 中所有支援的自定義建模架構

AWS Trainium 和 Inferentia

LoRA for Neuron

PEFT 模型類

NeuronPeftModel

class optimum.neuron.peft.NeuronPeftModel

NeuronPeftModelForCausalLM

class optimum.neuron.peft.NeuronPeftModelForCausalLM

LoRA 層實現

基礎 LoRA 層

class optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer

並行線性 LoRA

class optimum.neuron.peft.tuners.lora.layer.ParallelLinear

GQA QKV 列並行 LoRA

class optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear

並行嵌入 LoRA

class optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding

LoRA 模型

NeuronLoraModel

class optimum.neuron.peft.tuners.NeuronLoraModel

實用函式

get_peft_model

optimum.neuron.peft.get_peft_model

架構支援

主要功能