Transformers 文件

RoBERTa-PreLayerNorm

Transformers

加入 Hugging Face 社群

並獲得增強的文件體驗

在模型、資料集和 Spaces 上進行協作

透過加速推理獲得更快的示例

切換文件主題

開始使用

RoBERTa-PreLayerNorm

概述

RoBERTa-PreLayerNorm 模型由 Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli 在 fairseq: 用於序列建模的快速可擴充套件工具包中提出。它與在 fairseq 中使用 --encoder-normalize-before 標誌相同。

論文摘要如下：

fairseq 是一個開源的序列建模工具包，允許研究人員和開發人員訓練用於翻譯、摘要、語言建模和其他文字生成任務的自定義模型。該工具包基於 PyTorch，支援多 GPU 和多機器上的分散式訓練。我們還支援在現代 GPU 上進行快速混合精度訓練和推理。

此模型由 andreasmadsen 貢獻。原始程式碼可在此處找到。

使用技巧

其實現與 Roberta 相同，不同之處在於它使用 Norm and Add 而不是 Add and Norm。Add 和 Norm 是指 Attention Is All You Need 中描述的 Addition 和 LayerNormalization。
這與在 fairseq 中使用 --encoder-normalize-before 標誌相同。

Transformers

RoBERTa-PreLayerNorm

概述

使用技巧

資源

RobertaPreLayerNormConfig

class transformers.RobertaPreLayerNormConfig

RobertaPreLayerNormModel

class transformers.RobertaPreLayerNormModel

前向傳播

RobertaPreLayerNormForCausalLM

class transformers.RobertaPreLayerNormForCausalLM

前向傳播

RobertaPreLayerNormForMaskedLM

class transformers.RobertaPreLayerNormForMaskedLM

前向傳播

RobertaPreLayerNormForSequenceClassification

class transformers.RobertaPreLayerNormForSequenceClassification

前向傳播

RobertaPreLayerNormForMultipleChoice

class transformers.RobertaPreLayerNormForMultipleChoice

前向傳播

RobertaPreLayerNormForTokenClassification

class transformers.RobertaPreLayerNormForTokenClassification

前向傳播

RobertaPreLayerNormForQuestionAnswering

class transformers.RobertaPreLayerNormForQuestionAnswering

前向傳播

TFRobertaPreLayerNormModel

class transformers.TFRobertaPreLayerNormModel

呼叫

TFRobertaPreLayerNormForCausalLM

class transformers.TFRobertaPreLayerNormForCausalLM

呼叫

TFRobertaPreLayerNormForMaskedLM

class transformers.TFRobertaPreLayerNormForMaskedLM

呼叫

TFRobertaPreLayerNormForSequenceClassification

class transformers.TFRobertaPreLayerNormForSequenceClassification

呼叫

TFRobertaPreLayerNormForMultipleChoice

class transformers.TFRobertaPreLayerNormForMultipleChoice

呼叫

TFRobertaPreLayerNormForTokenClassification

class transformers.TFRobertaPreLayerNormForTokenClassification

呼叫

TFRobertaPreLayerNormForQuestionAnswering

class transformers.TFRobertaPreLayerNormForQuestionAnswering

呼叫

FlaxRobertaPreLayerNormModel

class transformers.FlaxRobertaPreLayerNormModel

__call__

FlaxRobertaPreLayerNormForCausalLM

class transformers.FlaxRobertaPreLayerNormForCausalLM

__call__

FlaxRobertaPreLayerNormForMaskedLM

class transformers.FlaxRobertaPreLayerNormForMaskedLM

__call__

FlaxRobertaPreLayerNormForSequenceClassification

class transformers.FlaxRobertaPreLayerNormForSequenceClassification

__call__

FlaxRobertaPreLayerNormForMultipleChoice

class transformers.FlaxRobertaPreLayerNormForMultipleChoice

__call__

FlaxRobertaPreLayerNormForTokenClassification

class transformers.FlaxRobertaPreLayerNormForTokenClassification

__call__

FlaxRobertaPreLayerNormForQuestionAnswering

class transformers.FlaxRobertaPreLayerNormForQuestionAnswering

__call__

call

call

call

call

call

call

call