LiteRT

LiteRT（以前稱為 TensorFlow Lite）是一種專為裝置上機器學習設計的高效能執行時。

Optimum 庫可將模型匯出為 LiteRT，支援多種架構。

匯出到 LiteRT 的好處包括：

低延遲、注重隱私、無需網路連線，以及降低裝置上機器學習的模型大小和功耗。
廣泛的平臺、模型框架和語言支援。
對 GPU 和 Apple Silicon 的硬體加速。

使用 Optimum CLI 將 Transformers 模型匯出到 LiteRT。

執行以下命令安裝 Optimum 和 LiteRT 的匯出器模組。

pip install optimum[exporters-tf]

請參閱使用 optimum.exporters.tflite 將模型匯出到 TFLite 指南，以獲取所有可用引數，或使用以下命令。

optimum-cli export tflite --help

設定 --model 引數可從 Hub 匯出模型。

optimum-cli export tflite --model google-bert/bert-base-uncased --sequence_length 128 bert_tflite/

您應該會看到指示進度並顯示生成的 model.tflite 儲存位置的日誌。

Validating TFLite model...
	-[✓] TFLite model output names match reference model (logits)
	- Validating TFLite Model output "logits":
		-[✓] (1, 128, 30522) matches (1, 128, 30522)
		-[x] values not close enough, max diff: 5.817413330078125e-05 (atol: 1e-05)
The TensorFlow Lite export succeeded with the warning: The maximum absolute difference between the output of the reference model and the TFLite exported model is not within the set tolerance 1e-05:
- logits: max diff = 5.817413330078125e-05.
 The exported model was saved at: bert_tflite

對於本地模型，請確保模型權重和分詞器檔案儲存在同一目錄中，例如 local_path。將目錄傳遞給 --model 引數，並使用 --task 指示模型可以執行的任務。如果未提供 --task，則使用沒有特定任務頭部的模型架構。

optimum-cli export tflite --model local_path --task question-answering google-bert/bert-base-uncased --sequence_length 128 bert_tflite/

< > 在 GitHub 上更新