Safetensors 文件

速度比較

您正在檢視的是需要從原始碼安裝。如果您想使用常規的 pip 安裝,請檢視最新的穩定版本 (v0.5.0-rc.0)。
Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

速度比較

Open In Colab

Safetensors 非常快。讓我們透過載入 gpt2 的權重來將其與 PyTorch 進行比較。要執行 GPU 基準測試,請確保您的機器有 GPU,或者如果您使用的是 Google Colab,請確保您已選擇 GPU 執行時

在開始之前,請確保您已安裝所有必要的庫

pip install safetensors huggingface_hub torch

讓我們從匯入所有將要使用的包開始

>>> import os
>>> import datetime
>>> from huggingface_hub import hf_hub_download
>>> from safetensors.torch import load_file
>>> import torch

下載 gpt2 的 safetensors 和 torch 權重

>>> sf_filename = hf_hub_download("gpt2", filename="model.safetensors")
>>> pt_filename = hf_hub_download("gpt2", filename="pytorch_model.bin")

CPU 基準測試

>>> start_st = datetime.datetime.now()
>>> weights = load_file(sf_filename, device="cpu")
>>> load_time_st = datetime.datetime.now() - start_st
>>> print(f"Loaded safetensors {load_time_st}")

>>> start_pt = datetime.datetime.now()
>>> weights = torch.load(pt_filename, map_location="cpu")
>>> load_time_pt = datetime.datetime.now() - start_pt
>>> print(f"Loaded pytorch {load_time_pt}")

>>> print(f"on CPU, safetensors is faster than pytorch by: {load_time_pt/load_time_st:.1f} X")
Loaded safetensors 0:00:00.004015
Loaded pytorch 0:00:00.307460
on CPU, safetensors is faster than pytorch by: 76.6 X

速度提升是由於該庫透過直接對映檔案來避免不必要的記憶體複製。這實際上可以在純 PyTorch 上實現。當前顯示的速度提升是在以下環境中獲得的:

  • 作業系統:Ubuntu 18.04.6 LTS
  • CPU:Intel(R) Xeon(R) CPU @ 2.00GHz

GPU 基準測試

>>> # This is required because this feature hasn't been fully verified yet, but 
>>> # it's been tested on many different environments
>>> os.environ["SAFETENSORS_FAST_GPU"] = "1"

>>> # CUDA startup out of the measurement
>>> torch.zeros((2, 2)).cuda()

>>> start_st = datetime.datetime.now()
>>> weights = load_file(sf_filename, device="cuda:0")
>>> load_time_st = datetime.datetime.now() - start_st
>>> print(f"Loaded safetensors {load_time_st}")

>>> start_pt = datetime.datetime.now()
>>> weights = torch.load(pt_filename, map_location="cuda:0")
>>> load_time_pt = datetime.datetime.now() - start_pt
>>> print(f"Loaded pytorch {load_time_pt}")

>>> print(f"on GPU, safetensors is faster than pytorch by: {load_time_pt/load_time_st:.1f} X")
Loaded safetensors 0:00:00.165206
Loaded pytorch 0:00:00.353889
on GPU, safetensors is faster than pytorch by: 2.1 X

速度提升的原因是該庫能夠跳過不必要的 CPU 記憶體分配。據我們所知,這在純 PyTorch 中是無法復現的。該庫的工作原理是記憶體對映檔案,使用 PyTorch 建立空張量,然後直接呼叫 cudaMemcpy 將張量直接移動到 GPU 上。當前顯示的速度提升是在以下環境中獲得的:

  • 作業系統:Ubuntu 18.04.6 LTS
  • GPU:Tesla T4
  • 驅動版本:460.32.03
  • CUDA 版本:11.2
< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.