在 Hugging Face 上使用 OpenCLIP

OpenCLIP 是 OpenAI CLIP 的開源實現。

探索 Hub 上的 OpenCLIP

您可以透過在模型頁面左側進行篩選來查詢 OpenCLIP 模型。

Hub 上託管的 OpenCLIP 模型擁有模型卡，其中包含有關模型的有用資訊。藉助 OpenCLIP Hugging Face Hub 整合，您可以通過幾行程式碼載入 OpenCLIP 模型。您還可以使用推理端點部署這些模型。

安裝

要開始使用，您可以遵循 OpenCLIP 安裝指南。您也可以透過 pip 使用以下一行命令進行安裝

$ pip install open_clip_torch

使用現有模型

所有 OpenCLIP 模型都可以輕鬆地從 Hub 載入

import open_clip

model, preprocess = open_clip.create_model_from_pretrained('hf-hub:laion/CLIP-ViT-g-14-laion2B-s12B-b42K')
tokenizer = open_clip.get_tokenizer('hf-hub:laion/CLIP-ViT-g-14-laion2B-s12B-b42K')

載入後，您可以對影像和文字進行編碼，以執行零樣本影像分類

import torch
from PIL import Image
import requests

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
image = preprocess(image).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)

    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

print("Label probs:", text_probs)

它輸出每個可能類別的機率

Label probs: tensor([[0.0020, 0.0034, 0.9946]])

如果您想載入特定的 OpenCLIP 模型，可以點選模型卡中的在 OpenCLIP 中使用，您將獲得一個可用的程式碼片段！

其他資源

OpenCLIP 儲存庫
OpenCLIP 文件
Hub 中的 OpenCLIP 模型

< > 在 GitHub 上更新