Diffusers 文件

管道回撥

Diffusers

加入 Hugging Face 社群

並獲得增強的文件體驗

在模型、資料集和 Spaces 上進行協作

透過加速推理獲得更快的示例

切換文件主題

開始使用

管道回撥

可以使用自定義函式透過callback_on_step_end引數修改管道的去噪迴圈。回撥函式在每個步驟結束時執行，並修改管道屬性和變數以用於下一個步驟。這對於*動態*調整某些管道屬性或修改張量變數非常有用。這種多功能性允許有趣的用例，例如在每個時間步改變提示嵌入，為提示嵌入分配不同的權重，以及編輯引導比例。透過回撥，您可以在不修改底層程式碼的情況下實現新功能！

🤗 Diffusers 目前只支援 callback_on_step_end，但如果您有很棒的用例並且需要一個具有不同執行點的回撥函式，請隨時開啟功能請求！

本指南將透過您可以實現的一些功能來演示回撥的工作原理。

官方回撥

我們提供了一系列回撥，您可以將其插入現有管道並修改去噪迴圈。這是當前官方回撥列表：

SDCFGCutoffCallback：在一定步數後停用所有 SD 1.5 管道的 CFG，包括文字到影像、影像到影像、影像修復和 ControlNet。
SDXLCFGCutoffCallback：在一定步數後停用所有 SDXL 管道的 CFG，包括文字到影像、影像到影像、影像修復和 ControlNet。
IPAdapterScaleCutoffCallback：在一定步數後停用所有支援 IP-Adapter 的管道的 IP Adapter。

如果您想新增新的官方回撥，請隨時開啟功能請求或提交 PR。

要設定回撥，您需要指定回撥生效的去噪步數。您可以透過以下兩個引數之一來完成此操作：

cutoff_step_ratio：浮點數，表示步數比例。
cutoff_step_index：整數，表示確切的步數。

import torch

from diffusers import DPMSolverMultistepScheduler, StableDiffusionXLPipeline
from diffusers.callbacks import SDXLCFGCutoffCallback


callback = SDXLCFGCutoffCallback(cutoff_step_ratio=0.4)
# can also be used with cutoff_step_index
# callback = SDXLCFGCutoffCallback(cutoff_step_ratio=None, cutoff_step_index=10)

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)

prompt = "a sports car at the road, best quality, high quality, high detail, 8k resolution"

generator = torch.Generator(device="cpu").manual_seed(2628670641)

out = pipeline(
    prompt=prompt,
    negative_prompt="",
    guidance_scale=6.5,
    num_inference_steps=25,
    generator=generator,
    callback_on_step_end=callback,
)

out.images[0].save("official_callback.png")

generated image of a sports car at the road

不使用 SDXLCFGCutoffCallback

generated image of a sports car at the road with cfg callback

使用 SDXLCFGCutoffCallback

動態分類器無關引導

動態分類器無關引導（CFG）是一種功能，允許您在一定數量的推理步驟後停用 CFG，這有助於您節省計算量，同時對效能影響最小。此回撥函式應具有以下引數：

pipeline（或管道例項）提供對重要屬性的訪問，例如 num_timesteps 和 guidance_scale。您可以透過更新底層屬性來修改這些屬性。對於此示例，您將透過設定 pipeline._guidance_scale=0.0 來停用 CFG。
step_index 和 timestep 告訴您在去噪迴圈中的位置。使用 step_index 在達到 num_timesteps 的 40% 後關閉 CFG。
callback_kwargs 是一個字典，其中包含您可以在去噪迴圈期間修改的張量變數。它只包含在 callback_on_step_end_tensor_inputs 引數中指定的變數，該引數傳遞給管道的 __call__ 方法。不同的管道可能使用不同的變數集，因此請檢查管道的 _callback_tensor_inputs 屬性以獲取您可以修改的變數列表。一些常見的變數包括 latents 和 prompt_embeds。對於此函式，在設定 guidance_scale=0.0 後更改 prompt_embeds 的批次大小，以使其正常工作。

您的回撥函式應該如下所示：

def callback_dynamic_cfg(pipe, step_index, timestep, callback_kwargs):
        # adjust the batch_size of prompt_embeds according to guidance_scale
        if step_index == int(pipeline.num_timesteps * 0.4):
                prompt_embeds = callback_kwargs["prompt_embeds"]
                prompt_embeds = prompt_embeds.chunk(2)[-1]

                # update guidance_scale and prompt_embeds
                pipeline._guidance_scale = 0.0
                callback_kwargs["prompt_embeds"] = prompt_embeds
        return callback_kwargs

現在，您可以將回調函式傳遞給 callback_on_step_end 引數，並將 prompt_embeds 傳遞給 callback_on_step_end_tensor_inputs。

import torch
from diffusers import StableDiffusionPipeline

pipeline = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"

generator = torch.Generator(device="cuda").manual_seed(1)
out = pipeline(
    prompt,
    generator=generator,
    callback_on_step_end=callback_dynamic_cfg,
    callback_on_step_end_tensor_inputs=['prompt_embeds']
)

out.images[0].save("out_custom_cfg.png")

中斷擴散過程

中斷回撥支援文字到影像、影像到影像和影像修復功能，適用於 StableDiffusionPipeline 和 StableDiffusionXLPipeline。

提前停止擴散過程對於構建與 Diffusers 配合使用的 UI 非常有用，因為它允許使用者在對中間結果不滿意時停止生成過程。您可以透過回撥將其整合到管道中。

此回撥函式應接受以下引數：pipeline、i、t 和 callback_kwargs（此引數必須返回）。將管道的 _interrupt 屬性設定為 True 以在一定步數後停止擴散過程。您也可以在回撥內部自由實現自己的自定義停止邏輯。

在此示例中，即使 num_inference_steps 設定為 50，擴散過程也在 10 步後停止。

from diffusers import StableDiffusionPipeline

pipeline = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5")
pipeline.enable_model_cpu_offload()
num_inference_steps = 50

def interrupt_callback(pipeline, i, t, callback_kwargs):
    stop_idx = 10
    if i == stop_idx:
        pipeline._interrupt = True

    return callback_kwargs

pipeline(
    "A photo of a cat",
    num_inference_steps=num_inference_steps,
    callback_on_step_end=interrupt_callback,
)

IP 介面卡截止

IP Adapter 是一個影像提示介面卡，可用於擴散模型，而無需對底層模型進行任何更改。我們可以使用 IP Adapter Cutoff Callback 在一定步數後停用 IP Adapter。要設定回撥，您需要指定回撥生效的去噪步數。您可以透過以下兩個引數之一來完成此操作：

cutoff_step_ratio：浮點數，表示步數比例。
cutoff_step_index：整數，表示確切的步數。

我們需要下載擴散模型併為其載入 ip_adapter，如下所示：

from diffusers import AutoPipelineForText2Image
from diffusers.utils import load_image
import torch

pipeline = AutoPipelineForText2Image.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16).to("cuda")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="sdxl_models", weight_name="ip-adapter_sdxl.bin")
pipeline.set_ip_adapter_scale(0.6)

回撥的設定應如下所示：


from diffusers import AutoPipelineForText2Image
from diffusers.callbacks import IPAdapterScaleCutoffCallback
from diffusers.utils import load_image
import torch
 

pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", 
    torch_dtype=torch.float16
).to("cuda")


pipeline.load_ip_adapter(
    "h94/IP-Adapter", 
    subfolder="sdxl_models", 
    weight_name="ip-adapter_sdxl.bin"
)

pipeline.set_ip_adapter_scale(0.6)


callback = IPAdapterScaleCutoffCallback(
    cutoff_step_ratio=None, 
    cutoff_step_index=5
)

image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/ip_adapter_diner.png"
)

generator = torch.Generator(device="cuda").manual_seed(2628670641)

images = pipeline(
    prompt="a tiger sitting in a chair drinking orange juice",
    ip_adapter_image=image,
    negative_prompt="deformed, ugly, wrong proportion, low res, bad anatomy, worst quality, low quality",
    generator=generator,
    num_inference_steps=50,
    callback_on_step_end=callback,
).images

images[0].save("custom_callback_img.png")

generated image of a tiger sitting in a chair drinking orange juice

不使用 IPAdapterScaleCutoffCallback

generated image of a tiger sitting in a chair drinking orange juice with ip adapter callback

使用 IPAdapterScaleCutoffCallback

每步生成後顯示影像

此提示由asomoza貢獻。

透過訪問和轉換每個步驟後的潛變數為影像，在每個生成步驟後顯示影像。潛變數空間壓縮為 128x128，因此影像也是 128x128，這對於快速預覽很有用。

使用以下函式將 SDXL 潛變數（4 通道）轉換為 RGB 張量（3 通道），如“解釋 SDXL 潛變數空間”部落格文章中所述。

def latents_to_rgb(latents):
    weights = (
        (60, -60, 25, -70),
        (60,  -5, 15, -50),
        (60,  10, -5, -35),
    )

    weights_tensor = torch.t(torch.tensor(weights, dtype=latents.dtype).to(latents.device))
    biases_tensor = torch.tensor((150, 140, 130), dtype=latents.dtype).to(latents.device)
    rgb_tensor = torch.einsum("...lxy,lr -> ...rxy", latents, weights_tensor) + biases_tensor.unsqueeze(-1).unsqueeze(-1)
    image_array = rgb_tensor.clamp(0, 255).byte().cpu().numpy().transpose(1, 2, 0)

    return Image.fromarray(image_array)

建立一個函式來解碼潛變數並將其儲存為影像。

def decode_tensors(pipe, step, timestep, callback_kwargs):
    latents = callback_kwargs["latents"]

    image = latents_to_rgb(latents[0])
    image.save(f"{step}.png")

    return callback_kwargs

將 decode_tensors 函式傳遞給 callback_on_step_end 引數，以便在每一步之後解碼張量。您還需要在 callback_on_step_end_tensor_inputs 引數中指定要修改的內容，在此示例中是潛變數。

from diffusers import AutoPipelineForText2Image
import torch
from PIL import Image

pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
).to("cuda")

image = pipeline(
    prompt="A croissant shaped like a cute bear.",
    negative_prompt="Deformed, ugly, bad anatomy",
    callback_on_step_end=decode_tensors,
    callback_on_step_end_tensor_inputs=["latents"],
).images[0]

步驟 0

步驟 19

步驟 29

步驟 39

步驟 49

< > 在 GitHub 上更新

←排程器功能可重現的管道→