Diffusers

加入 Hugging Face 社群

並獲得增強的文件體驗

在模型、資料集和 Spaces 上進行協作

透過加速推理獲得更快的示例

切換文件主題

開始使用

DDIM

去噪擴散隱式模型 (DDIM)，作者：Jiaming Song、Chenlin Meng 和 Stefano Ermon。

論文摘要如下：

去噪擴散機率模型 (DDPMs) 實現了高質量影像生成而無需對抗訓練，但它們需要模擬馬爾可夫鏈多步才能生成樣本。為了加速取樣，我們提出了去噪擴散隱式模型 (DDIMs)，這是一類更有效的迭代隱式機率模型，其訓練過程與 DDPMs 相同。在 DDPMs 中，生成過程被定義為馬爾可夫擴散過程的逆過程。我們構建了一類非馬爾可夫擴散過程，它們導致相同的訓練目標，但其逆過程的取樣速度要快得多。我們透過實驗證明，與 DDPMs 相比，DDIMs 在實際時間上可以快 10 到 50 倍生成高質量樣本，允許我們權衡計算量與樣本質量，並且可以直接在潛在空間中執行語義上有意義的影像插值。

原始程式碼庫可在 ermongroup/ddim 找到。

DDIMPipeline

類 diffusers.DDIMPipeline

< 源 >

( unet: UNet2DModel 排程器: DDIMScheduler )

引數

unet (UNet2DModel) — 用於對編碼影像潛在進行去噪的 UNet2DModel。
scheduler (SchedulerMixin) — 與 unet 結合用於去噪編碼影像的排程器。可以是 DDPMScheduler 或 DDIMScheduler 之一。

用於影像生成的管道。

此模型繼承自 DiffusionPipeline。有關所有管道實現的通用方法（下載、儲存、在特定裝置上執行等），請檢視超類文件。

call

< 源 >

( 批大小: int = 1 生成器: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None eta: float = 0.0 推理步數: int = 50 使用裁剪模型輸出: typing.Optional[bool] = None 輸出型別: typing.Optional[str] = 'pil' 返回字典: bool = True ) → ImagePipelineOutput 或 tuple

引數

batch_size (int, 可選, 預設為 1) — 要生成的影像數量。
generator (torch.Generator, 可選) — 用於使生成具有確定性的 torch.Generator。
eta (float, 可選, 預設為 0.0) — 對應於 DDIM 論文中的引數 eta (η)。僅適用於 DDIMScheduler，在其他排程器中將被忽略。值為 0 對應於 DDIM，1 對應於 DDPM。
num_inference_steps (int, 可選, 預設為 50) — 去噪步數。更多去噪步數通常會帶來更高質量的影像，但推理速度會變慢。
use_clipped_model_output (bool, 可選, 預設為 None) — 如果為 True 或 False，請參閱 DDIMScheduler.step() 的文件。如果為 None，則不會將任何內容傳遞給下游排程器（對於不支援此引數的排程器，請使用 None）。
output_type (str, 可選, 預設為 "pil") — 生成影像的輸出格式。在 PIL.Image 或 np.array 之間選擇。
return_dict (bool, 可選, 預設為 True) — 是否返回 ImagePipelineOutput 而不是普通的元組。

ImagePipelineOutput 或 tuple

如果 return_dict 為 True，則返回 ImagePipelineOutput，否則返回一個 tuple，其中第一個元素是生成的影像列表。

用於生成的管道的呼叫函式。

示例

>>> from diffusers import DDIMPipeline
>>> import PIL.Image
>>> import numpy as np

>>> # load model and scheduler
>>> pipe = DDIMPipeline.from_pretrained("fusing/ddim-lsun-bedroom")

>>> # run pipeline in inference (sample random noise and denoise)
>>> image = pipe(eta=0.0, num_inference_steps=50)

>>> # process image to PIL
>>> image_processed = image.cpu().permute(0, 2, 3, 1)
>>> image_processed = (image_processed + 1.0) * 127.5
>>> image_processed = image_processed.numpy().astype(np.uint8)
>>> image_pil = PIL.Image.fromarray(image_processed[0])

>>> # save image
>>> image_pil.save("test.png")

ImagePipelineOutput

類 diffusers.ImagePipelineOutput

< 源 >

( 影像: typing.Union[typing.List[PIL.Image.Image], numpy.ndarray] )

引數

images (List[PIL.Image.Image] 或 np.ndarray) — 長度為 batch_size 的去噪 PIL 影像列表或形狀為 (batch_size, height, width, num_channels) 的 NumPy 陣列。

影像流水線的輸出類。

< > 在 GitHub 上更新

←Dance Diffusion DDPM→

Diffusers

DDIM

DDIMPipeline

類 diffusers.DDIMPipeline

__call__

ImagePipelineOutput

類 diffusers.ImagePipelineOutput

call