Optimum 文件
使用擴散模型生成影像
並獲得增強的文件體驗
開始使用
使用擴散模型生成影像
Stable Diffusion
Stable Diffusion 模型也可以在使用 OpenVINO 進行推理時使用。當 Stable Diffusion 模型匯出為 OpenVINO 格式時,它們被分解為不同的元件,這些元件在推理過程中再進行組合。
- 文字編碼器
- U-NET
- VAE 編碼器
- VAE 解碼器
| 任務 | 自動類 |
|---|---|
文字到影像 | OVStableDiffusionPipeline |
影像到影像 | OVStableDiffusionImg2ImgPipeline |
影像修復 | OVStableDiffusionInpaintPipeline |
文字到影像
以下是如何載入 OpenVINO Stable Diffusion 模型並使用 OpenVINO Runtime 執行推理的示例
from optimum.intel import OVStableDiffusionPipeline
model_id = "echarlaix/stable-diffusion-v1-5-openvino"
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id)
prompt = "sailing ship in storm by Rembrandt"
images = pipeline(prompt).images要載入 PyTorch 模型並即時將其轉換為 OpenVINO,您可以設定 export=True。
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
# Don't forget to save the exported model
pipeline.save_pretrained("openvino-sd-v1-5")為了進一步加快推理速度,模型可以進行靜態重塑
# Define the shapes related to the inputs and desired outputs
batch_size, num_images, height, width = 1, 1, 512, 512
# Statically reshape the model
pipeline.reshape(batch_size=batch_size, height=height, width=width, num_images_per_prompt=num_images)
# Compile the model before the first inference
pipeline.compile()
# Run inference
images = pipeline(prompt, height=height, width=width, num_images_per_prompt=num_images).images如果您想更改任何引數,例如輸出高度或寬度,您需要再次靜態重塑模型。

帶文字反轉的文字到影像
以下是載入帶有預訓練文字反轉嵌入的 OpenVINO Stable Diffusion 模型並使用 OpenVINO Runtime 執行推理的示例。
首先,您可以執行不帶文字反轉的原始管道
from optimum.intel import OVStableDiffusionPipeline
import numpy as np
model_id = "echarlaix/stable-diffusion-v1-5-openvino"
prompt = "A <cat-toy> back-pack"
# Set a random seed for better comparison
np.random.seed(42)
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id, export=False, compile=False)
pipeline.compile()
image1 = pipeline(prompt, num_inference_steps=50).images[0]
image1.save("stable_diffusion_v1_5_without_textual_inversion.png")然後,您可以載入 sd-concepts-library/cat-toy 文字反轉嵌入,並使用相同的提示再次執行管道
# Reset stable diffusion pipeline
pipeline.clear_requests()
# Load textual inversion into stable diffusion pipeline
pipeline.load_textual_inversion("sd-concepts-library/cat-toy", "<cat-toy>")
# Compile the model before the first inference
pipeline.compile()
image2 = pipeline(prompt, num_inference_steps=50).images[0]
image2.save("stable_diffusion_v1_5_with_textual_inversion.png")左圖顯示了原始 stable diffusion v1.5 的生成結果,右圖顯示了帶有文字反轉的 stable diffusion v1.5 的生成結果。
![]() | ![]() |
影像到影像
import requests
import torch
from PIL import Image
from io import BytesIO
from optimum.intel import OVStableDiffusionImg2ImgPipeline
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = OVStableDiffusionImg2ImgPipeline.from_pretrained(model_id, export=True)
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 512))
prompt = "A fantasy landscape, trending on artstation"
image = pipeline(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images[0]
image.save("fantasy_landscape.png")Stable Diffusion XL
| 任務 | 自動類 |
|---|---|
文字到影像 | OVStableDiffusionXLPipeline |
影像到影像 | OVStableDiffusionXLImg2ImgPipeline |
文字到影像
以下是如何從 stabilityai/stable-diffusion-xl-base-1.0 載入 SDXL OpenVINO 模型並使用 OpenVINO Runtime 執行推理的示例
from optimum.intel import OVStableDiffusionXLPipeline
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
base = OVStableDiffusionXLPipeline.from_pretrained(model_id)
prompt = "train station by Caspar David Friedrich"
image = base(prompt).images[0]
image.save("train_station.png")![]() | ![]() |
帶文字反轉的文字到影像
以下是如何從 stabilityai/stable-diffusion-xl-base-1.0 載入帶有預訓練文字反轉嵌入的 SDXL OpenVINO 模型並使用 OpenVINO Runtime 執行推理的示例
首先,您可以執行不帶文字反轉的原始管道
from optimum.intel import OVStableDiffusionXLPipeline
import numpy as np
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround wearing a red jacket and black shirt, best quality, intricate details."
# Set a random seed for better comparison
np.random.seed(112)
base = OVStableDiffusionXLPipeline.from_pretrained(model_id, export=False, compile=False)
base.compile()
image1 = base(prompt, num_inference_steps=50).images[0]
image1.save("sdxl_without_textual_inversion.png")然後,您可以載入 charturnerv2 文字反轉嵌入,並使用相同的提示再次執行管道
# Reset stable diffusion pipeline
base.clear_requests()
# Load textual inversion into stable diffusion pipeline
base.load_textual_inversion("./charturnerv2.pt", "charturnerv2")
# Compile the model before the first inference
base.compile()
image2 = base(prompt, num_inference_steps=50).images[0]
image2.save("sdxl_with_textual_inversion.png")影像到影像
以下是載入 PyTorch SDXL 模型,將其即時轉換為 OpenVINO 並使用 OpenVINO Runtime 執行影像到影像推理的示例
from optimum.intel import OVStableDiffusionXLImg2ImgPipeline
from diffusers.utils import load_image
model_id = "stabilityai/stable-diffusion-xl-refiner-1.0"
pipeline = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(model_id, export=True)
url = "https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/castle_friedrich.png"
image = load_image(url).convert("RGB")
prompt = "medieval castle by Caspar David Friedrich"
image = pipeline(prompt, image=image).images[0]
# Don't forget to save your OpenVINO model so that you can load it without exporting it with `export=True`
pipeline.save_pretrained("openvino-sd-xl-refiner-1.0")最佳化影像輸出
可以透過使用 stabilityai/stable-diffusion-xl-refiner-1.0 等模型來最佳化影像。在這種情況下,您只需從基礎模型輸出潛變數。
from optimum.intel import OVStableDiffusionXLImg2ImgPipeline
model_id = "stabilityai/stable-diffusion-xl-refiner-1.0"
refiner = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(model_id, export=True)
image = base(prompt=prompt, output_type="latent").images[0]
image = refiner(prompt=prompt, image=image[None, :]).images[0]潛在一致性模型
| 任務 | 自動類 |
|---|---|
文字到影像 | OVLatentConsistencyModelPipeline |
文字到影像
以下是載入來自 SimianLuo/LCM_Dreamshaper_v7 的潛在一致性模型 (LCM) 並使用 OpenVINO 執行推理的示例
from optimum.intel import OVLatentConsistencyModelPipeline
model_id = "SimianLuo/LCM_Dreamshaper_v7"
pipeline = OVLatentConsistencyModelPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipeline(prompt, num_inference_steps=4, guidance_scale=8.0).images


