AWS Trainium & Inferentia 文件

Stable Diffusion

Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

Stable Diffusion

概述

Stable Diffusion 是一個文字到影像的*潛在擴散*模型,它建立在最初的 Stable Diffusion 的工作基礎上,由 Stability AILAION 的 Robin Rombach 和 Katherine Crowson 領導。

🤗 Optimum 擴充套件了 Diffusers 以支援在第二代 Neuron 裝置(支援 Trainium 和 Inferentia 2)上進行推理。它旨在繼承 Diffusers 在 Neuron 上的易用性。

匯出到 Neuron

為了部署模型,您需要將它們編譯為針對 AWS Neuron 最佳化的 TorchScript。對於 Stable Diffusion,有四個元件需要匯出為 .neuron 格式以提升效能:

  • 文字編碼器
  • U-Net
  • VAE 編碼器
  • VAE 解碼器

您可以透過 CLI 或 NeuronStableDiffusionPipeline 類編譯和匯出 Stable Diffusion Checkpoint。

選項 1: CLI

以下是使用 Optimum CLI 匯出 Stable Diffusion 元件的示例:

optimum-cli export neuron --model stabilityai/stable-diffusion-2-1-base \
  --batch_size 1 \
  --height 512 `# height in pixels of generated image, eg. 512, 768` \
  --width 512 `# width in pixels of generated image, eg. 512, 768` \
  --num_images_per_prompt 1 `# number of images to generate per prompt, defaults to 1` \
  --auto_cast matmul `# cast only matrix multiplication operations` \
  --auto_cast_type bf16 `# cast operations from FP32 to BF16` \
  sd_neuron/

我們建議使用 inf2.8xlarge 或更大的例項進行模型編譯。您也可以在僅 CPU 的例項上使用 Optimum CLI 編譯模型(需要約 35 GB 記憶體),然後將預編譯的模型在 inf2.xlarge 上執行以降低成本。在這種情況下,請不要忘記透過新增 --disable-validation 引數來停用推理驗證。

選項 2: Python API

以下是使用 NeuronStableDiffusionPipeline 匯出 Stable Diffusion 元件的示例:

為了應用 Unet 注意力得分的最佳化計算,請將您的環境變數配置為 export NEURON_FUSE_SOFTMAX=1

此外,請不要猶豫調整編譯配置,以在您的用例中找到效能與準確性之間的最佳權衡。預設情況下,我們建議將 FP32 矩陣乘法操作轉換為 BF16,這在效能良好且準確性適度犧牲的情況下提供了很好的效能。請查閱 AWS Neuron 文件 中的指南,以更好地瞭解您的編譯選項。

>>> from optimum.neuron import NeuronStableDiffusionPipeline

>>> model_id = "stable-diffusion-v1-5/stable-diffusion-v1-5"
>>> compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
>>> input_shapes = {"batch_size": 1, "height": 512, "width": 512}

>>> stable_diffusion = NeuronStableDiffusionPipeline.from_pretrained(model_id, export=True, **compiler_args, **input_shapes)

# Save locally or upload to the HuggingFace Hub
>>> save_directory = "sd_neuron/"
>>> stable_diffusion.save_pretrained(save_directory)
>>> stable_diffusion.push_to_hub(
...     save_directory, repository_id="my-neuron-repo"
... )

文字到影像

NeuronStableDiffusionPipeline 類允許您在 Neuron 裝置上根據文字提示生成影像,類似於 Diffusers 的體驗。

使用預編譯的 Stable Diffusion 模型,現在可以在 Neuron 上根據提示生成影像。

>>> from optimum.neuron import NeuronStableDiffusionPipeline

>>> stable_diffusion = NeuronStableDiffusionPipeline.from_pretrained("sd_neuron/")
>>> prompt = "a photo of an astronaut riding a horse on mars"
>>> image = stable_diffusion(prompt).images[0]
stable diffusion generated image

影像到影像

使用 NeuronStableDiffusionImg2ImgPipeline 類,您可以根據文字提示和初始影像生成新影像。

import requests
from PIL import Image
from io import BytesIO
from optimum.neuron import NeuronStableDiffusionImg2ImgPipeline

# compile & save
model_id = "nitrosocke/Ghibli-Diffusion"
input_shapes = {"batch_size": 1, "height": 512, "width": 512}
pipeline = NeuronStableDiffusionImg2ImgPipeline.from_pretrained(model_id, export=True, **input_shapes)
pipeline.save_pretrained("sd_img2img/")

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"

response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((512, 512))

prompt = "ghibli style, a fantasy landscape with snowcapped mountains, trees, lake with detailed reflection. sunlight and cloud in the sky, warm colors, 8K"

image = pipeline(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images[0]
image.save("fantasy_landscape.png")
影像 提示 輸出
landscape photo 吉卜力風格,雪山、樹木、湖泊(有詳細倒影)的奇幻風景。暖色調,8K drawing

區域性重繪

使用 NeuronStableDiffusionInpaintPipeline 類,您可以透過提供一個蒙版和文字提示來編輯影像的特定部分。

import requests
from PIL import Image
from io import BytesIO
from optimum.neuron import NeuronStableDiffusionInpaintPipeline

model_id = "stable-diffusion-v1-5/stable-diffusion-inpainting"
input_shapes = {"batch_size": 1, "height": 512, "width": 512}
pipeline = NeuronStableDiffusionInpaintPipeline.from_pretrained(model_id, export=True, **input_shapes)
pipeline.save_pretrained("sd_inpaint/")

def download_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image).images[0]
image.save("cat_on_bench.png")
影像 蒙版影像 提示 輸出
drawing drawing 一隻黃貓的臉,高解析度,坐在公園長凳上 drawing

NeuronStableDiffusionPipeline

class optimum.neuron.NeuronStableDiffusionPipeline

< >

( config: dict[str, typing.Any] configs: dict[str, 'PretrainedConfig'] neuron_configs: dict[str, 'NeuronDefaultConfig'] data_parallel_mode: typing.Literal['none', 'unet', 'transformer', 'all'] scheduler: diffusers.schedulers.scheduling_utils.SchedulerMixin | None vae_decoder: torch.jit._script.ScriptModule | NeuronModelVaeDecoder text_encoder: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None text_encoder_2: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None unet: torch.jit._script.ScriptModule | NeuronModelUnet | None = None transformer: torch.jit._script.ScriptModule | NeuronModelTransformer | None = None vae_encoder: torch.jit._script.ScriptModule | NeuronModelVaeEncoder | None = None image_encoder: torch.jit._script.ScriptModule | None = None safety_checker: torch.jit._script.ScriptModule | None = None tokenizer: transformers.models.clip.tokenization_clip.CLIPTokenizer | transformers.models.t5.tokenization_t5.T5Tokenizer | None = None tokenizer_2: transformers.models.clip.tokenization_clip.CLIPTokenizer | None = None feature_extractor: transformers.models.clip.feature_extraction_clip.CLIPFeatureExtractor | None = None controlnet: torch.jit._script.ScriptModule | list[torch.jit._script.ScriptModule]| NeuronControlNetModel | NeuronMultiControlNetModel | None = None requires_aesthetics_score: bool = False force_zeros_for_empty_prompt: bool = True add_watermarker: bool | None = None model_save_dir: str | pathlib.Path | tempfile.TemporaryDirectory | None = None model_and_config_save_paths: dict[str, tuple[str, pathlib.Path]] | None = None )

__call__

< >

( *args **kwargs )

NeuronStableDiffusionImg2ImgPipeline

class optimum.neuron.NeuronStableDiffusionImg2ImgPipeline

< >

( config: dict[str, typing.Any] configs: dict[str, 'PretrainedConfig'] neuron_configs: dict[str, 'NeuronDefaultConfig'] data_parallel_mode: typing.Literal['none', 'unet', 'transformer', 'all'] scheduler: diffusers.schedulers.scheduling_utils.SchedulerMixin | None vae_decoder: torch.jit._script.ScriptModule | NeuronModelVaeDecoder text_encoder: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None text_encoder_2: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None unet: torch.jit._script.ScriptModule | NeuronModelUnet | None = None transformer: torch.jit._script.ScriptModule | NeuronModelTransformer | None = None vae_encoder: torch.jit._script.ScriptModule | NeuronModelVaeEncoder | None = None image_encoder: torch.jit._script.ScriptModule | None = None safety_checker: torch.jit._script.ScriptModule | None = None tokenizer: transformers.models.clip.tokenization_clip.CLIPTokenizer | transformers.models.t5.tokenization_t5.T5Tokenizer | None = None tokenizer_2: transformers.models.clip.tokenization_clip.CLIPTokenizer | None = None feature_extractor: transformers.models.clip.feature_extraction_clip.CLIPFeatureExtractor | None = None controlnet: torch.jit._script.ScriptModule | list[torch.jit._script.ScriptModule]| NeuronControlNetModel | NeuronMultiControlNetModel | None = None requires_aesthetics_score: bool = False force_zeros_for_empty_prompt: bool = True add_watermarker: bool | None = None model_save_dir: str | pathlib.Path | tempfile.TemporaryDirectory | None = None model_and_config_save_paths: dict[str, tuple[str, pathlib.Path]] | None = None )

__call__

< >

( *args **kwargs )

NeuronStableDiffusionInpaintPipeline

class optimum.neuron.NeuronStableDiffusionInpaintPipeline

< >

( config: dict[str, typing.Any] configs: dict[str, 'PretrainedConfig'] neuron_configs: dict[str, 'NeuronDefaultConfig'] data_parallel_mode: typing.Literal['none', 'unet', 'transformer', 'all'] scheduler: diffusers.schedulers.scheduling_utils.SchedulerMixin | None vae_decoder: torch.jit._script.ScriptModule | NeuronModelVaeDecoder text_encoder: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None text_encoder_2: torch.jit._script.ScriptModule | NeuronModelTextEncoder | None = None unet: torch.jit._script.ScriptModule | NeuronModelUnet | None = None transformer: torch.jit._script.ScriptModule | NeuronModelTransformer | None = None vae_encoder: torch.jit._script.ScriptModule | NeuronModelVaeEncoder | None = None image_encoder: torch.jit._script.ScriptModule | None = None safety_checker: torch.jit._script.ScriptModule | None = None tokenizer: transformers.models.clip.tokenization_clip.CLIPTokenizer | transformers.models.t5.tokenization_t5.T5Tokenizer | None = None tokenizer_2: transformers.models.clip.tokenization_clip.CLIPTokenizer | None = None feature_extractor: transformers.models.clip.feature_extraction_clip.CLIPFeatureExtractor | None = None controlnet: torch.jit._script.ScriptModule | list[torch.jit._script.ScriptModule]| NeuronControlNetModel | NeuronMultiControlNetModel | None = None requires_aesthetics_score: bool = False force_zeros_for_empty_prompt: bool = True add_watermarker: bool | None = None model_save_dir: str | pathlib.Path | tempfile.TemporaryDirectory | None = None model_and_config_save_paths: dict[str, tuple[str, pathlib.Path]] | None = None )

__call__

< >

( *args **kwargs )

您希望我們在 🤗Optimum-neuron 中支援其他擴散功能嗎?請在 Optimum-neuron Github 倉庫 中提交問題或在 HuggingFace 社群論壇 上與我們討論,謝謝 🤗!

© . This site is unofficial and not affiliated with Hugging Face, Inc.