MyGit

v0.29.1

huggingface/diffusers

版本发布时间: 2024-06-21 09:50:52

huggingface/diffusers最新发布版本:v0.31.0(2024-10-22 22:15:27)

SD3 CntrolNet

image
import torch
from diffusers import StableDiffusion3ControlNetPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
from diffusers.utils import load_image

controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Canny", torch_dtype=torch.float16)

pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.to("cuda")
control_image = load_image("https://huggingface.co/InstantX/SD3-Controlnet-Canny/resolve/main/canny.jpg")
prompt = "A girl holding a sign that says InstantX"
image = pipe(prompt, control_image=control_image, controlnet_conditioning_scale=0.7).images[0]
image.save("sd3.png")

📜 Refer to the official docs here to learn more about it.

Thanks to @haofanwang @wangqixun from the @ResearcherXman team for contributing this pipeline!

Expanded single file support

We now support all available single-file checkpoints for sd3 in diffusers! To load the single file checkpoint with t5

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_single_file(
    "https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/sd3_medium_incl_clips_t5xxlfp8.safetensors",
    torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

image = pipe("a picture of a cat holding a sign that says hello world").images[0]
image.save('sd3-single-file-t5-fp8.png')

Using Long Prompts with the T5 Text Encoder

We increased the default sequence length for the T5 Text Encoder from a maximum of 77 to 256! It can be adjusted to accept fewer or more tokens by setting the max_sequence_length to a maximum of 512. Keep in mind that longer sequences require additional resources and will result in longer generation times. This effect is particularly noticeable during batch inference.

prompt = "A whimsical and creative image depicting a hybrid creature that is a mix of a waffle and a hippopotamus. This imaginative creature features the distinctive, bulky body of a hippo, but with a texture and appearance resembling a golden-brown, crispy waffle. The creature might have elements like waffle squares across its skin and a syrup-like sheen. It’s set in a surreal environment that playfully combines a natural water habitat of a hippo with elements of a breakfast table setting, possibly including oversized utensils or plates in the background. The image should evoke a sense of playful absurdity and culinary fantasy."

image = pipe(
    prompt=prompt,
    negative_prompt="",
    num_inference_steps=28,
    guidance_scale=4.5,
    max_sequence_length=512,
).images[0]
Before max_sequence_length=256 max_sequence_length=512
20240612204503_2888268196 20240612204440_2888268196 20240613195139_569754043

All commits

Significant community contributions

The following contributors have made significant changes to the library over the last release:

相关地址:原始地址 下载(tar) 下载(zip)

查看:2024-06-21发行的版本