MyGit

nateraw/stable-diffusion-videos

Fork: 422 Star: 4423 (更新于 2024-10-18 03:15:30)

license: Apache-2.0

Language: Python .

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

最后发布版本: v0.9.0 ( 2024-05-07 11:19:11)

GitHub网址

stable-diffusion-videos

Try it yourself in Colab: Open In Colab

Example - morphing between "blueberry spaghetti" and "strawberry spaghetti"

https://user-images.githubusercontent.com/32437151/188721341-6f28abf9-699b-46b0-a72e-fa2a624ba0bb.mp4

Installation

pip install stable_diffusion_videos

Usage

Check out the examples folder for example scripts 👀

Making Videos

Note: For Apple M1 architecture, use torch.float32 instead, as torch.float16 is not available on MPS.

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=3,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    name='animals_test',        # Subdirectory of output_dir where images/videos will be saved
    guidance_scale=8.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

Making Music Videos

New! Music can be added to the video by providing a path to an audio file. The audio will inform the rate of interpolation so the videos move to the beat 🎶

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

# Seconds in the song.
audio_offsets = [146, 148]  # [Start, end]
fps = 30  # Use lower values for testing (5 or 10), higher values for better quality (30 or 60)

# Convert seconds to frames
num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]

video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=num_interpolation_steps,
    audio_filepath='audio.mp3',
    audio_start_sec=audio_offsets[0],
    fps=fps,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    guidance_scale=7.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

Using the UI

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

interface = Interface(pipeline)
interface.launch()

Credits

This work built off of a script shared by @karpathy. The script was modified to this gist, which was then updated/modified to this repo.

Contributing

You can file any issues/feature requests here

Enjoy 🤗

最近版本更新:(数据更新于 2024-09-21 09:05:17)

2024-05-07 11:19:11 v0.9.0

2023-01-21 06:00:59 v0.8.1

2023-01-07 04:41:56 v0.8.0

2022-12-06 02:14:05 v0.7.1

2022-12-06 01:56:06 v0.7.0

2022-10-26 04:55:56 v0.6.2

2022-10-22 11:49:14 v0.6.1

2022-10-20 11:26:54 v0.6.0

2022-10-11 09:57:17 v0.5.3

2022-10-11 02:06:20 v0.5.2

主题(topics):

ai-art, huggingface, huggingface-diffusers, machine-learning, stable-diffusion

nateraw/stable-diffusion-videos同语言 Python最近更新仓库

2024-11-05 15:03:24 Cinnamon/kotaemon

2024-11-05 11:00:51 home-assistant/core

2024-11-04 23:11:11 DS4SD/docling

2024-11-04 10:56:18 open-compass/opencompass

2024-11-04 08:51:21 yt-dlp/yt-dlp

2024-11-02 04:45:40 princeton-vl/infinigen