nateraw/stable-diffusion-videos
Fork: 423 Star: 4454 (更新于 2024-11-17 20:58:37)
license: Apache-2.0
Language: Python .
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
最后发布版本: v0.9.0 ( 2024-05-07 11:19:11)
stable-diffusion-videos
Example - morphing between "blueberry spaghetti" and "strawberry spaghetti"
Installation
pip install stable_diffusion_videos
Usage
Check out the examples folder for example scripts 👀
Making Videos
Note: For Apple M1 architecture, use torch.float32
instead, as torch.float16
is not available on MPS.
from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch
pipeline = StableDiffusionWalkPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float16,
).to("cuda")
video_path = pipeline.walk(
prompts=['a cat', 'a dog'],
seeds=[42, 1337],
num_interpolation_steps=3,
height=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
width=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
output_dir='dreams', # Where images/videos will be saved
name='animals_test', # Subdirectory of output_dir where images/videos will be saved
guidance_scale=8.5, # Higher adheres to prompt more, lower lets model take the wheel
num_inference_steps=50, # Number of diffusion steps per image generated. 50 is good default
)
Making Music Videos
New! Music can be added to the video by providing a path to an audio file. The audio will inform the rate of interpolation so the videos move to the beat 🎶
from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch
pipeline = StableDiffusionWalkPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float16,
).to("cuda")
# Seconds in the song.
audio_offsets = [146, 148] # [Start, end]
fps = 30 # Use lower values for testing (5 or 10), higher values for better quality (30 or 60)
# Convert seconds to frames
num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]
video_path = pipeline.walk(
prompts=['a cat', 'a dog'],
seeds=[42, 1337],
num_interpolation_steps=num_interpolation_steps,
audio_filepath='audio.mp3',
audio_start_sec=audio_offsets[0],
fps=fps,
height=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
width=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
output_dir='dreams', # Where images/videos will be saved
guidance_scale=7.5, # Higher adheres to prompt more, lower lets model take the wheel
num_inference_steps=50, # Number of diffusion steps per image generated. 50 is good default
)
Using the UI
from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface
import torch
pipeline = StableDiffusionWalkPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float16,
).to("cuda")
interface = Interface(pipeline)
interface.launch()
Credits
This work built off of a script shared by @karpathy. The script was modified to this gist, which was then updated/modified to this repo.
Contributing
You can file any issues/feature requests here
Enjoy 🤗
最近版本更新:(数据更新于 2024-09-21 09:05:17)
2024-05-07 11:19:11 v0.9.0
2023-01-21 06:00:59 v0.8.1
2023-01-07 04:41:56 v0.8.0
2022-12-06 02:14:05 v0.7.1
2022-12-06 01:56:06 v0.7.0
2022-10-26 04:55:56 v0.6.2
2022-10-22 11:49:14 v0.6.1
2022-10-20 11:26:54 v0.6.0
2022-10-11 09:57:17 v0.5.3
2022-10-11 02:06:20 v0.5.2
主题(topics):
ai-art, huggingface, huggingface-diffusers, machine-learning, stable-diffusion
nateraw/stable-diffusion-videos同语言 Python最近更新仓库
2024-11-22 02:39:01 goauthentik/authentik
2024-11-22 00:03:47 comfyanonymous/ComfyUI
2024-11-21 22:06:18 rashevskyv/dbi
2024-11-21 21:09:02 xtekky/gpt4free
2024-11-21 20:03:58 ultralytics/ultralytics
2024-11-21 00:54:04 hect0x7/JMComic-Crawler-Python