v1.2.0
版本发布时间: 2024-07-25 14:28:12
PKU-YuanGroup/Open-Sora-Plan最新发布版本:v1.3.1(2024-10-22 19:20:59)
v1.2.0 is here! Utilizing a 3D full attention architecture instead of 2+1D. We released a true 3D video diffusion model trained on 4s 720p.
- Architecture shift from 2+1D model to 3D full attention architecture and no longer supports 2+1D.
- Instead of joint image-video training, the image weights are trained first as the initialization for the video.
- Release all data annotations, the data are filtered by aesthetic and motion.
- Improve CasualVideoVAE performance and report performance on validation set of WebVid and Panda70M.
Although the 3D attention architecture excels in spatio-temporal consistency, it is so expensive to train that it is difficult to scale up. We hope to collaborate with the open-source community to optimize the 3D DiT architecture. For further details, please refer to our report.