版本发布时间: 2024-10-16 02:03:38
PKU-YuanGroup/Open-Sora-Plan最新发布版本:v1.3.1(2024-10-22 19:20:59)
In version 1.3.0, Open-Sora-Plan introduced the following five key features:
- A more powerful and cost-efficient WFVAE. We decompose video into several sub-bands using wavelet transforms, naturally capturing information across different frequency domains, leading to more efficient and robust VAE learning.
- Prompt Refiner. A large language model designed to refine short text inputs.
- High-quality data cleaning strategy. The cleaned panda70m dataset retains only 27% of the original data.
- DiT with new sparse attention. A more cost-effective and efficient learning approach.
- Dynamic resolution and dynamic duration. This enables more efficient utilization of videos with varying lengths (treating a single frame as an image).
For further details, please refer to our report.
⚡️⚡️⚡️ For large model parallelisation training, TP & SP and more strategies are coming...近期将新增华为昇腾多模态MindSpeed-MM分支,借助华为MindSpeed-MM套件的能力支撑Open-Sora Plan参数的扩增,为更大参数规模的模型训练提供TP、SP等分布式训练能力。