streamingt2v
"--- title: StreamingT2V emoji: 🔥 colorFrom: purple colorTo: blue sdk: gradio sdk_version: 4.25.0 app_file: app.py pinned: false short_description: Consistent, Dynamic, and Extendable Long Video Generation --- This repository is the official implementation of StreamingT2V. **StreamingT2V: Consistent..."
Best Scenarios
Technical Constraints
🕸️ Neural Graph Explorer
v15.13📚 Learn More
📈 Interest Trend
* Real-time activity index across HuggingFace, GitHub and Research citations.
🕸️ Neural Graph Explorer
v15.13📚 Learn More
📈 Interest Trend
* Real-time activity index across HuggingFace, GitHub and Research citations.
Benchmark integration for interactive spaces is in preview.
🔬Deep Dive
Expand Details [+]▾
🛠️ Technical Profile
⚡ Hardware & Scale
🌐 Cloud & Rights
🎮 Demo Preview
💻 Usage
pip install gradio git clone https://huggingface.co/spaces/PAIR/streamingt2v Space Overview
StreamingT2V
This repository is the official implementation of StreamingT2V.
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text Roberto Henschel, Levon Khachatryan, Daniil Hayrapetyan, Hayk Poghosyan, Vahram Tadevosyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
arXiv preprint | Video | Project page
StreamingT2V is an advanced autoregressive technique that enables the creation of long videos featuring rich motion dynamics without any stagnation. It ensures temporal consistency throughout the video, aligns closely with the descriptive text, and maintains high frame-level image quality. Our demonstrations include successful examples of videos up to 1200 frames, spanning 2 minutes, and can be extended for even longer durations. Importantly, the effectiveness of StreamingT2V is not limited by the specific Text2Video model used, indicating that improvements in base models could yield even higher-quality videos.
News
* [03/21/2024] Paper StreamingT2V released! * [04/03/2024] Code and model released!
Setup
`` shell
git clone https://github.com/Picsart-AI-Research/StreamingT2V.git
cd StreamingT2V/
2. Install requirements using Python 3.10 and CUDA >= 11.6
3. (Optional) Install FFmpeg if it's missing on your system
Download the weights from HF and put them into the t2v_enhanced/checkpoints` StreamingT2V
This repository is the official implementation of StreamingT2V.
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text Roberto Henschel, Levon Khachatryan, Daniil Hayrapetyan, Hayk Poghosyan, Vahram Tadevosyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
arXiv preprint | Video | Project page
StreamingT2V is an advanced autoregressive technique that enables the creation of long videos featuring rich motion dynamics without any stagnation. It ensures temporal consistency throughout the video, aligns closely with the descriptive text, and maintains high frame-level image quality. Our demonstrations include successful examples of videos up to 1200 frames, spanning 2 minutes, and can be extended for even longer durations. Importantly, the effectiveness of StreamingT2V is not limited by the specific Text2Video model used, indicating that improvements in base models could yield even higher-quality videos.
News
* [03/21/2024] Paper StreamingT2V released! * [04/03/2024] Code and model released!
Setup
`` shell
git clone https://github.com/Picsart-AI-Research/StreamingT2V.git
cd StreamingT2V/
2. Install requirements using Python 3.10 and CUDA >= 11.6
3. (Optional) Install FFmpeg if it's missing on your system
4. Download the weights from HF and put them into thet2v_enhanced/checkpointsdirectory.---
<h2 class="text-xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">Inference</h2>
<h3 class="text-lg font-semibold mt-6 mb-3 text-gray-900 dark:text-white">For Text-to-Video</h3>
To use other base models add the--base_model=AnimateDiffargument. Usepython inference.py --helpfor more options.<h3 class="text-lg font-semibold mt-6 mb-3 text-gray-900 dark:text-white">For Image-to-Video</h3>
<h2 class="text-xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">Results</h2>
Detailed results can be found in the Project page.<h2 class="text-xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">License</h2>
Our code is published under the CreativeML Open RAIL-M license.
We include ModelscopeT2V, AnimateDiff, DynamiCrafter in the demo for research purposes and to demonstrate the flexibility of the StreamingT2V framework to include different T2V/I2V models. For commercial usage of such components, please refer to their original license.
<h2 class="text-xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">BibTeX</h2>
If you use our work in your research, please cite our publication:
3,433 characters total