XiangpengYang commited on
Commit
cb56f2f
·
1 Parent(s): 7756f3d
Files changed (2) hide show
  1. README.md +89 -6
  2. config.json +5 -0
README.md CHANGED
@@ -82,15 +82,98 @@ To use these weights, please refer to the official [GitHub Repository](https://g
82
  ### Installation
83
 
84
  ```bash
85
- git clone [https://github.com/knightyxp/VideoCoF](https://github.com/knightyxp/VideoCoF)
86
  cd VideoCoF
87
 
88
- # Create environment
89
  conda create -n videocof python=3.10
90
  conda activate videocof
91
 
92
- # Install PyTorch (adjust for your CUDA version)
93
- pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
 
94
 
95
- # Install dependencies
96
- pip install -r requirements.txt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  ### Installation
83
 
84
  ```bash
85
+ git clone https://github.com/knightyxp/VideoCoF
86
  cd VideoCoF
87
 
88
+ # 1. Create and activate a conda environment
89
  conda create -n videocof python=3.10
90
  conda activate videocof
91
 
92
+ # 2. Install PyTorch (Choose version compatible with your CUDA)
93
+ # For standard GPUs (CUDA 12.1):
94
+ pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
95
 
96
+ # For Hopper GPUs (e.g., H100/H800) requiring fast inference:
97
+ # pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
98
+
99
+ # 3. Install other dependencies
100
+ pip install -r requirements.txt
101
+ ```
102
+
103
+ **Note on Flash Attention:**
104
+ We recommend using **FlashAttention-3** (currently beta) for optimal performance, especially on NVIDIA H100/H800 GPUs.
105
+ If you are using these GPUs, please follow the [official FlashAttention-3 installation guide](https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#flashattention-3-beta-release) after installing the compatible PyTorch version (e.g., PyTorch 2.8 + CUDA 12.8).
106
+
107
+ ### Download Models
108
+
109
+ * **Wan-2.1-T2V-14B Pretrained Weights:**
110
+
111
+ ```bash
112
+ git lfs install
113
+ git clone https://huggingface.co/Wan-AI/Wan2.1-T2V-14B
114
+
115
+ # Or using huggingface-cli:
116
+ # hf download Wan-AI/Wan2.1-T2V-14B --local-dir Wan2.1-T2V-14B
117
+ ```
118
+
119
+ * **VideoCoF Checkpoint:**
120
+
121
+ ```bash
122
+ git lfs install
123
+ git clone https://huggingface.co/XiangpengYang/VideoCoF videocof_weight
124
+
125
+ # Or using huggingface-cli:
126
+ # hf download XiangpengYang/VideoCoF --local-dir videocof_weight
127
+ ```
128
+
129
+ ### Inference
130
+
131
+ ```bash
132
+ export CUDA_VISIBLE_DEVICES=0
133
+ torchrun --nproc_per_node=1 inference.py \
134
+ --video_path assets/two_man.mp4 \
135
+ --prompt "Remove the young man with short black hair wearing black shirt on the left." \
136
+ --output_dir results/obj_rem \
137
+ --model_name /scratch3/yan204/models/Wan2.1-T2V-14B \
138
+ --seed 0 \
139
+ --num_frames 33 \
140
+ --source_frames 33 \
141
+ --reasoning_frames 4 \
142
+ --repeat_rope \
143
+ --videocof_path videocof_weight/videocof.safetensors
144
+ ```
145
+
146
+ For parallel inference:
147
+
148
+ ```bash
149
+ sh scripts/parallel_infer.sh
150
+ ```
151
+
152
+ ## 🙏 Acknowledgments
153
+
154
+ We thank the authors of related works and the open-source community [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun) and [Wan](https://github.com/Wan-Video/Wan2.1) for their contributions.
155
+
156
+ ## 📜 License
157
+
158
+ This project is licensed under the [Apache License 2.0](LICENSE).
159
+
160
+ ## 📮 Contact
161
+
162
+ For any questions, please feel free to reach out to the author Xiangpeng Yang [@knightyxp](https://github.com/knightyxp), email: knightyxp@gmail.com/Xiangpeng.Yang@student.uts.edu.au
163
+
164
+ ## 📄 Citation
165
+
166
+ If you find this work useful for your research, please consider citing:
167
+
168
+ ```bibtex
169
+ @article{yang2025videocof,
170
+ title={Unified Video Editing with Temporal Reasoner},
171
+ author={Yang, Xiangpeng and Xie, Ji and Yang, Yiyuan and Huang, Yan and Xu, Min and Wu, Qiang},
172
+ journal={arXiv preprint arXiv:2400.00000},
173
+ year={2025}
174
+ }
175
+ ```
176
+
177
+ <div align="center">
178
+ ❤️ **If you find this project helpful, please consider giving it a like!** ❤️
179
+ </div>
config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "name": [
3
+ "VideoCoF"
4
+ ]
5
+ }