AnimateAnything: Consistent and Controllable Animation for video generation

摘要

We present a unified controllable video generation approach AnimateAnything that facilitates precise and consistent video manipulation across various conditions, including camera trajectories, text prompts, and user motion annotations. Specifically, we carefully design a multi-scale control feature fusion network to construct a common motion representation for different conditions. It explicitly converts all control information into frame-by-frame optical flows. Then we incorporate the optical flows as motion priors to guide final video generation. In addition, to reduce the flickering issues caused by large-scale motion, we propose a frequency-based stabilization module. It can enhance temporal coherence by ensuring the video’s frequency domain consistency. Experiments demonstrate that our method outperforms the state-of-the-art approaches. For more details and videos, please refer to the webpage

类型
出版物
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Chi Wang 王驰
Chi Wang 王驰
博士

我的研究领域涉及人工智能内容生成、语义分割、抠图、以及新视角合成。