Academic Homepage
Qi Mao
毛琪
Professor, School of Information and Communication Engineering
State Key Laboratory of Media Convergence and Communication
Communication University of China, Beijing, China
Qi Mao works on controllable image and video generation, image editing, multimedia intelligence, and image-video compression based on generative models.
Biography
Qi Mao is currently a Professor at the School of Information and Communication Engineering and the State Key Laboratory of Media Convergence and Communication, Communication University of China.
She received her Ph.D. degree from Peking University in July 2021, where she worked with Prof. Siwei Ma. Prior to that, she obtained both the B.E. degree in Digital Media Technology and the B.A. degree in Journalism from Communication University of China in 2016.
She was also a visiting Ph.D. student at the Vision and Learning Lab, University of California, Merced, under the supervision of Prof. Ming-Hsuan Yang, and a visiting scholar at the National University of Singapore, where she collaborated with Prof. Mike Zheng Shou.
News
- Notice [招生信息] 目前 CUC-MIPG 2026 年入学的硕士生还有 1 个名额,2027 年入学的硕士生正在招生,欢迎联系。
- April 2026 One paper was accepted to TCSVT 2026.
- March 2026 One paper was accepted to TAFFC 2026.
- March 2026 One paper was accepted to TMM 2026.
- Feb 2026 One paper was accepted to CVPR 2026.
- Nov 2025 One paper was accepted to WACV 2026.
- Sep 2025 Work on ultra-low bitrate image compression enabled by multimodal large foundation models appeared in IEEE Transactions on Image Processing.
- Oct 2025 Undergraduate recruitment remains open.
Research Interests
- Controllable image and video generation and editing
- Agentic workflow of AIGC
- Generative image and video compression
Selected Publications
Full publication list: Google Scholar
2026
- UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models. Lan Chen, Yuchao Gu, Qi Mao. WACV 2026. Paper, Code
- Generative Neural Video Compression via Video Diffusion Prior. Qi Mao, Hao Cheng, Tinghan Yang, Libiao Jin, Siwei Ma. CVPR 2026. Paper
- StarVid: Enhancing Semantic Alignment in Video Diffusion Models via Spatial and Syntactic Guided Attention Refocusing. Yuanhang Li, Qi Mao, Lan Chen, Zhen Fang, Lei Tian, Xinyan Xiao, Libiao Jin, Hua Wu. IEEE Transactions on Multimedia, 2026. Paper
- EmoAgent: A Multi-Agent Framework for Diverse Affective Image Manipulation. Qi Mao, Haobo Hu, Yujie He, Difei Gao, Haokun Chen, Libiao Jin. IEEE Transactions on Affective Computing, 2026. Paper
2025
- Exploring Multimodal Knowledge for Image Compression via Large Foundation Models. Junlong Gao, Zhimeng Huang, Qi Mao(*), Siwei Ma, Chuanmin Jia(*). IEEE Transactions on Image Processing, 2025. DOI
- StarVid: Enhancing Semantic Alignment in Video Diffusion Models via Spatial and Syntactic Guided Attention Refocusing. Yuanhang Li, Qi Mao(*), Lan Chen, Zhen Fang, Lei Tian, Xinyan Xiao, Libiao Jin, Hua Wu. IEEE Transactions on Multimedia. Paper
- Edit Transfer: Learning Image Editing via Vision In-Context Relations. Lan Chen, Qi Mao, Yuchao Gu, Mike Zheng Shou. arXiv preprint. Paper, Project, Code
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model. Qi Mao, Lan Chen, Yuchao Gu, Mike Zheng Shou, Ming-Hsuan Yang. arXiv preprint. Paper, Code
2024
- MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance. Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou. ACM MM 2024. Paper, Project, Code
- Extreme Image Compression using Fine-tuned VQGANs. Qi Mao, Tinghan Yang, Yinuo Zhang, Zijian Wang, Meng Wang, Shiqi Wang, Siwei Ma. DCC 2024. Paper, Code
- Unifying Generation and Compression: Ultra-low Bitrate Image Coding Via Multi-stage Transformer. Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma. ICME 2024. Paper
- Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement. Liyuan Zhu, Wenhan Yang, Bing Chen, Haoyu Zhu, Zhihui Ni, Qi Mao, Shiqi Wang. ECCV 2024.
Selected Earlier Work
- ZGaming: Zero-Latency 3D Cloud Gaming by Image Prediction. Jiaming Wu, Yibo Guan, Qi Mao, Yuchen Cui, Zhisheng Guo, Xiaodong Zhang. ACM SIGCOMM 2023.
- Scalable Face Image Coding via StyleGAN Prior: Toward Compression for Human-Machine Collaborative Vision. Qi Mao, Chongyu Wang, Meng Wang, Shiqi Wang, Ruijie Chen, Libiao Jin, Siwei Ma. IEEE Transactions on Image Processing, 2023. Paper
- Enhancing Style-Guided Image-to-Image Translation via Self-Supervised Metric Learning. Qi Mao, Siwei Ma. IEEE Transactions on Multimedia, 2023. Paper
- Semantic-Aware Visual Decomposition for Image Coding. Jianhui Chang, Jian Zhang, Jing Li, Shiqi Wang, Qi Mao, Chuanmin Jia, Siwei Ma, Wen Gao. International Journal of Computer Vision, 2023.
- Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors. Qi Mao, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Siwei Ma, Ming-Hsuan Yang. International Journal of Computer Vision, 2022. Paper, Code
- Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis. Q. Mao, H.-Y. Lee, H.-Y. Tseng, S. Ma, M.-H. Yang. CVPR 2019. Paper, Project
Honors & Awards
Awards
- 2023 年度北京图象图形学学会优秀博士论文奖 (2023 Beijing Society of Image and Graphics Outstanding Doctoral Dissertation Award), 2023.
Talent Programs
- 全国广播电视和网络视听行业青年创新人才工程 (National Radio and Television and Online Audiovisual Industry Youth Innovation Talent Project), 2024.
- 微软铸星学者计划 (Microsoft Rising Star Program), 2024.
Academic Service
- ICCV 2025 Area Chair.
- CVPR 2025 Area Chair.
- ICML 2026 Area Chair.
- ECCV 2026 Area Chair.
Selected Projects
- 国家自然科学基金面上项目 (National Natural Science Foundation of China General Program), 62471445, 基于离散特征表示与生成式模型的极限编码理论与方法研究 (Theory and methods of extreme coding based on disentangled feature representations and generative models), 在研,主持。
- 国家自然科学基金青年基金项目 (National Natural Science Foundation of China Young Scientists Fund), 62201526, 基于分层特征表示的人机协同视频编码研究 (Human-machine collaborative video coding based on layered feature representations), 在研,主持。
- 国家重点研发计划 (National Key R&D Program), 2022YFF0902402, 沉浸式文旅体验技术集成与场景创新 (Immersive cultural tourism experience technology integration and scene innovation), 在研,骨干成员。
- 多媒体信息处理全国重点实验室开放课题 (Open Research Project of the State Key Laboratory of Media Information Processing), SKLMIP-KF-2025-04, 融合时空语义控制的文本驱动扩散视频编辑方法 (Text-driven diffusion video editing with integrated temporal-spatial semantic control), 在研,主持。
- 中国传媒大学“三国”专项项目 (Communication University of China “San Guo” Special Project), CUC25SG008, 高流行度短视频特征解析及生成技术研究 (High-popularity short-video feature analysis and generation technology), 在研。
- 百度 NLP 学术合作 (Baidu NLP Academic Collaboration), HG23056, 结题,主持。
- 中国传媒大学“三国”专项项目 (Communication University of China “San Guo” Special Project), CUC24SG015, 基于情感引导的智能媒体内容可控生成 (Emotion-guided controllable generation of intelligent media content), 结题。
- 媒体融合与传播国家重点实验室专项科研项目 (Special Research Project of the State Key Laboratory of Media Convergence and Communication), CUC22GZ035, 深度学习人脸生成与鉴伪方法研究、音视频鉴伪系统 (Deep-learning-based face generation and forgery detection methods and audiovisual forgery systems), 结题,主持。
- 媒体融合与传播国家重点实验室专项科研项目 (Special Research Project of the State Key Laboratory of Media Convergence and Communication), CUC23GZ007, 基于 AIGC 的对话多媒体内容生成 (AIGC-based dialog multimedia content generation), 结题,主持。
Openings
Prospective Ph.D., M.S., and undergraduate students are welcome to apply. Students with interests in controllable generation, image and video editing, generative compression, multimedia intelligence, computer vision, graphics, and machine learning are especially encouraged to contact us.
[招生信息] 目前 CUC-MIPG 2026 年入学的硕士生还有 1 个名额,2027 年入学的硕士生正在招生,欢迎联系团队招收硕士生和对科研感兴趣的本科生。 如果你对人工智能、生成式模型、视频编码感兴趣,希望加入我们团队,欢迎与我邮件联系 qimao@cuc.edu.cn, 邮件请附上个人简历。
- Please send your CV, transcript when applicable, and a brief statement of research interests to cuc_mipg@163.com.
- Strong coding ability, sound mathematical preparation, and clear research motivation are highly valued.
- For additional group updates, please follow the WeChat official account: cuc-mipg.