Yibing Song

宋奕兵

Deputy Chief Engineer (AI)
BYD Group

Email: yibingsong.cv at gmail dot com

Biography


I oversee the AI systems in BYD electric vehicles. Previously, I held positions in Academia (i.e., Fudan University as a faculty member) and Industry (i.e., Alibaba DAMO Academy, and Tencent AI Lab as a research scientist). I got my PhD/MPhil degrees from City University of Hong Kong during which I visited Adobe Research and UC Merced, and got my bachelor degree from University of Science and Technology of China. My expertise resides in computer vision and machine learning, with 60+ premier papers (i.e., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, PAMI, IJCV) published and 10k+ citations gathered. Specifically, I am experienced in multi-modal AI, from model-centric, data-centric, and human-centric perspectives, with applications centered around computer vision. I am an IEEE senior member, and have been elected among the Top 2% Scientists worldwide by Stanford University.


Professional Activities


Area Chairs / Meta Reviewers: CVPR (2025,2024,2023), ICCV (2025,2023)
NeurIPS (2025,2024,2023,2022), ICML (2025,2024,2023), ICLR (2025,2024,2023,2022)

Outstanding / Top Reviewers: CVPR (2020,2019,2018), ECCV 2022, NeurIPS 2019


Shortlisted Publications   [More] [Citations]


LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
Guowei Xu, Peng Jin, Li Hao, Yibing Song, Lichao Sun, and Li Yuan,
Arxiv 2024
Paper / Project
Re-Aligning Language to Visual Objects with an Agentic Workflow
Yuming Chen, Jiangyan Feng, Haodong Zhang, Lijun Gong, Feng Zhu, Rui Zhao, Qibin Hou, Ming-Ming Cheng, and Yibing Song,
International Conference on Learning Representations (ICLR) 2025
Paper / Project
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Shentong Mo, and Yibing Song,
Advances in Neural Information Processing Systems (NeurIPS) 2024
Paper / Project
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen, Peize Sun, Yibing Song, and Ping Luo,
IEEE/CVF International Conference on Computer Vision (ICCV) 2023 (Best Paper Nominee)
Paper / Project
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong, Yibing Song, Jue Wang, and Limin Wang,
Advances in Neural Information Processing Systems (NeurIPS) 2022 (Spotlight)
Paper / Project / Hugging Face Repo
Ranked 8th in most influential NIPS 2022 papers / Ranked 39th in most cited 2022 AI papers