Haoye Zhang

About Me

I am an AI Research & Development Engineer at Tianyi Cloud, China Telecom, specializing in multimodal large language models. My research focuses on trustworthy multimodal learning, particularly hallucination mitigation and alignment techniques such as RLHF for large vision-language models, and currently working on agent memory system and AI agents products.

During my academic studies, I contributed to the development of the MiniCPM-V/o series and streaming interactive multimodal models, advancing efficient and interactive multimodal systems. My research work has been published in prestigious conferences including CVPR and ICLR, covering multimodal alignment, efficient model design and open-source multimodal learning.

My long-term research goal is to build reliable, efficient and practical multimodal systems, addressing real-world challenges in industrial cloud scenarios and promoting the application of responsible multimodal artificial intelligence.

Selected Publications

More Publications

Open-source Projects

Experience

[2025 - Present] Algorithm Engineer, China Telecom
[2022 - 2025] Master of Computer Science, Tsinghua University - RLHF in Multimodal LLMs
[2017 - 2022] Bachelor of Automation (primary), Industrial Design (secondary), Tsinghua University

Contact

Email: hy_zhang1028@163.com

Location: Beijing, China

GitHub LinkedIn

About Me

Selected Publications

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-Grained Correctional Human Feedback

[Paper Title 3]

Open-source Projects

MiniCPM-V/o

RLAIF-V dataset

Experience

Contact