Shanghaoran Quan

I am a senior student in School of Computer Science and Engineering, Beihang University. My research focuses on LLM Alignment and I am currently involved in an internship at Alibaba, Qwen Team, advised by Bowen Yu and An Yang . I also enjoyed my previous experiences participating in Informatics and Mathematics competitions. I will pursue a master's degree in NLP at Wangxuan Institute of Computer Technology, Peking University starting in Fall 2025, advised by Dongyan Zhao and Huishuai Zhang.

Feel free to email me at quanshanghaoran@alibaba-inc.com for any form of academic cooperation!

Selected Papers
Language Models can Self-Lengthen to Generate Long Texts
Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, Junyang Lin
Preprint
[Paper]     [Code]
Self-Lengthen is a novel and effective data-driven technique for extrapolating long output, designed to stimulate long-generation ability from scratch using only the LLM's intrinsic knowledge and skills.
Qwen2.5-Coder Technical Report
Qwen Team
Technical Report
[Collection]     [Paper]     [Code]
Qwen2.5-Coder series includes six models: 0.5B/1.5B/3B/7B/14B/32B and achieves state-of-the-art performance across more than 10 benchmarks, demonstrating impressive code generation capabilities.
Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity
Shanghaoran Quan
AAAI, 2025 || AFM@NeurIPS, 2024
[Paper]     [Code]
We propose a novel and effective method to iteratively split context and derive high-quality query-response pairs for domain-specific SFT data.
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Shanghaoran Quan
ACL Findings, 2024
[Paper]     [Code]
We propose a novel and effective reward modeling method based on MoE, which decomposes the inputs into different capability points under different tasks.
Other Papers
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao, Feifan Song, Yibo Miao, ..., Shanghaoran Quan, ...
Preprint
[Paper]     [Repo]
Aligning CodeLLMs with Direct Preference Optimization
Yibo Miao, Bofei Gao, Shanghaoran Quan, Junyang Lin, Daoguang Zan, Jiaheng Liu, Jian Yang, Tianyu Liu, Zhijie Deng
Preprint
[Paper]
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
Bofei Gao, Feifan Song, Zhe Yang, ..., Shanghaoran Quan, ...
Preprint
[WebPage]     [Paper]
Experience
Alibaba DAMO Academy, Qwen Team
Jul 2024 - Present

Research Intern
Advisor: Bowen Yu and An Yang
Baidu NLP Group, ERNIE-Bot Team
Nov 2023 - May 2024

Research Intern
Advisor: Jun Xu
Beihang University, School of Computer Science and Engineering
Sep 2021 - Present

B.Eng. Student
Honor
  • Three Gold medals in ICPC Asia regional contests (highest rank: 5th) and two Gold medals in CCPC regional contests
  • Undergraduate National Scholarship (rank: 2/1003)
  • First prize in China National Olympiad in Informatics in Provinces
  • First prize in China National Olympiad in Mathematics in Provinces
Academic Service
  • Conference Reviewer: KDD 2024, ICLR 2025

Updated on December 10, 2024.