|
Xiaopeng Li
Research
Ph.D. Advisor: Professor Cédric Josz
My research interest lies in nonconvex optimization, differential equations, semi-algebraic geometry, with applications in low rank matrix recovery, deep neural networks, and large language model reasoning.
Openings
I am actively recruiting PhD students who are interested in optimization for machine learning, nonconvex/nonsmooth optimization, or LLM reasoning. Strong mathematical background and solid programming skills are important; prior research experience is a plus.
Publications
Journals
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO, with Peter L. Chen, Ziniu Li, Xi Chen and Tianyi Lin
Transactions on Machine Learning Research (TMLR), 2026. Preprint on ArXiv: 2505.11595.
Singular perturbation in heavy ball dynamics, with Cédric Josz
Journal of Dynamics and Differential Equations, 2024. Preprint on ArXiv: 2407.15044.
Convergence of the momentum method for semi-algebraic functions with locally Lipschitz gradients, with Cédric Josz and Lexiao Lai
SIAM Journal on Optimization, 2023. Preprint on ArXiv: 2307.03331.
Certifying the absence of spurious local minima at infinity, with Cédric Josz
SIAM Journal on Optimization, 2023. Preprint on ArXiv: 2303.03536.
Conference
Reward-free Alignment for Conflicting Objectives, with Peter L. Chen, Xi Chen and Tianyi Lin
Proceedings of the International Conference on Machine Learning (ICML), 2026 (Oral, top 0.7%). Preprint on ArXiv: 2602.02495.
Exploration vs exploitation: Rethinking rlvr through clipping, entropy, and spurious reward, with Peter L. Chen, Ziniu Li, Wotao Yin, Xi Chen and Tianyi Lin
Proceedings of the International Conference on Learning Representations (ICLR), 2026. Preprint on ArXiv: 2512.16912.
|