About me

Welcome to my homepage! I am an assitant professor at UIUC, studying optimization and machine learning (especially deep learning). Before joining UIUC, I was a visiting scientist at FAIR (Facebook AI Research). I was a postdoc at Stanford, obtained PhD from Univ. of Minnesota, and BS in math from Peking University.

Recently, I have been studying optimization in deep learning, such as landscape of neural-nets, GANs and Adam. I have written a survey “optimization for deep learning: theory and algorithms”. The study of neural networks is an extension of my research on non-convex optimization for machine learning since PhD. My thesis is on non-convex matrix completion, and I provided one of the first geometrical analysis. Another direction I’ve been studying is the computation/iteration complexity of optimization algorithms, especially Adam, ADMM and coordinate descent.

Perspective and current students interested in optimization/ML/AI are welcome to contact me. Undergraduate interns and visiting students/scholars are also welcome. Master students may check “advised master projects” in “Publications”.

To contact me, click the “email” icon on the left panel.

Assistant Professor

Department of Industrial and Enterprise Systems Engineering
Coordinated Science Lab (affiliated)
Department of Electrical and Computer Engineering (affiliated)
University of Illinois at Urbana-Champaign

Professional Expeirence

Visiting Researcher, Facebook Artificial Intelligence Research, 2016.06-2016.12.
Post-doctoral Scholar, Dept. of Management Science and Engineering, Stanford University (host: Yinyu Ye), 2015-2016.


Ph.D. Electrical Engineering, University of Minnesota, 2009-2015.
B.Sc. in Mathematics, Peking University, Beijing, China, 2005-2009.

Research Interests

  • Optimization for deep learning: lanscape analysis of neural-nets, GANs, Adam, adversarial robustness, etc.
  • Non-convex optimization for machine learning: neural networks, matrix factorization, etc.
  • Large-scale optimization: ADMM, coordinate descent, adaptive gradient methods, etc.
  • Other research interests: Information theory and wireless communications, such as interference alignment and base station association.

Selected works


  • Our paper [RMSprop can converge with proper hyper-parameter] (Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun) has been accepted to ICLR 2021 as Spotlight.
  • Sep 2020: Our paper Towards a better global loss landscape of GANs (joint with Tiantian Fang, Alex Schwing) is accepted to NeurIPS 2020 as oral paper (1.1% of 9454 submissions).
  • Sep 2020: Our paper Towards a better global loss landscape of GANs (joint with Tiantian Fang, Alex Schwing) is accepted to NeurIPS 2020 as oral paper (1.1% of 9454 submissions).
  • Sep 2020: Our paper https://arxiv.org/abs/2010.15768 (joint with Jiawei Zhang, Peijun Xiao, Zhi-Quan Luo) is accpeted to NeurIPS 2020.
  • Jun 2020: Our survey “[On the global landscape of neural networks: an overview]” (joint with Srikant, Shiyu Liang, Dawei Li, Tian Ding) has appeared in IEEE SPM (signal processing magazine).
  • Dec 2019: my survey “optimization for deep learning: theory and algorithms” is available at arxiv https://arxiv.org/abs/1912.08957 Comments are welcome.
  • Oct 2019: our paper “spurious local minima exist for almost all over-parameterized neural networks” is available at optimizaiton online.
  • Oct 2019: our paper “understanding two symmeterized orders by worst-case complexity” is available at arxiv.
  • Sep 2019: our paper “Worst-case complexity of cyclic coordinate descent: O(n^2) gap with randomized versions” is accepted by MP (Mathematical Programming).
  • Aug 2019: I organized a session on “Non-convex optimization for neural networks” at ICCOPT 2019, the triennial conference of contiuous optimization.
  • Mar 2019: our paper Max-Sliced Wasserstein Distance and its use for GANs is accepted by CVPR 2019 as Oral. 
  • Jan 2019: our paper on Adam-type methods (joint with Xiangyi Chen, Sijia Liu and Mingyi Hong) is accepted by ICLR 2019. 
  • Dec 2018: our paper “On the efficiency of random permutation for ADMM and coordinate descent” is accepted by MOR (Mathematics of Operations Research).
  • Nov 2018: I gave a talk at Stanford ISL Colloquium (see slides here) on Nov 8, and Google Brain on Nov 9. 
  • Nov 2018: I organized a session on non-convex optimization for machine learning at INFORMS annual meeting
  • Oct 2018: our paper that says adding a neuron can eliminate all bad local minima will appear at NIPS 2018

Professional Services

Area chair for ICLR, NeurIPS, AISTATS, ICML.

Reviewer for

  • Machine learning and computer science:  ICLR, NeurIPS, ICML, COLT, FOCS, AISTATS, JMLR, Neural computation
  • Optimization area: Mathematical Programming, SIAM Journal on Optimization, SIAM Journal on Computing, Pacific Journal of Optimization.
  • Signal processing and information theory: IEEE Transaction on Information Theory, IEEE Transaction on Signal Processing, SPAWC, ICASSP