Follow
Zeyuan Allen-Zhu (朱澤園)
Zeyuan Allen-Zhu (朱澤園)
Other namesZeyuan Allen Zhu
Meta AI / FAIR Labs
Verified email at csail.mit.edu - Homepage
Title
Cited by
Cited by
Year
LoRA: Low-rank adaptation of large language models
EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen
ICLR 2022: International Conference on Learning Representations, 2022
35002022
A convergence theory for deep learning via over-parameterization
Z Allen-Zhu, Y Li, Z Song
ICML 2019: International Conference on Machine Learning, 2019
14602019
Is Q-learning Provably Efficient?
C Jin, Z Allen-Zhu, S Bubeck, MI Jordan
NIPS 2018: Neural Information Processing Systems, 2018
8732018
Learning and generalization in overparameterized neural networks, going beyond two layers
Z Allen-Zhu, Y Li, Y Liang
NeurIPS 2019: Neural Information Processing Systems, 2019
7992019
Katyusha: the first direct acceleration of stochastic gradient methods
Z Allen-Zhu
STOC 2017: Symposium on Theory of Computing, 19-23, 2017
6692017
Variance reduction for faster non-convex optimization
Z Allen-Zhu, E Hazan
ICML 2016: International Conference on Machine Learning, 699-707, 2016
4202016
Linear coupling: An ultimate unification of gradient and mirror descent
Z Allen-Zhu, L Orecchia
ITCS 2017: Innovations in Theoretical Computer Science, 2017
3732017
Finding approximate local minima faster than gradient descent
N Agarwal, Z Allen-Zhu, B Bullins, E Hazan, T Ma
STOC 2017: Symposium on Theory of Computing, 1195-1199, 2017
334*2017
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning
Z Allen-Zhu, Y Li
ICLR 2023: International Conference on Learning Representations, 2023
3162023
Byzantine Stochastic Gradient Descent
D Alistarh, Z Allen-Zhu, J Li
NIPS 2018: Neural Information Processing Systems, 2018
2962018
A simple, combinatorial algorithm for solving SDD systems in nearly-linear time
JA Kelner, L Orecchia, A Sidford, ZA Zhu
STOC 2013: Symposium on Theory of Computing, 911-920, 2013
2882013
Natasha 2: Faster Non-Convex Optimization Than SGD
Z Allen-Zhu
NIPS 2018: Neural Information Processing Systems, 2018
2532018
Improved SVRG for non-strongly-convex or sum-of-non-convex objectives
Z Allen-Zhu, Y Yuan
ICML 2016: International Conference on Machine Learning, 1080-1089, 2016
2252016
Even faster accelerated coordinate descent using non-uniform sampling
Z Allen-Zhu, Z Qu, P Richtárik, Y Yuan
ICML 2016: International Conference on Machine Learning, 1110-1119, 2016
2052016
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Z Allen-Zhu, Y Li
NeurIPS 2019: Neural Information Processing Systems, 2019
2032019
Asymptotically optimal strategy-proof mechanisms for two-facility games
P Lu, X Sun, Y Wang, ZA Zhu
ACM-EC 2010: Conference on Economics and Computation, 315-324, 2010
1862010
On the convergence rate of training recurrent neural networks
Z Allen-Zhu, Y Li, Z Song
NeurIPS 2019: Neural Information Processing Systems, 2019
1852019
Neon2: Finding Local Minima via First-Order Oracles
Z Allen-Zhu, Y Li
NIPS 2018: Neural Information Processing Systems, 2018
1492018
Feature purification: How adversarial training performs robust deep learning
Z Allen-Zhu, Y Li
FOCS 2021: Symposium on Foundations of Computer Science, 977-988, 2022
1432022
LazySVD: Even faster SVD decomposition yet without agonizing pain
Z Allen-Zhu, Y Li
NIPS 2016: Neural Information Processing Systems, 974-982, 2016
1342016
The system can't perform the operation now. Try again later.
Articles 1–20