Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... Journal of machine learning research 21 (140), 1-67, 2020 | 15214 | 2020 |
Merging models with fisher-weighted averaging MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 17703-17716, 2022 | 124 | 2022 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 83 | 2021 |
Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... arXiv preprint arXiv:1910.10683, 2019 | 70 | 2019 |
Exploring the limits of transfer learning with a unified text-to-text transformer A Roberts, C Raffel, K Lee, M Matena, N Shazeer, PJ Liu, S Narang, W Li, ... Google, Tech. Rep., 2019 | 42 | 2019 |
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 22187-22198, 2022 | 1 | 2022 |
NPEFF: Non-Negative Per-Example Fisher Factorization M Matena, C Raffel arXiv preprint arXiv:2310.04649, 2023 | | 2023 |