Skip to content

Latest commit

 

History

History
54 lines (54 loc) · 2.17 KB

2021-03-18-arnold21a.md

File metadata and controls

54 lines (54 loc) · 2.17 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
When MAML Can Adapt Fast and How to Assist When It Cannot
Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. Nonetheless, we have just started to understand and analyze how they are able to adapt fast to new tasks. In this work, we contribute by conducting a series of empirical and theoretical studies, and discover several interesting, previously unknown properties of the algorithm. First, we find MAML adapts better with a deep architecture even if the tasks need only a shallow one. Secondly, linear layers can be added to the output layers of a shallower model to increase the depth without altering the modelling capacity, leading to improved performance in adaptation. Alternatively, an external and separate neural network meta-optimizer can also be used to transform the gradient updates of a smaller model so as to obtain improved performances in adaptation. Drawing from these evidences, we theorize that for a deep neural network to meta-learn well, the upper layers must transform the gradients of the bottom layers as if the upper layers were an external meta-optimizer, operating on a smaller network that is composed of the bottom layers.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
arnold21a
0
When MAML Can Adapt Fast and How to Assist When It Cannot
244
252
244-252
244
false
Arnold, S{\'e}bastien M. R. and Iqbal, Shariq and Sha, Fei
given family
Sébastien
Arnold
given family
Shariq
Iqbal
given family
Fei
Sha
2021-03-18
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
130
inproceedings
date-parts
2021
3
18