728x90
'Paper Review' 카테고리의 다른 글
Paper Review: Why Transformers need Adam : A Hessian Perspective (0) | 2025.04.16 |
---|---|
Adam can converge without any modification on Update rules (0) | 2023.07.04 |
Sharpness-Aware Minimization (3) | 2023.06.26 |
댓글