Layer-sequential unit-variance (LSUV) initialization - Tech It Yourself


Wednesday, 14 August 2019

Layer-sequential unit-variance (LSUV) initialization

This is a simple method for weight initialization for deep net learning. The method consists of the two steps:
- First, pre-initialize weights of each convolution or inner-product layer with
orthonormal matrices.
- Second, proceed from the first to the final layer, normalizing the variance of the output of each layer to be equal to one.
Experiment with different activation functions (maxout, ReLU-family, tanh) show
that the proposed initialization leads to learning of very deep nets.
Pseudo-code of LSUV

- In the most cases, batch normalization put after non-linearity performs better.
- LSUV-initialized network is as good as batch-normalized one.
- The paper are not claiming that batch normalization can always be replaced by proper initialization, especially in large datasets like ImageNet.

No comments:

Post a Comment

Thường mất vài phút để quảng cáo xuất hiện trên trang nhưng thỉnh thoảng, việc này có thể mất đến 1 giờ. Hãy xem hướng dẫn triển khai mã của chúng tôi để biết thêm chi tiết. Ðã xong