General

What is the effect of initialization of high weight value in neural network learning?

May 2, 2020 by Author

Table of Contents

1 What is the effect of initialization of high weight value in neural network learning?
2 Why do large weights cause Overfitting?
3 What will happen to the convergence of neural network if weights are initialized to zero?
4 What is weight decay in deep learning?

What is the effect of initialization of high weight value in neural network learning?

👉 Zero initialization causes the neuron to memorize the same functions almost in each iteration. 👉 Random initialization is a better choice to break the symmetry. However, initializing weight with much high or low value can result in slower optimization.

Why do large weights cause Overfitting?

The longer we train the network, the more specialized the weights will become to the training data, overfitting the training data. The weights will grow in size in order to handle the specifics of the examples seen in the training data. Large weights make the network unstable.

Why is zero initialization not recommended for weight initialization?

Zero initialization: If all the weights are initialized to zeros, the derivatives will remain same for every w in W[l]. As a result, neurons will learn same features in each iterations. And not only zero, any constant initialization will produce a poor result.

What’s the effect of initialization on a neural network?

The aim of weight initialization is to prevent layer activation outputs from exploding or vanishing during the course of a forward pass through a deep neural network.

What will happen to the convergence of neural network if weights are initialized to zero?

If you initialize all the weights to be zero, then all the the neurons of all the layers performs the same calculation, giving the same output and there by making the whole deep net useless.

What is weight decay in deep learning?

Weight decay is a regularization technique by adding a small penalty, usually the L2 norm of the weights (all the weights of the model), to the loss function. loss = loss + weight decay parameter * L2 norm of the weights. Some people prefer to only apply weight decay to the weights and not the bias.

Why is initialization of weight important?

Weight initialization is an important design choice when developing deep learning neural network models. Weight initialization is used to define the initial values for the parameters in neural network models prior to training the models on a dataset.

Why is a good weight initialization required?

The weights of artificial neural networks must be initialized to small random numbers. This is because this is an expectation of the stochastic optimization algorithm used to train the model, called stochastic gradient descent.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.