# 会死掉的ReLU

2019年1月11日 / 783次阅读

ReLU神经元激活函数的出现，同样也是来自生物神经科学领域的启发。它给神经网络技术带来的好处是：

（1）有效解决了在DNN（或MLP）训练过程中，梯度消失的问题；（误差在向后传递的过程中，由于BP算法的计算规则，需要乘激活函数的导数，对于sigmoid或tanh，计算的结果就是，误差越来越小，对于前面的hidden layer，误差已经小到几乎无法有效学习的地步）

（2）ReLU函数的计算更快，加快神经网络的计算。

Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says:

"Unfortunately, ReLU units can be fragile during training and can "die". For example, a large gradient flowing through a ReLU neuron could cause the weights to update in such a way that the neuron will never activate on any data point again. If this happens, then the gradient flowing through the unit will forever be zero from that point on. That is, the ReLU units can irreversibly die during training since they can get knocked off the data manifold. For example, you may find that as much as 40% of your network can be "dead" (i.e. neurons that never activate across the entire training dataset) if the learning rate is set too high. With a proper setting of the learning rate this is less frequently an issue."

Learning Rate设置过大，确实是个问题。可以让ReLU彻底死掉，也可能出现overshooting现象，即cost不断变大。

Ctrl+D 收藏本页