# Sigmoid神经元

2018年4月24日 / 197次阅读

In fact, a small change in the weights or bias of any single perceptron in the network can sometimes cause the output of that perceptron to completely flip, say from 0 to 1. That flip may then cause the behaviour of the rest of the network to completely change in some very complicated way. So while your "9" might now be classified correctly, the behaviour of the network on all the other images is likely to have completely changed in some hard-to-control way. That makes it difficult to see how to gradually modify the weights and biases so that the network gets closer to the desired behaviour.

We can overcome this problem by introducing a new type of artificial neuron called a sigmoid neuron. Sigmoid neurons are similar to perceptrons, but modified so that small changes in their weights and bias cause only a small change in their output. That's the crucial fact which will allow a network of sigmoid neurons to learn.

Sigmoid神经元具备这种小小的权重和偏移的改变，只会导致输出的一点小小改变的数学特性。

$$\sigma(w \cdot x+b) \equiv \frac{1}{1+e^{-(w \cdot x+b)}}$$

$$\sigma(z) \equiv \frac{1}{1+e^{-z}}$$

Sigmoid函数的图形

$$\begin{eqnarray} \Delta \mbox{output} \approx \sum_j \frac{\partial \, \mbox{output}}{\partial w_j} \Delta w_j + \frac{\partial \, \mbox{output}}{\partial b} \Delta b, \end{eqnarray}$$

While the expression above looks complicated, with all the partial derivatives, it's actually saying something very simple (and which is very good news): $$\Delta output$$ is a linear function of the changes $$\Delta w_j$$ and $$\Delta b$$ in the weights and bias. This linearity makes it easy to choose small changes in the weights and biases to achieve any desired small change in the output. So while sigmoid neurons have much of the same qualitative behaviour as perceptrons, they make it much easier to figure out how changing the weights and biases will change the output.

$$\Delta output$$和$$\Delta w_j$$ and $$\Delta b$$是线性关系（把偏导数部分理解为一个系数），因此一点点权重w和偏移b的改变，对结果的改变也仅仅只是一点点。

Sigmoid神经元使用的Sigmoid函数，就是一个激活函数。当然还有别的激活函数。

The main thing that changes when we use a different activation function is that the particular values for the partial derivatives in Equation above change. It turns out that when we compute those partial derivatives later, using $$\sigma$$ will simplify the algebra, simply because exponentials have lovely properties when differentiated. In any case, $$\sigma$$ is commonly-used in work on neural nets.

Sigmoid神经元与感知机的输出不同的地方在于，感知机只输出0或1，而Sigmoid神经元的输出是$$(0,1)$$区间中的任意实数。在解释输出方面，我们只需要简单的定义一个对应关系即可，比如大于0.5就是什么，小于0.5就是另外的什么等等。。。

### 留言区

《Sigmoid神经元》有1条留言

• 麦新杰

本质就是将Perceptron进行平滑处理，不再使用step函数，sigmoid函数用了e，不用e也是可以的，只要是一个平滑的连续函数，就会有同样的效果。 []

Ctrl+D 收藏本页