11. In many applications the units of these networks apply a " sigmoid function " as an activation function . 12. Where is the matrix of input-to-hidden-layer weights, is some activation function , and is the matrix of hidden-to-output-layer weights. 13. This result holds for a wide range of activation functions , e . g . for the sigmoidal functions. 14. Each convolutional layer is followed by a rectified linear unit, a popular activation function for deep neural networks. 15. Backpropagation requires that the activation function used by the artificial neurons ( or " nodes " ) be differentiable. 16. A generalisation of the logistic function to multiple inputs is the softmax activation function , used in multinomial logistic regression. 17. One of the first versions of the theorem was proved by George Cybenko in 1989 for sigmoid activation functions . 18. Here, \ sigma _ 1 is an element-wise activation function such as a sigmoid function or a rectified linear unit. 19. Where \ phi ^ \ prime is the derivative of the activation function described above, which itself does not vary. 20. Neurons with this kind of activation function are also called " artificial neurons " or " linear threshold units ".