Moving below the surface: Artificial Neural Networks — William

As mentioned before, we will be discussing artificial neural networks in this blog. Being a significant subject in the field of supervised machine learning, neural networks excel at solving classification problems, and, when combined with convolution integrals, are the most popular model for image classification tasks. To understand how neural networks function, however, we are going to first examine something seemingly unrelated: a biological neuron, or a single nerve cell.

Image result for neuron

A nerve cell consists of numerous dendrites around the cell nucleus, as well as a long axon connecting to another neuron’s dendrite, as shown above. Electric impulses are transmitted into the cell body via the dendrites, and, when the electric voltage builds up above a certain threshold in the form of sodium concentration, the neuron is activated, firing an electric pulse along the axon towards the other neuron. Artificial neurons work in a similar way: accepting numerical inputs from other neurons, and activating when the weighted sum of the inputs exceeds the threshold. The weights of the inputs are the essential part of the model. They, along with a static Bias Weight, determine the output of the neurons and the result of classification, and are targeted for training during the machine learning process.

Image result for artificial neruon

The activation procedure of an artificial neuron is represented by the Activity Rule, a function that produces an output from 0 to 1 as the input increases. The Activity Rule could be one of the many sigmoid functions (those with “S”-shaped curves), including the hyperbolic tangent function and the logistic function. In most cases, the logistic functions are preferred, for they have superior mathematical properties allowing them to outperform other sigmoid models with noisy and linearly inseparable inputs.

To build up an artificial neural network, we simply align the neurons into different layers. Each neuron in the same layer receives input from all other neurons from the preceding layer, and feeds its result into every neuron in the proceeding layer.

Image result for artificial neural network

Therefore, the layers in a neural network could be divided into three parts. First is an input layer receiving data from the environment. The number of neurons in this layer matches the dimension of the input data set, while each neuron takes the whole set as its input. Then there are several hidden layers, processing input data layer by layer. Finally, an output layer produces the results, which are normalized before being presented as classification outcomes. Though the number of layers and the number of neurons in each layer has no limitation, the problem of underfitting and overfitting need to be considered to achieve an effective model.

Taken from the most successful biological structure in all of evolution, artificial neural networks inherit the effectiveness of biological nerve systems, as well as the flexibility to adapt to different machine learning tasks. It has become the most popular model for supervised learning, and it will remain in its place.

Works Cited

“Neural Networks, Part 2: The Neuron.” Marek Rei, 23 Feb. 2014, Accessed 30 Sep. 2017.

1843161695945955. “Overview of Artificial Neural Networks and its Applications.” Hacker Noon, Hacker Noon, 17 July 2017, Accessed 30 Sep. 2017.

Graham Templeton on October 12, 2015 at 7:30 am Comment. “Artificial neural networks are changing the world. What are they?” ExtremeTech, 11 Oct. 2015, Accessed 30 Sep. 2017.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.