In this blog, we are going to talk about another optimization algorithm, the simulated annealing. As we mentioned last time, the goal of machine learning algorithms is to minimize the difference between the predicted values of the trained model and the actual values from either surveying or measuring or to find the minimum of the *error function*. Compared to gradient descent method introduced in the previous blog, simulated annealing algorithm offers a more efficient way to find the global maximum instead of a local one in a certain dataset. Though this method is not complex in nature, it requires some understanding of a field of physics that is not widely known and is a bit abstract.

As its name suggests, simulated annealing algorithm is derived from the annealing process in metallurgy. This process is a controlled heating and then cooling of metal to achieve desired properties, specifically increase the strength of the metal. First, the material is heated up to its melting point and is cast and formed. Heat, at the atomic level, is represented as the kinetic energy of particles. During this stage, all particles have a tremendous amount of kinetic energy and move rather quickly, since the hotter the material is, the greater kinetic energy the particles possess. As the particles roam through space, it is almost impossible to form chemical bonds and therefore the metal loses its physical form and turns into the liquid.

Then the metal starts to be cooled. As the temperature decreases, the kinetic energy also falls. More particles slow down, and permanent bonds start to form between the atoms. Therefore, small freezing “seeds” came into existence and particles around them form crystals upon the seeds. As seeds slowly grow into larger and larger lattices, particles have enough time to fit into the state of minimum energy, giving the whole piece of metal a more steady structure and minimizing inner tension inside the metal. The cooling process is carefully adjusted so that every atom could end up with the least possible energy. If the process is run too quickly, the result would not be desired.

In simulated annealing, the same method applies. Instead of working with real metal, we treat the problem like an atomic thermodynamic system, “crystallizing” the coefficients of our error function into their lowest “energy” state. A typical simulated annealing includes a number of consecutive jumps across the plot of our error function. The amplitude of each jump is determined by the current temperature of the system. The following is an example of a simulated annealing:

The horizontal axis is the possibility of different coefficients in our error function and the vertical axis is the fitness of our model. In other words, the bigger the value in the graph, the better the coefficients fits the data, and the lower the error.

We start at point 1, which is completely chosen at will. We made a random jump towards point 2. Note that with a high system temperature, such large-scaled jumps are allowed, though it is really likely to end up with a worse landing point than the starting. Now we continue the jump to point 3, which is actually worse than point 2. No worries, “it’s gonna get worse before it gets better”, as the old saying goes. When we accumulate more jumps, the temperature of the system decreases, limiting the amplitude of the jumps. After numerous jumps, we could finally reach point 9, which is the global maximum and is where we end the algorithm as the temperature reaches 0.

Simulated Annealing offers a unique interpretation of a physical model and brings it into the optimization process. The randomness included in the algorithm actually gives it a shorter solving time compared to other optimization processes, marking it with distinctive qualities.

We will continue to explore Tensorflow, the programming package that allows us to build our own artificial intelligence model, in my next post. See you very soon!

Works Cited

“Simulated Annealing, a brief introduction.” *I Eat Bugs For Breakfast*, 14 Mar. 2012, ieatbugsforbreakfast.wordpress.com/2011/10/14/simulated-annealing-a-brief-introduction/.

KC MillerThis is really interesting and neat work. The math is WAY over my head. I’m one in Calc 1, and struggling with that. We just learned about sigma summation. Great work.