There are two widely accepted definitions of machine learning. The phrase is first coined in 1959 by computer scientist Arthur Lee Samuel, who trained a computer program to play checkers with humans. He later described his work as “the field of study that gives computers the ability to learn without being explicitly programmed.” Decades later, Professor Tom Mitchell coined a more modern and formal definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

To illustrate Mitchell’s definition of machine learning, I want to use training a computer to identify people in images as an example. The “T”, in this case, would be the task of identifying people in images. The “E” would be the experience of identifying people from viewing many images. The “P” or performance would be the probability that the program will correctly identify people in images.

Depending on the tasks and problems, there are three types of machine methods: supervised learning, unsupervised learning, and reinforcement learning (which I will not discuss in this blog post). In supervised learning, a correlation between inputs and outputs is already known. The computer program is given sets of input parameters and desired outputs to infer a general function that describes the relationship between the input parameters and the output. In other words, the program looks at the “correct data sets” to model the relationship between inputs and outputs.

Specifically, there are two types of tasks that can be best trained through supervised learning: regression and classification. The main difference between the two is that regression is used to predict “continuous values,” whereas classification is used to predict “discrete values.” Let’s say I want to write a program that predicts the temperature of my ramen noodle based on the time elapsed since it was taken out of the microwave. This problem would be considered a regression problem because the temperature of the ramen noodle as a function of the elapsed time is a continuous output, meaning that the output can take *any* value in the given range. On the other hand, the output of a classification problem can only be discrete values, meaning that the output can only take *certain* values. For example, to predict the result of a soccer game, there are only three potential outcomes of a soccer game: win, lose, or tie, which can be represented as 0, 1, and 2. This problem, therefore, is considered a classification problem because we are classifying the outcome of a soccer game into three discrete categories.

In unsupervised learning, we have no prior knowledge of the relationship between datasets or the desired output. No feedback is given to the program during unsupervised learning. Therefore, the program is designed to *derive* a structure from these data sets. The most common approach to unsupervised learning is called clustering. Data points are clustered to help us understand the relationship between them. To better explain unstructured learning and clustering, we can use the example of the friend discovery feature used by many social networks.

In the graph above, we take a collection of friends from a specific user and plot them on the graph based on information such as age, location, gender identity, education, and the number of friends. Through unsupervised learning and clustering, the data points (or friends) are clustered into three distinctive groups. The method of clustering helps social networks such as Facebook and Instagram to understand the relationship between the user’s friends and offer friend suggestions to the user. There is also the non-clustering approach to unsupervised learning which I will discuss in future blog posts. If you are interested, please check out the renowned “cocktail party algorithm.”

In next week’s blog, I will introduce an iOS app I am working on that utilizes machine learning and the CoreML framework. Thanks for reading, and see you next week!

Works Cited

*How USA Swimming Trains with BMW’s Motion Tracking System for the 2016 Olympics*. 18 July 2016. *TechCrunch*, techcrunch.com/2016/07/18/how-usa-swimming-trains-with-bmws-motion-tracking-system-for-the-2016-olympics/. Accessed 24 Sept. 2017.

Ng, Andrew. “Supervised Learning.” *Coursera*, http://www.coursera.org/learn/machine-learning/lecture/1VkCb/supervised-learning. Accessed 24 Sept. 2017. Lecture.

—. “Unsupervised Learning.” *Coursera*, http://www.coursera.org/learn/machine-learning/lecture/olRZo/unsupervised-learning. Accessed 24 Sept. 2017. Lecture.