Category Archives: Communications

Moving below the surface (2): Cross-entropy — William

In my previous post, cross-entropy is presented as a cost function, measuring the difference between given inputs and outputs in supervised learning. It is also an important concept from the Information Theory.

Before we talk about what cross-entropy is, let me introduce John, an imaginary freshman entering Westtown. He loves desserts and talks a lot about it. In fact, all he says are none but four words: ice-cream, chocolate, cookie, and yogurt. When he is back home after school, however, John only use binary code to text with his classmates. So, the texts his classmates received looks like this:

binary_bits

To understand what the messages mean, John and his classmates need an established code system, a way to map sequences of binary bits into words. Here is a simple example:

Code_mapping

With the code system in hand, John and his classmates can encode text messages by simply substituting words for codewords and vice versa.

encoding

It turns out that John does not use all of his four words as often. As an ice-cream fan, John shares his ice-cream-loving moments all the time. He sometimes mentions his time eating chocolate ice-cream with his family, and rarely does he mention anything else. So, his word frequency chart looks like the following:

milton_word.png

Now we can plot out John’s word frequency against the number of binary bits each symbol matches. The area formed represents John’s average length of messages we use to send each word. These areas are formally named entropy.

milton_code.png

Since typing and sending text message takes up time for John and his classmates, they want to reduce average codeword length for each word so they could spend minimum time texting the same amount of words. This is when variable-length code comes into play, mapping commonly used words (like “ice-cream”) to shorter codewords and less-frequently used ones to longer codewords (like “cookie” or “yogurt”). So, we have a new mapping between words and codewords:

new_code.png

Side note: the mapping is not randomly picked — it is designed as a function of the word frequency so it is uniquely decodable and not cause confusion when splitting the message into codewords.

Again, we could plot the word frequency against the number of binary bits. Here is how it looks like:

new_code.png

The result is amazing: we successfully reduce the average length to 1.75 bits! It will be the most optimized code in this case. Encoding words with it will take up a minimum number of bits and the text messages will be the most concise.

Very soon after the school started, John meets Juliet, another freshman at Westtown. Juliet is a chocolate-lover. She talks about chocolate all day, and her favourite is chocolate chip cookies. Juliet despises ice-cream though, and mentions it only when necessary. Despite this, they share their obsession with dessert and, interestingly, the same limited vocabulary size.

Juliet_word.png

When Juliet started to use John’s code, unfortunately, the text messages she sends are much longer than John’s, since they have different word frequencies. As we plot Juliet’s entropy graph, we could see her average message length is as long as 2.25 bits! We call Juliet’s average message length using John’s code system the cross-entropy.

So, why do we care about cross-entropy in supervised machine learning? Well, cross-entropy provides us a way to measure the difference between the result our model produces and the provided outcome. Since both the result and the provided outcome could be both expressed in the form of frequency charts or possibility chars, cross-entropy fits its role as the cost function well. With the cost calculated, we could adjust the parameters to achieve a best-fit model for the given outcomes.

See you next week!

Work Cited

“I like this Maple Application – Vibration of Mindlin rectangular plates.” Vibration of Mindlin rectangular plates – Application Center, http://www.maplesoft.com/applications/view.aspx?SID=35302&view=html.

What is bit (binary digit)? – Definition from WhatIs.com. (n.d.). Retrieved October 14, 2017, from http://whatis.techtarget.com/definition/bit-binary-digit

The Importance of Press – KC

On October 2nd 2017 a Q&A about the work I’ve been doing was posed on Vice News. It was later added to their national snapchat story. It’s hard to say how many people saw the article but this was national coverage which means a TON of people saw it all across the country and perhaps the world.

Let’s look at the numbers we do have:

We can use Facebook’s article tracking feature to see how many times it was simply shared on the popular social media site. In the past two weeks it has been shared by nearly 2,000 people and popular pages.

Screenshot at Oct 15 19-54-20.png

I don’t have any way to quantify any other post-based social media websites like twitter, but this gives an audience rage on one.

We can however extract a few numbers from Snapchat’s stories. Vice News is one of the most popular snapchat stories, and while the app does not release official viewership counts, NBC released their own count earlier this year.

Read more: Quakerism’s Influence on my Activism – KC

According to Variety, the multi-media giant garnered a whopping 29 million views in the first month of starting their new snapchat story. While this number is probably inflated because of first month promotion, it allows us to see the amount of people who are tuning into a specific story – a new one at that.

It is safe to say that over a hundred thousand people saw the story on Vice. We don’t have any way of quantifying the number of people who then chose to read the article, but they were all able to see this video:

 

So why do these numbers matter? It’s simple, good press is one of the most crucial parts of any organization or movement. Over the past two weeks since my article dropped, my mailbox has been flooded with new people wanting to get involved. Leading activist in my field have begun reaching out to partner.

Read more: Timing is an Art Form – KC

I’m really excited about working with these people and continuing to build my organization. To those trying to build something new, I suggest you start working on news coverage. Reach out to local reporters or people who frequently write about related topics. Start sending press-releases when new things happen inside your organization.

These kinds of articles will help propel your message and build a wider audience.

Quakerism’s Influence on my Activism – KC

My interview  about Keystone Coalition for Advancing Sex Education was just published in Vice News’ Broadly section today titled “This Teen is Paving the Way for LGBTQ Inclusive Sex Ed in Schools”. The following post is a follow-up on the interview, which you can read here.

One of the questions I was asked for my Q&A was “Westtown is a Quaker school. Were you raised Quaker? If so, how has that influenced your path regarding Keystone CASE (if at all)? If not, how has your Quaker schooling influenced your path in any way (if at all)?” Continue reading

Moving below the surface: Artificial Neural Networks — William

As mentioned before, we will be discussing artificial neural networks in this blog. Being a significant subject in the field of supervised machine learning, neural networks excel at solving classification problems, and, when combined with convolution integrals, are the most popular model for image classification tasks. Continue reading

The Game Theory of Community Weekend Event–Summer

Today, I’m going to examine with you, in the lens of Game Theory, some of the most memorable times at Westtown, Community Weekend Events!

As we all know, as a boarder at Westtown, we are required to attend four community weekend events each year. In these events, we have the common goal to have a great time and build a tighter community. But, what makes these events fun? Continue reading

Simple Linear Regression in Machine Learning – Kevin

You might remember linear regression from statistics as a method to produce a linear equation that models the relationship between two variables. Not surprisingly, linear regression is quite similar in machine learning, except that the focus is on the prediction rather than the interpretation of data. Regression is a supervised learning algorithm (if you remember from my previous blog) that predicts real-valued output when given an input. In this blog post, I will discuss the model representation of simple linear regression and introduce its cost function.

Continue reading

Wetting the Feet: An Introduction to Machine Learning — William

As we discussed in the previous post, machine learning is one of the main branches of artificial intelligence, in which we aim to build a rational agent. Machine learning is essential to implementation of artificial intelligence, for it allows agents to adapt to different scenarios, as well as predict changes in evolving environment around them. Continue reading

Diving Deep Into Deep Learning: Dipping a Toe in the Water — William

Since its birth in mid 20th century, artificial intelligence has been present in various aspects of the popular culture, from Isaac Asimov’s Three Laws of Robotics in Handbook of Robotics and HAL in 2001: A Space Odyssey, to The Terminator and The Matrix, even influencing our way of life in recent years, including Alpha Go and Siri. But what is artificial intelligence? Or in other words, what is the ultimate goal for artificial intelligence? Continue reading

A Brief Introduction to Machine Learning – Kevin

There are two widely accepted definitions of machine learning. The phrase is first coined in 1959 by computer scientist Arthur Lee Samuel, who trained a computer program to play checkers with humans. He later described his work as “the field of study that gives computers the ability to learn without being explicitly programmed.” Decades later, Professor Tom Mitchell coined a more modern and formal definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”
Continue reading

How Hannibal defeated Game Theory—Summer

hannibal-300x212.jpg

Today, I am going to examine with you, through the lens of game theory, the most famous war in my favorite era (Classical Antiquity!), the 2nd Punic war between Carthage and Roman Empire. In particular, Hannibal’s invasion into Roman territory through the Alps. Continue reading