WEST CHESTER, Pennsylvania — January 20, 2018 — Today, Kevin Wang, designer of Westtown Resort and Polaris, announced Argus, an innovative iOS application that uses machine learning to perform scene and object recognition and enunciates what it detects to the user.
Tensorflow is one of the most widely used programming frameworks for algorithms with a large number of mathematical operations and computations. Specifically, Tensorflow is designed for the algorithms of Machine Learning. Tensorflow was first developed by Google and its source code soon became available on Github, the largest open-source code sharing website. Google uses this library in almost its all Machine Learning applications. From Google photos to Google Voice, we have all been using Tensorflow directly or indirectly, while a fast-growing group of independent developers incorporates Tensorflow into their own software. Tensorflow is able to run on large clusters of computing hardware and its excellence in perceptual tasks gives it an edge to Tensorflow in competitions against other Machine Learning libraries.
In this blog, we will explore the conceptual structure of Tensorflow. Although Tensorflow is mostly used along with the programming language Python, only fundamental knowledge of computer science is needed for you to proceed further in this blog. As its name suggests, Tensorflow comprises two core components: the Tensors and the computational graph (or “the flow”). Let me briefly introduce each of them.
Mathematically speaking, a Tensor is an N-Dimensional vector representing a set of data in the N-Dimensional space. In other words, a Tensor includes a group of points in a coordinate with N axes. It is difficult to visualize points in high dimensions, but the following examples in two or three dimensions give a good idea of how Tensors look like.
As the dimension increases, the volume of data represented grows exponentially. For example, a Tensor with form (3,3) is a matrix with 3 rows and 3 columns, while a Tensor with form (6,7,8) is a set of 6 matrices with 7 rows and 8 columns. In these cases, the form (3,3) and the form (6,7,8) are called the shape or the Dimension of the Tensor. In Tensorflow, the Tensors could be either a constant with fixed values, or a variable allowing alternations during computations.
After we understand what Tensor means, it’s time to go with the Flow. The Flow refers to a computational graph or a graph in short. Such graphs are always acyclic, have a distinct input and output, and never feed back into itself. Each node in the graph represents a single mathematical operation. It could be an addition, a multiplication, etc. Data and numbers flow from one node to the next in the form of Tensors, and the result is a new Tensor. The following is a simple computational graph.
The expression of this graph is not complicated: e = (a+b)*(b+1). Let’s start from the bottom of the graph. The nodes at the lowest level of the graph are called leaves. The leaves of the graph do not accept inputs and only provides a Tensor as output. Actually, a Tensor would not be in a non-leaf node for this reason. The three leaves are variables a and b, and a constant 1.
One level up is two operation nodes. Each one of them represents an addition. Both take two inputs from the nodes below. These middle and higher levels depend on their predecessors, for they could not be computed without the outputs from a, b, or 1. Note that both addition operations are parallel to each other at the same level: Tensorflow does not need to wait on all of them to complete before moving on to the next node.
The final node is a multiplication node. It take c and d as input, forming the expression e = (c)*(d), while c = a+b and d = b+1. Therefore, combining the two expressions, we have the final result of e = (a+b)*(b+1).
That is all for our introduction to basic Tensorflow concepts. We will discuss further advanced features of Tensorflow in later posts. Stay tuned and see you next time!
In this blog, we are going to talk about another optimization algorithm, the simulated annealing. As we mentioned last time, the goal of machine learning algorithms is to minimize the difference between the predicted values of the trained model and the actual values from either surveying or measuring or to find the minimum of the error function. Compared to gradient descent method introduced in the previous blog, simulated annealing algorithm offers a more efficient way to find the global maximum instead of a local one in a certain dataset. Though this method is not complex in nature, it requires some understanding of a field of physics that is not widely known and is a bit abstract.
As its name suggests, simulated annealing algorithm is derived from the annealing process in metallurgy. This process is a controlled heating and then cooling of metal to achieve desired properties, specifically increase the strength of the metal. First, the material is heated up to its melting point and is cast and formed. Heat, at the atomic level, is represented as the kinetic energy of particles. During this stage, all particles have a tremendous amount of kinetic energy and move rather quickly, since the hotter the material is, the greater kinetic energy the particles possess. As the particles roam through space, it is almost impossible to form chemical bonds and therefore the metal loses its physical form and turns into the liquid.
Then the metal starts to be cooled. As the temperature decreases, the kinetic energy also falls. More particles slow down, and permanent bonds start to form between the atoms. Therefore, small freezing “seeds” came into existence and particles around them form crystals upon the seeds. As seeds slowly grow into larger and larger lattices, particles have enough time to fit into the state of minimum energy, giving the whole piece of metal a more steady structure and minimizing inner tension inside the metal. The cooling process is carefully adjusted so that every atom could end up with the least possible energy. If the process is run too quickly, the result would not be desired.
In simulated annealing, the same method applies. Instead of working with real metal, we treat the problem like an atomic thermodynamic system, “crystallizing” the coefficients of our error function into their lowest “energy” state. A typical simulated annealing includes a number of consecutive jumps across the plot of our error function. The amplitude of each jump is determined by the current temperature of the system. The following is an example of a simulated annealing:
The horizontal axis is the possibility of different coefficients in our error function and the vertical axis is the fitness of our model. In other words, the bigger the value in the graph, the better the coefficients fits the data, and the lower the error.
We start at point 1, which is completely chosen at will. We made a random jump towards point 2. Note that with a high system temperature, such large-scaled jumps are allowed, though it is really likely to end up with a worse landing point than the starting. Now we continue the jump to point 3, which is actually worse than point 2. No worries, “it’s gonna get worse before it gets better”, as the old saying goes. When we accumulate more jumps, the temperature of the system decreases, limiting the amplitude of the jumps. After numerous jumps, we could finally reach point 9, which is the global maximum and is where we end the algorithm as the temperature reaches 0.
Simulated Annealing offers a unique interpretation of a physical model and brings it into the optimization process. The randomness included in the algorithm actually gives it a shorter solving time compared to other optimization processes, marking it with distinctive qualities.
We will continue to explore Tensorflow, the programming package that allows us to build our own artificial intelligence model, in my next post. See you very soon!
“Simulated Annealing, a brief introduction.” I Eat Bugs For Breakfast, 14 Mar. 2012, ieatbugsforbreakfast.wordpress.com/2011/10/14/simulated-annealing-a-brief-introduction/.
This week, we are going to talk about gradient decrease. The beauty of artificial neural network is that it utilizes a very simple algorithm to optimize itself which brings down the error of the system, the gradient decrease.
Gradient Decrease could quickly find the local minimum for a function. In the field of machine learning, this method is applied to the error function E(x) (or sometimes also called the cost function) so we could find the point of minimum error. That is exactly what we are looking for training the system.
To visualize this, let us assume a set of example data shown below. Our task is to find the line of best fit, performing a simple linear regression.
Let us start with the function for a straight line:
With the data given above, we want to determine the best m and b that best represents the points the graph. To do this, we define the measure of the “representativeness” with the error function E(m,b) that takes the two coefficients as independent variables. In this case, we will utilize the average sum of squared differences as our error function. Essentially, we take the square of all differences between the predicted values of our best fit line and the actual values and average the sum over the all N data points (xi,yi) in the dataset. If we express our error function in a mathematical way, we will see something like this:
Now with the help of MatLab, we could plot the error function in a 3-dimensional graph. We could see straight away that the global minimum point lies at the point where m=5 and b=3.
Gradient Descent offers a systematic way to find the local minimum (global if you are lucky) without it being in a specific place. Of course, gradient descent has is own limitation, but we will put back this discussion to later posts. The goal of the algorithm is to find the minimum point (m*,b*) starting at a random point (m0,b0). Recall from the multivariable calculus class, the gradient of a function at a specific point is the vector formed by the partial derivatives of the function along both axes, i.e.
The gradient vector represents the direction where the function increases the most greatly. Therefore, to find the point of the minimum we need to move along the opposite or the negative direction of the gradient vector. A more formative way to express this shown as the following:
In this function group, mj and bj are the points we start with, while mj+1 and bj+1 are that of the next step. The γ parameter represents the learning rate which controls the effect of the variable movements. Again, we will leave how to find suitable learning rate for later blogs.
The challenge of this method comes when we find the partial derivatives of our error functions with respect to the coefficients. In this case, both derivatives of our error function are the following:
Note that the average sum of squared differences is not difficult to take derivatives comparing to other ones.
The following is a single run of the gradient descent method of 20 steps with a 0.01 learning rate and we could see the algorithm approaches the real minimum quite well.
We will explore more topics on machine learning next week. Stay put!
For my last blog, I would like to share the paintings I have done so far. First is a pair of paintings of a bookshelf with a flower in a vase located next to it. Inside the bookshelf, I drew Korean traditional elements including a brush and a roll of paper. As you can see, I need some final work in painting the Westtown lamp and Westtown clock and I am going to add some decorations inside the circles of the bookshelf in ‘W’ shapes to symbolize Westtown. As I mentioned in the first blog, I intended to harmonize western elements influenced by Westtown and eastern elements from my background in Korea. Furthermore, the two paintings are both traditional and modern by keeping Korean flowers, bookshelves, and brushes while inserting the modern clock, lamp, and bench. Continue reading
Hello everyone! You may or may not be able to guess from the title, but this will be my last blog post this semester. Therefore, this post will be somewhat of a look ahead. Continue reading
In my last blog post, I detailed the implementation of machine learning models in iOS applications using the Core ML and Vision frameworks. As you probably remember from the tutorial, I implemented the Inception v3 model to give the app the ability to classify 1,000 common objects in the world. While it is true that you can easily download the model from a Github repository, have you ever wonder where it came from? In this blog post, I will introduce the “brain” behind the Inception v3 model––an artificial neural network (ANN).
Thanksgiving break was a welcome and necessary reprieve for me. Although I did not get as much actual painting done as I had wished, I got an incredible amount of planning and idea generating done. I now have a clear vision of my path in preparing for my gallery show, and an idea of what my finished body of work will look like. This past week, I finally dove in to creating this body of work. Yesterday evening I had Gwyneth sit for me, and the Alex painting is almost done. Finally, the fantastical elements are beginning to incorporate themselves into my serious portraiture. I will be beginning my self-portrait soon, and have an even better idea now of where it is going (especially seeing as I got quite a few spider-slug studies done over break).
Below is the sketch and imprimatura for the Gwyneth portrait. I’ve included the imprimatura in both warm and cool light, because I think each holds the space a little differently, which is interesting to me. I haven’t yet decided which direction I will go with my color palate, but it is good to be able to see both possibilities. In addition, there is a close up of the face detail. As can be seen in the sketch, I am planning on giving her a dragon.
Gwyneth’s hair is certainly going to be the most difficult (and most fun) aspect of this one. I am not incredibly familiar with painting hair, so I have compiled some references for myself. First, is the work of Alphonse Mucha, one of the most prominent artists in the Art Nouveau movement in the 1890’s and early 1900’s He has an incredible knack for the silhouette and shape of hair (you can find some examples here). Second is a portrait of Alice Guérin done by Paul César Helleu. This one I had not previously known of, but found in my investigating. The hair is nearly identical to that of Gwyneth’s, although not quite so curly (here is the painting). Third, we have the famous Lady of Shalott, by John William Waterhouse, another example of beautiful long red hair (find it here). Fourth, comes the work of Sandro Bottecelli, who very much enjoyed his depictions of goddesses to have long curly hair. The one painting of his which I have singled out is Pallas and the Centaur, which can be seen here. And finally, there is the work of Gustav Klimt, another famous Art Nouveau painter. I have chosen the painting Danae, which features another woman with long curly red hair (here).
To finish up this blog post, here is the progress on the Alex painting, which is getting very close to done.
Some unique little dudes have wandered their way onto her painting, and while not originally intended, I think I like them, so they will be staying.
In the light of these new paintings and plans, I think I am certainly going to revise the background of my Maggie portrait to tie in these fantastical elements and creatures.
I am excited to see what I will get done over the coming month.
…and I’ll tell you why. This semester a key part in every piece that I have done is improvisation. The guitar line and drum beats in the song were improvised. Much of my dance was choreographed in a trial and error improvisation method. I went in with a clear head, and allowed art to happen. I’ve never thought of visual art that way. It’s always been something carefully planned or meticulously copied from reality. But when I was gathering sources for my project’s Works Cited, I gathered lots of sources focused on creating visual art using musical improvisation techniques. As someone who primarily identifies herself as a musician, this speaks to me. And since reading these articles I’ve become infatuated with the idea. And so I’ve decided to try something entirely new to me. I want to improvise my final drawing. Continue reading
For this last blog post, I want to trace back to the beginning of my project and think about the future for China’s reconstruction plan for impoverished areas.
One of the initial reasons why I conducted this research project is that I felt a loss of identity after continuing my education in America for several years. I remember feeling a loss of connection to the city I grew up in, and I realized that I was forgetting the culture I was born into. Trying to remember and reconnect with my culture, I decided to conduct an independent research project on Shimenkan. Growing up in Kunming in an upper-middle-class family, my life was confined to a small circle. My research project would allow me to learn more about contemporary Chinese politics and the socioeconomic diversity of my province. Continue reading