Operational Research – Maddie Smith

Optimising Visual Art

Maddie Smith — Tue, 30 Mar 2021 12:30:00 +0000

I am one of those people who enjoys everything. And I mean everything. You probably know one of those people, or maybe you are one. The people who try a hobby once, become instantly hooked, and then want to buy all the equipment and become a master of this new skill – well, that’s me. Bread making, calligraphy, skiing, candle making, language learning, tennis, writing, figure skating, piano playing… you get the idea.

This also extends into my academic career; you may have read on my home page that my undergraduate degree was in Theoretical Physics, but now I study Statistics and Operational research (check out this post about why I made the switch). Basically, I love to learn, and I love to discover as much as I can about the things I find interesting; hence my love of research.

One thing that has remained a constant love throughout my life though is art. I love to paint (in water colours, acrylics, dyes and oils – I told you, I enjoy everything), and also draw. I recently read a blog post by my friend Lídia, on the role of Operational Research in music – a truly interesting post that you can find here – and it got me thinking, I wonder if there was a link between Operational Research and art too?

Optimisation

Optimisation is a branch of Operational Research which considers the task of achieving optimal performance subject to certain constraints. Mathematically, optimisation problems deal with maximising or minimising some function with respective to some set; for example, given a set of decision variables x = (x₁, x₂, x₃, …, x_n), we want to find the decision variable that maximises or minimises the objective function.

The objective function could be the waiting time for customers in a system, or the profit from the sale of a product.

The subset which represents the allowable choices of decision variables is called the feasible set. Constraints in an optimisation problem work to specify what the feasible set is. For example, if we consider the waiting time for customers in a system, a constraint would be given by the fact the number of customers in a system can never be negative. This is a simple example of a constraint that one would encounter in an optimisation problem, but in reality there are often many constraints, that can be complex and highly dimensional.

So how does this relate to art?

Well, clearly when an artist works, they are subject to some real-world constraints. They are required to work within budgets, meet deadlines, and follow the instructions of the customer in the case of commissions.

But, perhaps more interestingly, some artists subject themselves to constraints voluntarily. A Sunday on La Grande Jatte (1884) is a work by the French artist Georges-Pierre Seurat (see on ). Looking at it, the painting depicts Parisians having what looks to be a lovely day at the park. When viewed up close, however, it is possible to see that the painting is made up of many tiny dots of multicoloured paint.

If we think about the creation of this painting as an optimisation problem, we could say that Seurat’s objective was to create the best possible depiction of what he saw, subject to two key constraints: only applying the paint in tiny dots, and keeping his colours separate.

I came across , which discusses applying optimisation algorithms in order to create computer generated artwork. This general idea is considered for three different branches of art: line art, tile-based methods, and freeform arrangement of elements. For this post, we’re going to be considering tile-based methods, or mosaics. See .

When photographs or other images are represented digitally, we can think of this as a function which maps each pixel location (x, y) to some colour space. This is easy to consider in a mathematical context, for colours can be denoted using an RGB colour model or similar. Once photographs are represented in this form, it is easy for artists to construct a new artistic version of this photograph by replacing each pixel or block of pixels with some object, known as tiles.

The need for optimisation becomes apparent when an artist is choosing from a finite selection of tiles (where tiles could be anything ranging from dominoes, lego pieces, or other photographs). If the artist only has access to a fixed selection of tiles, then the goal is to find the ‘best’ overall arrangement of this set of tiles, in order to produce the most aesthetically pleasing artwork. Mathematically, the most ‘aesthetically pleasing’ artwork may be considered as the artwork which is most similar to the original image.

If we let I be a W x H image, and suppose that we have an inventory of tiles T = {T₁, T₂, …, T_n}. So how do we quantify how similar an approximation is to the original reference image? In the paper, a distance function d(I(x,y),T) is introduced, where I(x, y) is the colour of the pixel at location (x, y) in the original image, andT_j is the colour of a particular tile. This distance function provides a quantitative method for determining how effectively a given tile approximates a pixel in the source image.

We assume that d(I(x, y),T) is never negative, and smaller values denote better approximations (a distance value of zero would indicate that the tile and the corresponding source pixel are the same colour). If we consider all possible ways in which our inventory of tiles may be arranged on our image (where each tile is used no more than once), that is, all possible mappings, we then seek to find the mapping which minimises the total of the distance functions for every pixel.

In the case that our tiles match up with the dimensions of the pixels exactly, this proves to be a relatively simple task. It is possible to solve this by constructing a complete, weighted bipartite graph, in which each pixel location (x, y) is connected to every tile T_j by an edge of weight d(I(x, y),T_j). The minimum weight matching which uses all pixels can then be computed to give the optimal solution.

This task becomes far more complex in the case that the tiles do not match up one to one with the image pixels, for example, if the artist were to use something like dominoes. However, it is presented in the paper, that the construction of domino art can be naturally reduced to an problem.

Isn’t it fascinating how optimisation techniques can be used to create impressive artwork? I certainly think so. Some people might think that applying mathematical techniques such as integer programming to create art takes away from the spontaneity and skill that goes into creating a masterpiece, but I actually think it only adds to the awe! What do you think? Let me know in the comments.

Make sure to check out the further reading if you are interested in finding out more!

Learn From Your Mistakes – Multi-armed Bandits

Maddie Smith — Tue, 02 Feb 2021 12:30:00 +0000

In a recent talk given to the MRes students, I was asked for my opinion on a multi-armed bandit problem. In these working from home times, I’m sure most of us know of the combined dread and panic that comes with taking your microphone off mute to speak on a call. I contemplated the question, and then gave my answer. As you might have guessed from the title of this talk, I was wrong. But I certainly wouldn’t get this problem wrong again, because I had learned from my mistake.

Ironically enough, learning from your mistakes, or past behaviour, is an idea that is strongly rooted in the multi-armed bandit problem. And thus, a blog post was inspired!

Multi-armed bandits

So, let’s get started. What exactly is a multi-armed bandit?

When most people hear ‘multi-armed bandit’, they may think of gambling. That is because a one-armed bandit is a machine where you can pull a lever, with the hopes of winning a prize. But of course, you may not win anything at all. It is this idea which constitutes a multi-armed bandit problem.

Mutli-armed bandit problems are a class of problems where we can associate a particular score with each of our decisions at each point in time. This score includes the immediate benefit or cost of making that decision, plus some future benefit.

Imagine that we have K independent one-armed bandits, and we get to play one of these bandits at each time t = 0, 1, 2, 3, …. These are very simple bandits, where we either win or lose upon pulling the arm. We’ll define losing as simply winning nothing.

Now, if your win occurs at some time t, then you gain some discounted reward a^t, where 0 < a < 1. Clearly, rewards are discounted over time; this means that a reward in the future is worth less to you than a reward now. The mathematically minded among you may realise that this means that the total reward we could possibly earn is bounded.

The probability of a success upon pulling bandit i is unknown, and denoted by π_i. Since these success probabilities are unknown, we have to learn about what the π_is are and profit from them as we go. This means that, in the early stages we have to pull some of the arms just to see what π_i might be like.

At each time t, our choice of which bandit to play is informed by each bandit’s record of successes and failures to date. For example, if I know that bandit 2 has given me a success every time I pulled it, I might be inclined to pull bandit 2 again. On the other hand, if bandit 4 has given me a failure most of the times, I might want to avoid this bandit. Thus, we are using previous data which we have obtained about each of the bandits in order to update our beliefs about the bandits’ probability distributions.

The Maths

Updating our beliefs about the probability distributions of the bandits in this way is using an interpretation of statistics called Bayesian. Let’s imagine that we have a parameter that we want to determine (in our case, the probability of success for each of the K bandits). Maybe we have some prior ideas about what the probability of success for each bandit will be. This could be due to previous experiments you know about, or maybe just personal beliefs. These prior beliefs are described by the prior distributions for the parameters.

Then, let’s imagine we begin our bandit experiment. After time t, we have pulled t bandits, and we now have information detailing the number of successes and failures for each of the bandits pulled. At this time, we take this observed data into account in order to modify what we think the probability distributions for the parameters look like. These updated distributions are called the posterior distributions.

The Question…

How do we play the one-armed bandits in order to maximise the total gain?

Well, this is where the importance of learning from your mistakes comes in. Imagine the case where K = 5, and at time t = 7 our observed data are as follows:

Looking at our observed data, bandit 1 has the highest proportion of successes out of the number of times it was played. So maybe we would want to pull Bandit 1 again at our next time?

Well, maybe not.

Notice that Bandits 2 and 5 haven’t been played yet at all; therefore we can’t really infer any data about them. Pulling Bandit 1 might give us a success on our next attempt, but it could also give us a fail. We know nothing about Bandits 2 and 5, perhaps these bandits have a probability of success of 1?

The idea of pulling the best bandits to maximise your expected number of successes relies on a balance between exploration and exploitation. Exploration refers to playing bandits that we don’t know much about, in this case, pulling arms 2 and 5. Exploitation, on the other hand, means pulling the arms of bandits that we already know might give us a good result; in our case, pulling bandit 1 again.

Every time we pull an arm, we are actually receiving both a present benefit and a future benefit. The present benefit is what we win, and the future benefit is what we learn. Recall that at time t, if we win, we receive a discounted reward of a^t. Therefore, the closer a is to 1 determines how important it is to learn about the future. For example, if a is closer to one, many future pulls will make a big difference. On the other hand, if a is closer to 0, our present benefit is more important. Again, this relates to the importance of balancing exploration and exploitation.

So how do we quantify both the immediate benefit and the future benefit from these bandits, in such a way so that we can play the bandits to maximise our total gain? It is possible to take the posterior distribution for each bandit at each point in time, and associate a score with it that represents both future and present benefit of pulling that bandit.

It turns out that there is something called a Gittins Index, which is a value that can be assigned to every bandit. By looking at the Gittins Index for all K bandits at time t, it is Bayes optimal to play any bandit which has the largest Gittins index, which is pretty cool!

So there we have it, learning from past experiences is critical in the world of multi-armed bandits, and also in many other areas of mathematics including reinforcement learning and bayesian optimisation. The trade off between exploration and exploitation is critical, and perhaps even serves as a reminder to the importance of not being afraid to make mistakes in our everyday lives… just a thought.

If you enjoyed this post, be sure to check out the following further reading resources!