Matt Randall /stor-i-student-sites/matthew-randall PhD Student, STOR-i Fri, 06 Oct 2023 14:16:33 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 Insertion Heuristics for a Class of Dynamic Vehicle Routing Problems – Poster /stor-i-student-sites/matthew-randall/2023/10/06/insertion-heuristics-for-a-class-of-dynamic-vehicle-routing-problems-poster/?utm_source=rss&utm_medium=rss&utm_campaign=insertion-heuristics-for-a-class-of-dynamic-vehicle-routing-problems-poster /stor-i-student-sites/matthew-randall/2023/10/06/insertion-heuristics-for-a-class-of-dynamic-vehicle-routing-problems-poster/#respond Fri, 06 Oct 2023 14:15:27 +0000 /stor-i-student-sites/matthew-randall/?p=315

In April I was fortunate enough to get to present a poster at the in Birmingham. The poster deals with insertion heuristics for dynamic vehicle routing problems and is work that has gone towards my PhD on attended home delivery. In particular we see the potential benefits of utilising a Sum of Squared distances to construct routes quickly in parallel. Please see this below, and if you find it interesting look out for my upcoming paper on the topic.

]]>
/stor-i-student-sites/matthew-randall/2023/10/06/insertion-heuristics-for-a-class-of-dynamic-vehicle-routing-problems-poster/feed/ 0
Data on the Lake Poster – Slot Offering for Online Shopping /stor-i-student-sites/matthew-randall/2022/08/01/data-on-the-lake-poster-slot-offering-for-attended-home-delivery/?utm_source=rss&utm_medium=rss&utm_campaign=data-on-the-lake-poster-slot-offering-for-attended-home-delivery /stor-i-student-sites/matthew-randall/2022/08/01/data-on-the-lake-poster-slot-offering-for-attended-home-delivery/#respond Mon, 01 Aug 2022 11:12:44 +0000 /stor-i-student-sites/matthew-randall/?p=285

In April I was fortunate enough to get to go to the lake district for an event hosted by STOR-i and the from the Republic of Ireland. At this event I got the opportunity to present a poster, this can be found below. The poster is about my PhD project on slot offering for the , and the work I had done up to that point on it.

The results here are interesting, and since the presentation of this poster the reason for this phenomenon has been identified. The heuristic used is biased towards using vehicles which are already in use rather than calling new ones into use, this results in inefficient routes which make it more difficult to insert customers who arrive later in the ordering period. For more information stay tuned for the paper I recently submitted.

]]>
/stor-i-student-sites/matthew-randall/2022/08/01/data-on-the-lake-poster-slot-offering-for-attended-home-delivery/feed/ 0
Heuristic Search Part 3: Selective Search Methods /stor-i-student-sites/matthew-randall/2020/04/06/heuristic-search-part-3-selective-search-methods/?utm_source=rss&utm_medium=rss&utm_campaign=heuristic-search-part-3-selective-search-methods /stor-i-student-sites/matthew-randall/2020/04/06/heuristic-search-part-3-selective-search-methods/#respond Mon, 06 Apr 2020 10:00:00 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=253

Recently, I published blog post introducing the concept of heuristic search, this can be foundÌýhere. Which was followed up with a discussion linear space search methods, in particular branch and bound, which can be found here. This blog is a continuation of these, in which some selective search methods will be discussed.

In order to solve very complicated combinatorial optimisation problems, in particular ones in which it is not possible to guarantee the quality of solutions found,ÌýÌýcan be used. All search methods considered up to this point have been based around a systematic search of the solution space, and the methods involving heuristics increased the speed of the search process in order to reach the goal more quickly. As they all take the complete search space into account, they should all find the optimal solution. In addition to this, the previously addressed approaches have been deterministic, they will always give the same result every time they are performs on the same problem. Search methods discussed here have elements of randomness involved in them. Randomised heuristic search methods can be catagorised as either a Las Vegas algorithm, which always yields the optimal solution, however takes a variable amount of time, and Monte Carlo methods, which only sometimes yield the optimal solution.

Selective search is a generic term, which includes attributed from both local and randomised search. Selective search methods as said to be satisficing, meaning that they may not always output the optimal solution, although they may by chance, yet should still yield a very good when implemented correctly

Hill Climbing

Ìýis an example of a local search method.ÌýÌýis applied widely in combinatorial optimisation. The aim of local search methods is the optimisation of an objective function in a local neighbourhood of states in a solution space, with the aim of finding a solution which is approximately globally optimal. This type of algorithm does not search the entire solution space, hence they aim to find a solution which is more optimal than the other states in its neighbourhood. Selective search methods can be used to modify the size of steps in the solution space, in order to tune the process and alter the size of the neighbourhood being explored at the cost of optimality.

Hill climbing first selects uses an evaluation function in order to select a neighbourhood in which the optimal solution is likely to be located, and searches the solution space here. Hill climbing (for maximisation) and gradient descent (the equivalent for minimisation) aims to make changes which improve the objective function until it is no longer possible to improve it. This is often done in one of two ways, the first is that all neighbours to the current solution are checked, and the one which offers the greatest improvement is chosen, this is known as simple hill climbing. The other by proposing a change to the solution at random, and accepting the change if it results in the objective function improving, and rejecting the change if it does not, this is known as stochastic hill climbing.

There are issues associated with hill climb methods. First is the issue that some proposed changes may result in an infeasible solution that violates the constraints of problem, which is prevalent when the solution space has regions which are both feasible and infeasible regions as seen in figure (a) below (where F represents the disconnected feasible region and I represents the infeasible region). Second is the obvious issue that the local optimum may not be globally optimal and may in fact be very far way, such as for the solution space shown with the objective function in figure (b) below (where A is the global maximum but B is a local maximum). This places importance on choosing where the hill climb starts, however there is often not much information available about the locations of local maxima, so a random start point is often chosen.

Simulated Annealing

Ìýis a search method based on an analogy of the process of annealing inÌý, it can be viewed as a development of a stochastic hill climb. The process of annealing involves the heating a material above a certain threshold temperature, maintaining this temperature for some time, then cooling it in order to alter the properties of the material.

This algorithm is effectively a form ofÌý. The algorithm generates a new state by perturbing the current state using a small, random change to move the sate of the solution in the solution space. In metallurgy this algorithm is used to simulate the cooling part of the annealing process, here the objective function, represents the total energy of the material. If the energy of the material in the proposed state is less than that of the current state, then the move is accepted. If it is not, then the move is accepted with a probability according to aÌý.ÌýThe reason for this is based on the physics of the annealing process. In statistical thermodynamics energy changes at a certain temperature are distributed according to a Boltzmann distribution. Over time, the temperature is decreased, to simulate the material cooling. A high energy material will make bigger jumps in its state space, whereas a cooler one will make smaller jumps, hence the gradual cooling will allow a coarse energy minimisation to start with (as a lot of unfavourable moves will be accepted) as the algorithm effectively selects a neighbourhood of states and the the minimisation gradually becomes more and more local. This allows the material to cool to its minimum energy state whilst attempting to avoid energy traps (local minima in itsÌý). It should be noted however that slower cooling will result in an increased computational cost.

This translates to a general combinatorial optimisation problem by considering an general objective function which is to be minimised, and a tuning parameter (in place of temperature) which is gradually reduced. Application of this method allows for a coarse optimisation at the start of the problem, effectively choosing the neighbourhood to search in, which tends towards being a more and more local search as time goes on. In theory this should avoid local minima, such as the one shown in figure (b) above, however due to the randomness in the method it still may become stuck in a local minimum, but this is considerably less likely than with a stochastic hill climb.

]]>
/stor-i-student-sites/matthew-randall/2020/04/06/heuristic-search-part-3-selective-search-methods/feed/ 0
Heuristic Search Part 2: Linear Search Methods /stor-i-student-sites/matthew-randall/2020/04/02/heuristic-search-part-2-linear-search-methods/?utm_source=rss&utm_medium=rss&utm_campaign=heuristic-search-part-2-linear-search-methods /stor-i-student-sites/matthew-randall/2020/04/02/heuristic-search-part-2-linear-search-methods/#comments Thu, 02 Apr 2020 12:46:22 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=247

Recently, I published blog post introducing the concept of heuristic search, this can be foundÌýhere. This is a continuation of that post, in which I will discuss linear space search methods, in particular branch and bound.

Ìýinvolve the exploration of a search tree in a systematic exploration of the solution space for a problem. Often the search tree analysed is considerably larger than the problem graph for the problem itself. These algorithms often consider the search tree nodes as members in a search space. Search trees are designed to be simple to analyse in comparison to their underlying problem graphs due to each node having a unique path between it and the root. This
means that members of the solution space in the search space are paths.

Branch and bound

Ìýis a method often used in operational research in order to find solutions to complicatedÌý. Here, the set of all possible solutions are represented as a tree with the complete set of all possible solutions at the root. Branching refers to the creation of sub-problems, and bounding refers to the dismissal of partial solutions which are worse than the currently found optimal solution. Hence, in order to achieve this, the upper and lower bounds for optimal solutions, U and L, must be calculated at each branch on the tree. In order to apply the branch and bound method to problems with a general solution space, depth first search is used with U and L applied at each stage. Here, the principle of depth first search can be applied, thus creating a branch and bound search tree.

Consider an example where a problem has a solution space representing all possible configurations of a system, and the aim is to find the configurationÌýwhich minimises some objective function. The branch and bound algorithm progresses by iterating the
following steps at each of a set of predetermined branching points.

  • Firstly, the problem is branched using a branching rule specific to the problem in order to produce two or more (usually disjoint) subsets of the solution space.
  • Second, a heuristic is used to estimate a lower bound on the value of the objective function for any candidate solution contained in the first branch from the branching point, which is at that point the lowest value for L found. The algorithm then does the same for each of set of solutions for each branch sequentially and determines whether or not its lower bound is greater than the current lowest found lower bound for all branches which have already been checked. If this is the case, then the that branch is pruned and that set of possible solutions discarded.

This results in only a single branch remaining, on which the process can now be repeated so as to repeatedly branch the set of possible solutions until the optimal solution has been found. An advantage of this method is the ability to keep track of the upper bound as well as the lower bound and stop branching when the gap between lower and upper bounds reaches a certain threshold and so solutions can be considered good enough.

]]>
/stor-i-student-sites/matthew-randall/2020/04/02/heuristic-search-part-2-linear-search-methods/feed/ 1
Heuristic Search Part 1: Introduction and Basic Search Methods /stor-i-student-sites/matthew-randall/2020/04/01/heuristic-search-part-1-introduction-and-basic-search-methods/?utm_source=rss&utm_medium=rss&utm_campaign=heuristic-search-part-1-introduction-and-basic-search-methods /stor-i-student-sites/matthew-randall/2020/04/01/heuristic-search-part-1-introduction-and-basic-search-methods/#respond Wed, 01 Apr 2020 14:14:37 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=236

Heuristic search

´¡ÌýÌýin the most common sense is a method involving adapting the approach to a problem based on previous solutions to similar problems. These approaches aim to be easily and quickly applicable to a range of problems, so as to find approximate solutions quickly without using the time and resources to develop and execute a precise approach. The principle of a heuristic can beÌýÌýto various problems in mathematics, science and optimsation by applying heuristics computationally.

Heuristic search is class of method which is used in order to search a solution space for an optimal solution for a problem. The heuristic here uses some method to search the solution space while assessing where in the space the solution is most likely to be and focusing the search on that area. It is often possible to model the process of solving a problem as a search through a solution space starting from an initial possible solution, with a method or set of rules dictating how to move from one possible solution to another. This method or rule set must be applied repeatedly in order to eventually satisfy some goal condition which indicates that a solution has been found. Often, this means assessing which path through the solution space improves the solution the most at each iteration of the applied method and choosing this. For more information, seeÌý.

Searching solution spaces has been integral for the development of . Since then, the use of heuristic search algorithms has expanded considerably to include a variety of applications, such as ,  and .

A selection of heuristic search methods will be outlined in future posts, while a brief overview of some basic search methods can be seen below.

Basic search methods

Here, some basic introductory search methods which do not use heuristics will be discussed, based on content from book. In general, the exploration of a discrete solution space can be visualised as searching a graph with each vertex representing a possible solution, and an edge representing that possible solutions are adjacent to each other. For an illustration, see the figure below. Here, a simple problem is presented, in which a person is trying to search for treasure, which is at node H, starting from node A, in as few moves as possible, as shown in figure (a). Here, the problem is considered to have been solved once node H has been searched.

At times there is no graph representation available for the problem at the start of the process of finding a solution. In this case, whilst search is performed, a partial picture of the graph is gradually built up from nodes which have been explored. Here, for every iteration, all nodes adjacent to the one being explored according to set of transition rules are added to the graph (along with edges connecting them to the node being explored), this is called expanding the node. Practically, this is the application of all possible actions to that state. When doing this, it is important to keep track of nodes which have already been generated in the search. Here, every node must be explored at least once in order for it to be present. As a result of this, it is possible to sort the set of all nodes which have been reached into one set of nodes which have been expanded, and another set which have been added to the graph and not yet expanded. The rest of these is known as the open list or search frontier, and the second set is known as the closed list. The set of all paths from the node at which search began up to the set of open nodes gives the the search tree of problem, which gives an illustration of the part of the solution space which has been explored by the search algorithm at a given time. Possible search roots for up to three moves on the graph in figure (a) above are shown on the search tree for the solution of the problem in figure (b). There are multiple different methods for performing searches on graphs, one distinction between methods is the order in which the solution space is searched, this is commonly either breadth first or depth first which shall be discussed further here.

When performing breadth first search, the open set is considered to be a first in first out queue. So when a node is added to the open set, it is added to the bottom of the list. When a node is explored, the node at the top of the list is expanded and moved from the open set to the closed set. The result of this is that nodes are explored layer by layer and the graph is expanded from the start node. An illustration of this can be seen in the table to the left, which shows how the open and closed sets of nodes change as breadth first search is performed on the problem with the graph shown in figure (a).

When performing depth first search, the open set is considered to be a last in first out queue. This means that when a node is added to the open set, it is added to the top of the list. It also means that when a node is explored, the node at the top of the list is expanded and moved from the open set to the closed set. The result of this is that every iteration generates a node connected to the last node expanded (unless there are none, in this situation the algorithm backtracks to the last node with an unexplored node adjacent to it). This means the search effectively explores a path until that path has been completely explored, then goes back to the point at which that path branches off and explores other paths from here, repeating this process until all paths have been explored. An illustration of this can be seen in the table to the right, which shows how the open and closed sets of nodes change as depth first search is performed on the problem with the graph shown in figure (a).

]]>
/stor-i-student-sites/matthew-randall/2020/04/01/heuristic-search-part-1-introduction-and-basic-search-methods/feed/ 0
Methods for Missing Data /stor-i-student-sites/matthew-randall/2020/02/21/methods-for-missing-data/?utm_source=rss&utm_medium=rss&utm_campaign=methods-for-missing-data /stor-i-student-sites/matthew-randall/2020/02/21/methods-for-missing-data/#respond Fri, 21 Feb 2020 22:37:21 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=225

It is common when collecting data for some entries to be absent. This can have a significant impact on any attempt to gain useful information from these data, hence methods have been developed in order to make it possible to gain useful insights into data of this kind. A simple method for this is to simply discard any record which contains a missing entry, however this can lead to such a small sample that it is not useful for obtaining reliable information. In addition to this, there may be reasons why certain groups of people do not want to supply certain information, hence this approach can result in those certain groups of people being ignored.

In order to use data with absent entries, these absent entries are often filled in, this is called imputing them. There are a variety of different methods which can be used for this. Some of which are explained below.

  • Unconditional mean imputation: The simplest method is to take the average value of a variable which has missing entries, and use this as the value for all those which are missing. Whilst convenient, this can lead to distortions in the data.

  • Conditional mean imputation: Unconditional mean imputation can be improved upon by identifying a variable which seems to have a connection with the one with missing values and group the records according to this variable. The average value within each group for the variable with missing values is then calculated and used to fill in the missing values their respective group. Distortions in the data are still present here.

  • Regression imputation:ÌýThis method involves identifying a variable which has a connection to the one with missing values, and effectively plotting them and calculating a line of best fit for their relationship. This line is then used to predict missing values. Distortions in the data are still present here.

  • Stochastic regression imputation: This method involves performing regression imputation as mentioned above, but moving every imputed value by a random amount. This is intended to reflect the randomness in the data and prevent the previously mentioned distortion.

In order to reflect that there is some uncertainty in imputation, when using a method with some randomness to it (such as stochastic regression imputation) it can be useful to perform regression multiple times to gain multiple data sets. These data sets are then studied separately, and the averages of these are found with an estimate of how uncertain these averages are. This is called multiple imputation and can be useful because it gives an idea of how accurate the method used is.

When studying data, the selection of which variables to study is important. There are well established methods for this, however with missing data things are not quite as straight forward. Methods for dealing with this range from simply performing the standard method on the imputed data to altering the chances of variables being selected based on how much of them are missing.

Overall this is a wide area with a range of methods associated with it, only a few of which have been mentioned here. It is important to keep researching in this area in order to make collected data as useful as possible.

]]>
/stor-i-student-sites/matthew-randall/2020/02/21/methods-for-missing-data/feed/ 0
Statistics and Music: Studying Beethoven /stor-i-student-sites/matthew-randall/2020/02/04/statistics-and-music-studying-beethoven/?utm_source=rss&utm_medium=rss&utm_campaign=statistics-and-music-studying-beethoven /stor-i-student-sites/matthew-randall/2020/02/04/statistics-and-music-studying-beethoven/#respond Tue, 04 Feb 2020 13:10:28 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=207

While I was on the internet earlier today, I stumbled across a short but interesting video involving a novel use of statistics. Here, statistical methods have been used in order to determine what makes Beethoven’s string quartets sound distinctively like his compositions. I thought I would share it on here, you may view the video below.

]]>
/stor-i-student-sites/matthew-randall/2020/02/04/statistics-and-music-studying-beethoven/feed/ 0
Applied Data Science: Machine Learning and Robot Tanks /stor-i-student-sites/matthew-randall/2020/01/28/applied-data-science-machine-learning-and-robot-tanks/?utm_source=rss&utm_medium=rss&utm_campaign=applied-data-science-machine-learning-and-robot-tanks /stor-i-student-sites/matthew-randall/2020/01/28/applied-data-science-machine-learning-and-robot-tanks/#respond Tue, 28 Jan 2020 16:27:59 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=181

Recently on the BBC news website I stumbled across an about the development of robot tanks controlled by a machine learning artificial intelligence. Machine learning is an area of data science which uses statistical models to infer a set of instructions from a set of training data on some past actions. Machine learning algorithms have a wide range of uses, such as retailers recommending products to a certain customer, email filtering and financial trading. Here the use of machine learning in the AI (artificial intelligence) of unmanned war vehicles is explored.

In 1985 the US army attempted to develop an AI controlled anti-aircraft tank with the development of the . The project however was scrapped after an incident where it attempted to target a group of high ranking US army officers. Yet in the time since this debacle, major advances in the field of machine learning have been made, such as the development of ,Ìý, and . With these advances among many others, multiple defense contractors have taken up renewed interest in AI controlled war vehicles.

The UK based defense technology group has been in talks with the British army about supplying AI controlled vehicles to carry heavy supplies and equipment as well as move into position to provide cover for human soldiers. Yet these are not programmed to fire weapons. According to the , a spokesperson from Qinetiq, they never intend to make a vehicle capable of firing a weapon due to difficulty in distinguishing between civilians and soldiers.

Meanwhile in the USA, their army is investing in an unmanned vehicle supplied by . Technology being developed here involves machine learning being used in conjunction with visual and thermal sensors which have been used to train the vehicle to identify a human and what they are holding (i.e. if they are holding a firearm or not). The aim of this being that it should be able to successfully identify and take out enemies.

All of this however still raises an interesting question about whether or not we should limit the application of machine learning. Ethical questions are raised going forward involving the value of a soldiers life: if the AI is 99.99% accurate, is it worth risking the lives of your own soldiers with a chance the AI will mistake one for an enemy? Is this worth it for the sake of the reduced casualties due to deploying fewer soldiers? These are questions which must be asked in the future if the path of AI controlled weapons is continued down.

]]>
/stor-i-student-sites/matthew-randall/2020/01/28/applied-data-science-machine-learning-and-robot-tanks/feed/ 0
SlipKnot Setlists: Predictions vs Results /stor-i-student-sites/matthew-randall/2020/01/20/slipknot-setlists-predictions-vs-results/?utm_source=rss&utm_medium=rss&utm_campaign=slipknot-setlists-predictions-vs-results /stor-i-student-sites/matthew-randall/2020/01/20/slipknot-setlists-predictions-vs-results/#comments Mon, 20 Jan 2020 14:05:12 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=168

Recently I took a brief look at a set of data containing information about the songs played by SlipKnot at their past performances and used this to predict the set they would play when I went to see them on the 16th of January, which can be found here. The actual set played can be seen to the left below, while the figure to the right below shows how the likelihood the twenty most likely songs to be performed, with the predicted set list being the first seventeen songs in this figure (“All Out Life” to “The Blister Exists”).

Set Played:

  • Unsainted
  • Disasterpiece
  • Eeyore
  • Nero Forte
  • Before I Forget
  • New Abortion
  • Psychosocial
  • Solway Firth
  • Vermilion
  • Birth of the Cruel
  • Wait and Bleed
  • Eyeless
  • All Out Life
  • Duality
  • (sic)
  • People=Shit
  • Surfacing

Of the seventeen songs predicted, twelve were performed at the concert. The songs performed which were not predicted include “Birth of the Cruel” and “Nero Forte”, which as mentioned in the previous post were likely to be played despite having only been performed once before due to being new additions to the band’s live shows. In addition to this “Vermilion” was performed, which is a fairly frequently played song predicted to be the 22nd most likely from my analysis and is the overall 16th most played song. Other songs performed which did not appear likely to be played based on the data were “Eeyore”, a hidden track from the self-titled debut and “New Abortion” from “Iowa”, both of with are songs noted for their heaviness and intensity. 

Of the songs predicted which were not performed, two were from the fifth album “.5: The Gray Chapter” (“Custer” and “The Devil In I”), which is notable as there was no representation from this album at all in the performance. Other absent songs which seemed likely were “The Blister Exists”, one of my personal favourites from “Vol. 3: (The Subliminal Verses)” and “The Heretic Anthem”, a fan favourite from the second album. The biggest surprise absence however was “Spit It Out”, a song from the debut album which is considered by many to be an essential part of the band’s live show, see (warning, explicit language).

Ìý

Looking for the most frequently played songs seems to be a relatively accurate way of predicting which songs will be played, and may be appropriate at the beginning of a tour which is not in support of a new album. It is worth noting however that the setlist performed was identical to the one performed at the previous show in Dublin, so predicting an identical set to the most recent gig may be the most accurate method for predicting songs for a concert in a tour which has already commenced.

Ìý

]]>
/stor-i-student-sites/matthew-randall/2020/01/20/slipknot-setlists-predictions-vs-results/feed/ 1
SlipKnot Setlists /stor-i-student-sites/matthew-randall/2020/01/16/slipknot-setlists/?utm_source=rss&utm_medium=rss&utm_campaign=slipknot-setlists /stor-i-student-sites/matthew-randall/2020/01/16/slipknot-setlists/#comments Thu, 16 Jan 2020 11:53:45 +0000 http://www.lancaster.ac.uk/stor-i-student-sites/matthew-randall/?p=148

Today, I get the opportunity to go to Manchester in order to attend a concert. This is something I am very excited about and as it was all I can think about I decided to write a blog post about it. The website keeps up to date information on setlists played by bands at gigs, this was used as a to find information on how many times each song has been performed by SlipKnot, with data having been retrieved on the 15th of January. For the purpose of this study, instrumental solo spots have been removed, non album singles have been attributed to the album which its release date is closest to, songs from demos released before the band were signed to a record label were also removed as data from this era was not well recorded and has been considered incomplete.

Plainly from looking at the data, it can easily be seen which are the most popular songs to perform, displayed in the plot to the left. It is not surprising that the four most played songs are all the the band’s self-titled debut album, as songs have been around for a longer period of time and many of which are considered live staples, meanwhile further down the list some of the more well known songs from later albums such as “The Devil in I”, “Psychosocial” and “Dead Memories” can be seen further down in the top twenty, mixed in with songs from the first three albums.

When looking at total the number of times songs have been performed from each album, a bias towards earlier material in this data becomes even more obvious, hence simply taking the most played songs and trying to use these to predict a set list may not be representative, especially considering that the current tour is in support of the most recent album “We Are Not Your Kind” from 2019, so it would be expected that a lot of this newer material will be performed. In particular, two songs from the new album “Birth of the Cruel” and “Nero Forte” have only been performed once at the time of retrieving this data, and that was the evening before the data was retrieved, so these songs are considerably more likely to be played than the data suggests.

In order to gain more representative data for what a contemporary SlipKnot setlist would look like, the number of plays for each song was divided by the number of years since its release to give the average number of performances per year in order to make the data more fair, the results of which can be seen below. The most performed songs per year are “Unsainted” and “All out life”, which are a song from the most recent album and a standalone single released in the lead up to it, neither of which appeared on the top twenty songs played. Following this are three songs from the first album, showing that there is certainly some bias in the setlist choices towards old fan favourites. The sixth to twentieth spots are now occupied by a relatively even mix of songs from all albums. When looking at the total number of times a song has been played from each album using the modified data there still seems to be a clear preference towards songs from the self-titled debut album, and also a clear dislike towards performing songs from the fourth album “All Hope Is Gone”. Given that past SlipKnot setlists tend to consist of approximately seventeen songs, taking the top seventeen songs (from “All Out Life” to “The Blister Exists”) from the lower left hand figure may give some idea of a likely setlist to play. The results of this predicted setlist will be compared to the actually performed set in my next blog post, to be posted after the concert.

]]>
/stor-i-student-sites/matthew-randall/2020/01/16/slipknot-setlists/feed/ 2