In the early 2013 EE approached Brendan Dawes to create something to signify the roll-out of 4G services across the UK. UCL – who would be collecting all the data for the three day period for EE – gave Brendan some test data to play with so he could test the format of the CSV file and begin coding a system to parse this data. This early test data was only a few thousand rows so it was easy to parse and put into an array in Processing. At this point Brendan was taking inspiration from the idea of the EE particles and exploring how he could represent time in a way that looked beautiful.
Brendan always loved the Vogel spiral algorithm and so after a while this became the core part of his explorations, using dots and circles to represent moments in time, parsing keywords such as “X-Factor” to find the subjects we were looking to pull out of the data. Things were gradually coming together until UCL sent Brendan the first set of real data, for example for London; over six million rows of information! He tried running it through the system he had been building and it just couldn’t cope with that amount of data and crashed with Out of Memory errors as he was trying to create an array entry for each row of data. This just wasn’t going to work so he needed a different approach.
Ben Fry’s Visualising Data was a real gift from this point of view, Brendan explains. In there he detailed a way to parse CSV information one row at a time and not only that you could use zipped up CSV files too. After some refactoring the system was now able to cope with as big a file as could fit on his hard drive. Another thing he needed to fix though was concatenating three CSV files into one; UCL provided each days worth of data in a separate CSV file so after some Googling he found a handy little Unix script that merged CSV files, keeping the first row of the first file and ignoring the first row of subsequent files. It worked perfectly. Another approach, Brendan explains, might have been to load everything into a MYSQL database but the CSV approach was just right for his needs. – “I like to keep things simple and the MYSQL route just seemed too heavy handed for this project.”
Very early on Brendan realised that he needed a way to keep things organised – he was making eleven of these images and each one had their own set of keywords that were used for finding relevant Tweets. So he used the city name that was prepended onto the filename of each CSV file and used that to load the relevant xml file that contained all the keywords as well as setting the filename for saving the pdf output. This stopped any errors of for example loading Glasgow but outputting it as London.
The images themselves map 4320 minutes – the number of minutes in three days – across a Vogel spiral pattern. “The map() method in Processing is a really powerful tool – in fact if I had to choose my favourite method it would be that one!” Subjects are then colour coded and mapped across this period of time with the size of the circle representing the number of mentions for that subject in that minute. Eventually Brendan added lines connecting the actual subject text to these circles, but only if there was a retweet of that subject. For Brendan this however gave rise to another problem; the image was being drawn to the canvas in realtime as it parsed each row, adding a circle and if necessary a connecting line. The problem was he needed all the lines to be underneath all the circles. He could have written some kind of array to store the circles and the lines but he was worried about memory usage so instead he outputted the lines to a separate pdf which was then composted behind the circles in Photoshop. It was an extra step but it worked well.
Each image was 12000 pixels wide by 23000 pixels tall. At 300 dpi that’s an image over 76 inches tall. The final images were printed as Lambda prints and then Diasec mounted.
Posted on: 03/07/2013
Posted in: Processing