Guest blog post by Brian Rutherford, Co-founding Director of Eyecademy
The first thing that struck me, sitting in the conference room three stories up in the New York Marriot, was how mature Data science and Big Data was in the US. I’d been following Big Data and Data Science for the past two years but this was on another level entirely.
It’s clear the US is ahead of the rest of the world. In Europe we are still waking up to the possibilities of Big Data and Data Science but in North America it is an everyday reality. I’m sure as I write this, there are fresh faced, motivated Computer science graduates, coding furiously, determined to be the next big UK start-up but elsewhere the pace of change is slower. Which is why my interest was piqued when our Marketing Manager starting talking about ‘The Data Lab’ a couple of years ago. Eyecademy has always been looking to be ahead of the curve and we’re lucky enough to have had a few successes. Back when our core service was training we offered virtual classrooms before anyone else. We were using Amazon Web Services before anyone else, certainly before the rest of the UK woke up to the cloud and possibilities of Software As a Service.
Before my trip to New York we’d been looking at Big Data and Data science for two years but every time came up against the same issues we just couldn’t recruit the right people and , even if we could, none of our customers were using any of the technologies involved beyond small POCs. The Data Lab, though, were talking about Data Science in the same terms as the Americans that it was here and that we should all be looking at it seriously for the benefits it brings in a wide range of Analytical fields. So when they offered me a place in the Scottish Enterprise Delegation to New York’s Big Data & Analytics for Banking Summit, I jumped at the opportunity.
The first thing that struck me at the Summit was that Big Data is not really a new thing. It’s a re-branding exercise for something that’s been about for a long time. Anyone who’s seen the Michael Caine movie ‘Billion Dollar baby’ will remember the giant mainframe which featured in the film. It was typical of the computing power of the time. Large corporations, draped in mustard and green branding, with secretaries feeding sheaves of punch cards into the mouth of a clacking monster. The intersection of Maths and Code first met in these early days but of course it’s a human trait to always be searching for order. Edgar Codd gave us that. The relational Database was a spear through the heart of the hierarchical Databases of the 70s behemoths. How could they go on when the Fields and Columns were laid out in geometric lines marching off into the future? The Corporations survived but now they weren’t called Bull or Texas Instruments, now they were Oracle and IBM.
Then something changed again. The rise of the internet saw the rise of competing systems small relational databases such as MySQL began to power code-based web sites using PHP and Python. We had competition and innovation. With this sea-change in power, people began to view data in a new way. Data would always be ordered except when it wasn’t. Data would always be neat, controllable except the times when it was messy but, even when it was messy, that didn’t mean that it was unusable. Messy data is still full of meaning. Unstructured data was the human mind revealed. I was reminded of the time Alan Turing looked at the endless chaotic variety in the patterns in animal skins or the infinitely varied development of the spiral shell of a snail. Why were animal markings never the same? Centuries of Maths had defined the square, the circle the triangle but when he looked around him Turing didn’t see any Squares except the ones we built to dwell in. He didn’t see any triangles except for the temples we’d built. Instead, all around in the natural world was a lack of structure. Turing came up with his theory of Morphogenesis by realising that life was messy and unstructured. Humans were messy and unstructured and it should follow, then, that our data is the same. There is no trigonometry of the mind and this is why the ability to store our data in an unstructured way is so important.
My second insight was that Data Science has actually been around for a long time. In 1974, Peter Naur published Concise Survey of Computer Methods, where he freely used the term Data Science when referring to a survey of the, then, contemporary data processing methods. There have been many uses for the term since but the general consensus is that the latest use of ‘Data Science’ is a re-branding exercise for what used to be called Business Analytics although I would argue that the new definition has a higher concentration of Mathematics and Statistical Analysis than the term would traditionally be associated with. I don’t think there is anything wrong with Data Science becoming a buzz word. Buzz words stir up commerce. Buzz words power the further development of new technologies such as Hadoop and MapReduce. Big Data and Data Science can exist apart quite happily but when brought together they can do great things. This is where The Data Lab comes in. This Scottish Innovation Centre aims to make Scotland an international centre of excellence in Data Science. Set up by the Scottish Government, the organisation is encouraging collaboration between industry, the public sector and our Universities and has been making waves across all the whole of the UK challenging the IT concentration of London and the South. The Data Lab is actively creating a Data Community in Scotland, a community I am delighted that Eyecademy is part of.
It’s been a long, insightful, trip from New York to Glasgow. As I look outside our office windows, the streets are teeming with people from all walks of life everyone generating data at a furious rate. If Scottish business can harness the technology to make sense of this chaos within the chaos, Scotland will surely become the international centre of data excellence we so aspire to be.
—
Brian Rutherford
Brian is co-founding director of Eyecademy. He has over 25 years of technical, commercial and managerial experience built in both private and public sectors.