By Hans Constandt, CEO & Co-Founder of ONTOFORCE & speaker at the SAS Forum Belux on 15 October 2015
This blog is part of a tailor-made content series around the SAS Forum Belux 2015. It is linked with the event track called “Data Science”. Click here to join the event and learn more about the other 3 tracks (Internet of Things, Digital Society and Data Management).
This is a perfect moment to be a data scientist. They are – almost literally – the only ones who know how to handle Big Data and the incredible insights that it affords. And everybody is looking in the direction of Big Data. So it’s only logical that everybody is searching for a data scientist. Yet they are scarce, these elusive unicorns that combine a mix of 3 seemingly mismatched capabilities: IT savviness, business insight and communication skills. Although data scientists are a fascinating and extremely valuable breed, I’m much more inspired by the gap that their existence, and that of Big Data, has created.
Let me explain. Put simply, there are two basic kinds of analytics. BI solutions, on the one hand, offer structured and repeatable reports. They deliver interesting insights, but operate within a constrained framework and ad hoc self-service often presents a challenge. In short, most BI tools are easy to use, but offer limited reach. Big Data solutions, on the other hand, can offer radical new insights and are the basis for potentially disruptive business model innovations. But they are complex, and can only be manipulated by data scientists, who therefore – due to their scarcity – tend to induce bottlenecks in organisations.
When a gap becomes an opportunity: smart data discovery
As I pointed out, there is a gaping canyon of opportunity between BI and Big Data. And in that gap – whenever employees or researchers are looking for fast intelligence that does not lie in predefined BI reports and when they do not have the time (or budget) to involve a data scientist – people ‘google’ for information. Still too few of us realize that there is an alternative. It’s what Gartner calls `smart data discovery’ which “provides insights from advanced analytics to business users or citizen data scientists without requiring them to have traditional data scientist expertise”. The rise of these smart data discovery tools and their enablers – the citizen data scientists – are sure to revolutionize the way we think, work and invent.
On the one hand, information wants to be expensive, because it's so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time. So you have these two fighting against each other - Stewart Brand.
I’m a big believer in Stewart Brand’s premise that “information wants to be free”. What most people fail to understand is that, in our current environment, it is not. The information on Google is not ‘free’. On the contrary, Google is keeping it expensive on purpose. We ‘pay’ for the hits we receive with unsolicited advertisement information and a lack of objectivity. What’s more, you cannot freely extract, connect and manipulate the web data you acquire in order to transform it into actual insights. That is why I’m such a big fan of Tim Berners Lee and his semantic web – an environment where you can freely use, reuse and enrich the power of information.
Data discovery tools are the true enablers of the semantic web. They, too, have the ambition to link and compare all the data that is available in the world, both on- and offline. They want to facilitate the `web’ in the true sense of the word: going beyond a network of connected devices into the realm of intelligently linked data. Data discovery tools will make information `free’, in the sense that ‘laymen’ will be able to operate it themselves, without the help of expensive data science wizards (except during the phase when the tools are introduced, obviously).
Democratizing information for a knowledge revolution
The rise of data discovery tools will democratise information in the true meaning of the word: make it available to everyone in a convenient format and fashion such that they will be able to make it their own. They will trigger a significant evolution in Big Data analytics. Will we still need data scientists, then? Yes, but their role will dramatically evolve. They will become the orchestrators of the new breed of analytics: creating them, installing them and facilitating self-service, while staying on the sideline, monitoring the quality of the system. The citizen data scientists – the ‘regular’ users – will be the ones actually extracting the intelligence.
I became involved in data discovery for personal reasons. There are people in my close circle suffering from a rare disease and I felt frustrated about the lack of support. The unfortunate truth of uncommon health problems is that the pharmaceutical industry is struggling to make the huge R & D investments needed to find a new medication as the sales of the finished product will challenge a viable business model. So I started to collect as many pieces of information as I could and to link all of that data together in such a way that I could uncover tell-tale patterns and perhaps even some answers. And I found a way to achieve this using linked data methodology. Because I did this, people and pharmaceutical companies alike started to ask for help with their projects. That’s when I truly realized what data discovery and linked data could signify for the future of medicine. Not just for rare diseases, but for common ones as well, to which we have yet to find a conclusive cure. Just imagine what the ability to easily and quickly connect the data of all medical trials, patient records, academic research or communities like PatientsLikeMe.com could mean for cancer or Alzheimer’s.
To my mind, data discovery goes beyond the ability to link data and find valuable answers. It is about empowering citizens everywhere the world over to find the answers that will make our lives and that of others better. It is about finally making information free. I trust this is going to make a lot of people nervous, especially in this era of extreme data value and monetization. But the ultimate point is that the digital age is pushing us towards a very different kind of economy anyway. Some call it the zero marginal cost society. Or the sharing, peer-to-peer or collaborative economy. Others call it the reverse economy. I believe the semantic web, smart discovery tools and citizen data scientists will play a huge part in accelerating this evolution. The real question is: is your organisation prepared for this new paradigm? And are you investigating the tools that will enable it?