Top quotations by famous data scientists that are sometimes enlightening, often humorous and always insightful
Over the last few years as I have read and researched the data science landscape, I have developed the habit of collecting, recording and interpreting data science-related quotations wherever I have come across them.
I have found these quotations to be sometimes enlightening, often humorous and always insightful, hence I have compiled a list of my very favourites below along with my interpretation of their meaning and what we can learn from them.
Before we dive into the quotations please consider …
Joining Medium with my referral link (I will receive a proportion of the fees if you sign up using this link).
1. “Things get done only if the data we gather can inform and inspire those in a position to make a difference.”
(Dr. Mike Schmoker, Author)
This is one of my favourite data science quotes ever. It refers to the “last mile” of data science and infers that we can have access to clever data scientists and good data but if we fail to influence key decision makers to act on the findings then it has all been for nought and our beloved models are destined to gather dust on the data-science shelf.
2. “Torture the data, and it will confess to anything.”
(Ronald Coase, Economics, Nobel prize Laureate)
The first data science-related quotation I ever collected, it refers to the temptation to make the data fit the hypothesis, also know as a “confirmation” bias if we obsess with a data set for long enough.
The key learning is to develop a sense of when to accept that no further value can be realised by continued analysis of a particular data set and even that sometimes we may need to walk away.
3. “Data is the new oil.”
(Clive Humby, Mathematician and Marketeer)
In 2006 Clive Humby created one of the lasting data science quotes by declaring that “Data is the new oil”.
This comparison has been debated and challenged many times since but it still highlights that data has a significant and often untapped monetary value to firms and that the data has to be mined, extracted, refined and delivered to realise that value.
4. “All models are wrong but some are useful.”
(George E. P. Box, British Statistician)
Data science models are simplified abstractions of the real world and as such they are all wrong in some measure because the real world is the one and only true representation, so why bother at all?
Well, we build a data model to develop descriptive and predictive insight to drive improvements in decision making and any model that is just accurate enough can produce that insight and the associated competitive advantage even if it is bound to fall short of the real world that it represents.
5. “If we have data, let’s look at data. If all we have are opinions, let’s go with mine.”
(Jim L. Barksdale, American Executive)
Jim Barksdale is encouraging his teams to come to his door with data that can be analysed in the decision making process with the inherent threat that if no data is available Jim will implement his own view, so you had better bring data to the next meeting!
6. “Intuition is thinking that you know without knowing why you do.”
(Daniel Kahneman, Israeli-American Psychologist and Nobel Prize Winner)
Pure intuition, according to the writings of Daniel Kahneman, is a fallacy and by inference decision makers who rely solely on intuition without data and evidence will make decisions that are either wrong or just lucky.
7. “Big data is like teenage sex; everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”
(Dan Ariely, American Professor)
The source of this flippant, humorous but oh-so insightful quote is a famous tweet from Dan Ariely.
It is referring to the tendency for many data science professionals, managers and leaders to boast about their firms leveraging of big data without really understanding what it is, let alone having developed a mature big data capability that is delivering those all-important and tangible impact and outcomes.
8. “In God we trust. All others bring data.”
(Barry Beracha, CEO of Sara Lee Bakery Group)
This is another great data science-related quotation from a successful senior executive.
My interpretation is that Barry would only trust God to make decisions without having data to back it up and if anyone else is coming to his office door they had better bring some data to support their views and opinions!
9. “Errors using inadequate data are much less than those using no data at all.”
(Charles Babbage, English Mathematician)
Have you ever been in one of those meetings where the data team has worked hard to produce the required information only to be shot down by other attendees opining that the data cannot be trusted to inform the decision making?
Well, I keep this famous quote in my back pocket for just such an eventuality.
Self-evidently we want our data to be as accurate, timely and consistent as possible to enable and support effective decision making but even where there are some inadequacies it is still much better to use data than to resort to uninformed intuition, anecdote and personal opinion.
10. “Gentlemen, you need to put the armour plate where the bullet holes aren’t because that’s where the holes were on the planes that didn’t return.”
(Abraham Wald, Hungarian Mathematician)
Abraham Wald famously reduced the losses of WWII Allied aircraft by recommending to senior officers that they add armour plating to the places on the planes that returned to base where there were no bullet holes rather than the places that were riddled.
His thinking was that the planes that made it back were bad indicators of where to add the armour. It was the ones that did not get home that were shot elsewhere like in the engines and nose-cone.
In data science this refers to the “availability bias” where there is a tendency to develop projects using just data that is readily and easily available rather than data that is critical to the analysis, but much more difficult to acquire.
11. “It is a capital mistake to theorize before one has data.”
(Sherlock Holmes, Detective)
We may think of Sherlock Holmes primarily as a detective but in this quote the great detective is supporting the view of some of our more recent, corporate contributors that data is a critical pre-cursor to forming and evaluating our hypotheses.
12. “No great marketing decisions have ever been made on qualitative data.”
(John Sculley, American Businessman)
We have established that data detractors often dispute data accuracy but another group that can often decry the criticality of data in decision making are creatives.
Here John Sculley is challenging that with a counter view stating that even though we associate marketing with qualitative data, no great marketing decisions have ever been made unless there is quantitative data.
And in the modern, machine learning world we can even use models to develop qualitative data into quantitative information using sentiment analysis and other natural language processing techniques so there really is no excuse!
13. “You can have data without information, but you cannot have information without data.”
(Daniel Keys Moran, American Fiction Writer)
Data can be analysed, synthesised and developed into information, and again into intelligence and finally into knowledge as firms obtain higher data maturity, however this journey is unidirectional.
Daniel Keys Moran’s quote highlights two key lessons. Firstly, if we fail to analyse, clean and refine raw data we can never get structured, meaningful, insightful information and also we can never reach the higher levels of data maturity without having the raw data in the first place.
14. “Data beats emotions.”
(Sean Rad, Co-founder of Tinder)
The dating app tinder is all about the emotional business of forming friendships and romantic relationships but here Sean Rad is indicating in just 3 words the criticality of data as an under-pinning, foundational component of building that capability in the digital and real worlds.
15. “In the labour market of the future, any decision made without data is just a shot in the dark”
(Duncan Brown, Economist and Andy Durman, Managing Director)
I will finish my famous quotations with this one that is alluding to the historical tendency to recruit based on feeling giving way to an envisaged future where recruitment without data will be little better than guessing.
Of course, it has an important overlap with the moral and ethical uses of data.
We should not and cannot make people-centric decisions based on data in the public space (on social media etc.) unless that data has been rigorously and robustly checked and assured but data will certainly play an increasing role in augmenting recruitment, staffing and people decisions in future.