In Friday’s Monetary Times, economist and writer Tim Harford offers a a lot-needed, unusually deemed response to big information hype, making use of the reported failure of Google Google Flu Trends to predict a recent flu season as a prismatic example that highlights important causes large data merits a lot more caution and less exuberance.
Harford’s critique is typically insightful and spot-on – his considerations about the perils of numerous comparisons are specifically important (my 2008 consider in the Washington Submit right here), and will become even much more appropriate in healthcare as much more data turn out to be accessible to more men and women (naturally a very good trend). Without having statistically proper evaluation, nonetheless, there’s a actual concern that a spate of false-constructive associations will emerge and distract us, a lot like in the early days of genomics.
The one particular location of Harford’s examination that gave me pause entails the age-previous debate amongst concept and empiricism (nicely summarized in Jim Manzi’s Uncontrolled, by the way). A robust model of the canonical huge data thesis is that when you have sufficient data, you can make unbiased predictions that really don’t demand an underlying comprehending of the method or context – the information are ample to communicate for themselves. This is the so-referred to as “end of concept.”
Not so quickly, Harford responds. The failure of Google Flu Trends, in his see, emphasizes the perils of unmoored empiricism.
“A concept-free of charge analysis of mere correlations is inevitably fragile,” Harford writes. “If you have no idea what is behind a correlation, you have no notion what may possibly lead to that correlation to break down.”
He’s proper, of program – but I suspect that outdoors of a few areas this kind of as physics, our knowing of causation is far much more fragile than we enjoy (a point emphasized at length in Manzi’s guide). We overestimate our comprehending of causation, and our capacity to generalize.
I’d argue this is particularly correct in medicine, in which despite our aspirations to approach overall health and condition from 1st concepts, our actual comprehending is far far more limited, and primarily based far a lot more on rationalized empiricism than is often appreciated – there’s significantly much more scientism than science. The primacy of empiricism in medication also emerges from Morton Meyers’ Pleased Accidents, and is a central theme of Nassim Taleb’s Antifragile.
Fundamentally, my concern is that more often than we value, and particularly in healthcare, our faith on concept is misplaced – we flip to various theories as crutches, explanatory models, memory devices, in the situation of med college students and harried residents.
Ideally, theories can be evaluated scientifically and replaced by much better ones – and this happens, in excess of time. But I suspect that a lot of of our existing theories are at least as fragile as the visibly imperfect, information-driven associations Harford cites.
The variation is that we acknowledge (or must acknowledge) empirical predictions for what they are, limitations and all. Yet, I suspect we are a lot more likely to allow our guard down in circumstances in which predictions are concept-driven, where we instinctively feel we truly recognize what is going on. In undertaking so, we are most likely to discount data that really don’t match, and unconsciously constrain our thinking in accordance to theory’s dictates.
For most phenomena in medicine and well being, we truly do not have a clear understanding of result in and effect. “Que sais-je,” Montaigne inscribed over the door of his research 5 hundred many years ago. “What do I know?” Empirical, huge-data-driven analyses, at least, have the humility to acknowledge this.