By Oded Netzer
A lot of people are using data like a drunk man uses a lamppost, for support rather than illumination.
Because there is so much data, managers tend to say “let me see what the data says,” “let me see what’s in there, in the data,” and in my mind that’s a big no-no in the world of data-driven decision-making.
You shouldn’t expect the data to provide both the question and the answer. You should think about what is it that you want to get from the data. What type of business problems can the data help with? And, hopefully, then if you ask the question appropriately and you use the right tools, there is a chance that the data actually can provide the answer. But, we tend to use big data as this exploratory source of information and hope that something will emerge, rather than actually asking precise questions and then hopefully getting good answers from big data.
In my mind, the value of big data is actually not so much in the length of the data — it’s not in the millions of customers observations that we observe — but rather in the depth of the data, or in the breadth of the data.
It is only in very rare cases where you really need millions of observations to derive insight. In most cases if you analyze 1,000 customers versus 1 million you’ll derive about the same insights.
One of the exceptions to that would be cases where we actually want to personalize or target an offering and in that case I do want the information about each customer. A good example for that would be electronic health records, because now I can observe about every patient their full history of their interactions with a physician in this hospital, maybe in other of the hospitals as well. We can, for example, help people with chronic diseases better manage their disease. We can even offload some of the work from physicians to nurse practitioners because they all observe the same data, and the richness of this data. So, it’s really more about the breadth of the data than about the length of the data.