Complacency in Data Analysis

Data Analysis is learning, really. You only spend time doing it to learn new things, or re-confirm past knowledge. And learning from the data is a lot like learning anything – for example, I’m going to compare it to learning a language.

Ahem! Everyone, everyone? Hello. Analyzing data is like learning a language.

Dutch is my second language. Well, technically, it’s my 4th, but I speak zen Spanish and functionally illiterate vacation Japanese. By zen, I mean that I’m incapable of speaking about the past or the future in Spanish, so I’m forced to “live in the now.” It’s amazing, I recommend it.

I’ve started and stopped learning German repeatedly over the years, and in truth I can usually understand much of what I hear, depending on the accent. It varies a great deal – I’m at a total loss with my wife’s relatives in Austria, unfortunately, but folks from the parts of Germany closest to the Dutch border, I can mostly follow.

And that’s often my problem with learning the language. I hear the words, I think I understand the words, I forget the words. Because everything is comfortable, I can’t hold onto the words myself. My personal vocabulary remains non-existent, so even if I can understand the German speaker, I can’t form sentences to respond. I might be able to answer in Dutch in a funny accent and hope for the best, but this doesn’t even always work as comedy. I find myself often uncomfortably sputtering gibberish, and getting headaches.

Data is similar and there is a real risk that people will see data, and superficially understand it, and then just move on. This risk is especially there with people who use the old school data dump method – “show me all the data in a file, and I’ll look for issues once a week.” But as that methodology passes into the dustbin of history, similar issues have popped up with bad dashboard design.

The more data one puts into a dashboard, the more risk they have of missing things because everything just looks right. It blends into a comfortable whole – a huge mass of data, where you look for exceptions. You see the data, you think you understand the data, you forget the data – and in effect, while you mean to process it, you’re ignoring it.

This is why I say, a properly designed dashboard never answers questions – it points you to the questions you should be asking.

The key to data presentation must always be about jolting us out of our senses (das ist gut), and never making us comfortable with problematic data (das ist schlecht).