Navigating the Data Driven Quicksand – 3

This is the third part of the series of posts covering Anti-patterns in Data Science. Read Part1 and Part2. In this article, I cover a few points to be careful about, during experiment design and data collection. Again, as with the previous two articles, the block diagram describing the “process flow” for solving a problem is reproduced below.

The trouble with hypothesis testing and statistical significance

It is not seldom that I encounter the following question while marketing a solution: “What are the hypothesis you are testing for and how will you prove or disprove it?”, thereby alluding to standard hypothesis testing techniques. In this context, I had a short conversation with a friend a couple of days back on the prevalent culture in many analytics organisations and academic