The trouble with hypothesis testing and statistical significance

It is not seldom that I encounter the following question while marketing a solution: “What are the hypothesis you are testing for and how will you prove or disprove it?”, thereby alluding to standard hypothesis testing techniques. In this context, I had a short conversation with a friend a couple of days back on the prevalent culture in many analytics organisations and academic

[NLPPipelines] Basic Components -3

This is Part 3 of the NLP Pipelines Series. View Part 1 and Part 2. In this post, I cover a few more basic components required for building an NLP System. Parsing Chunking. Chunking is often regarded as the preprocessing step for parsing. It involves the recognition of non-overlapping phrases belonging to different syntactic categories. Originally, chunking referred

[NLP Pipelines] Natural Language Processing for Analytics – 1

Abstract The availability of large-scale textual data through disparate sources like the digitized libraries, social media, and the internet has made automated techniques to process text imperative. “Natural language processing” (NLP) refers to the functioning of software and hardware components in a computer system which analyze or synthesize spoken or written language. Several reports in the