k Nearest Neighbors and related Data Structures(KD tree and LSH)

k Nearest Neighbors or aptly known as kNN is one of the most common ways to cluster similar items. In this post I’ll touch on the intuition of the algorithm and some related data structures to optimize the algorithm. Related implementations in python can be found here. Let’s start with... [Read More]

A short summary of PCA (Principal Component Analysis)

A short summary of PCA (Principal Component Analysis) Application PCA is the most common dimensionality reduction technique. It is primarily used to compress data to lower dimensions for various reasons including: Speeding up of algorithm Visualization Caveat Superficially it may seem that we are eliminating features by applying PCA. But... [Read More]

The ML Box

Machine Learning is working like magic all around us. Suddenly, facebook is recognizing our faces, youtube is finding for us what feel’s like we always wanted to watch but could not find, gmail is filtering all the spams for us. ML is at play in all these cases. Machine Learning... [Read More]