K-Nearest Algorithm

Mon, 08/07/2017 - 16:35

Recently I started following tutorials on YouTube provided by Google on using and understanding deep-learning. This is a subject I have had some interest in recently. These tutorials not only gave me a foot-hold on how to use deep-learning, but have also given me an opportunity to start using Python as a language.

So anyways this project isn't using Python. Following the tutorials I started to notice that the math in the most basic search algorithm K-Nearest was pretty easy to understand and I was confident that I could build it in Javascript. NOTE: In the tutorial the instructor writes out the code to K-Nearest...BUT, I have not gotten that far in the tutorial. ( I wanted to figure it out for myself ).

Here is a running example with source codehttp://wesmantooth.jeremyheminger.com/examples/knearest.php

The concept is pretty simple:

  1. Create two or more arrays of [x,y] coordinates. These are the test subjects. Plot them onto a grid.
  2. Create an array of random [x,y] coordinates. The algorithm is going to attempt to classify these points based on their proximity to the sample data.
  3. The algorithm loops through the test data and compares the distance to each sample. It assumes that points that are grouped together are related.
  4. To get a higher level of accuracy I compare the differences to n samples. This also helps to avoid clashes from points that are equidistant from the test point. 

This test version is using random coordinates. But those coordinates could be correlated with real data points. A real-world application for this algorithm might be to categorize unknown data based on several similar data points. (car parts, insects, cats, etc) ...

The screenshot below (left side) shows that the program was able to categorize the blue dots and the green dots.

The screenshot on the right shows the sample data. This would be the known data that the program would learn from.

I have only just begun studying this subject, but to be fair I do have some experience in using weights to optimize return based on previous result. But, my previous tools were...well they were entirely based on my own research and assumptions. Not to say they weren't effective. But I am very excited to begin REALLY getting my feet dirty on this subject.

Loads of FUN!