A simple-but-useful kNN library for NodeJS, comparing JSON Objects using Euclidean distances, returning top k closest objects.
Supports Normalized Weighted Euclidean distances. Normalize attributes by Standard Deviation. See here.
Features key and filter attributes to do the data assembly for you, Lisp style!
subject: vantage point object - will consider each attribute present in this object as a feature
objects: array of objects that should all have at least the attributes of subject
options:
- k: (default = unlimited) specifies how many objects to return
- standardize: (default = false) if true, will apply standardization across all attributes using stdvs - set this to true if your attributes do not have the same scale
- weights: (default = {}) a hash describing the weights of each attribute
- key: (default = none) a key function to map over objects, to be used if the subject attributes are nested within key
e.g. if subject is {a:0} and objects are [{x: {a: 0}}, {x: {a: 2}}], then provide key: function(o) {return o.x}
- filter: (default = none) a filter function that returns true for items to be considered
e.g. to only consider objects with non-negative a: function(o) {return o.a >= 0})
Given John Foo's taste for movies:
| Attributes | Value | Weight |
|---|---|---|
| explosions | 8 | 10% |
| romance | 3 | 30% |
| length | 6 | 5% |
| humor | 5 | 5% |
| pigeons | 10 | 50% |
John Foo would like to rent a movie tonight that most closely matches his movie tastes. He collected a DB of movies with numerical values ranging from 1 to 10 for each of the 5 attributes listed above (don't ask how).
John Foo loves his pigeons. It is the most important attribute to him, hence carries 50% of the weight. He does not like romance and wants to make sure that he avoids sappy movies. Even though he likes mid-length movies with explosions and semi-funny movies, he doesn't care as much, as long as the movie features peaceful pigeons.
Perfect case for Alike!
To install and add it to your package.json
$ npm install alike --save
Now you can load up the module and use it like so:
knn = require('alike');
options = {
k: 10,
weights: {
explosions: 0.1,
romance: 0.3,
length: 0.05,
humour: 0.05,
pigeons: 0.5
}
}
movieTaste = {
explosions: 8,
romance: 3,
length: 5,
humour: 6,
pigeons: 10
}
knn(movieTaste, movies, options)
Where movies is an array of objects that have at least those 5 attributes. Returns the top 10 movies from the array. Enjoy! :)
Alike is written in CoffeeScript in the coffee/ folder. You may use make coffee to compile and watch for changes. Unit tests are in the coffee/test/ folder. You can run the tests with npm test or if you are developing, you may use make watch-test to watch while you TDD. :)
Alike is licensed under the terms of the GNU Lesser General Public License, known as the LGPL.
