Machine Learning Tom Maiaroto @shift8creative
What is Machine Learning?
Algorithms & Approaches Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighbour)     Naive Bayesian classifier
Algorithms & Approaches Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighbour)     Naive Bayesian classifier
So could machines one day rule the earth?
So could machines one day rule the earth?   Maybe   (ok probably not)
What can Machine Learning  do for Apps?   Spam filtering
What can Machine Learning  do for Apps? Auto-tagging
What can Machine Learning  do for Apps? All Sorts of Categorization
What can Machine Learning  do for Apps? Sentiment Analysis
Languages Commonly Used Java Java-ML, WEKA, Apache Mahout, many more... Python NLTK, scikit-learn, PyML, a good deal more... C++ libDAI, Armadillo, Orange, tons more...     and then some others...
Languages Commonly Used     http://www.mloss.org
MongoDB Too! Map/Reduce
Stored JavaScript
Geo-spatial Indexing
Replication
Geo-spatial Indexing Did someone say nearest neighbour?
Geo-spatial Indexing Did someone say nearest neighbour? Design geeks, imagine the visualizations...
Replication Store massive amounts of data
Distributed performance benefits
Dedicated databases for calculations      All the obvious benefits.
Map/Reduce It's the brain.
Map/Reduce It's the brain. It's not just for aggregation.
Map/Reduce It's the brain. It's not just for aggregation.       It's faster than you might think.
Map/Reduce It's the brain. It's not just for aggregation.       It's faster than you might think. It runs in the database.
Map/Reduce In the computer. ..
Example Time! It's simple...Just take this...
Example Time! It's simple...Just take this...
Example Time! Just kidding...       Let's Break Down a Naive Bayes Classifier
Classification /Naive Bayes Training the System
Classification /Naive Bayes Training the System Simple... $inc
Classification /Naive Bayes Just Keep Count of Words per Category
Classification /Naive Bayes Reduce:
Classification /Naive Bayes Reduce:
Classification /Naive Bayes Finalize:
Classification /Naive Bayes Finalize:
Classification /Naive Bayes Call the Command:
Classification /Naive Bayes Results: Can see total words. Can also see word  counts per category.
Classification /Naive Bayes Results: ...and of course the scores per category... cae = arts and entertainment cs = science ...
Classification /Naive Bayes Accurate even with little training
MongoDB on a small VM Took 1.7 seconds
Compared to say PHP 33 seconds and timed out
More training data == exponentially faster than PHP
Classification /Naive Bayes This wasn't even a full map/reduce

MongoDB & Machine Learning