Robotics, Search and AI with Solr, MyRobotLab, and Deep Learning Kevin Watters Founder, KMW Technology @kwatters76 #Activate18 #ActivateSearch
What are we going to show today? • A life sized humanoid 3D printed open source robot that can learn from it’s surrounding and interact with humans in a meaningful and natural manor. • Teach a robot to recognize people by making an introduction. • Just as a humans meet and remember each other, so should robots. • Simplify the barriers of human robotic interaction.
Agenda • Introduction • InMoov • MyRobotLab • Anotomy of a Cognitive Robot • Demo • Q & A
Introduction • KMW Technology • Boston Based Search Professional Services • Founded in 2010 • Search consulting and contracting • Solr • Elastic • Deeplearning4j • NLP/NLU • ETL & Custom Connectors • www.kmwllc.com
How did I get here? • Open source supporter, committer, contributor • Enjoy teaching and sharing • Maker Faire / Maker movement • NYC / Bay Area / Denver / Boston / Paris • EE from Northeastern University • Passion for building and integrating. • Search and AI passion.
RedHat Summit & Robots for Good
Introducing Lloyd • Started Construction 2014 ? • MakerBot Replicator 2 • Powered By MyRobotLab • 2 arduinos • 1 Raspberry PI • 2 cameras • 25 servo motors (more to come.) • Speech Recognition • Speech Synthesis • Memory • And Telepresence and remote operation with OculusRift Support!
InMoov • Open Source life sized 3D printed humanoid robot • Designed by Gael Langevin, Paris France • Started in 2012 • Inspired 3D printed prostetics projects • Bionico / eNable • Approx. 500 exist around the world • Gael believes in Open source and that it takes a world to raise a robot. • More Info at: www.inmoov.fr
MyRobotLab I for one welcome our new robotic overloads! • Started by Greg Perry, Portland, Oregon • Java based Open Source framework • Hosted on github • Borg in technologies • Over 100 open source projects integrated.. And counting! • Pub / Sub service based architecture • Scripting via Python / jython • Multi-platform (Windows/Linux/Mac/RasPI) • www.myrobotlab.org
How do we make this robot “smart”? • How does can it recognize people and interact with them. • What were the challenges ? Where do we want to take the robot? • How do we get this robot to learn and understand? • What tools exist out there to solve this problem? • How can we wire it all together… • We should be able to interact with the robot without a keyboard. • We should be able to teach the robot new things. • Lets make it cognitive!.. And open source!
Anotomoy Of A Cognitive Robot Modeled anthropomorphically • Hearing ( WebkitSpeech / CMU Sphinx) • Speaking ( MaryTTS / Polly / NaturalReader, etc • Reasoning ( ProgramAB / AIML ) • Vision ( OpenCV) • Remembering ( Solr ) • Learning ( Deeplearing4j )
System Architecture Review
Hearing All your utterance are belong to us.
Speech Recognition • Initially using CMU Sphinx • Need to specify a fixed grammar • Not very active project • English only • Offline • Webkit Speech Recognition • Built into Google Chrome • Accessed via WebGui & Javascript • Supports ~ 60 langauges & dialects • Requires Internet Connection
Speaking Hello, is there anybody out there?
Speech Synthesis • MaryTTS • Open source, supports a few different voices and languages • No internet connection required • MyCroft – AI Speech Synthesis • Open source, limited number of voices • No internet connection Required • LocalSpeech – invoke existing command line utilities • FreeTTS • Festival • Natural Reader / Acapela Speech • Good quality • Requires internet connection • Lots of voices for various languages • Amazon Polly • Requires account, small cost • Good quality, requires internet connection
Reasoning What did you intend?
Natural Language Understanding • Based on ProgramAB • AIML 2.0 (XML based) • Created by Dr. Richard S. Wallace in 1995. (Yeah it’s old, but it works.) • Case based Reasoning • Uses recursion to break down user utterances • Special fork of project on github to support MyRobotLab specific using “OOB” or Out-Of-Band calls to MyRobotLab services or external web services • Pandorabots Create your own online • Mitsuku is the current winner of the Lobner prize and is AIML based.
ProgramAB extensions • Maven Based • Proper Logging • 40x faster loading large AIML sets • SRAIX handler for extensibility to call out to external services. • Localization / Locale support • CJK support with Lucene Tokenization • CJK Tokenizer (Chinese / Korean) • Kuramoji Tokenizer (Japanese)
AIML Tags & Simple Example “Category” is the basic unit to define a response “Pattern” This is the string that defines the matching for the utterance. Patterns can have wildcards or reference a “set” of items for brevity. “Template” this defines how to handle the response “That” this is an optional tag that can specify additional matching criteria based on that previous resonse from the robot. It’s used to create “multi- pass” conversations “Topic” an optional tag that specifies the precedence of matching. Categories in the current topic are matched before the default topic. Useful for talking about particular subjects in more detail without giving generic responses. Add additional mappings for an utterance of “Greetings”, or “Hey” to recursively return the response for “Hi” Map any utterance that starts with “HELLO” to return the response for “Hi” User : Hello Robot! Bot: Hello User! <category> <pattern>HI</pattern> <template>Hello user!</template> </category> <category> <pattern>GREETINGS</pattern> <template><srai>HI</srai></template> </category> <category> <pattern>HEY</pattern> <template><srai>HI</srai></template> </category> <category> <pattern>HELLO *</pattern> <template><srai>HI</srai></template> </category>
Learn Tag Example • AIML has built into it the ability to add new categories and responses dynamically. • Use the wildcards to generate a template to teach the robot. • Use the “think” tag so the robot only thinks it and doesn’t say it! • Use the “learn” tag to add a new category that is filled out from the pattern. • User the “eval” tag to return the value that matched the first and second * in the pattern • A helper category that will match any utterance starting with “What is” to return the response for what ever comes after the words “what is” • User : Learn that Pi is yummy • Bot: Ok Pi is yummy. • User : What is Pi? • Bot: Yummy. <category> <pattern>LEARN * IS *</pattern> <template>OK <star/> IS <star index="2"/> <think> <learn> <category> <pattern><eval><star/></eval></pattern> <template><eval><star index="2"/></eval></template> </category> </learn> </think> </template> </category> <category> <pattern>WHAT IS *</pattern> <template><srai><star/></srai></template> </category>
Wikipedia based Q/A API Integration • Using ProgramAB to convert the utterance to a Solr query. • MyRobotLab support for indexing XML, JDBC, RSS, etc • Support for Document Processing Pipeline with pluggable stages • Indexed Wikipedia using Sweble java parser • Extracting Infoboxes and indexing them as triples • Constructing high precision queries to answer free form questions. • KMW based NLU web service integrated to simplify.
Seeing Peek-a-boo!
OpenCV / JavaCV • Open source… • Java bindings via the JavaCPP project. Thanks Samuel Audet! • 2000+ different algorithms to manipulate and extract information from image and video data • Support for a wide range of different hardware from webcams to remote Mjpeg video streams. • Video Processing Pipeline of filters that is modular and dynamic that enhance the image with metadata and classifications.
Memory Well, how did I get here?
Memory (Embedded and Cloud Solr) • Solr is integrated into MyRobotLab as both Embedded Solr Server or external SolrCloud instance. • Can attach to all information flows between services in MyRobotLab to capture data-inflight between services • Records what the robot hears, sees, says, recognizes, and how the robot moves • Stores image data in non-searchable binary field. • Can be queried to produce training datasets for deep learning and custom AI model.
More Solr Stuff! • Dynamically attach to the Inbox or Outbox of an MRL Service • Serialize the Message object that is passed between services into a SolrInputDocument. • Ability to specify stateful information to be tagged on data being indexed to label incoming data • Facet based metrics on the amount of data flowing through the robots nervous system.
Memory Exploration w/ Solr
Learning What good are memories if you don’t learn from them?
Deeplearning4j • Java based deep learning framework supported by SkyMind.io • GPU acceleration and native support across many platforms with JavaCPP project. • Relies on ND4J with JNI to do heavy matrix math modeled after Python’s NumPY • Can load models from Tensorflow, Keras, Café • Has a pre-trained model zoo • Supports custom network topologies • Feed forward, CNN, RNN, LSTM support
Solr for Deeplearning in 1,2,3 Solr’s native support for faceting, random sort ordering, and pagination is ideal for generating a random sampling to produce both the testing and training datasets. 1. Query to get total training dataset count with a facet on the label field to get all labels for a dataset. 2. Query with a random sort order ascending for the training dataset (max pagination offset based on percentage of dataset.) 3. Query with the same random sort seed descending for the remaining examples for the testing dataset.
VGG-16 Image Classification • Visual Geometry Group (U. Oxford) • ImageNet ILSVRC-2014 (1st runner up) • Can classify ~ 9000 classes of objects • 16 level deep Convolutional Neural Network • Input layer 224x224 pixels with 3 color channels (163,968) • Open source pre-trained (Creative Commons Attribution) • Available in Model-Zoo for DL4J
Yolo - Darknet • Yolo (You Only Look Once) • Classification and Localization (Bounding Box) • Trained on COCO dataset (Common Object in Context) • Pretrained model, native support in OpenCV to load pre-trained model • Currently using YoloV2 (Input 416x416x3 pixels)
Transfer learning • Training neural networks takes a long time and a lot of compute resources. • Training VGG16 took multiple weeks. • 138,357,544 parameters to train • Use pre-trained model, chop off last (output) layer • Add new output layer with the number of classes that you want to classify. • Hold all layers frozen except for the output layer (~ 4,097 parameters per class to train.) • Training time ~ 5 minutes to get to 80+ % accuracy • Small training dataset ( ~ 50 examples per class.)
Combine Yolo & VGG • We can combine Yolo classification to produce the bounding box. • Crop image based on bounding box and pass it to a transfer learned VGG16 model to sub-classify • Yolo detects Person • Pass cropped image of the person to VGG16 to subclassify and identify which person it is! • Store training data in Solr from Yolo • Custom Dataset Iterator that queries embedded Solr • Inspired by SOLR-11838
System Architecture Review
Demo • How ya doin’ Lloyd? I always liked you!
The Singularity is near! • We have presented an integration of many open source technologies to demonstrate emergent behavior and to help provide a bit of our vision of how robots and their human care takers will evolve in the next decades to come. • This is an open source project in its inception. Feel free to contribute or take from this as much or as little as you like. • Try it out! Download it.. Don’t like it? Make it better! • Want to help out? Pull Requests welcome!
Whats next? • New technologies? • DeepWave – Speech? • LSTM based Speech Recognition? • Performance and stability • Better distributed and swarm capabilities • SLAM – Simultaneous Localization and Mapping • Current Release is “Manticore” • Upcoming release is “Nixie” (Soon!) • Next release is “Ogre”
Thank you! Kevin Watters Founder, KMW Technology @kwatters76 #Activate18 #ActivateSearch

The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep Learning - Kevin Watters, KMW Technology

  • 1.
    Robotics, Search andAI with Solr, MyRobotLab, and Deep Learning Kevin Watters Founder, KMW Technology @kwatters76 #Activate18 #ActivateSearch
  • 2.
    What are wegoing to show today? • A life sized humanoid 3D printed open source robot that can learn from it’s surrounding and interact with humans in a meaningful and natural manor. • Teach a robot to recognize people by making an introduction. • Just as a humans meet and remember each other, so should robots. • Simplify the barriers of human robotic interaction.
  • 3.
    Agenda • Introduction • InMoov •MyRobotLab • Anotomy of a Cognitive Robot • Demo • Q & A
  • 4.
    Introduction • KMW Technology •Boston Based Search Professional Services • Founded in 2010 • Search consulting and contracting • Solr • Elastic • Deeplearning4j • NLP/NLU • ETL & Custom Connectors • www.kmwllc.com
  • 5.
    How did Iget here? • Open source supporter, committer, contributor • Enjoy teaching and sharing • Maker Faire / Maker movement • NYC / Bay Area / Denver / Boston / Paris • EE from Northeastern University • Passion for building and integrating. • Search and AI passion.
  • 6.
    RedHat Summit &Robots for Good
  • 7.
    Introducing Lloyd • StartedConstruction 2014 ? • MakerBot Replicator 2 • Powered By MyRobotLab • 2 arduinos • 1 Raspberry PI • 2 cameras • 25 servo motors (more to come.) • Speech Recognition • Speech Synthesis • Memory • And Telepresence and remote operation with OculusRift Support!
  • 8.
    InMoov • Open Sourcelife sized 3D printed humanoid robot • Designed by Gael Langevin, Paris France • Started in 2012 • Inspired 3D printed prostetics projects • Bionico / eNable • Approx. 500 exist around the world • Gael believes in Open source and that it takes a world to raise a robot. • More Info at: www.inmoov.fr
  • 9.
    MyRobotLab I for onewelcome our new robotic overloads! • Started by Greg Perry, Portland, Oregon • Java based Open Source framework • Hosted on github • Borg in technologies • Over 100 open source projects integrated.. And counting! • Pub / Sub service based architecture • Scripting via Python / jython • Multi-platform (Windows/Linux/Mac/RasPI) • www.myrobotlab.org
  • 10.
    How do wemake this robot “smart”? • How does can it recognize people and interact with them. • What were the challenges ? Where do we want to take the robot? • How do we get this robot to learn and understand? • What tools exist out there to solve this problem? • How can we wire it all together… • We should be able to interact with the robot without a keyboard. • We should be able to teach the robot new things. • Lets make it cognitive!.. And open source!
  • 11.
    Anotomoy Of ACognitive Robot Modeled anthropomorphically • Hearing ( WebkitSpeech / CMU Sphinx) • Speaking ( MaryTTS / Polly / NaturalReader, etc • Reasoning ( ProgramAB / AIML ) • Vision ( OpenCV) • Remembering ( Solr ) • Learning ( Deeplearing4j )
  • 12.
  • 13.
    Hearing All your utteranceare belong to us.
  • 14.
    Speech Recognition • Initiallyusing CMU Sphinx • Need to specify a fixed grammar • Not very active project • English only • Offline • Webkit Speech Recognition • Built into Google Chrome • Accessed via WebGui & Javascript • Supports ~ 60 langauges & dialects • Requires Internet Connection
  • 15.
    Speaking Hello, is thereanybody out there?
  • 16.
    Speech Synthesis • MaryTTS •Open source, supports a few different voices and languages • No internet connection required • MyCroft – AI Speech Synthesis • Open source, limited number of voices • No internet connection Required • LocalSpeech – invoke existing command line utilities • FreeTTS • Festival • Natural Reader / Acapela Speech • Good quality • Requires internet connection • Lots of voices for various languages • Amazon Polly • Requires account, small cost • Good quality, requires internet connection
  • 17.
  • 18.
    Natural Language Understanding •Based on ProgramAB • AIML 2.0 (XML based) • Created by Dr. Richard S. Wallace in 1995. (Yeah it’s old, but it works.) • Case based Reasoning • Uses recursion to break down user utterances • Special fork of project on github to support MyRobotLab specific using “OOB” or Out-Of-Band calls to MyRobotLab services or external web services • Pandorabots Create your own online • Mitsuku is the current winner of the Lobner prize and is AIML based.
  • 19.
    ProgramAB extensions • MavenBased • Proper Logging • 40x faster loading large AIML sets • SRAIX handler for extensibility to call out to external services. • Localization / Locale support • CJK support with Lucene Tokenization • CJK Tokenizer (Chinese / Korean) • Kuramoji Tokenizer (Japanese)
  • 20.
    AIML Tags &Simple Example “Category” is the basic unit to define a response “Pattern” This is the string that defines the matching for the utterance. Patterns can have wildcards or reference a “set” of items for brevity. “Template” this defines how to handle the response “That” this is an optional tag that can specify additional matching criteria based on that previous resonse from the robot. It’s used to create “multi- pass” conversations “Topic” an optional tag that specifies the precedence of matching. Categories in the current topic are matched before the default topic. Useful for talking about particular subjects in more detail without giving generic responses. Add additional mappings for an utterance of “Greetings”, or “Hey” to recursively return the response for “Hi” Map any utterance that starts with “HELLO” to return the response for “Hi” User : Hello Robot! Bot: Hello User! <category> <pattern>HI</pattern> <template>Hello user!</template> </category> <category> <pattern>GREETINGS</pattern> <template><srai>HI</srai></template> </category> <category> <pattern>HEY</pattern> <template><srai>HI</srai></template> </category> <category> <pattern>HELLO *</pattern> <template><srai>HI</srai></template> </category>
  • 21.
    Learn Tag Example •AIML has built into it the ability to add new categories and responses dynamically. • Use the wildcards to generate a template to teach the robot. • Use the “think” tag so the robot only thinks it and doesn’t say it! • Use the “learn” tag to add a new category that is filled out from the pattern. • User the “eval” tag to return the value that matched the first and second * in the pattern • A helper category that will match any utterance starting with “What is” to return the response for what ever comes after the words “what is” • User : Learn that Pi is yummy • Bot: Ok Pi is yummy. • User : What is Pi? • Bot: Yummy. <category> <pattern>LEARN * IS *</pattern> <template>OK <star/> IS <star index="2"/> <think> <learn> <category> <pattern><eval><star/></eval></pattern> <template><eval><star index="2"/></eval></template> </category> </learn> </think> </template> </category> <category> <pattern>WHAT IS *</pattern> <template><srai><star/></srai></template> </category>
  • 22.
    Wikipedia based Q/AAPI Integration • Using ProgramAB to convert the utterance to a Solr query. • MyRobotLab support for indexing XML, JDBC, RSS, etc • Support for Document Processing Pipeline with pluggable stages • Indexed Wikipedia using Sweble java parser • Extracting Infoboxes and indexing them as triples • Constructing high precision queries to answer free form questions. • KMW based NLU web service integrated to simplify.
  • 23.
  • 24.
    OpenCV / JavaCV •Open source… • Java bindings via the JavaCPP project. Thanks Samuel Audet! • 2000+ different algorithms to manipulate and extract information from image and video data • Support for a wide range of different hardware from webcams to remote Mjpeg video streams. • Video Processing Pipeline of filters that is modular and dynamic that enhance the image with metadata and classifications.
  • 25.
  • 26.
    Memory (Embedded andCloud Solr) • Solr is integrated into MyRobotLab as both Embedded Solr Server or external SolrCloud instance. • Can attach to all information flows between services in MyRobotLab to capture data-inflight between services • Records what the robot hears, sees, says, recognizes, and how the robot moves • Stores image data in non-searchable binary field. • Can be queried to produce training datasets for deep learning and custom AI model.
  • 27.
    More Solr Stuff! •Dynamically attach to the Inbox or Outbox of an MRL Service • Serialize the Message object that is passed between services into a SolrInputDocument. • Ability to specify stateful information to be tagged on data being indexed to label incoming data • Facet based metrics on the amount of data flowing through the robots nervous system.
  • 28.
  • 29.
    Learning What good arememories if you don’t learn from them?
  • 30.
    Deeplearning4j • Java baseddeep learning framework supported by SkyMind.io • GPU acceleration and native support across many platforms with JavaCPP project. • Relies on ND4J with JNI to do heavy matrix math modeled after Python’s NumPY • Can load models from Tensorflow, Keras, Café • Has a pre-trained model zoo • Supports custom network topologies • Feed forward, CNN, RNN, LSTM support
  • 31.
    Solr for Deeplearningin 1,2,3 Solr’s native support for faceting, random sort ordering, and pagination is ideal for generating a random sampling to produce both the testing and training datasets. 1. Query to get total training dataset count with a facet on the label field to get all labels for a dataset. 2. Query with a random sort order ascending for the training dataset (max pagination offset based on percentage of dataset.) 3. Query with the same random sort seed descending for the remaining examples for the testing dataset.
  • 32.
    VGG-16 Image Classification •Visual Geometry Group (U. Oxford) • ImageNet ILSVRC-2014 (1st runner up) • Can classify ~ 9000 classes of objects • 16 level deep Convolutional Neural Network • Input layer 224x224 pixels with 3 color channels (163,968) • Open source pre-trained (Creative Commons Attribution) • Available in Model-Zoo for DL4J
  • 33.
    Yolo - Darknet •Yolo (You Only Look Once) • Classification and Localization (Bounding Box) • Trained on COCO dataset (Common Object in Context) • Pretrained model, native support in OpenCV to load pre-trained model • Currently using YoloV2 (Input 416x416x3 pixels)
  • 34.
    Transfer learning • Trainingneural networks takes a long time and a lot of compute resources. • Training VGG16 took multiple weeks. • 138,357,544 parameters to train • Use pre-trained model, chop off last (output) layer • Add new output layer with the number of classes that you want to classify. • Hold all layers frozen except for the output layer (~ 4,097 parameters per class to train.) • Training time ~ 5 minutes to get to 80+ % accuracy • Small training dataset ( ~ 50 examples per class.)
  • 35.
    Combine Yolo &VGG • We can combine Yolo classification to produce the bounding box. • Crop image based on bounding box and pass it to a transfer learned VGG16 model to sub-classify • Yolo detects Person • Pass cropped image of the person to VGG16 to subclassify and identify which person it is! • Store training data in Solr from Yolo • Custom Dataset Iterator that queries embedded Solr • Inspired by SOLR-11838
  • 36.
  • 37.
    Demo • How yadoin’ Lloyd? I always liked you!
  • 38.
    The Singularity isnear! • We have presented an integration of many open source technologies to demonstrate emergent behavior and to help provide a bit of our vision of how robots and their human care takers will evolve in the next decades to come. • This is an open source project in its inception. Feel free to contribute or take from this as much or as little as you like. • Try it out! Download it.. Don’t like it? Make it better! • Want to help out? Pull Requests welcome!
  • 39.
    Whats next? • Newtechnologies? • DeepWave – Speech? • LSTM based Speech Recognition? • Performance and stability • Better distributed and swarm capabilities • SLAM – Simultaneous Localization and Mapping • Current Release is “Manticore” • Upcoming release is “Nixie” (Soon!) • Next release is “Ogre”
  • 40.
    Thank you! Kevin Watters Founder,KMW Technology @kwatters76 #Activate18 #ActivateSearch