Detection and Recognition of Objects, Faces and Emotions on Embedded Devices
What do robots see when they look at us? Can they recognise what we feel? Do they see us in a room next to other objects? Smartlife robots surely do!
Ever since the inception of object detection models like AlexNet by Google, we have seen a lot of object detection models that are capable of running in real-time. We have seen CNNs, RCNNs, Faster-RCNNs, YOLO, SSD and several others.
At Smartlife, we analyze the efficiency of all these models and optimize the best one to put in our smart social robots. Also, we want our robots to understand you and your emotions. To help with this we have used the latest models for Face Recognition and Face Emotion Recognition.
So let’s see what we have got!
As mentioned earlier there are several models available for object detection varying on their accuracy and speed. For real-time application on the edge usually, SSD MobileNet models are used. SSD stands for Single Shot (Multibox) Detector and MobileNet is the base net. MobileNet is a computationally efficient CNN architecture designed specifically for mobile devices with very limited computing power.
In a simple way, the way SSD works is that given an image all the multiple convolutional layers work together to detect objects. The initial layers are able to detect smaller features and objects and further, we go the larger objects are detected. So using just one shot (image) the different levels of the network detect objects of different sizes.
The below image is an example of how SSD works with a VGG base net.
A point in the input image will have several bounding boxes of different aspect ratio and the model tries to classify the contents of these boxes to known objects.
With NVIDIAs TensorRT™ version of SSD MobileNet v2 that is optimized for the GPU, we were able to get accurate and fast model. NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications.
We laid out several objects for the model to detect and also tested the model on hand-drawn doodles and it was able to identify the object accurately. It also knows when it sees only part of some objects.
Once we got object detection running we tried our hand on face detection and recognition. The open-source community always has something great to offer. We found a great C++ toolkit with a python wrapper called Dlib that can help us with fast face recognition using even just one sample image of a person's face.
The face detected on the image is saved as a small encoding which can then be used to compare with the frame received from a live feed. Add to this a simple pan and tilt camera setup and an object tracking code we get a system that can detect, recognize and track faces in real-time.
Face Emotion Recognition
Now that our robot can identify who it is looking at, the next challenge was to make him understand how you feel. That is where emotion recognition comes into play.
This is also done using another convolutional network that analyzes the features in the face image data and classifies it to one of the 7 emotion classes: Angry, Sad, Happy, Surprised, Disgusted, Fearful or Neutral.
Since this model can work with very low resolution, it is easy to pair it up to run in the same pipeline as Face Recognition. This can work with multiple faces too.
With this kind of information, our robots can know how you feel without even the need for you to speak anything. It can join you in the happy moments, cheer you up when you are feeling down and know to stay away from when angry. An active feedback loop to ensure the robot can always behave accordingly.
Watch the video below to see more results of our tests of these models on an NVIDIA Jetson Nano!
Smartlife Robotics is looking for partners to work together on future projects. Feel free to leave your ideas on how we can introduce new AI solutions -> Contact US
SmartLife Robotics is a startup that builds AI-powered socially intelligent interactive robots for use in a wide spectrum of fields, https://smartlife.global