Object detection in the browser.

Maxwell Clarke
25th October 2019

AI models can nowadays be seamlessly executed in real time on just about any platform, from inexpensive Raspberry Pis, to (in this case) Web Browsers. This demo shows real-time object classification and bounding-box annotation running securely in the browser on your device. If your browser supports WebGL (Desktop computers, and some mobile devices will support this) then the demo will even use GPU hardware acceleration.

Specifically, this example uses Tensorflow.js and the COCO-SSD Model. These are both open-source, and easy to download and install via NPM.

The COCO-SSD model (which you can see in action below) recognises 90 classes of objects. It is a deep neural network trained based on the freely-available COCO dataset. SSD stands for Single Shot MultiBox Detection. This means that it uses only a single frame at a time, and does not keep information between frames (which some more advanced models do).

You will notice the main limitations of this particular model relatively quickly:

  • The bounding boxes can change hugely frame-to-frame. This is because it lacks knowledge of previous frames, and small variations in the image may cause changes in the predicted class.
  • Object classifications might just be wrong. For example this model often thinks that just about any object held by a hand is a "cell phone".
The second limitation can be addressed by training on a more comprehensive data-set with more object categories. However, addressing the first limitation requires a different architecture - instead of processing single images (frames), the model would need to take in multiple frames of the video stream.

Pleaseto start the interactive demo.