Output based on VOXL tflite live footage/bounding boxes

Denver Bennett

Hello,

I hope you are well. I am trying to make it so that I can take use the live output from the MobileNet and write a python script that uses that output. Where do I start? How can I do this?

For example I would like to write a python script that each time a keyboard is detected, the python script prints "keyboard". I would also like to make it so that I could take the live outputs from a depth estimation model and have the python script print the estimated depth.

Thank you!

@Denver-Bennett

Hey Denver, happy to try and help you out with this! Where you should start is our documentation for voxl-tflite-server which is how we often run ML models on VOXL. What you'll see from those docs is that we have a bunch of pretrained models already onboard as well as a bit of support for custom models. voxl-tflite-server then writes AI predictions out to a pipe using libmodal-pipe. However libmodal-pipe currently only has a C/C++ API and so you'd have to make a simple piece of C code to consume data from the pipe and invoke your Python script accordingly.

I'm more than happy to help with this, but it would be good to do some reading first. I'd check out that documentation link above as well as the one on the Modal Pipe Architecture to learn how processes communicate with each other. From there I can start to answer more specific questions and help you build this out.

Hope this helps!

Thomas Patton