Use Hardware Acceleration with TFLite outside the voxl-tflite-server

aharvey

For my given application, it is not practical for my case to be running the voxl-tflite-server all the time. I have a python application that needs to perform a lot of inferences only some of the time. My object detectors are in tflite format and currently run on the CPU. I want to use the NPU on the voxl, but don't know where or how to load its delegates. I thought maybe, just maybe the delegates might be in /usr/lib but I cannot get them to load in python (I get undefined symbol errors).

Can someone point me to where I can find the delegate to load in python to use the NPU (on the VOXL2 does using the GPU really get to the NPU) or tell me what I need to do to compile a delegate to use the NPU?

@aharvey

I believe TFLite Delegate logic is available only in C++ per TFLite Sources. Likely you would want to implement your inference code in C++ anyways as on VOXL your inference times will likely be much faster than in Python.

We have some example of assigning delegates inside our voxl-tflite-server source code, check out these lines: https://gitlab.com/voxl-public/voxl-sdk/services/voxl-tflite-server/-/blob/master/src/inference_helper.cpp#L216

Hope this helps,

Thomas
thomas.patton@modalai.com