YOLOv8 with NPU (VOXL2 Mini)
-
Dear Dev Team,
My goal is to run the yolov8s-oiv model on the NPU. I've quantized the model to INT8 following this method:
from ultralytics import YOLO model = YOLO('yolov8s-oiv7.pt') model.export(format='tflite', int8=True, data='dataset.yaml', imgsz=640, nms=False, single_cls=False)where dataset.yaml provides details to the calibration dataset.
When I run the quantized model on the VOXL2 Mini (SDK 1.6.3), it silently falls back to the CPU. I have edited the voxl-tflite-server.conf file to set delegate to nnapi.
I'd really appreciate any inputs on the best practices for quantizing YOLOv8 for the NPU, and references to relevant documentation. I also wished to run a sanity check to ensure the NPU works, can you please provide some quantized models that I can use to verify a functional NPU?
Thanks,
Yash