No detections when running custom YOLOv8 model on voxl-tflite-server

svempati

Hello,

I am trying to run a custom YOLOv8 model on the voxl-tflite-server. The model detects ships and the yolov8_labels.txt file only contains one ship class. However, when I run the tflite server and view it on voxl-portal I can see the video feed, but cannot see any bounding box detections even when the target is in the camera frame.
I tried another variation for the labels file by having the class index and label name like this: 0 ship, but that doesn't work either.
I also ran voxl-inspect-detections but it doesn't show any detections there.

When I tested the default yolov5 and yolov8 models on voxl-tflite-server, it displays the bounding boxes and shows the list of detections in voxl-inspect-detections just fine.

If it helps, I used this command to convert the YOLOv8 model to the tflite format:

yolo export model=best.pt format=tflite

I use the quantized 16 bit tflite model named yolov8_best_float16.tflite.

This is how I set up the config file /etc/modalai/voxl-tflite-server.conf:

{
"skip_n_frames":	0,
"model":	"/usr/bin/dnn/yolov8_best_float16.tflite",
"input_pipe":	"/run/mpa/front_small_color/",
"delegate":	"gpu",
"requires_labels":	true,
"labels":	"/usr/bin/dnn/yolov8_labels.txt",
"allow_multiple":	false,
"output_pipe_prefix":	"yolov8"
}

Is there anything I missed that is leading to no detections on the voxl-tflite-server?

I would appreciate any help!

Zachary Lowell 0

Hello @svempati can you paste your yolov8_labels and tflite file and I can test it out on my end?

To confirm - you followed the instructions in this gitlab repository: https://gitlab.com/voxl-public/support/voxl-train-yolov8

Zach

svempati

Hi @Zachary-Lowell-0, Yes I am confirming that I followed the instructions in that gitlab repository.
Here is the tflite file and labels file: https://drive.google.com/drive/folders/1kyjanabVSP_pH_jsQyjQG9z6hFYZ_iij?usp=drive_link

Zachary Lowell 0

@svempati said in No detections when running custom YOLOv8 model on voxl-tflite-server:

"labels": "/usr/bin/dnn/yolov8_labels.txt",

So running your model we get the following errors via voxl-tflite-server:

Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here

Which means there is an issue in your model itself and most likely means you ran into an issue during the build process. Specifically this means that is a model issue, not a labels file issue. Your .tflite model has an output tensor with a different data type than what voxl-tflite-server expects

The code itself shows the error when you hit this case statement:

// Gets the uint8_t tensor data pointer
template <>
inline uint8_t *TensorData(TfLiteTensor *tensor, int batch_index)
{
int nelems = 1;

for (int i = 1; i < tensor->dims->size; i++)
{
    nelems *= tensor->dims->data[i];
}

switch (tensor->type)
{
case kTfLiteUInt8:
    return tensor->data.uint8 + nelems * batch_index;
default:
    fprintf(stderr, "Error in %s: should not reach here\n",
            __FUNCTION__);
}

return nullptr;

}

Which means the output tensor doesnt match the expected output in this header file. Please look into your model.

zach

Zachary Lowell 0

https://gitlab.com/voxl-public/voxl-sdk/services/voxl-tflite-server/-/blob/master/include/tensor_data.h

These are all the potential data types that voxl-tflite-server is expecting.

svempati

I see, so my model is not supported by the voxl-tflite-server since it is float16 and the tflite server only supports 32 bit precision if we want to use floating point values. Am I understanding that correctly or am I missing something? Cause the default YOLOv5 model that is included in the VOXL 2 model (yolov5_float16_quant.tflite) is also of float 16 precision so I wonder how the functions in tensor_data.h handle that.

One question, what command did you use to view these error logs from voxl-tflite server?

Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here
Error in TensorData<float>: should not reach here

Zachary Lowell 0

@svempati said in No detections when running custom YOLOv8 model on voxl-tflite-server:

I just ran voxl-tflite-server directly from the command line instead of in the background via systemd - aka run voxl-tflite-server directly on the command line. I would recommend NOT quantizing your model as the directions in the train yolov8 do not recommend that.

Zach

svempati

@Zachary-Lowell-0 Got it, I will try that out and will let you know if I have any more questions. Thanks for your help!

svempati

@Zachary-Lowell-0 I wanted to follow up with you again on this, and the issue seems to be the model conversion process from pytorch to tflite. To confirm this, I tried it with the default yolov8n.pt downloaded from ultralytics by entering this command from the gitlab repository:

python export.py yolov8n.pt

So that I create a new yolov8n_float16.tflite file. However, running this file on voxl-tflite-server shows this output before displaying Error in TensorData<float>: should not reach here:

WARNING: Unknown model type provided! Defaulting post-process to object detection.
INFO: Created TensorFlow Lite delegate for GPU.
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
Successfully built interpreter

------VOXL TFLite Server------

 4 5 6
 4 5 6
Connected to camera server

I even tried running export.py on the voxl emulator to account for any differences in the cpu architecture between my computer and the VOXL (between X86 and ARM) but I still get the same error. Do you think there is anything I would be missing? Thank you!

Zachary Lowell 0

@svempati said in No detections when running custom YOLOv8 model on voxl-tflite-server:

WARNING: Unknown model type provided! Defaulting post-process to object detection.

@svempati I will try recreating this issue today and get back to you!

svempati

@Zachary-Lowell-0 Just wanted to follow up to see if you were able to replicate this issue?

Zachary Lowell 0

@svempati I was able to use an open source training set and then leverage the docs to make my own custom yolov8 model capable of running on the voxl2 - do you want to provide to me the actual dataset and I can try creating the model and training it on a TPU?

svempati

@Zachary-Lowell-0 I would first like to diagnose what is causing the yolov8 model to not work on the voxl 2 for me. Will it only work when you train the model/ export it to tflite on a TPU? I am getting the issue even if I train the yolov8 model on an open source dataset/ use the pretrained yolov8n.pt model downloaded from ultralytics. I want to make sure I can train a yolov8 model on an open source dataset from scratch works on the voxl, so that I can move on to using my custom dataset.

In case there is no other workaround then I should be able to send you the dataset I am using.

Thanks!

Zachary Lowell 0

Let me try and train a custom model and run a loom on it in the next few days and get that over to you showing how I do it!

Zachary Lowell 0

https://www.loom.com/share/bf52e252ab09444bb366f265a3f36dc5

Alright take a look at this loom please - it might help point you in the right direction in terms of training your model.

Zach

svempati

@Zachary-Lowell-0 Thanks for sharing the video! I looked at it, and I pretty much did the same steps you did. It might be worth mentioning that I had to modify the Dockerfile because the one in the documentation was throwing a version mismatch error when installing the onnx part.

This was the original docker command

RUN pip3 install ultralytics tensorflow onnx "onnx2tf>1.17.5,<=1.22.3" tflite_support onnxruntime onnxslim "onnx_graphsurgeon>=0.3.26" "sng4onnx>=1.0.1" tf_keras

I modified it to this

RUN pip3 install ultralytics tensorflow "onnx2tf>1.17.5,<=1.22.3" tflite_support onnxruntime onnxslim "onnx_graphsurgeon>=0.3.26" "sng4onnx>=1.0.1" tf_keras
RUN pip3 install onnx==1.20.1

I don't think this should cause any issues, but could you confirm?

ApoorvThapliyal

Hey @svempati
I trained a model in the docker command as provided by you. I am able to run an inference on potholes following exactly the steps provided by @Zachary-Lowell-0 earlier.
Could I ask you what your voxl-tflite-version is? And if you havent yet, could you try to run this version: http://voxl-packages.modalai.com/dists/qrb5165/sdk-1.6/binary-arm64/voxl-tflite-server_0.5.1_arm64.deb
Make sure to change the .tflite model and labels as shown in the loom video after the install, then run voxl-configure-tflite

Thanks