Can anyone recommend a Tflite Colab Notebook for VOXL2 Training
-
@tom hey Tom, i'm having absolutely no luck.
i've tried 3 times and it still just hangs atWould you like to continue with the VOXL2 (m0054) system image flash?
- Yes
- No
#? 1
I then followed the unbrick instructions and reinstalled everything per
https://docs.modalai.com/voxl2-unbricking/#ubuntu-hostGot the system back up and running and tried to install the latest SDK again with no luck.
It just hangs. -
UPDATE: Got it working with "sudo" for the install. normally one would get a permission errors and thought maybe that was the issue and sure enough. recommend updating your docs to
say sudo ./install.sh -
@tom so i trained on a new batch of AR15 images and got really good numbers in terms of losses and mAPs. Ran an unquantized and quantized version in voxl-tflite-server and again nothing is being recognized.
Here's a link to the tflites, and saved_models with inference results on never before seen images.
Any insight on how to make these models work in your environment would be awesomely appreciated.https://drive.google.com/drive/folders/1N1pU0jMRTb3rODSfIuETPrBf66m4ody7?usp=drive_link
-
@sansoy Interesting, sudo isn't normally required. I'm curious, what linux distro are you running?
-
@sansoy @Thomas-Patton is the ML expert here and I'll let him comment on that front
-
@tom Ubuntu 22.04.3 LTS
-
@sansoy Huh, okay, that's what I run as well.
What groups are your default user in? For example, here is mine:
~ groups ok | 10:20:36 AM tom adm dialout cdrom sudo dip plugdev lpadmin lxd sambashare docker
-
@tom eve@eve:~$ groups
eve adm cdrom sudo dip plugdev lpadmin lxd sambashare -
@sansoy Can you try adding your user to the dialout group and seeing if that fixes the issue?
sudo usermod -a -G dialout $USER
-
@tom did that and still no inference.
-
@sansoy That was for fixing the fastboot issue, unrelated
-
When I get some time today I'll try to download your .tflite models and see what's going on. The good news is that the server is at least running! It very well may just be an issue with how the tensor is being parsed.
Thomas
thomas.patton@modalai.com -
Hey, they just gave me access to the Google Drive folder. Can you confirm that the
edgetpu.tflite
in the root directory is the file you want me to try and get working?Thomas
-
@thomas i just checked permissions and you have access to all the files and yes to
edgetpu.tflit
ssd-mobilenet-v2-fpnlite-640_quant.tflite
ssd-mobilenet-v2-fpnlite-640.tflite
saved_models/saved_model.pb -
Of these 4 models which is the custom trained YOLO model you mentioned above? I can't even get the
edgetpu.tflite
model to load so I need a little bit more information on how each of these files was generated.Thomas
-
@thomas here's a yolov5 trained model.
https://drive.google.com/file/d/1wRbIXdylgx-EOGDLnuWnsytGd2DcTNeY/view?usp=drive_linkI used this instruction set to train a yolov5 model which works well on my mac, rPI4, linux box and nvidia jetson.
https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/#23-organize-directories -
@thomas here's the colab notebook i used to create the tflites and quantized tflites
https://colab.research.google.com/drive/1QdgpSl63OSQdLTnFwOyP8dxLQ7W0HtmW?usp=sharing -
I found the bug in our code that's causing the issue. You can see on this line that a for YOLO models a former employee hardcoded the number of classes to be 80 leading to a segfault when we only have a single class like in your case. Here's what your model looks like on our server with the fix:
What I'm going to do is write a patch for
voxl-tflite-server
that more intelligently reads the number from the labels file. This will be available in our SDK Nightlies and so you'll be able to use the fix tomorrow morning. If you'd like the fix sooner, what I can do is package the new voxl-tflite-server into a .deb so you can deploy it manually to your VOXL. I can help you out with this process if you're not familiar.Thanks,
Thomas Patton
thomas.patton@modalai.com -
@thomas
https://gitlab.com/voxl-public/voxl-sdk/services/voxl-tflite-server/-/merge_requests/22
Here's the merge request with the fix, this will be published both in tonight's nightly and new SDK releases.
Thomas
-
@thomas you rock! awesome! thanks for troubleshooting.