ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Can anyone recommend a Tflite Colab Notebook for VOXL2 Training

    Ask your questions right here!
    4
    53
    4944
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ?
      A Former User @sansoy
      last edited by

      @sansoy

      Hey, happy to try and help resolve this!

      The first thing I notice is that the voxl-tflite-server is defaulting your model to object detection which seems incorrect as you have a YOLO model. This is because the tflite server does a string compare call to determine which model is being used as seen here. What this means is you'll need to rename your model to yolov5_float16_quant.tflite for the time being to get proper YOLO processing. Obviously this isn't ideal and we're working on making this functionality better in a future software release.

      However, I'm more curious about that "should not reach here" message. I've traced that back to inference_helper.cpp which is likely hitting this line or one of the other ones which are similar to it. So the TFLite Server is attempting to read in your Tensors in some expected format but it's differing from the type it's getting.

      What you should try first is the renaming suggestion I mentioned in the first paragraph. It's possible that because it's defaulting to an object detection model and not a YOLO model, that's causing the server to read in your Tensors as the wrong datatype. If that doesn't work, if you could provide me with the output of cat /etc/modalai/voxl-tflite-server.conf that might help me in better diagnosing your issue. If I can't help, I may need you to pass along your actual model file so that I can load in your exact configuration and do some debugging to find the issue.

      Sorry about this!

      Thomas Patton
      thomas.patton@modalai.com

      S 1 Reply Last reply Reply Quote 0
      • S
        sansoy @Guest
        last edited by

        @Thomas-Patton Thanks for your response. Didnt realize you were checking for exact names.

        I renamed my yolo model to be yolov5_float16_quant.tflite and updated the yolov5_labels.txt file
        but still getting an error.
        Jan 9 15:06:30 m0054 systemd[1]: Started voxl-tflite-server.
        Jan 9 15:06:30 m0054 bash[5690]: INFO: Created TensorFlow Lite delegate for GPU.
        Jan 9 15:06:47 m0054 bash[5690]: INFO: Initialized OpenCL-based API.
        Jan 9 15:06:47 m0054 bash[5690]: INFO: Created 1 GPU delegate kernels.
        Jan 9 15:06:47 m0054 bash[5690]: ------VOXL TFLite Server------
        Jan 9 15:06:47 m0054 bash[5690]: Segmentation fault:
        Jan 9 15:06:47 m0054 bash[5690]: Fault thread: voxl-tflite-ser(tid: 5770)
        Jan 9 15:06:47 m0054 bash[5690]: Fault address: 0x656972623d3d206e
        Jan 9 15:06:47 m0054 bash[5690]: Unknown reason.
        Jan 9 15:06:47 m0054 bash[1410]: ERROR in pipe_client_init_channel opening request pipe: No such device or address
        Jan 9 15:06:47 m0054 bash[1410]: Most likely the server stopped without cleaning up
        Jan 9 15:06:47 m0054 bash[1410]: Client is cleaning up pipes for the server
        Jan 9 15:06:47 m0054 systemd[1]: voxl-tflite-server.service: Main process exited, code=killed, status=11/SEGV
        Jan 9 15:06:47 m0054 systemd[1]: voxl-tflite-server.service: Failed with result 'signal'.
        Jan 9 15:06:48 m0054 systemd[1]: voxl-tflite-server.service: Service hold-off time over, scheduling restart.
        Jan 9 15:06:48 m0054 systemd[1]: voxl-tflite-server.service: Scheduled restart job, restart counter is at 16.
        Jan 9 15:06:48 m0054 systemd[1]: Stopped voxl-tflite-server.
        Jan 9 15:06:48 m0054 systemd[1]: Started voxl-tflite-server.
        Jan 9 15:06:48 m0054 bash[5774]: INFO: Created TensorFlow Lite delegate for GPU.

        here's my voxl-tflite-server.conf
        /**

        • This file contains configuration that's specific to voxl-tflite-server.
        • skip_n_frames - how many frames to skip between processed frames. For 30Hz
        •                     input frame rate, we recommend skipping 5 frame resulting
          
        •                     in 5hz model output. For 30Hz/maximum output, set to 0.
          
        • model - which model to use. Currently support mobilenet, fastdepth,
        •                     posenet, deeplab, and yolov5.
          
        • input_pipe - which camera to use (tracking, hires, or stereo).
        • delegate - optional hardware acceleration: gpu, cpu, or nnapi. If
        •                     the selection is invalid for the current model/hardware,
          
        •                     will silently fall back to base cpu delegate.
          
        • allow_multiple - remove process handling and allow multiple instances
        •                     of voxl-tflite-server to run. Enables the ability
          
        •                     to run multiples models simultaneously.
          
        • output_pipe_prefix - if allow_multiple is set, create output pipes using default
        •                     names (tflite, tflite_data) with added prefix.
          
        •                     ONLY USED IF allow_multiple is set to true.
          

        */
        {
        "skip_n_frames": 0,
        "model": "/usr/bin/dnn/yolov5_float16_quant.tflite",
        "input_pipe": "/run/mpa/hires_color",
        "delegate": "gpu",
        "allow_multiple": false,
        "output_pipe_prefix": "mobilenet"
        }

        ? 1 Reply Last reply Reply Quote 0
        • ?
          A Former User @sansoy
          last edited by

          @sansoy

          Thanks for an informative response. One thing that's confusing me is the message "ERROR in pipe_client_init_channel" as the pipe_client_init_channel method is deprecated. Do you mind letting me know what version of the SDK you're on? If it isn't the most recent SDK, it's probably worth upgrading to see if it fixes anything. I know we've put out a lot of changes in libmodal-pipe. You can read how to flash the latest SDK here.

          Unfortunately just from these debug messages I can't pin down the issue and so I might need you to provide me with a model file to help out more. I can understand if you don't want to leak your trained model file, though. One thing you could do in this case would be to just train for a single epoch just as a means of creating a model through the same process. If I have a model file I can do some more rigorous debugging to determine the issue.

          Thanks and sorry about all of this!

          Thomas Patton
          thomas.patton@modalai.com

          S 1 Reply Last reply Reply Quote 0
          • S
            sansoy @Guest
            last edited by

            @Thomas-Patton

            voxl2:/$ voxl-version

            system-image: 1.6.2-M0054-14.1a-perf
            kernel: #1 SMP PREEMPT Fri May 19 22:19:33 UTC 2023 4.19.125

            hw version: M0054

            voxl-suite: 1.0.0

            will update to 1.0.1

            Can I email you my tflite and saved model for review? I'm doing a run right now that should be completed in a couple hours.
            Sabri

            tomT 1 Reply Last reply Reply Quote 0
            • tomT
              tom admin @sansoy
              last edited by

              @sansoy You should upgrade to the latest SDK (1.1.2)

              S 1 Reply Last reply Reply Quote 0
              • S
                sansoy @tom
                last edited by

                @tom so i downloaded the upgrade and started the upgrade but its been stuck for about an hour.
                How long does it take to flash the upgrade?
                Sabri

                Flashing the following System Image:
                Build Name: 1.7.1-M0054-14.1a-perf-nightly-20231025
                Build Date: 2023-10-25
                Platform: M0054
                System Image Version: 1.7.1

                Installing the following version of voxl-suite:
                voxl-suite Version: 1.1.2

                Would you like to continue with SDK install?

                1. Yes
                2. No
                  #? yes
                  [ERROR] invalid option
                  #? 1
                  [INFO] adb installed
                  [INFO] fastboot installed

                ---- Starting System Image Flash ----
                ----./flash-system-image.sh ----
                Detected OS: Linux

                Installer Version: 0.8
                Image Version: 1.7.1

                Please power off your VOXL, connect via USB,
                then power on VOXL. We will keep searching for
                an ADB or Fastboot device over USB
                [INFO] Found ADB device
                [INFO] Rebooting to fastboot
                .
                [INFO] Found fastboot device
                [WARNING] This system image flash is intended only for the following
                platform: VOXL2 (m0054)

                      Make sure that the device that will be flashed is correct.
                      Flashing a device with an incorrect system image will lead
                      the device to be stuck in fastboot.
                

                Would you like to continue with the VOXL2 (m0054) system image flash?

                1. Yes
                2. No
                  #? 1
                tomT 1 Reply Last reply Reply Quote 0
                • tomT
                  tom admin @sansoy
                  last edited by

                  @sansoy It should start right away, I would power cycle your voxl2 and try again

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    sansoy @tom
                    last edited by

                    @tom i did all that and still stuck. could it be whats in the warning about being stuck in fastboot?
                    it is the voxl2 and not the voxl2 mini.

                    [WARNING] This system image flash is intended only for the following
                    platform: VOXL2 (m0054)

                          Make sure that the device that will be flashed is correct.
                          Flashing a device with an incorrect system image will lead
                          the device to be stuck in fastboot.
                    
                    tomT 1 Reply Last reply Reply Quote 0
                    • tomT
                      tom admin @sansoy
                      last edited by

                      @sansoy As long as you are using the voxl2 SDK and are indeed flashing voxl2 hardware then that warning can be ignored.

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        sansoy @tom
                        last edited by

                        @tom hey Tom, i'm having absolutely no luck.
                        i've tried 3 times and it still just hangs at

                        Would you like to continue with the VOXL2 (m0054) system image flash?

                        1. Yes
                        2. No
                          #? 1

                        I then followed the unbrick instructions and reinstalled everything per
                        https://docs.modalai.com/voxl2-unbricking/#ubuntu-host

                        Got the system back up and running and tried to install the latest SDK again with no luck.
                        It just hangs.

                        S 1 Reply Last reply Reply Quote 0
                        • S
                          sansoy @sansoy
                          last edited by sansoy

                          UPDATE: Got it working with "sudo" for the install. normally one would get a permission errors and thought maybe that was the issue and sure enough. recommend updating your docs to
                          say sudo ./install.sh

                          S tomT 2 Replies Last reply Reply Quote 0
                          • S
                            sansoy @sansoy
                            last edited by

                            @tom so i trained on a new batch of AR15 images and got really good numbers in terms of losses and mAPs. Ran an unquantized and quantized version in voxl-tflite-server and again nothing is being recognized.

                            Here's a link to the tflites, and saved_models with inference results on never before seen images.
                            Any insight on how to make these models work in your environment would be awesomely appreciated.

                            https://drive.google.com/drive/folders/1N1pU0jMRTb3rODSfIuETPrBf66m4ody7?usp=drive_link

                            tomT 1 Reply Last reply Reply Quote 0
                            • tomT
                              tom admin @sansoy
                              last edited by

                              @sansoy Interesting, sudo isn't normally required. I'm curious, what linux distro are you running?

                              S 1 Reply Last reply Reply Quote 0
                              • tomT
                                tom admin @sansoy
                                last edited by

                                @sansoy @Thomas-Patton is the ML expert here and I'll let him comment on that front

                                1 Reply Last reply Reply Quote 0
                                • S
                                  sansoy @tom
                                  last edited by

                                  @tom Ubuntu 22.04.3 LTS

                                  tomT 1 Reply Last reply Reply Quote 0
                                  • tomT
                                    tom admin @sansoy
                                    last edited by

                                    @sansoy Huh, okay, that's what I run as well.

                                    What groups are your default user in? For example, here is mine:

                                     ~  groups                                               ok | 10:20:36 AM 
                                    tom adm dialout cdrom sudo dip plugdev lpadmin lxd sambashare docker
                                    
                                    S 1 Reply Last reply Reply Quote 0
                                    • S
                                      sansoy @tom
                                      last edited by

                                      @tom eve@eve:~$ groups
                                      eve adm cdrom sudo dip plugdev lpadmin lxd sambashare

                                      tomT 1 Reply Last reply Reply Quote 0
                                      • tomT
                                        tom admin @sansoy
                                        last edited by

                                        @sansoy Can you try adding your user to the dialout group and seeing if that fixes the issue?

                                        sudo usermod -a -G dialout $USER

                                        S 1 Reply Last reply Reply Quote 0
                                        • S
                                          sansoy @tom
                                          last edited by

                                          @tom did that and still no inference.
                                          tflite_1705069939.png

                                          tomT 1 Reply Last reply Reply Quote 0
                                          • tomT
                                            tom admin @sansoy
                                            last edited by

                                            @sansoy That was for fixing the fastboot issue, unrelated

                                            ? 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Powered by NodeBB | Contributors