ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    VOXL2 RTSP decoding fails

    VOXL 2
    3
    25
    1469
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      Aaky
      last edited by Aaky

      Hello community,
      I am using SDK 1.1.3 with VOXL2. I have usecase where I need to decode RTSP stream from my gimbal camera and pass it to voxl-tflite-server for inferencing. I referred to this thread and was able to run the example code rtsp_rx_mpa_pub.py. I had to modify the python code a bit to provide it as input to voxl-tflite-server. All works fine now except sometimes randomly VOXL reboots while I open tflite stream on voxl-portal. I have also installed voxl-mpa-tools and voxl-opencv with python3 bindings provided on above mentioned link still this failure happens sometimes.

      Here is link to my modified rtsp_rx_mpa_pub.py and dmesg log file. I have tried both hardware decode and software decoder but problem stays the same. I think something related to mpa publishing is not compatible/incorrect since visualization services are failing. I am also getting failure over RTSP stream after generating the same via voxl-streamer.

      Link : https://drive.google.com/drive/folders/1hfWALmpOEE3ouGyjCPUfFD0jJuGnKjsH?usp=drive_link

      @Alex-Kushleyev Please help over this problem. This is bit urgent for me.

      Alex KushleyevA 1 Reply Last reply Reply Quote 0
      • Alex KushleyevA
        Alex Kushleyev ModalAI Team @Aaky
        last edited by

        @Aaky, if the crash happens when you are trying to view the tflite output in voxl-portal, this may mean that actually tflite server is causing the board crash. I am not saying that the rtsp to mpa publisher is not a problem, but it is less likely. What if you just view the output of rtsp to mpa (via voxl-portal)?

        What model are you running in tflite server? I am not an expert on this particular topic, actually, but others can help.

        Alex

        A 1 Reply Last reply Reply Quote 0
        • A
          Aaky @Alex Kushleyev
          last edited by Aaky

          @Alex-Kushleyev Thanks for quick response. I tried only viewing rtsp to mpa over voxl-portal and it failed once, on next reboot it didnt failed there and failed when I opened tflite mpa. So my suspicion is on the mpa publishing. Failure even happens if I close rtsp mpa by clicking on "VOXL-PORTAL" icon and again opening the mpa. Also on tflite I am using YoloV5 and passing IMAGE_FORMAT_RAW8 1920x1080p stream after RTSP decoding.

          Alex KushleyevA 1 Reply Last reply Reply Quote 0
          • Alex KushleyevA
            Alex Kushleyev ModalAI Team @Aaky
            last edited by

            @Aaky, are you able to try different resolution and / or different rtsp source just to see if this behavior is consistent?

            A 1 Reply Last reply Reply Quote 0
            • A
              Aaky @Alex Kushleyev
              last edited by

              @Alex-Kushleyev I tried 640x480 resolution and its working consistently across multiple reboots. I am still stress testing this but not sure when I might fail. Now I shifted back to hardware decoding with 640x480 resolution.

              A 1 Reply Last reply Reply Quote 0
              • A
                Aaky @Aaky
                last edited by

                @Alex-Kushleyev Update: I was trying with 640x480 with hardware encoded stream and this time I opened RTSP stream over QGC and parallely opened tflite mp before rtsp over voxl-portal and voxl had some hard fault where it had reboot. There seems to be some memory leak/corruption happening in mpa publish pipeline (just an assumption). Please advise ahead.

                Alex KushleyevA 1 Reply Last reply Reply Quote 0
                • Alex KushleyevA
                  Alex Kushleyev ModalAI Team @Aaky
                  last edited by

                  @Aaky, OK, i will double check if the python mpa publisher has a memory leak, this should be easy to verify.

                  Alex

                  Alex KushleyevA 1 Reply Last reply Reply Quote 0
                  • Alex KushleyevA
                    Alex Kushleyev ModalAI Team @Alex Kushleyev
                    last edited by Alex Kushleyev

                    @Aaky ,

                    I ran the example from here for 4 minutes and did not observe any memory leak here. I tried using the SW decoder and HW decoder with resize (as provided in the example)

                    This is for SW decoder (hence high cpu usage)
                    1d9ef9d5-a4fe-45d6-88fd-c274e249c9e0-image.png

                    You can use top to check memory usage while you are running the test.

                    In my test, i had the following setup:

                    • voxl-camera-server creating h264 stream 1920x1080 from imx214 camera
                    • using voxl-streamer -i hires_large_encoded to create rtsp stream
                    • using python3 rtsp_rx_mpa_pub.py to receive rtsp stream and publish a new image
                    • using voxl-portal to live stream the rtsp-debug mpa image stream to the browser

                    You could try a similar setup to see if the issue occurs with a local stream.

                    A 1 Reply Last reply Reply Quote 0
                    • A
                      Aaky @Alex Kushleyev
                      last edited by Aaky

                      @Alex-Kushleyev Alex I confirm when I tried the local stream with tracking camera (since I dont have hires on my voxl), I am still getting hardfault even if I try to do "voxl-inspect-cam -a". Can you please urgently send me the deb files which you have installed over SDK 1.1.3 on your setup and also guide me further? Can this be hardware issue?

                      Alex KushleyevA 1 Reply Last reply Reply Quote 0
                      • Alex KushleyevA
                        Alex Kushleyev ModalAI Team @Aaky
                        last edited by

                        @Aaky , this is strange. Can you please disable the tflite server and try again? Just running camera server and inspect cam should not reboot voxl2. Also, how are you powering your voxl2?

                        A 1 Reply Last reply Reply Quote 0
                        • A
                          Aaky @Alex Kushleyev
                          last edited by Aaky

                          @Alex-Kushleyev I tried disabling tflite. I ran rtsp_rx_mpa_pub.py with tracking camera rtsp url, went to voxl-portal and viewed rtsp-debug mpa pipe, then did some back and forth with home page of voxl-portal, voxl rebooted again. We are powering voxl with standard power supply with one end to 4S battery and other end to ESC and regulated power supply to VOXL2 in standard configuration. Culprit is multiple times opening of mpa pipe/rtsp url creates problem. Sometimes this fault comes at first time or sometimes randomly at n'th time.

                          A 1 Reply Last reply Reply Quote 0
                          • A
                            Aaky @Aaky
                            last edited by

                            @Alex-Kushleyev Can you try to play the same python file which I have attached above?

                            Alex KushleyevA 1 Reply Last reply Reply Quote 0
                            • Alex KushleyevA
                              Alex Kushleyev ModalAI Team @Aaky
                              last edited by

                              @Aaky, sure i will try it later today

                              A 1 Reply Last reply Reply Quote 0
                              • A
                                Aaky @Alex Kushleyev
                                last edited by

                                @Alex-Kushleyev Thanks. One more observation, I tried setting FPS to 30 in the rtsp_rx_mpa_pub python script and those faults have stopped for moment. Is the FPS or any other parameter important in the RTSP MPA -> Tflite -> streamer pipeline?

                                A 1 Reply Last reply Reply Quote 0
                                • A
                                  Aaky @Aaky
                                  last edited by Aaky

                                  @Alex-Kushleyev Update: FPS changing dosent help. Still I am facing failure. On my above google drive I have uploaded my latest failing rtsp decoding python script, voxl-opencv debian, my startup service for kickstarting the RTSP decoding script and also voxl-mpa-tools debian. Please install them and see if there is any problem over SDK 1.1.3. voxl-mpa-tools I have cloned from here with branch pympa-experimental and voxl-opencv I have downloaded from this thread. There is some incompatibility causing this failure. Please let me know your analysis.

                                  I even tried with tracking camera skipping RTSP decoding entirely and providing tracking camera feed to tflite model and then RTSP streaming, that is also having hard time. I am clueless about these failures. Also VOXL keeps rebooting and never comes out of reboot cycle randomly when tflite is active. Please help.

                                  A 1 Reply Last reply Reply Quote 0
                                  • A
                                    Aaky @Aaky
                                    last edited by

                                    @Alex-Kushleyev This issue is extremely urgent for me for some demonstration. My request is if you can provide me solution for this problem as soon as possible it would be really helpful.

                                    A 1 Reply Last reply Reply Quote 0
                                    • A
                                      Aaky @Aaky
                                      last edited by Aaky

                                      @Alex-Kushleyev One more update in this respect, I am using libmodal-pipe version 2.10.0 on my SDK 1.1.3, this version of libmodal-pipe is I guess supported on next SDK 1.2.0. Can this be a problem? This came as dependency while installing voxl-mpa-tools I guess.

                                      Say I update to SSDK 1.2.0, what should be exact voxl-opencv and voxl-mpa-tools version to be installed? I think they are conflicting somewhere leading to hardfaults.

                                      A tomT 2 Replies Last reply Reply Quote 0
                                      • A
                                        Aaky @Aaky
                                        last edited by

                                        @Alex-Kushleyev Apologies for trailing messages. Any update over this problem?

                                        1 Reply Last reply Reply Quote 0
                                        • tomT
                                          tom admin @Aaky
                                          last edited by

                                          @Aaky The source of truth for SDK 1.2 packages can be found here: http://voxl-packages.modalai.com/dists/qrb5165/sdk-1.2/binary-arm64/

                                          Alex KushleyevA 1 Reply Last reply Reply Quote 0
                                          • Alex KushleyevA
                                            Alex Kushleyev ModalAI Team @tom
                                            last edited by Alex Kushleyev

                                            @Aaky , sorry for the delay.

                                            You can see what packages are shipped with each SDK here : https://docs.modalai.com/sdk-1.1-release-notes/#sdk-113-package-list (link is pointing to SDK 1.1.3). Tom also provided the address above where all the packages are available for download for each major release. Also, you can see the tags in the actual git repo for each package, for example here:

                                            b51761cf-8133-4b76-bace-e407627a42b6-image.png
                                            You can see that SDK-1.1.0 released version v2.9.2 and the only other release was for SDK 1.2.0.

                                            When I originally made the post with pympa tools (including the rtsp example), i was using SDK 1.1.3 and everything was working fine. I used the libmodal-pipe that shipped with SDK.

                                            I just installed SDK 1.2.0 and then on top of that i installed the opencv with python and voxl-mpa-tools i posted before, i am re-posting the links for clarity:
                                            https://storage.googleapis.com/modalai_public/temp/voxl2-misc-packages/voxl-opencv_4.5.5-3_arm64.deb
                                            https://storage.googleapis.com/modalai_public/temp/voxl2-misc-packages/voxl-mpa-tools_1.1.5_arm64.deb

                                            Then i connected OV7251 tracking camera to my VOXL2 with the basic voxl-camera-server.conf which just publishes raw8 640x480 image.

                                            Next, i ran voxl-streamer -i tracking to encode and create an rtsp stream for from the raw8 images.

                                            Finally, i just updated the rtsp address (just stream_url = 'rtsp://127.0.0.1:8900/live') in the modified rtsp_rx_mpa_pub.py script that you shared.

                                            And the last part is i started voxl-portal to view the rtsp stream that is re-published back to mpa as rtsp-debug mpa channel.

                                            So.. everything is working fine, no issues, no reboots.

                                            Doing further investigation, i looked at dmesg -w output while the script is running and i saw messages like the following:

                                            [ 2306.392927] msm_vidc:   err : 00000002: h264d: qbuf cache ops failed: CAPTURE: idx 15 fd 74 off 0 daddr dc900000 size 786432 filled 0 flags 0x0 ts 0 refcnt 2 mflags 0x1, extradata: fd 80 off 245760 daddr de7bc000 size 16384 filled 0 refcnt 2
                                            [ 2306.424439] msm_vidc:   err : 00000002: h264d: dqbuf cache ops failed: CAPTURE: idx 16 fd 76 off 0 daddr dc800000 size 786432 filled 786432 flags 0x10 ts 11920339000 refcnt 2 mflags 0x0, extradata: fd 80 off 262144 daddr de7c0000 size 16384 filled 16384 refcnt 2
                                            [ 2306.426040] msm_vidc:   err : 00000002: h264d: qbuf cache ops failed: CAPTURE: idx 16 fd 76 off 0 daddr dc800000 size 786432 filled 0 flags 0x0 ts 0 refcnt 2 mflags 0x1, extradata: fd 80 off 262144 daddr de7c0000 size 16384 filled 0 refcnt 2
                                            [ 2306.457763] msm_vidc:   err : 00000002: h264d: dqbuf cache ops failed: CAPTURE: idx 17 fd 78 off 0 daddr dc700000 size 786432 filled 786432 flags 0x10 ts 11953488000 refcnt 2 mflags 0x0, extradata: fd 80 off 278528 daddr de7c4000 size 16384 filled 16384 refcnt 2
                                            [ 2306.461468] msm_vidc:   err : 00000002: h264d: qbuf cache ops failed: CAPTURE: idx 17 fd 78 off 0 daddr dc700000 size 786432 filled 0 flags 0x0 ts 0 refcnt 2 mflags 0x1, extradata: fd 80 off 278528 daddr de7c4000 size 16384 filled 0 refcnt 2
                                            [ 2306.491300] msm_vidc:   err : 00000002: h264d: dqbuf cache ops failed: CAPTURE: idx  0 fd 44 off 0 daddr dd800000 size 786432 filled 786432 flags 0x10 ts 11987146000 refcnt 2 mflags 0x0, extradata: fd 80 off 0 daddr de780000 size 16384 filled 16384 refcnt 2
                                            [ 2306.492874] ion_sgl_sync_range: 291 callbacks suppressed
                                            [ 2306.492878] Partial cmo only supported with 1 segment
                                                           is dma_set_max_seg_size being set on dev:kgsl-3d0
                                            [ 2306.492892] msm_vidc:   err : 00000002: h264d: qbuf cache ops failed: CAPTURE: idx  0 fd 44 off 0 daddr dd800000 size 786432 filled 0 flags 0x0 ts 0 refcnt 2 mflags 0x1, extradata: fd 80 off 0 daddr de780000 size 16384 filled 0 refcnt 2
                                            [ 2306.524454] Partial cmo only supported with 1 segment
                                                           is dma_set_max_seg_size being set on dev:kgsl-3d0
                                            [ 2306.524479] msm_vidc:   err : 00000002: h264d: dqbuf cache ops failed: CAPTURE: idx  1 fd 46 off 0 daddr dd700000 size 786432 filled 786432 flags 0x10 ts 12020491000 refcnt 2 mflags 0x0, extradata: fd 80 off 16384 daddr de784000 size 16384 filled 16384 refcnt 2
                                            [ 2306.526015] kgsl_iommu_fault_handler: 141 callbacks suppressed
                                            [ 2306.526025] kgsl kgsl-3d0: GPU PAGE FAULT: addr = 500211000 pid= 10785 name=python3
                                            [ 2306.526052] kgsl kgsl-3d0: context=gfx3d_user TTBR0=0x30001c873f000 CIDR=0x2a21 (read translation fault)
                                            [ 2306.526096] kgsl kgsl-3d0: FAULTING BLOCK: UCHE: TP
                                            [ 2306.526111] kgsl kgsl-3d0: ---- nearby memory ----
                                            [ 2306.526134] kgsl kgsl-3d0: [0000000500130000 - 0000000500211000]   (pid = 10785) (2d)
                                            ..
                                            

                                            so there are actually two issues going on here (it seems)

                                            • error messages from the decoder (h264d)
                                            • some sort of GPU page fault (GPU is used for doing image format conversion / resize). Note that a page fault is not necessarily an issue, but I am not sure if this is a normal page fault or something that should not be occurring (read translation fault).

                                            With these errors, my VOXL2 is not crashing, but still these are probably not good messages to see..

                                            I changed the stream string in the test script to use software decoder and the errors are no longer printed in dmesg:

                                            stream = 'gst-launch-1.0 rtspsrc location=' + stream_url + ' latency=0 ! queue ! rtph264depay ! h264parse config-interval=-1 ! avdec_h264 ! autovideoconvert ! appsink'
                                            

                                            Additinally, using HW decoder but sw-based videoconvert, also works without any errors in dmesg (note using videoconvert instead of autovideoconvert. I believe autovideoconvert uses GPU to do the format conversion)

                                            stream = 'gst-launch-1.0 rtspsrc location=' + stream_url + ' latency=0  ! queue ! rtph264depay ! h264parse config-interval=-1 ! qtivdec turbo=true ! videoconvert ! appsink'
                                            

                                            You can try these basic tests to see if you also see the errors in dmesg and if the errors and crash goes away after changingn to SW decoder or SW-based videoconvert.

                                            Regarding the error printed in dmesg, i am not sure what is actually causing it. It is not coming from ModalAI software, so we should try to work around the issue.

                                            With all that being said.. none of the tests that i ran results in a crash or reboot of VOXL2.. I suggest that you run dmesg -w in a separate window before running your test and see what is printed right before the system reboots. This can help. If you cannot see anything on dmegs -w output via adb, you can also check /var/log/kern.log to see the messages from previous boot (at the end of the log.. note that the log can be large as it saves previous kernel logs).

                                            Alex

                                            A 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Powered by NodeBB | Contributors