ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    voxl-px4 service hangs at startup sometimes with SDK 1.1.2

    VOXL 2
    3
    18
    725
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      Aaky
      last edited by Aaky

      Hello community,
      I have been using VOXL2 with SDK 1.1.2 with latest PX4 1.14 firmware on top. While this has been significant upgrade since last SDK and PX4 firmware version, I am experiencing strange problem with my voxl-px4 startup service. This service starts on boot but GPS LED (Holybro M8N) dosent' blink sometimes and even GCS dosent get connected to firmware. This then needs manual intervention like ssh into voxl and then perform "systemctl restart voxl-px4" or in worst case hard reboot since service restart also goes into same loop of not starting PX4 correctly.
      If you are lucky, sometimes service starts correctly and everything is fine.

      Please find logs below and help me ahead to debug this problem.

      voxl2:~$ /usr/bin/voxl-px4
      [INFO] Reading from /etc/modalai/voxl-px4.conf
      Found DSP signature file
      [INFO] Daemon mode enabled
      
      *************************
      GPS=AUTODETECT
      RC=M0065_SBUS
      ESC=VOXL2_IO_PWM_ESC
      POWER MANAGER=VOXLPM
      DISTANCE SENSOR=NONE
      OSD=DISABLE
      DAEMON_MODE=ENABLE
      SENSOR_CAL=ACTUAL
      EXTRA STEPS:
      *************************
      
      INFO  [px4] mlockall() enabled. PX4's virtual address space is locked into RAM.
      INFO  [px4] assuming working directory is rootfs, no symlinks needed.
      INFO  [muorb] Got muorb init command
      Sending initialization request
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      INFO  [muorb] SLPI: muorb aggregator thread running
      INFO  [muorb] muorb protobuf initalize method succeeded
      INFO  [muorb] succesfully did ADVERTISE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097656
      INFO  [muorb] succesfully did SUBSCRIBE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097655
      INFO  [muorb] succesfully did TOPIC_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097654
      INFO  [muorb] succesfully did UNSUBSCRIBE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097653
      INFO  [muorb] muorb test passed
      
      ______  __   __    ___ 
      | ___ \ \ \ / /   /   |
      | |_/ /  \ V /   / /| |
      |  __/   /   \  / /_| |
      INFO  [muorb] SLPI: Advertising remote topic log_message
      | |     / /^\ \ \___  |
      \_|     \/   \/     |_/
      
      px4 starting.
      
      INFO  [parameters] Starting param sync THREAD
      INFO  [px4] startup script: /bin/sh /usr/bin/voxl-px4-start 0
      
      *************************
      GPS: AUTODETECT
      RC: M0065_SBUS
      ESC: VOXL2_IO_PWM_ESC
      POWER MANAGER: VOXLPM
      DISTANCE SENSOR: NONE
      OSD: DISABLE
      EXTRA STEPS:
      	
      *************************
      
      Running on M0054
      INFO  [muorb] SLPI: Starting param sync THREAD
      INFO  [muorb] SLPI: before starting the qshell_entry task
      INFO  [muorb] SLPI: Creating pthread qshell
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_qshell with tid 2097652
      INFO  [muorb] SLPI: qshell entry.....
      INFO  [muorb] SLPI: after starting the qshell_entry task
      INFO  [muorb] SLPI: Init app map initialized
      INFO  [param] selected parameter default file /data/px4/param/parameters
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_client_reset_request) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(parameter_client_reset_response) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Advertising remote topic parameter_update
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_client_set_value_request) as advertised in process_remote_to
      INFO  [uORB] Marking DeviceNode(parameter_server_set_used_request) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_server_set_used_response) as advertised in process_remote_to
      INFO  [uORB] Marking DeviceNode(parameter_client_set_value_response) as advertised in process_remote_topic
      INFO  [parameters] BSON document size 2650 bytes, decoded 2650 bytes (INT32:47, FLOAT:75)
      INFO  [logger] logger started (mode=all)
      Starting IMU driver with no rotation
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: icm42688p start -s
      INFO  [muorb] SLPI:   arg0 = 'icm42688p'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-s'
      
      INFO  [qshell] Send cmd: 'icm42688p start -s'
      INFO  [muorb] SLPI: *** SPI Device ID 0x26000a 2490378
      INFO  [uORB] Advertising remote topic sensor_accel
      INFO  [uORB] Advertising remote topic sensor_gyro
      INFO  [muorb] SLPI: ICM42688P::probe successful!
      INFO  [muorb] SLPI: on SPI bus 1
      INFO  [muorb] SLPI: icm42688p #0 on SPI bus 1
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176ec20
      INFO  [muorb] SLPI: Ok executing command: icm42688p start -s
      INFO  [uORB] Advertising remote topic qshell_retval
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176ec20
      INFO  [qshell] qshell return value timestamp: 174626437, local time: 174631744
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176ec20
      INFO  [muorb] SLPI: Register interrupt b21d31a4 e61fedec 3176ec20
      INFO  [uORB] Advertising remote topic sensor_gyro_fifo
      INFO  [uORB] Advertising remote topic sensor_accel_fifo
      INFO  [uORB] Advertising remote topic imu_server
      INFO  [qshell] Send cmd: 'icp101xx start -I -b 5'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: icp101xx start -I -b 5
      INFO  [muorb] SLPI:   arg0 = 'icp101xx'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-I'
      
      INFO  [muorb] SLPI:   arg3 = '-b'
      
      INFO  [muorb] SLPI:   arg4 = '5'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0xb76329 12018473
      INFO  [muorb] SLPI: icp101xx #0 on I2C bus 5
      INFO  [muorb] SLPI:  address 0x63
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: icp101xx start -I -b 5
      INFO  [qshell] qshell return value timestamp: 174670478, local time: 174674102
      Looking for qmc5883l magnetometer
      INFO  [qshell] Send cmd: 'qmc5883l start -R 10 -X -b 1'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: qmc5883l start -R 10 -X -b 1
      INFO  [muorb] SLPI:   arg0 = 'qmc5883l'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-R'
      
      INFO  [muorb] SLPI:   arg3 = '10'
      
      INFO  [muorb] SLPI:   arg4 = '-X'
      
      INFO  [muorb] SLPI:   arg5 = '-b'
      
      INFO  [muorb] SLPI:   arg6 = '1'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0x80d09 527625
      ERROR [muorb] SLPI: i2c probe failed
      INFO  [muorb] SLPI: PX4_qshell: no instance started (no device on bus?)
      ERROR [muorb] SLPI: Failed to execute command: qmc5883l start -R 10 -X -b 1
      INFO  [qshell] cmd returned with: -1
      INFO  [qshell] qshell return value timestamp: 174713285, local time: 174716440
      ERROR [qshell] Command failed
      Looking for ist8310 magnetometer
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: ist8310 start -R 10 -X -b 1
      INFO  [muorb] SLPI:   arg0 = 'ist8310'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-R'
      
      INFO  [muorb] SLPI:   arg3 = '10'
      
      INFO  [muorb] SLPI:   arg4 = '-X'
      
      INFO  [muorb] SLPI:   arg5 = '-b'
      
      INFO  [muorb] SLPI:   arg6 = '1'
      
      INFO  [qshell] Send cmd: 'ist8310 start -R 10 -X -b 1'
      INFO  [muorb] SLPI: *** I2C Device ID 0x60e09 396809
      INFO  [muorb] SLPI: ist8310 #0 on I2C bus 1
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0xE
      INFO  [muorb] SLPI:  rotation 10
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: ist8310 start -R 10 -X -b 1
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176ec20
      INFO  [qshell] qshell return value timestamp: 174743444, local time: 174747090
      INFO  [qshell] Send cmd: 'gps start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: gps start
      INFO  [muorb] SLPI:   arg0 = 'gps'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Creating pthread gps
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_gps with tid 2097647
      INFO  [muorb] SLPI: Ok executing command: gps start
      INFO  [qshell] qshell return value timestamp: 174786779, local time: 174788641
      Looking for ncp5623c RGB LED
      INFO  [muorb] SLPI: GPS UART baudrate set to 115200
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: rgbled_ncp5623c start -X -b 1 -f 400 -a 56
      INFO  [muorb] SLPI:   arg0 = 'rgbled_ncp5623c'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [qshell] Send cmd: 'rgbled_ncp5623c start -X -b 1 -f 400 -a 56'
      INFO  [muorb] SLPI:   arg2 = '-X'
      
      INFO  [muorb] SLPI:   arg3 = '-b'
      
      INFO  [muorb] SLPI:   arg4 = '1'
      
      INFO  [muorb] SLPI:   arg5 = '-f'
      
      INFO  [muorb] SLPI:   arg6 = '400'
      
      INFO  [muorb] SLPI:   arg7 = '-a'
      
      INFO  [muorb] SLPI:   arg8 = '56'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0x7b3809 8075273
      INFO  [muorb] SLPI: rgbled_ncp5623c #0 on I2C bus 1
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0x38
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: rgbled_ncp5623c start -X -b 1 -f 400 -a 56
      INFO  [qshell] qshell return value timestamp: 174824162, local time: 174825291
      INFO  [uORB] Advertising remote topic sensor_mag
      INFO  [qshell] Send cmd: 'pca9685_pwm_out start -a 0x40 -b 4'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: pca9685_pwm_out start -a 0x40 -b 4
      INFO  [muorb] SLPI:   arg0 = 'pca9685_pwm_out'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-a'
      
      INFO  [muorb] SLPI:   arg3 = '0x40'
      
      INFO  [muorb] SLPI:   arg4 = '-b'
      
      INFO  [muorb] SLPI:   arg5 = '4'
      
      INFO  [uORB] Advertising remote topic sensor_baro
      INFO  [uORB] Advertising remote topic actuator_outputs
      INFO  [muorb] SLPI: *** I2C Device ID 0x694021 6897697
      INFO  [muorb] SLPI: running on I2C bus 4 address 0x40
      INFO  [muorb] SLPI: Ok executing command: pca9685_pwm_out start -a 0x40 -b 4
      INFO  [qshell] qshell return value timestamp: 175125498, local time: 175128045
      INFO  [muorb] SLPI: PCA9685 PWM frequency: target=50.00 real=50.03
      Starting VOXL IO for PWM ESC with SBUS RC
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'voxl2_io start'
      INFO  [muorb] SLPI: qshell gotten: voxl2_io start
      INFO  [muorb] SLPI:   arg0 = 'voxl2_io'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Marking DeviceNode(actuator_outputs) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Opened UART connection to M0065 device on port 7
      INFO  [muorb] SLPI: u-blox firmware version: SPG 3.01
      INFO  [muorb] SLPI: u-blox protocol version: 18.00
      INFO  [muorb] SLPI: u-blox module: NEO-M8N-0
      INFO  [uORB] Advertising remote topic sensor_gps
      ERROR [qshell] command timed out
      ERROR [qshell] Command failed
      M0065 SBUS RC driver already started with PWM ESC start
      INFO  [qshell] Send cmd: 'voxlpm start -X -b 2'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      
      
      Eric KatzfeyE 3 Replies Last reply Reply Quote 0
      • Eric KatzfeyE
        Eric Katzfey ModalAI Team @Aaky
        last edited by

        @Aaky Can you provide the output of dmesg after a failure? This will help us to debug the issue.

        1 Reply Last reply Reply Quote 0
        • Eric KatzfeyE
          Eric Katzfey ModalAI Team @Aaky
          last edited by

          @Aaky How often do you see it fail to start properly?

          1 Reply Last reply Reply Quote 0
          • Eric KatzfeyE
            Eric Katzfey ModalAI Team @Aaky
            last edited by

            @Aaky If you comment out the qshell gps start command in /usr/bin/voxl-px4-start does it prevent the failure from happening?

            A 1 Reply Last reply Reply Quote 0
            • A
              Aaky @Eric Katzfey
              last edited by

              @Eric-Katzfey

              Here's output of dmesg when this happens:

              Link to download dmesg logs : https://drive.google.com/file/d/143QVl4bhNdHPD0i8ACgGWLt7COsrljcy/view?usp=sharing

              I see this happening more often, say 5 out of 10 times.

              if i comment qshell gps start from /usr/bin/voxl-px4-start stillresult is similar, sometimes it starts sometimes it dosent.

              Let me know further steps.

              Eric KatzfeyE 1 Reply Last reply Reply Quote 0
              • Eric KatzfeyE
                Eric Katzfey ModalAI Team @Aaky
                last edited by

                @Aaky What drone are you using? Is it a custom drone? In your first post you included the output from px4. But it doesn't look complete. Are you sure that is the complete log? If so, does it look like that every time?

                A 1 Reply Last reply Reply Quote 0
                • A
                  Aaky @Eric Katzfey
                  last edited by

                  @Eric-Katzfey Yes Eric its custom drone and the logs are actually complete posted in first post. Still if possible I will see if I can provide you with more logs. Any idea from these logs about the problem?

                  Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                  • Eric KatzfeyE
                    Eric Katzfey ModalAI Team @Aaky
                    last edited by

                    @Aaky There's only one log in the first message and it doesn't look complete. The command timed out and Command failed messages seem to be related to the attempt to start the voxl2_io driver. Can you comment that part out so that it skips trying to start the voxl2_io driver and see if it comes up then?

                    A 1 Reply Last reply Reply Quote 0
                    • A
                      Aaky @Eric Katzfey
                      last edited by Aaky

                      @Eric-Katzfey Actually this driver failing happens very randomly, not always and also not because of any one particular driver in general. For example check one more dmesg and reset voxl-px4 log where voxl2_io driver spawned up correcly but still GPS LED wasn't blinking red and neither px4 got connected to GCS.

                      voxl-px4-logs : https://drive.google.com/file/d/1PHqQLldXs5XFKhHvenWqy-0nDLUOfbDB/view?usp=sharing

                      dmesg logs: https://drive.google.com/file/d/1ZOK7-ypm_9BfgbE4khD2NLKFdTecTHXD/view?usp=sharing

                      Let me know any hints to further debug this issue.

                      Alex KushleyevA Eric KatzfeyE 2 Replies Last reply Reply Quote 0
                      • Alex KushleyevA
                        Alex Kushleyev ModalAI Team @Aaky
                        last edited by

                        @Aaky , can you please confirm that you can reproduce the issue even with GPS driver completely disabled?

                        A 1 Reply Last reply Reply Quote 0
                        • A
                          Aaky @Alex Kushleyev
                          last edited by

                          @Alex-Kushleyev Yes I confirm even if I disable GPS driver can reproduce this issue.

                          1 Reply Last reply Reply Quote 0
                          • Eric KatzfeyE
                            Eric Katzfey ModalAI Team @Aaky
                            last edited by

                            @Aaky Those 2 logs look normal except for the GPS taking a while to finally connect to the unit and advertise sensor_gps topic. But nothing to indicate there was any crash. So the GPS LED wasn't illuminated at all? Or was a different color? And PX4 not connecting to GCS doesn't necessarily mean that PX4 is having any problems.

                            A 1 Reply Last reply Reply Quote 0
                            • A
                              Aaky @Eric Katzfey
                              last edited by Aaky

                              @Eric-Katzfey Yes PX4 dosent crash but dosent get connected to QGC inspite of having good connectivity (I have been performing ping tests). GPS LED dosent blink at all when this particular scenario occurs. As I have told before this is very random occuring problem.

                              Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                              • Eric KatzfeyE
                                Eric Katzfey ModalAI Team @Aaky
                                last edited by

                                @Aaky I'm having trouble following your comments. You say that px4 does not crash. But in the first log you posted it stops only partially through the startup script without any error indications. In that case is PX4 still running? If so, it seems like it is stuck somehow. If not, perhaps something is killing the px4 process. When this happens, adb into the voxl2 and look at voxl-inspect-services. Does it show voxl-px4 enabled and running? You say you are running this on a custom drone. Do you also have a reference drone from ModalAI? If so, does everything work fine on that drone? That is the main purpose that we sell those reference drones. It allows you to see everything working so that when you build a custom drone and something doesn't work you can compare against the reference drone.

                                A 1 Reply Last reply Reply Quote 0
                                • A
                                  Aaky @Eric Katzfey
                                  last edited by Aaky

                                  @Eric-Katzfey Hi Eric, My apologies for the confusion.
                                  I will correct my statement, PX4 never crashes, it just gets stuck which means the service is always running but never connects to QGC. On our custom drone post power boot up whenever we see GPS LED isn't blinking at all (neither red nor green) we come to know PX4 service is stuck somewhere. At this point if I ssh into VOXL2 and check voxl-inspect-services, I can see vozl-px4 service is definately running but GPS LED isn't blinking and also PX4 isn't connecting to QGC. These are the observations. Then if I do systemctl restart vozl-px4 , then sometimes problem gets solved and PX4 starts without getting stuck anywhere. I clarify once again, PX4 never crashes but just gets stuck somewhere. All all the logs starting from my first post involves the issue which I have described in this comment.
                                  I am very old customer of ModalAI and have been working since voxl1 on multiple custom drones so I definitely understand the underlying architecture Eric. 🙂

                                  I hope this comment clarifies the confusion.

                                  Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                                  • Eric KatzfeyE
                                    Eric Katzfey ModalAI Team @Aaky
                                    last edited by

                                    @Aaky Okay, this isn't anything we have encountered before. I'm not sure what could make PX4 get stuck like that. When it is in the stuck condition what does CPU utilization look like?

                                    A 1 Reply Last reply Reply Quote 0
                                    • A
                                      Aaky @Eric Katzfey
                                      last edited by

                                      @Eric-Katzfey Here's screenshot of cpu utilization when this happpens. Everything looks normal.

                                      Screenshot from 2024-03-24 20-07-53.png

                                      Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                                      • Eric KatzfeyE
                                        Eric Katzfey ModalAI Team @Aaky
                                        last edited by

                                        @Aaky Well, unfortunately it isn't obvious what is going wrong. You will have to start by disabling everything and when px4 comes up reliably start enabling things to find out the culprit.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Powered by NodeBB | Contributors