ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    DSP Tasks Failing Unless mini-dm is Run

    Support Request Format for Best Results
    3
    21
    1918
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      ejohnson01
      last edited by

      We are experiencing issues with voxl-px4. We notice on our GCS that we are constantly gaining and loosing rc connection and the mavlink telemetry stream is extremely laggy.

      We seem to be observing strange behavior with tasks running on o observing strange behavior with tasks running on our DSP. If we start PX4 from a cold boot, it never works correctly, and we see sensor timeouts printing on the output of voxl-px4 from the SLPI shown in the console output below.

      Even more strange is that when we plug in a usbc cable to our voxl2 and run mini-dm the problem seems to go away. After mini-dm is run once we are able to stop and start px4 with no issue. Once we power cycle our voxl then the problem returns until we plug in and run mini-dm again.

      We are running a slightly modified version of this commit of px4.
      https://github.com/modalai/px4-firmware/tree/cf055af349bcd7a141640936184e03054863022b

      it is the commit that is tagged in the voxl-px4 sdkv1.1.1 tag.

      this is the voxl2 sku info

      root@microUAS1:~/mav_ws# voxl-inspect-sku
      
      family code:   MCCA-M0054 (voxl2-board-only)
      compute board: 4 (voxl2)
      hw version:    1
      cam config:    8
      SKU:           MCCA-M0054-C8
      
      

      we are also using the voxl2io board which we believe has the legacy firmware (red light with blinking blue light) to communicate with our sbus reciever.

      Do you all have any thoughts on what might be happening here?

      here is what our GCS shows:
      7aee57af-ceae-4265-beb9-6bbbe7312564-image.png

      here is the console output from running

      root@microUAS1:~/mav_ws# voxl-px4 -g -n -s
      [INFO] Reading from /etc/modalai/voxl-px4.conf
      Found DSP signature file
      [INFO] GR-12L SBUS Reciever Selected
      [INFO] MRO M9N with MAG IST8308 selected
      [INFO] SDP3X airspeed sensor selected
      [INFO] Daemon mode enabled
      
      *************************
      GPS=MRO_M9N
      RC=M0065_SBUS
      ESC=VOXL_ESC
      POWER MANAGER=VOXLPM
      DISTANCE SENSOR=NONE
      OSD=DISABLE
      DAEMON_MODE=ENABLE
      SENSOR_CAL=ACTUAL
      EXTRA STEPS:
      *************************
      
      INFO  [px4] mlockall() enabled. PX4's virtual address space is locked into RAM.
      INFO  [px4] assuming working directory is rootfs, no symlinks needed.
      INFO  [muorb] Got muorb init command
      Sending initialization request
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      Got topic data before configuration complete
      INFO  [muorb] SLPI: muorb aggregator thread running
      INFO  [muorb] muorb protobuf initalize method succeeded
      INFO  [muorb] succesfully did ADVERTISE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097656
      INFO  [muorb] succesfully did SUBSCRIBE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097655
      INFO  [muorb] succesfully did TOPIC_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097654
      INFO  [muorb] succesfully did UNSUBSCRIBE_TEST_TYPE
      INFO  [muorb] SLPI: Creating pthread test_MUORB
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_test_MUORB with tid 2097653
      INFO  [muorb] muorb test passed
      
      ______  __   __    ___ 
      | ___ \ \ \ / /   /   |
      | |_/ /  \ V /   / /| |
      |  __/   /   \  / /_| |
      | |     / /^\ \ \___  |
      \_|     \/   \/     |_/
      
      px4 starting.
      
      INFO  [px4] startup script: /bin/sh /usr/bin/voxl-px4-start 0
      INFO  [muorb] SLPI: Advertising remote topic log_message
      INFO  [parameters] Starting param sync THREAD
      INFO  [muorb] SLPI: Starting param sync THREAD
      
      *************************
      GPS: MRO_M9N
      RC: M0065_SBUS
      ESC: VOXL_ESC
      POWER MANAGER: VOXLPM
      DISTANCE SENSOR: NONE
      OSD: DISABLE
      AISPEED SENSOR: SDP3X
      EXTRA STEPS:
      	
      *************************
      
      Running on M0054
      INFO  [muorb] SLPI: before starting the qshell_entry task
      INFO  [muorb] SLPI: Creating pthread qshell
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_qshell with tid 2097652
      INFO  [muorb] SLPI: qshell entry.....
      INFO  [muorb] SLPI: after starting the qshell_entry task
      INFO  [muorb] SLPI: Init app map initialized
      INFO  [param] selected parameter default file /data/px4/param/parameters
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_client_reset_request) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(parameter_client_reset_response) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Advertising remote topic parameter_update
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_client_set_value_request) as advertised in process_remote_to
      INFO  [uORB] Marking DeviceNode(parameter_server_set_used_request) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Marking DeviceNode(parameter_server_set_used_response) as advertised in process_remote_to
      INFO  [uORB] Marking DeviceNode(parameter_client_set_value_response) as advertised in process_remote_topic
      INFO  [parameters] BSON document size 3367 bytes, decoded 3367 bytes (INT32:55, FLOAT:104)
      INFO  [logger] logger started (mode=all)
      Starting IMU driver with no rotation
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: icm42688p start -s
      INFO  [muorb] SLPI:   arg0 = 'icm42688p'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-s'
      
      INFO  [qshell] Send cmd: 'icm42688p start -s'
      INFO  [muorb] SLPI: *** SPI Device ID 0x26000a 2490378
      INFO  [uORB] Advertising remote topic sensor_accel
      INFO  [uORB] Advertising remote topic sensor_gyro
      INFO  [muorb] SLPI: ICM42688P::probe successful!
      INFO  [muorb] SLPI: on SPI bus 1
      INFO  [muorb] SLPI: icm42688p #0 on SPI bus 1
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176d110
      INFO  [muorb] SLPI: Ok executing command: icm42688p start -s
      INFO  [uORB] Advertising remote topic qshell_retval
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176d110
      INFO  [qshell] qshell return value timestamp: 102630757, local time: 102636903
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176d110
      INFO  [muorb] SLPI: Register interrupt b21d3164 e61f8ddc 3176d110
      INFO  [uORB] Advertising remote topic sensor_gyro_fifo
      INFO  [uORB] Advertising remote topic sensor_accel_fifo
      INFO  [uORB] Advertising remote topic imu_server
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'icp101xx start -I -b 5'
      INFO  [muorb] SLPI: qshell gotten: icp101xx start -I -b 5
      INFO  [muorb] SLPI:   arg0 = 'icp101xx'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-I'
      
      INFO  [muorb] SLPI:   arg3 = '-b'
      
      INFO  [muorb] SLPI:   arg4 = '5'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0xb76329 12018473
      INFO  [muorb] SLPI: icp101xx #0 on I2C bus 5
      INFO  [muorb] SLPI:  address 0x63
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: icp101xx start -I -b 5
      INFO  [qshell] qshell return value timestamp: 102673513, local time: 102677483
      Looking for qmc5883l magnetometer
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'qmc5883l start -R 0 -X -b 1'
      INFO  [muorb] SLPI: qshell gotten: qmc5883l start -R 0 -X -b 1
      INFO  [muorb] SLPI:   arg0 = 'qmc5883l'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-R'
      
      INFO  [muorb] SLPI:   arg3 = '0'
      
      INFO  [muorb] SLPI:   arg4 = '-X'
      
      INFO  [muorb] SLPI:   arg5 = '-b'
      
      INFO  [muorb] SLPI:   arg6 = '1'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0x80d09 527625
      INFO  [muorb] SLPI: qmc5883l #0 on I2C bus 1
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0xD
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: qmc5883l start -R 0 -X -b 1
      INFO  [qshell] qshell return value timestamp: 102734319, local time: 102738429
      INFO  [muorb] SLPI: >>> ICM42688P this: 3176d110
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'gps start'
      INFO  [muorb] SLPI: qshell gotten: gps start
      INFO  [muorb] SLPI:   arg0 = 'gps'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Creating pthread gps
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_gps with tid 2097648
      INFO  [muorb] SLPI: Ok executing command: gps start
      INFO  [qshell] qshell return value timestamp: 102790110, local time: 102791873
      Looking for ncp5623c RGB LED
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'rgbled_ncp5623c start -X -b 1 -f 400 -a 56'
      INFO  [muorb] SLPI: qshell gotten: rgbled_ncp5623c start -X -b 1 -f 400 -a 56
      INFO  [muorb] SLPI:   arg0 = 'rgbled_ncp5623c'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-X'
      
      INFO  [muorb] SLPI:   arg3 = '-b'
      
      INFO  [muorb] SLPI:   arg4 = '1'
      
      INFO  [muorb] SLPI:   arg5 = '-f'
      
      INFO  [muorb] SLPI:   arg6 = '400'
      
      INFO  [muorb] SLPI:   arg7 = '-a'
      
      INFO  [muorb] SLPI:   arg8 = '56'
      
      INFO  [uORB] Advertising remote topic sensor_mag
      INFO  [muorb] SLPI: *** I2C Device ID 0x7b3809 8075273
      INFO  [muorb] SLPI: rgbled_ncp5623c #0 on I2C bus 1
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0x38
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: rgbled_ncp5623c start -X -b 1 -f 400 -a 56
      INFO  [qshell] qshell return value timestamp: 102875392, local time: 102878649
      INFO  [uORB] Advertising remote topic sensor_baro
      Starting VOXL ESC driver
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'voxl_esc start'
      INFO  [muorb] SLPI: qshell gotten: voxl_esc start
      INFO  [muorb] SLPI:   arg0 = 'voxl_esc'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic actuator_outputs
      INFO  [muorb] SLPI: Ok executing command: voxl_esc start
      INFO  [muorb] SLPI: Opened UART ESC device
      INFO  [qshell] qshell return value timestamp: 103031754, local time: 103033825
      Attempting to start M0065 SBUS RC driver for original M0065 FW
      INFO  [uORB] Advertising remote topic esc_status
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: dsp_sbus start
      INFO  [muorb] SLPI:   arg0 = 'dsp_sbus'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [qshell] Send cmd: 'dsp_sbus start'
      INFO  [muorb] SLPI: Creating pthread dsp_sbus_main
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_dsp_sbus_main with tid 2097645
      INFO  [muorb] SLPI: Ok executing command: dsp_sbus start
      INFO  [qshell] qshell return value timestamp: 103092449, local time: 103095538
      INFO  [uORB] Advertising remote topic input_rc
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'voxlpm start -X -b 2'
      INFO  [muorb] SLPI: qshell gotten: voxlpm start -X -b 2
      INFO  [muorb] SLPI:   arg0 = 'voxlpm'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-X'
      
      INFO  [muorb] SLPI:   arg3 = '-b'
      
      INFO  [muorb] SLPI:   arg4 = '2'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0xd24411 13779985
      INFO  [muorb] SLPI: Set i2c address 0x6a, fd 3
      INFO  [muorb] SLPI: Set i2c address 0x44, fd 3
      INFO  [uORB] Advertising remote topic battery_status
      INFO  [muorb] SLPI: voxlpm #0 on I2C bus 2
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0x44
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: voxlpm start -X -b 2
      INFO  [uORB] Advertising remote topic power_monitor
      INFO  [qshell] qshell return value timestamp: 103159043, local time: 103161281
      starting sdp3x
      INFO  [muorb] SLPI: u-blox firmware version: SPG 4.04
      INFO  [muorb] SLPI: u-blox protocol version: 32.01
      INFO  [muorb] SLPI: u-blox module: NEO-M9N
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'sdp3x start -X -a 0x21 -b 4'
      INFO  [muorb] SLPI: qshell gotten: sdp3x start -X -a 0x21 -b 4
      INFO  [muorb] SLPI:   arg0 = 'sdp3x'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = '-X'
      
      INFO  [muorb] SLPI:   arg3 = '-a'
      
      INFO  [muorb] SLPI:   arg4 = '0x21'
      
      INFO  [muorb] SLPI:   arg5 = '-b'
      
      INFO  [muorb] SLPI:   arg6 = '4'
      
      INFO  [muorb] SLPI: *** I2C Device ID 0x4a2121 4858145
      INFO  [muorb] SLPI: sdp3x #0 on I2C bus 4
      INFO  [muorb] SLPI:  (external)
      INFO  [muorb] SLPI:  address 0x21
      INFO  [muorb] SLPI: 
      
      INFO  [muorb] SLPI: Ok executing command: sdp3x start -X -a 0x21 -b 4
      INFO  [qshell] qshell return value timestamp: 103212466, local time: 103214889
      INFO  [uORB] Advertising remote topic differential_pressure
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'sensors start'
      INFO  [muorb] SLPI: qshell gotten: sensors start
      INFO  [muorb] SLPI:   arg0 = 'sensors'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic sensor_selection
      INFO  [uORB] Advertising remote topic sensors_status_imu
      INFO  [uORB] Advertising remote topic vehicle_acceleration
      INFO  [uORB] Advertising remote topic vehicle_angular_velocity
      INFO  [uORB] Advertising remote topic sensor_combined
      INFO  [uORB] Advertising remote topic vehicle_air_data
      ERROR [muorb] SLPI: Timeout waiting for parameter_server_set_used_response for SENS_GPS_MASK
      INFO  [uORB] Advertising remote topic vehicle_gps_position
      INFO  [uORB] Advertising remote topic sensors_status_baro
      INFO  [uORB] Advertising remote topic vehicle_magnetometer
      INFO  [uORB] Advertising remote topic sensor_preflight_mag
      INFO  [uORB] Advertising remote topic vehicle_imu
      INFO  [uORB] Advertising remote topic vehicle_imu_status
      INFO  [muorb] SLPI: Ok executing command: sensors start
      INFO  [uORB] Advertising remote topic sensors_status_mag
      INFO  [qshell] qshell return value timestamp: 103496668, local time: 103505109
      INFO  [uORB] Advertising remote topic airspeed
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: ekf2 start
      INFO  [qshell] Send cmd: 'ekf2 start'
      INFO  [muorb] SLPI:   arg0 = 'ekf2'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic sensor_gps
      INFO  [uORB] Advertising remote topic vehicle_attitude
      INFO  [uORB] Advertising remote topic vehicle_local_position
      INFO  [uORB] Advertising remote topic estimator_event_flags
      INFO  [uORB] Advertising remote topic estimator_innovation_test_ratios
      INFO  [uORB] Advertising remote topic estimator_innovation_variances
      INFO  [uORB] Advertising remote topic estimator_innovations
      INFO  [uORB] Advertising remote topic estimator_sensor_bias
      INFO  [uORB] Advertising remote topic estimator_states
      INFO  [uORB] Advertising remote topic estimator_status_flags
      INFO  [uORB] Advertising remote topic estimator_status
      INFO  [muorb] SLPI: Ok executing command: ekf2 start
      INFO  [qshell] qshell return value timestamp: 103856924, local time: 103862916
      INFO  [uORB] Advertising remote topic ekf2_timestamps
      INFO  [uORB] Advertising remote topic vehicle_odometry
      INFO  [uORB] Advertising remote topic estimator_aid_src_baro_hgt
      INFO  [uORB] Advertising remote topic estimator_aid_src_fake_pos
      INFO  [uORB] Advertising remote topic estimator_aid_src_fake_hgt
      INFO  [uORB] Advertising remote topic estimator_aid_src_mag_heading
      INFO  [uORB] Advertising remote topic estimator_aid_src_mag
      INFO  [uORB] Advertising remote topic estimator_aid_src_gravity
      INFO  [uORB] Advertising remote topic estimator_aid_src_airspeed
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'mc_pos_control start vtol'
      INFO  [muorb] SLPI: qshell gotten: mc_pos_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'mc_pos_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [uORB] Advertising remote topic takeoff_status
      INFO  [muorb] SLPI: Ok executing command: mc_pos_control start vtol
      INFO  [qshell] qshell return value timestamp: 104032898, local time: 104035892
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'mc_att_control start vtol'
      INFO  [muorb] SLPI: qshell gotten: mc_att_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'mc_att_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [muorb] SLPI: Ok executing command: mc_att_control start vtol
      INFO  [qshell] qshell return value timestamp: 104111040, local time: 104112332
      INFO  [qshell] Send cmd: 'mc_rate_control start vtol'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: mc_rate_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'mc_rate_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [uORB] Advertising remote topic rate_ctrl_status
      INFO  [muorb] SLPI: Ok executing command: mc_rate_control start vtol
      INFO  [qshell] qshell return value timestamp: 104259148, local time: 104261585
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'mc_hover_thrust_estimator start'
      INFO  [muorb] SLPI: qshell gotten: mc_hover_thrust_estimator start
      INFO  [muorb] SLPI:   arg0 = 'mc_hover_thrust_estimator'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Ok executing command: mc_hover_thrust_estimator start
      INFO  [qshell] qshell return value timestamp: 104344496, local time: 104346060
      INFO  [qshell] Send cmd: 'mc_autotune_attitude_control start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: mc_autotune_attitude_control start
      INFO  [muorb] SLPI:   arg0 = 'mc_autotune_attitude_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic autotune_attitude_control_status
      INFO  [muorb] SLPI: Ok executing command: mc_autotune_attitude_control start
      INFO  [qshell] qshell return value timestamp: 104414886, local time: 104418065
      INFO  [qshell] Send cmd: 'land_detector start vtol'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: land_detector start vtol
      INFO  [muorb] SLPI:   arg0 = 'land_detector'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [muorb] SLPI: Ok executing command: land_detector start vtol
      INFO  [qshell] qshell return value timestamp: 104496547, local time: 104497727
      INFO  [uORB] Advertising remote topic vehicle_land_detected
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'manual_control start'
      INFO  [muorb] SLPI: qshell gotten: manual_control start
      INFO  [muorb] SLPI:   arg0 = 'manual_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Ok executing command: manual_control start
      INFO  [uORB] Advertising remote topic manual_control_setpoint
      INFO  [qshell] qshell return value timestamp: 104569964, local time: 104571796
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'control_allocator start'
      INFO  [muorb] SLPI: qshell gotten: control_allocator start
      INFO  [muorb] SLPI:   arg0 = 'control_allocator'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic control_allocator_status
      INFO  [uORB] Marking DeviceNode(control_allocator_status) as advertised in process_remote_topic
      INFO  [uORB] Advertising remote topic actuator_motors
      INFO  [uORB] Advertising remote topic actuator_servos
      INFO  [uORB] Advertising remote topic actuator_servos_trim
      INFO  [muorb] SLPI: Ok executing command: control_allocator start
      INFO  [qshell] qshell return value timestamp: 104924910, local time: 104928529
      INFO  [qshell] Send cmd: 'load_mon start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: load_mon start
      INFO  [muorb] SLPI:   arg0 = 'load_mon'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Ok executing command: load_mon start
      INFO  [uORB] Advertising remote topic cpuload
      INFO  [qshell] qshell return value timestamp: 104994040, local time: 104996985
      INFO  [qshell] Send cmd: 'vtol_att_control start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: vtol_att_control start
      INFO  [muorb] SLPI:   arg0 = 'vtol_att_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic flaps_setpoint
      INFO  [uORB] Advertising remote topic spoilers_setpoint
      INFO  [uORB] Advertising remote topic vtol_vehicle_status
      INFO  [uORB] Advertising remote topic vehicle_thrust_setpoint
      INFO  [uORB] Advertising remote topic vehicle_torque_setpoint
      INFO  [uORB] Marking DeviceNode(vehicle_thrust_setpoint) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(vehicle_torque_setpoint) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Ok executing command: vtol_att_control start
      INFO  [qshell] qshell return value timestamp: 105128451, local time: 105136476
      INFO  [qshell] Send cmd: 'fw_rate_control start vtol'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: fw_rate_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'fw_rate_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [uORB] Marking DeviceNode(rate_ctrl_status) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Ok executing command: fw_rate_control start vtol
      INFO  [qshell] qshell return value timestamp: 105278018, local time: 105281688
      INFO  [qshell] Send cmd: 'fw_att_control start vtol'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: fw_att_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'fw_att_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [muorb] SLPI: Ok executing command: fw_att_control start vtol
      INFO  [uORB] Advertising remote topic landing_gear_wheel
      INFO  [qshell] qshell return value timestamp: 105379888, local time: 105387644
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'fw_pos_control start vtol'
      INFO  [muorb] SLPI: qshell gotten: fw_pos_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'fw_pos_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [uORB] Advertising remote topic actuator_controls_status_1
      INFO  [uORB] Advertising remote topic position_controller_status
      INFO  [uORB] Advertising remote topic position_controller_landing_status
      INFO  [uORB] Advertising remote topic tecs_status
      INFO  [uORB] Advertising remote topic launch_detection_status
      INFO  [uORB] Advertising remote topic landing_gear
      INFO  [uORB] Marking DeviceNode(flaps_setpoint) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(spoilers_setpoint) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Ok executing command: fw_pos_control start vtol
      INFO  [qshell] qshell return value timestamp: 105719884, local time: 105723062
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'fw_autotune_attitude_control start vtol'
      INFO  [muorb] SLPI: qshell gotten: fw_autotune_attitude_control start vtol
      INFO  [muorb] SLPI:   arg0 = 'fw_autotune_attitude_control'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI:   arg2 = 'vtol'
      
      INFO  [uORB] Marking DeviceNode(autotune_attitude_control_status) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Ok executing command: fw_autotune_attitude_control start vtol
      INFO  [qshell] qshell return value timestamp: 105788032, local time: 105790087
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'airspeed_selector start'
      INFO  [muorb] SLPI: qshell gotten: airspeed_selector start
      INFO  [muorb] SLPI:   arg0 = 'airspeed_selector'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic airspeed_validated
      INFO  [uORB] Advertising remote topic airspeed_wind
      INFO  [uORB] Marking DeviceNode(airspeed_wind) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Ok executing command: airspeed_selector start
      INFO  [qshell] qshell return value timestamp: 105909388, local time: 105911596
      INFO  [qshell] Send cmd: 'rc_update start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: rc_update start
      INFO  [muorb] SLPI:   arg0 = 'rc_update'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Ok executing command: rc_update start
      INFO  [qshell] qshell return value timestamp: 106355536, local time: 106357417
      INFO  [uORB] Advertising remote topic rc_channels
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: commander start
      INFO  [qshell] Send cmd: 'commander start'
      INFO  [muorb] SLPI:   arg0 = 'commander'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [muorb] SLPI: Creating pthread commander
      
      INFO  [muorb] SLPI: Successfully created px4 task PX4_commander with tid 2097639
      INFO  [uORB] Advertising remote topic manual_control_input
      INFO  [uORB] Advertising remote topic manual_control_switches
      INFO  [uORB] Advertising remote topic action_request
      INFO  [muorb] SLPI: Ok executing command: commander start
      INFO  [uORB] Advertising remote topic led_control
      INFO  [muorb] SLPI: LED: open /dev/led0 failed (22)
      INFO  [uORB] Advertising remote topic tune_control
      INFO  [qshell] qshell return value timestamp: 106650894, local time: 106653867
      INFO  [uORB] Advertising remote topic event
      INFO  [uORB] Advertising remote topic mavlink_log
      INFO  [muorb] SLPI: Preflight Fail: No connection to the ground control station	
      INFO  [uORB] Advertising remote topic health_report
      INFO  [uORB] Advertising remote topic failsafe_flags
      INFO  [uORB] Advertising remote topic actuator_armed
      INFO  [uORB] Advertising remote topic vehicle_control_mode
      INFO  [uORB] Advertising remote topic vehicle_thrust_setpoint_virtual_mc
      INFO  [uORB] Advertising remote topic vehicle_thrust_setpoint_virtual_fw
      INFO  [uORB] Advertising remote topic vehicle_torque_setpoint_virtual_fw
      INFO  [uORB] Advertising remote topic vehicle_torque_setpoint_virtual_mc
      INFO  [uORB] Marking DeviceNode(flaps_setpoint) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(spoilers_setpoint) as advertised in process_remote_topic
      INFO  [uORB] Advertising remote topic vehicle_status
      INFO  [uORB] Advertising remote topic failure_detector_status
      INFO  [uORB] Advertising remote topic mc_virtual_attitude_setpoint
      INFO  [uORB] Advertising remote topic vehicle_rates_setpoint
      INFO  [uORB] Advertising remote topic vehicle_attitude_setpoint
      INFO  [uORB] Advertising remote topic vehicle_local_position_setpoint
      INFO  [uORB] Marking DeviceNode(mc_virtual_attitude_setpoint) as advertised in process_remote_topic
      INFO  [qshell] Send cmd: 'flight_mode_manager start'
      INFO  [muorb] SLPI: Marking DeviceNode(qshell_req) as advertised in process_remote_topic
      INFO  [muorb] SLPI: qshell gotten: flight_mode_manager start
      INFO  [muorb] SLPI:   arg0 = 'flight_mode_manager'
      
      INFO  [muorb] SLPI:   arg1 = 'start'
      
      INFO  [uORB] Advertising remote topic vehicle_command
      INFO  [muorb] SLPI: Ok executing command: flight_mode_manager start
      INFO  [qshell] qshell return value timestamp: 106931567, local time: 106935102
      INFO  [uORB] Advertising remote topic trajectory_setpoint
      INFO  [uORB] Advertising remote topic vehicle_constraints
      INFO  [dataman] data manager file '/data/px4/dataman' size is 7866640 bytes
      /usr/bin/voxl-px4-start: line 234: px4-modal_io_bridge: command not found
      /usr/bin/voxl-px4-start: line 237: microdds_client: command not found
      no dds namespace set
      using DOM ID: 1
      INFO  [muorb] SLPI: Advertising remote topic transponder_report
      INFO  [muorb] SLPI: Advertising remote topic rtl_time_estimate
      INFO  [muorb] SLPI: Advertising remote topic position_setpoint_triplet
      INFO  [mavlink] mode: Onboard, data rate: 100000 B/s on udp port 14556 remote port 14557
      INFO  [uORB] Advertising remote topic actuator_controls_status_0
      INFO  [mavlink] partner IP: 127.0.0.1
      INFO  [muorb] SLPI: Advertising remote topic telemetry_status
      INFO  [mavlink] mode: Normal, data rate: 100000 B/s on udp port 14558 remote port 14559
      INFO  [muorb] SLPI: Advertising remote topic timesync_status
      INFO  [mavlink] MAVLink only on localhost (set param MAV_{i}_BROADCAST = 1 to enable network)
      INFO  [px4] Startup script returned successfully
      INFO  [muorb] SLPI: Advertising remote topic obstacle_distance
      INFO  [muorb] SLPI: Advertising remote topic offboard_control_mode
      INFO  [mavlink] partner IP: 127.0.0.1
      INFO  [muorb] SLPI: Marking DeviceNode(telemetry_status) as advertised in process_remote_topic
      INFO  [uORB] Marking DeviceNode(mavlink_log) as advertised in process_remote_topic
      ERROR [muorb] SLPI: Accel #0 fail:  TIMEOUT!	
      INFO  [uORB] Marking DeviceNode(mavlink_log) as advertised in process_remote_topic
      ERROR [muorb] SLPI: MAG #0 failed:  TIMEOUT!	
      INFO  [uORB] Marking DeviceNode(landing_gear) as advertised in process_remote_topic
      INFO  [muorb] SLPI: Marking DeviceNode(vehicle_command_ack) as advertised in process_remote_topic
      ERROR [muorb] SLPI: Accel #0 fail:  TIMEOUT!	
      ERROR [muorb] SLPI: MAG #0 failed:  TIMEOUT!	
      ERROR [muorb] SLPI: Accel #0 fail:  TIMEOUT!	
      ERROR [muorb] SLPI: MAG #0 failed:  TIMEOUT!	
      INFO  [muorb] SLPI: Advertising remote topic ping
      ERROR [muorb] SLPI: Accel #0 fail:  TIMEOUT!	
      ERROR [muorb] SLPI: MAG #0 failed:  TIMEOUT!	
      ERROR [muorb] SLPI: Accel #0 fail:  TIMEOUT!	
      ERROR [muorb] SLPI: MAG #0 failed:  TIMEOUT!	
      ^C
      PX4 Exiting...
      Exiting NOW.
      
      Eric KatzfeyE 1 Reply Last reply Reply Quote 0
      • Eric KatzfeyE
        Eric Katzfey ModalAI Team @ejohnson01
        last edited by

        @ejohnson01 You said you are running a modified version of voxl-px4. Does this problem still happen with the unmodified version of voxl-px4 that came with the SDK release?

        E 1 Reply Last reply Reply Quote 0
        • E
          ejohnson01 @Eric Katzfey
          last edited by

          @Eric-Katzfey that was a good question. today I setup my bench to run the px4 version in sdkv1.1.1 and at first I was unable to recreate the problem.

          After further testing I noticed the behavior happening when we would upload our qgc param file from the previous version. My co-worker and I spent some time and isolated the parameters that were the problem children. For what ever reason the parameters below when uploaded via a QGC file is causing the problem

          1	1	COM_FLTMODE1	-1	6
          1	1	COM_FLTMODE2	1	6
          1	1	COM_FLTMODE3	-1	6
          1	1	COM_FLTMODE4	2	6
          1	1	COM_FLTMODE5	-1	6
          1	1	COM_FLTMODE6	2	6
          

          I then set the bench back to the stable sdkv1.1.1 version and created a parameter file with only the parameters above and I was able to see the same problem.

          Steps to recreate
          make sure you run all commands over ssh and do not create an adb session.

          • load stable version of px4. (voxl-px4_1.14.0-2.0.59_arm64.deb)
          • remove old parameters by running rm -rf /data/px4/param/*
          • run voxl-px4 once to set the parameters back to default
          • power cycle
          • make sure sbus receiver and controller are powered on and connected.
          • run voxl-px4
          • open up QGC and the problem should be visible

          I honestly have no Idea why uploading these params in particular break px4. we were able to upload all the other params in the file just fine.

          Work around
          We did discover a work around. If we upload all our old params but the ones listed above and then configure them via QGC the problem does not present itself. The super weird thing about this is that I can then download the parameters and diff them vs the old ones and the above parameters will show as identical.

          Also I can not explain why connecting to the voxl with mini-dm once per boot resolves this issue. I did do further testing on though. It seems like if you open up an adb shell it also resolves this issue similar to using mini-dm, which i guess mini-dm works via adb so its probably the debug link that fixes this problem.

          We are able to continue testing using our work around, but I would be very interested to see what the root of this issue is. Let me know if you all are able to recreate and or if you need more information.

          Eric KatzfeyE E 2 Replies Last reply Reply Quote 0
          • Eric KatzfeyE
            Eric Katzfey ModalAI Team @ejohnson01
            last edited by

            @ejohnson01 Wow, that's an odd one! Thanks for the update. Not sure when I can get to it but now I am also curious as to why that would cause an issue with px4.

            1 Reply Last reply Reply Quote 0
            • E
              ejohnson01 @ejohnson01
              last edited by

              @ejohnson01 I realized I messed up the steps to recreate

              Steps to recreate
              make sure you run all commands over ssh and do not create an adb session.

              load stable version of px4. (voxl-px4_1.14.0-2.0.59_arm64.deb)
              remove old parameters by running rm -rf /data/px4/param/*
              run voxl-px4 once to set the parameters back to default
              power cycle
              make sure sbus receiver and controller are powered on and connected.
              run voxl-px4.
              open up QGC
              upload a QGC file with the above listed parameters
              restart voxl-px4 and the problem should be present

              1 Reply Last reply Reply Quote 0
              • B
                bhanner-bell
                last edited by

                I just wanted to add some more context to hopefully work towards a solution as @ejohnson01 and I been trying to overcome this issue for the better part of a week now.

                I think that the parameter issue may have been a red herring because we have now been able to reproduce the issue in other ways.

                We were able to find that using px4-uorb top when things are working, yields around ~350kB/s of traffic.

                Whenever we plug in a joystick from QGC or power on a transmitter via SBUS things start going haywire (coincidentally right when the uorb message "manual_control_setpoint" starts getting published). The data rate on uorb top tanks by over half, things start becoming unresponsive, and they don't recover until a PX4 restart.

                Are there ways to profile the PX4 tasks that are running on the SLPI? The Linux side shows ~70% of a single core and mini-dm is showing ~70% utilization of the SLPI when things are going bad (contrasted by, I believe, ~50% on Linux and ~45% on SLPI when things are working).

                I just cant help but think there is some task running either on Linux or on the SLPI that is occupying too many resources or filling a buffer that cant be cleared or something along those lines but we can't find the right tool to profile what's going on.

                Eric KatzfeyE 1 Reply Last reply Reply Quote 1
                • Eric KatzfeyE
                  Eric Katzfey ModalAI Team @bhanner-bell
                  last edited by

                  @bhanner-bell Hmmm, yes, 70% is a bit high. I generally like to see things 60% or lower. There isn't really a good tool to do that kind of profiling. There is a system level profiler included with the Hexagon SDK from Qualcomm that allows you to see how much CPU time each thread is taking. I have used that in the past to help me determine thread priorities. But it would be nice to have a statistical profiler that can tell you which lines of code are running most often. I think it would be possible to create a statistical profiler but it would be a fair amount of work. Without that you will have to use trial and error. Selectively start different drivers and modules to try and determine which ones are consuming the most resources. In normal use the CPU load goes way down after arming the drone and flying. There seems to be a bunch of preflight checking that the commander module does that consumes a lot of time and once the flight starts it no longer runs those checks. A weakness of the DSP is that it doesn't have HW support for floating point numbers so it has to use a software library for floating point operations. Also, the current version of PX4 (our voxl-dev branch) logs CPU load for the DSP. But without a new base DSP system image it will always report 0.1%. If you want to start using that I can provide the debian package that updates the DSP system image with this capability. That way you can see what happens after you arm and fly.

                  1 Reply Last reply Reply Quote 0
                  • B
                    bhanner-bell
                    last edited by

                    Hey @Eric-Katzfey we'd definitely appreciate a new dsp system image to test with.

                    I'm just going to dump some more info here in hopes that it may spark a thought:

                    We've been poking around all day and noticed there have been some recent software updates so I updated another one of our units to the SDK 1.1.1 image with our modified PX4-firmware version (it has added in vtol apps and some drivers for dsp_sbus and other sensors.. we are going to try again with 100% vanilla sdk 1.1.1 to make sure its present there too but we tried once before and experienced the same issues)

                    What I was able to observe when the system starts to become unresponsive is that the RAM consumption starts ballooning. voxl-px4 uses about 92364kb on htop when running normally but once the issue crops up the memory starts to increase and goes on forever until stopped... up into the hundreds of MB range.

                    Based on observed system behavior we believe the ballooning ram is due to a queue of uorb messages piling up. We can move the vehicle and observe many seconds later the output on the QGC AHRS indicator. The longer PX4 runs the longer the delay in the input->response loop.

                    We haven't been able to reliably trigger the fault but it seems significantly harder to trigger the fault if the usb c cable is plugged into the voxl and a host linux machine. Interestingly enough we arent actually running and adb commands or anything like that.. its just connected. With the cable completely disconnected the fault is pretty easy to reproduce: just power up the board and start voxl-px4.. if the radio (or a joystick to QGC) px4 seems to have problems.

                    Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                    • Eric KatzfeyE
                      Eric Katzfey ModalAI Team @bhanner-bell
                      last edited by

                      @bhanner-bell I sent the DSP system image debian package to the both of you in email.

                      Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                      • Eric KatzfeyE
                        Eric Katzfey ModalAI Team @Eric Katzfey
                        last edited by

                        @Eric-Katzfey Seems like your email server rejected the email with the attachment.

                        Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                        • Eric KatzfeyE
                          Eric Katzfey ModalAI Team @Eric Katzfey
                          last edited by

                          @Eric-Katzfey Try wget https://storage.googleapis.com/modalai_public/forum/modalai-slpi_1.1-12_arm64.deb

                          1 Reply Last reply Reply Quote 0
                          • B
                            bhanner-bell
                            last edited by bhanner-bell

                            We were able to update the dsp image and also build voxl-px4 from the sdk-1.1.2 tag.

                            We are able to reproduce this issue with just a joystick connected to QGC (meaning RC in mode set to joystick only and voxl-io board disconnected).

                            When the issue occurs, this is what the cpuload topic shows:

                            Every 0.1s: px4-listener cpuload                                                                                                                                                    m0054: Thu Mar  2 13:11:09 2023
                            
                            
                            TOPIC: cpuload 2 instances
                            
                            Instance 0:
                             cpuload
                                timestamp: 730351047 (59.056290 seconds ago)
                                process_load: 0.86000
                                system_load: 0.86000
                                ram_usage: 0.00000
                                platform: "QURT"
                            
                            
                            
                            Instance 1:
                             cpuload
                                timestamp: 788949259 (0.459716 seconds ago)
                                process_load: 0.52000
                                system_load: 0.12578
                                ram_usage: 0.11849
                                platform: "POSIX"
                            
                            

                            Interesting to note: the timestamp starts to fall behind for most topics published by the QURT side of things...
                            Memory ballooning is still here.

                            Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                            • Eric KatzfeyE
                              Eric Katzfey ModalAI Team @bhanner-bell
                              last edited by

                              @bhanner-bell Wow, 86%, that's really high.

                              E 1 Reply Last reply Reply Quote 0
                              • E
                                ejohnson01 @Eric Katzfey
                                last edited by

                                @Eric-Katzfey

                                I did a lot of trial and error today. I am 95% sure I just found the problem.

                                First I got my bench in the messed up state. Then I rolled voxl-px4 back to sdk version 1.0.0 and noticed I was not able to recreate the issue. I immediately after ran dpkg -i on the sdk1.1.2 version and it was broken.

                                I then proceeded to binary search test all of the commits between 1.0.0 and 1.1.2. After some time I came to the following commits.

                                b746ab9434b7e4e71f67c9047ea3d3d49de81c00 - working
                                41a57bc30a40c42990995d5b4e8fe72389f66902 - broken

                                I saved these debs so i could switch between them multiple times and sure enough every time I switched to 41a57... I would see link loss issue. as soon as I rolled back to b746a... fixed.

                                here is a link to the changes in 41a...
                                https://github.com/modalai/px4-firmware/commit/41a57bc30a40c42990995d5b4e8fe72389f66902

                                the problem is caused by the removal of this line

                                qshell commander mode manual
                                

                                I then rolled back to the sdk 1.1.2 commit and added qshell commander mode manual back to the start script and sure enough I can no longer recreate the link loss issue. I just figured this out and its almost 6pm on a friday so I have not had time to look into why this might break the rc control, but I suspect something is not getting initialized correctly without that call.

                                I love to help you all recreate this on your end and get to the true root of the issue. Let me know if there is any other information you would like me to provide.

                                Eric KatzfeyE 1 Reply Last reply Reply Quote 0
                                • Eric KatzfeyE
                                  Eric Katzfey ModalAI Team @ejohnson01
                                  last edited by

                                  @ejohnson01 Okay, thanks for tracking that down. The commander used to come up in AUTO_LOITER mode by default which runs the CPU way too high (as you have witnessed) so I had the explicit command in the startup script to put it into manual mode. The commit changes the default mode to MANUAL so that the explicit statement in the startup script is no longer necessary. That seemed to be working for me but doesn't seem to be working in your case for some reason. If you query the commander status when you remove that what mode does it report?

                                  E 2 Replies Last reply Reply Quote 0
                                  • E
                                    ejohnson01 @Eric Katzfey
                                    last edited by

                                    @Eric-Katzfey

                                    As we have done more testing of the system, we have noticed that one of the symptoms of the issue is that the control inputs become laggy. All inputs from the controller are seen on uorb with more and more time delay over time. For instance sometimes we will let the voxl run for a minute and then look at watch -n0 px4-listener manual_control_setpoint and we will not see the corresponding input show up for 30 seconds. The delay gets worse over time. I was under the impression that uorb had a queue size limit, which if the case, it does not seem to be enforced. If the queue lmit was enforce, I would expect to see a maximum delay of the time to process the size of the queue. I see in a lot of places of the code that there are 16 message length queues so a task running a 50 hz should at most see lag of 800ms (16 * 50ms). I think this might imply a more systemic issue as a few minutes ago we saw the cpuload message get over a minute behind in processing.

                                    Does this sound correct to you?

                                    Also, is there any possibility of setting up a call to dig into this issue in more depth?

                                    1 Reply Last reply Reply Quote 0
                                    • E ejohnson01 referenced this topic on
                                    • E
                                      ejohnson01 @Eric Katzfey
                                      last edited by

                                      @Eric-Katzfey

                                      When I read this a second time I noticed you told me to check the commander mode. I noticed that when I used the game pad the commander mode was put into POS_HOLD. When this happens it seems to cause the lag to build. If I command the vehicle to switch to STABILIZED or ACRO eventually the system seems to catch up and we see both uorb messages be shown realtime and we can see the HUD update mostly real time as well.

                                      I am guessing the issue has more to do with being in a mode that requires position lock. Interestingly enough, when we have brought the aircraft to the field and booted it with clear view of the sky so that the GPS can get lock, we have seen the aircraft either not have the lag and link loss or have lag and link loss for a few seconds then go away.

                                      You said that you thought mode AUTO_LOITER used too much CPU, do you happen to have any thoughts on which of the modules are using the CPU or where the problem might be originating?

                                      Eric KatzfeyE 4 Replies Last reply Reply Quote 0
                                      • Eric KatzfeyE
                                        Eric Katzfey ModalAI Team @ejohnson01
                                        last edited by

                                        @ejohnson01 We noticed that AUTO_LOITER mode was taking a lot of CPU sort of by accident and switched it to default to MANUAL. I have not gone back to determine what was taking so much time in AUTO_LOITER.

                                        1 Reply Last reply Reply Quote 0
                                        • Eric KatzfeyE
                                          Eric Katzfey ModalAI Team @ejohnson01
                                          last edited by

                                          @ejohnson01 I plan to spend some time looking into why POSCTL is taking so much CPU time. AUTO_LOITER hasn't been a very important mode for us but POSCTL certainly is.

                                          1 Reply Last reply Reply Quote 0
                                          • Eric KatzfeyE
                                            Eric Katzfey ModalAI Team @ejohnson01
                                            last edited by

                                            @ejohnson01 I have started to look into this and I can recreate the issue. So I'll dig in and see if I can figure out a fix.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Powered by NodeBB | Contributors