Hi @Moderator is it possible to make a bootable and installable OS image of the voxl2 from one drone and than deploy it to another drone. Like in a raspberry pi I can essentially take the backup image of the entire system and than flash it to another sd card. I have 3 more starling drones which need the same voxl services modifications as I have in my base starling drone, instead of doing it one by one and than testing the functionality only to find out that it doesn't work is a bit tedious
Latest posts made by Darshit Desai
-
VOXL2 starling bootable images
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev said in Starling fan attachment and optimization:
As far as I understand from the article here and some of my own search online, https://libeigen.gitlab.io/docs/group__TopicStorageOrders.html
It seems that eigen default uses column major order if the options aren't specified. For the point cloud data we have column major would be better right?
@Alex-Kushleyev said in Starling fan attachment and optimization:
For best results, x, y and z components of each vector have to be stored in consecutive memory locations and vector N+1 should be right after vector N (in memory)
I am also not sure about the memory locations here so if the matrix is column major and the incoming pointcloud is of shape 3, 38528 a column major matrix should be optimal for consecutive memory allocations
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev Yes I use Eigen3 for doing the rigid body transformation,
Method0: Use the tf2 sensor msgs::do transform cloud function directly on the large 38528x3 pointcloud, which didn't work so I used Eigen 3 from the next method onwards
Method1:
I have tried statically assigning eigen matrixxd variables, I also tried bifurcating the large 38528x3 pointcloud matrix which i get into 5 parts and parallelizing the mutliplication by doing the multiplication of 5 different parts of matrices on different pinned cpus and then combining them. None of the methods worked because all of them end up heating up the cpu when ran in combination of the voxl-tflite-server.Method2:
Other method I tried was filtering the points by depth which reduced the number of points to 15000 points for rigid body transformation and then doing the same parallelized multiplication of rotation matrix and pointcloud but that also ends up heating the cpu too much when run in combination of voxl-tflite-server.All of the code was written in eigen3. This is the code that I used for method 2 https://github.com/darshit-desai/Project_LegionAir/blob/master/your_pointcloud_package/src/pc_transform.cpp
Note all of this methods above run fine on my desktop cpu which is understandable because my desktop cpu has literally a million times more compute power and better cooling then what's onboard the voxl
Edit: Also in case of voxl I use the CPUs in perf mode
-
RE: Starling fan attachment and optimization
@thomas @Alex-Kushleyev I have bottle necks in my sensor fusion pipeline because of which the temperature of the cpus go very high even with the fan cooling and the propeller throwing airflow in flight. I am doing a rigid body transformation of the incoming points from tof frame to rgb camera frame and that transformation of 38528x3 vector takes a lot of cpu capacity and overheats the cpu. Now I have tried every trick in my toolbox ranging from multi threading to removing unnecessary pipes from voxl_mpa_to_ros but none of them work. I see one of the options is, I somehow filter out the points which are irrelevant to me (i.e. I am only looking for a certain depth range between 10 cms to 1.5 mtrs) before it is published on mpa to ros and then do a rigid body transformation. Another option is to use the raw data by somehow tapping into one of the camera server pipes and filter out the points. Any thoughts on how to optimize the below pipeline for performance would be helpful
-
RE: Starling fan attachment and optimization
@thomas Thank you, actually the purpose of the question was to identify all components which use CPU in the tflite server. So if it's just image publishing and drawing of bboxes or seg maps on images then I can just comment out the relevant function calls to publishing the image with bbox and the part where it actual makes an image with bbox and writes parameters on the image like fps and other details.
Is there any other part which is being done on the CPU other then the ones highlighted?
-
RE: Starling fan attachment and optimization
Hi @Alex-Kushleyev, I wanted to ask one more question regarding cpu utilization while running the tflite server. It shows that it uses cores 4, 5 and 6 for processing and connects itself to the camera server. What is the tflite server using cpu for? Publishing images to libmodal-pipe? like bbox drawn on images? What if I want to disable that and zero out any utilization of cpus by the tflite server?
By that I mean this line here: https://gitlab.com/voxl-public/voxl-sdk/services/voxl-tflite-server/-/blob/master/src/main.cpp?ref_type=heads#L247
What else is the tflite server using on cpus which can be removed? As in my system I am only concerned with the bbox detection message.
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev said in Starling fan attachment and optimization:
Regarding CPU frequencies, when cpu governor is in auto mode, it will try to scale down cpu frequencies to save power. but if you want maximum performance, you can set to to performance mode:
voxl-set-cpu-mode perf
Note that this does not persist after reboot, if you want permanent change, you can change more /etc/modalai/voxl-cpu-monitor.conf and set normal cpu mode to perf
This is definitely useful, but is it right that cpu0-3 are pinned for MPA services, if that is the case I can explicitly assign cpu's for my ros nodes to run cpu 4-7?
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev I tried this recommendation by removing the fan and holding it up and running my code it didn't make any difference as soon as the temperatures go above 75 the accelerometer bias flag is active. Also I don't think fan placement is an issue because the fan is placed right above the heat sink of the cpus and not anywhere near the imus, there is sufficient space between the wifi dongle and the board to move around a little.
@Alex-Kushleyev said in Starling fan attachment and optimization:
Then you can remove the wedged fan (and hold it close to the board) and test again and see if the unusual accel bias is gone (when warmed up).
The bias issues only come when I run the object detection and my own sensor fusion module, without that code running and the fan installed the drone is able to fly in position mode. This is more of a cpu heating and load distribution issue, somehow I think cpu0-3 are pinned for some MPA services and pipes and the rest of the 4 cpus are not being utilized equally, I am looking into multi threading for load distribution in my code, let me know if there are any more recommendations
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev Also I have consistently observed that cpu0-cpu3 have 1.8-2.0 GHz frequency and on an average 45-65% utilization even when the ros nodes are not running while, cpu7 when the ros nodes are running has 1.9-2.8 Ghz average frequency with 70-85% utilization while cpu4-6 are relatively lighter with only 0.6-0.7 Ghz frequency and ~20% utilization at maximum even when I run my complete code stack, is there a specific reason for such a scenario?
-
RE: Starling fan attachment and optimization
@Alex-Kushleyev said in Starling fan attachment and optimization:
To confirm the IMU bias issue, you can inspect the IMU data using QGC (mavlink inspector) and see if the XYZ accelerometer (while sitting still)
Which parameter would it be? Position NED?