Hi @Alex-Kushleyev,
Resurrecting this old thread, we now have the IMX412 on a drone and we are now ready to give to VIO on the IMX412 our full attention and lots of testing effort. Where I'm at now is I have a prototype working and QVIO does run on the IMX412 camera and outputs estimates that seem reasonable, but I'm 100% sure it's not configured as good as it could be cause I made so many assumptions that I would like your input on:
Which camera data / pipe to use
Ideally, we would like the IMX412 VIO to perform close to (or of course better than!) a ar0144 tracking camera, in terms of the quality of the image for feature tracking, low CPU usage, low latency frames, etc etc etc. In this spirit, I've been looking into how to get the MISP normalized pipes coming from the IMX412 and also how to get the camera server producing ION data to get the same CPU usage gains we saw in https://forum.modalai.com/topic/4893/minimizing-voxl-camera-server-cpu-usage-in-sdk1-6.
I saw that in the pipe setup, the normalized code for IMX412 was commented out
4f2e7cff-3212-46c7-9929-988f0ce413e1-image.png
After commenting it back in I was able to see in the portal a decent looking normalized stream. I also see the ION pipe pop up for that norm stream but I haven't tried that ION pipe on the QVIO server yet (I'm confident it would work though, just waiting for https://gitlab.com/voxl-public/voxl-sdk/core-libs/libmodal-pipe/-/commit/d18521776e3e88f396d85aa657769c47f29e9c9f to get tagged!).
Do you see any issue with using the MISP norm pipe for IMX412 VIO, or is that actually what you would recommend?
Which resolution to use
I know you talked about some resolution advice above, but I'm a little bit confused on the specifics on where to put those numbers. You had suggested 1996x1520 for a 5.5ms readout time. Do these numbers go into the Preview Width config fields? Here is the entire diff of the config settings I have been using for my testing:
b75275d6-9c35-48ac-a95c-a59b439e7f54-image.png
The other values I have a question about in that diff is The MISP width / height fields, I chose 998x760 which is half of the Preview Width resolution you suggested. I did this because I wasn't sure of any compute bottle necks that would pop up if I fed a 1996x1520 image into QVIO. Do you think 998x760 is good or maybe I should pick a like 0.75 downsample so something like 1497x1140 for the MISP width/height.
Camera Driver files to use and how to version control and deploy those
Could you confirm that the binary files in https://storage.googleapis.com/modalai_public/temp/imx412_test_bins/20250919/imx412_fpv_eis_20250919_drivers.zip are still the latest and the recommended binaries to use? Could you also advise on how to version control these files and deploy them to the voxl2 when the camera server .deb is deployed? I want to keep all files related to bringing up the camera in the voxl-camera-server debian if possible. I see some binaries files being stored in this path:
03191a03-455e-4e18-bd50-dd19b9d5c028-image.png
So if I understand the process correctly, those files will end up in /usr/share/modalai/chi-cdk/imx412-fps-misp. Is that where the voxl-configure-cameras C looks for them? Also, do I have to do anything with the com.qti.sensor.imx412_fpv.so file in the zip link that you sent, or do I just ignore that file?
My end desired behavior is that when I install the voxl camera server .deb, I don't have to worry about also copying binary files over to the voxl, or moving any files around on the voxl, or having to remember to run voxl-configure-cameras C. So maybe the path forward there is to have all files deployed into the right places by the .deb install and then in the postinst script auto run voxl-configure-cameras C? What do you think?
Aspect ratio concerns and their affect on field of view and camera calibration
Is the aspect ratio you suggested (1996x1520) the actual aspect ratio of the sensor? Or does the sensor support multiple aspect ratios, or is there something more complex going on I don't understand here? I just want to make sure that we're using as much FoV as the sensor supports. That should be the goal for VIO feature detection and for streaming right, maximize field of view?
Also, how does changing the aspect ratio / resolution affect the camera calibration? We're using kalibr which asks for a focal length bootstrap, which we've been giving it 470 for the ar0144 camera, do you know of a way that we can figure out an accurate focal length bootstrap for the images that we end up using for the IMX412?
How to not lose other IMX412 features like 4k recording and EIS streaming etc
This is kinda my biggest concern about the feasibility of this whole thing: Will we still be able to get the other awesome IMX412 features like high quality streaming to the GCS with EIS, as well as high quality 4K recordings even in difficult low light environments, AND at the same time optimize the IMX412 for VIO which demands stuff like fast readout for less skew?
Any advice you can give here on mapping out the tradeoffs? Are there any non-starters like not being able to get 4K recording if we opt to use the MISP norm pipe for VIO? Or are you confident that we can get the best of all 3 worlds 🙂
Exposure time concerns
I agree with your initial point that needing low exposure for low motion blur is important. As I mentioned in the intro to this message, I have a prototype working and I am now at a place where I can indeed tune the gain vs exposure / auto exposure params. Could you help me with that? I assume that this tradeoff also applies to the ar0144 camera, any lessons I can take from there?
QVIO readoutTime param
Great find and thanks for pointing that out!! To confirm the specific numbers here, if I use the 1996x1520 preview width/height which has a documented read out time of 5.5ms (I should confirm this using the -d mode), then I should put 0.0055 for that parameter?
As always, thank you for your help in camera related matters, we would be no where close to where we are now with robotic perception without your guidance!