How to use gstreamer with mipi camera(imx412 hi-res cam) in VOXL2?

iksk

Hi developers.

Please tell me, how to use gstreamer with mipi camera(imx412 hi-res cam) in VOXL2?
Or is not support gstreamer?
I check 80-PV086-200_REV_F_QRB5165_Linux_Ubuntu_Software_Programming_Guide.pdf, but I do not how to.

I want to do followings to use gstreamer

Video recording and streaming, with resizing, encording, etc..
Development OpenCV applications with hardware preprocessing (eg. format change, resizing).

Best Regards.

Moderator

We aren't familiar with that document

You should use voxl-camera-server for programmatic access to camera frames both unencoded and encoded for video transmission

The documentation to do this is here: https://docs.modalai.com/mpa-camera-interface/#example-to-programmatically-use-data

You should use voxl-streamer to leverage gstreamer to stream data via RTSP

Please review the VOXL SDK documentation here: https://docs.modalai.com/voxl-sdk/

iksk

Thanks for the reply.

I tried hi-res cam recording/streaming with voxl-camera-server, and MPA to ROS/ROS2. But these application are required high CPU usages. I think to bad a big video data copy.

I want to use hardware encoded video data for recording and streaming.
Streaming is fine voxl-streamer. but I cannot find how to record video. I tried voxl-record-video that is able to make a file but not open by media player; tried windows default player and vlc media player.

Moderator

This post is deleted!

Alex Kushleyev

@iksk , the output of voxl-record-video is a dump of all the video frames and does not contain a proper video container (such as mp4 headers, etc). I believe it should be possible to user gstreamer after video is generated to convert the dump of encoded video frames (h264 / h265) to a proper movie file. Have you tried something like this?

However, you also mentioned that you would like to perform image processing on these frames - would you want to use the raw YUV frames for that or the encoded video frames (which you would need to decode).

Alex

Alex Kushleyev

@iksk ,

If you have not already solved this, here is a quick way of creating a playable file using ffmpeg. I just tried it:

ffmpeg -r 30 -i voxl-record-video.h264 -codec copy videofile.mp4

-r 30 argument will specify the actual fps, since the raw video does not have that information.

The resulting file is playable in Ubuntu movie player and VLC

(please note that voxl-record-video may generate a file name with .h264 extension while the actual encoding is .h265. However you can provide the exact file name with the -o <output file name> argument

If you just want to play the original video, you can use ffplay:

ffplay voxl-record-video.h264 -framerate 30

Alex Kushleyev

@iksk ,

Finally, to answer another question about concurrently processing the RAW YUV frames and the encoded video.. Assuming the encoded video stream / recording is no longer an issue, if you wanted to avoid shipping very large frames via MPA and ROS, let me first describe why you experienced high CPU usage in your test.

Let's assume that you have a 4K YUV frame, so 3840*2160 = 8294400 pixels, and each pixel in YUV image is 1.5 bytes, so a single 4K YUV frame has a size of about 12.4MB. 30FPS means we have 373MB/s of data coming in.

If you publish all that data via MPA to another process, this creates a lot of work for voxl-camera-server and your process to send and access the image (a lot more work than a memcpy within a single process). So, your MPA client is a ROS converter, which receives the 373MB/s data and converts it to ROS (RGB??) frames and sends those out.

YUV to RGB conversion for a large frame is bit expensive and an RGB frame will have 2x the size of the YUV frame (3 bytes per pixel vs 1.5 byte per pixel). So now we are talking about sending out 740MB/s worth of images within ROS, which the sender has to send and receiver to receive before even starting to process the data!

So... what can you do?

reduce number of times the frames are shipped between processes. You could potentially add your processing code directly to the voxl-mpa-to-ros node, or you can form a nodelet using voxl-mpa-to-ros node and your processing nodes, then the ROS image transport will be shared memory (a nodelet is more of a ROS1 concept, i think in ROS2 this is more transparent)
reduce the resolution of your preview frame. You can record at full 4K frame and process a preview frame at a lower resolution (configurable via voxl-camera-server.conf)
reduce the frame rate of the frames that you are processing (skip / drop frames either in the camera server (may need a small change to drop frames) or downstream)
send / process only the Y (intensity) component of YUV frame, which would look just like a monochrome image
a more complicated thing to do would be for you to inject processing callback into voxl-camera-server itself. publishISPFrame function in voxl-camera-server source (https://gitlab.com/voxl-public/voxl-sdk/services/voxl-camera-server/-/blob/master/src/hal3_camera_mgr.cpp) shows how the frame is published to MPA. So instead of publishing to MPA, you can make a call to your processing function. This would achieve a true zero copy performance so you can process the frame in place. This means you would need to build your processing application as part of voxl-camera-server, which is a deviation from our current software model. This is the most complicated approach and it could break camera server if you do something incorrectly (or could interfere with other cameras, since camera server is responsible for handling all cameras). Also please note that this method would not be supported by us, I just wanted to make you aware that you have an option to do it at your own risk.

Hopefully some of this advice can help you move forward!

Alex

iksk

@Alex-Kushleyev

Thank you for some comments. I was able to be streaming, recording and video playback well!

I will be more considering about image preprocessing.