• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Register
  • Login
ModalAI Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
    • Register
    • Login

    Problem with bitrate and omx encoder

    VOXL-CAM
    5
    11
    905
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      Amither
      last edited by 22 Jan 2023, 10:39

      Hi,

      I have recently started exploring the video interface of the VOXL2, utilising the Hires camera setup. Unfortunately, some issues arose when working with the provided OMX encoder.
      When using the preinstalled voxl-streamer application with bitrate of 1MB and resolution of 640x480, format NV12, the output image is grainy.
      I tried writing my own Gstreamer plugin (voxlsrc) which encodes frames using omxh264enc and sends through RTP.

      gst-launch-1.0 voxlsrc device=hires ! video/x-raw,width=640,height=480,format=NV12,framerate=30/1 ! videoconvert ! omxh264enc control-rate=2 target-bitrate=1000000  ! video/x-h264,width=640,height=480,profile=main ! h264parse ! rtph264pay
      

      When using my own computer, I am able to produce a clear image by simply replacing the encoder to x264enc (CPU encoder) and 1MB bitrate.

      gst-launch-1.0 v4l2src ! videoscale ! videoconvert !  video/x-raw,width=640,height=480,framerate=30/1,format=NV12 ! x264enc tune=zerolatency bitrate=1000 ! h264parse ! avdec_h264 ! xvimagesink sync=false
      

      When using voxl-portal or voxl-streamer with bitrate
      10Mb or higher we see clearly picture.

      Bitrate is a key element in our system, while we are unable to support the current bit rate (10Mb or higher). Is there a method to reduce the current bitrate and still maintain a clear picture?

      Links for output pictures : https://postimg.cc/gallery/3RR7j5V

      Thanks,
      Amit Hershkovich.

      E 1 Reply Last reply 23 Jan 2023, 18:42 Reply Quote 0
      • C
        Chad Sweet ModalAI Team
        last edited by 23 Jan 2023, 16:43

        You need to configure the resolution in voxl-camera-server to get a resolution higher than 640 x 480

        1 Reply Last reply Reply Quote 0
        • E
          Eric Katzfey ModalAI Team @Amither
          last edited by 23 Jan 2023, 18:42

          @Amither There are multiple parts of the processing chain that can affect video quality. What type of network link are you using (e.g. WiFi, Ethernet, etc.)? I often start with Ethernet to make sure that I can get everything working well without having to deal with the uncertainties of a wireless link. Then, when everything is as good as I can get it over Ethernet I will switch to wireless and see how that affects the quality. Just out of curiosity, what does your custom GStreamer plugin voxlsrc do?

          1 Reply Last reply Reply Quote 0
          • A
            Amither
            last edited by 16 Feb 2023, 15:49

            Hi @Chad-Sweet,
            I tried to configuring higher resolution by editing /etc/modalai/voxl-camera-server.conf to higher resolution (720P) as follows, however, the output stays the same.

            voxl2:/$ cat /etc/modalai/voxl-camera-server.conf 
            {
            	"version":	0.1,
            	"cameras":	[{
            			"name":	"tracking",
            			"enabled":	true,
            			"frame_rate":	30,
            			"type":	"ov7251",
            			"camera_id":	0,
            			"ae_desired_msv":	60,
            			"ae_filter_alpha":	0.600000023841858,
            			"ae_ignore_fraction":	0.20000000298023224,
            			"ae_slope":	0.05000000074505806,
            			"ae_exposure_period":	1,
            			"ae_gain_period":	1
            		}, {
            			"name":	"hires",
            			"enabled":	true,
            			"frame_rate":	30,
            			"type":	"imx214",
            			"camera_id":	1,
            			"preview_width":	1280,
            			"preview_height":	720,
            			"snapshot_width":	3840,
            			"snapshot_height":	2160
            		}]
            }
            

            Hi @Eric-Katzfey,
            We are working with Ethernet which should provide the best output. I tried implementing a simple plugin for Gstreamer which will allow me to pull a video stream from the camera server. I am sharing the repo if you find this useful -
            https://github.com/amit-hers/Voxl-Plugin.

            Thanks,
            Amit

            1 Reply Last reply Reply Quote 0
            • A
              Amither
              last edited by Amither 16 Feb 2023, 17:01 16 Feb 2023, 16:11

              Hi,
              In addition, when trying to work with the OMX h264enc plugin provided by Qualcomm, I am facing the following issues -

              • The picture is completely distorted as can be seen here -
                https://www.youtube.com/watch?v=niex_WXB6iI
                The CLI command is
              sudo gst-rtsp-launch "( qtiqmmfsrc camera=1 device=hires latency=0 ! video/x-raw,width=640,height=480,format=NV12,framerate=30/1 ! omxh264enc qos=1 ! video/x-h264, profile=(string)high, width=(int)640, height=(int)480, framerate=(fraction)30/1, format=(string)NV12, interlace-mode=(string)progressive, chroma-format=(string)4:2:0,profile-level-id=428014 !  h264parse ! rtph264pay name=pay0 pt=96 )"
              
              • I did manage to receive a clear stream by using Qualcomm's SDK and Qmmf server - https://www.youtube.com/watch?v=e0B045-gBUA
                with the following CLI command
              sudo gst-rtsp-launch "( qtiqmmfsrc camera=1 device=hires video_0::bitrate=1200000 ! video/x-h264,width=640,height=480,format=NV12,framerate=30/1 ! h264parse ! rtph264pay name=pay0 pt=96  )"
              

              In case that i tried to use voxlsrc with omxh264enc that rides over voxl-camera-server.
              I see grainy picture, omxh264enc cant handle low bitrate in that situation.

              https://www.youtube.com/watch?v=UXO9PGNwfE8

              voxlsrc device=hires ! video/x-raw,width=640,height=480,format=NV12,framerate=30/1 ! videoconvert ! omxh264enc control-rate=2 target-bitrate=1200000  ! video/x-h264,width=640,height=480,profile=main ! h264parse ! rtph264pay name=pay0 pt=96
              

              Our scenario requires having two different streams, one is encoded and the other is a raw output.
              Our current setups works with Gstreamer and v4l2src, v4l2h264enc. Since VOXL uses a Qualcomm based board I tried converting the current implementation to use Gstreamer with Qmmfsrc, voxlsrc (my own plugin), omxh264enc.

              My question is how can i make the omxh264enc work with qmmfsrc or voxlsrc ?

              I hope this clarifies the situation.
              Thanks,
              Amit

              1 Reply Last reply Reply Quote 0
              • C
                Connor Fuhrman
                last edited by 8 Jan 2024, 15:27

                I'm experiencing similar issues with the omxh264enc showing grainy video. Using identical gstreamer pipelines except for swapping x264enc and omxh264 elements I see clear video with SW-only encoding but grainy with OpenMAX element. Any updates on the omxh264enc gstreamer element?

                A 1 Reply Last reply 9 Jan 2024, 23:08 Reply Quote 0
                • A
                  Alex Kushleyev ModalAI Team @Connor Fuhrman
                  last edited by 9 Jan 2024, 23:08

                  @Connor-Fuhrman and @Amither ,

                  Our best suggestion is to use voxl-camera-server which allows you to concurrently output encoded frames along as raw YUV frames (at different resolutions!).

                  Perhaps more details for the voxl-camera-config.xml will help. Here is a snippet:

                  {
                  	"version":	0.1,
                  	"cameras":	[{
                  			"type":	"imx214",
                  			"name":	"hires",
                  			"enabled":	true,
                  			"camera_id":	1,
                  			"fps":	30,
                  			"en_preview":	true,
                  			"preview_width":	1280,
                  			"preview_height":	720,
                  			"en_small_video":	false,
                  			"small_video_width":	1280,
                  			"small_video_height":	720,
                  			"small_venc_mode":	"h265",
                  			"small_venc_br_ctrl":	"cqp",
                  			"small_venc_Qfixed":	30,
                  			"small_venc_Qmin":	15,
                  			"small_venc_Qmax":	40,
                  			"small_venc_nPframes":	9,
                  			"small_venc_mbps":	5,
                  			"en_large_video":	true,
                  			"large_video_width":	3840,
                  			"large_video_height":	2160,
                  			"large_venc_mode":	"h265",
                  			"large_venc_br_ctrl":	"cbr",
                  			"large_venc_Qfixed":	38,
                  			"large_venc_Qmin":	15,
                  			"large_venc_Qmax":	50,
                  			"large_venc_nPframes":	29,
                  			"large_venc_mbps":	1,
                  			"en_snapshot":	false,
                  			"en_snapshot_width":	1920,
                  			"en_snapshot_height":	1080,
                  			"ae_mode":	"off",
                  			"en_rotate":	false,
                  			"en_raw_preview":	false
                  		}]
                  }
                  

                  In this configuration, the large encoded video stream will be 4K30, while there is also a smaller encoded stream available at 720p resolution and also a "preview" frame, which is YUV is avaialble at 720p (1280x720).

                  So with this configuration, you can use the voxl-streamer application to stream the large or small encoded video and use the standard mpa interface to subscribe to raw YUV frames and process them. Please note that if you change the YUV frame size to 4K, there will be a lot of data moving around between processes, see related post: https://forum.modalai.com/topic/2980/how-to-use-gstreamer-with-mipi-camera-imx412-hi-res-cam-in-voxl2

                  Also, please note that in this example i set the small encoded video bit rate control to cqp (variable) but large encoded to cbr (constant bit rate). You can change those as you need and experiment. Also please note that it is a known issue that the venc_mbps is not correctly reflected in actual bitrate of the video, but it does have an effect and you can adjust the setting until you reach the desired quality / video size that you want.

                  If you have any more questions about this approach, please let us know

                  C 1 Reply Last reply 11 Jan 2024, 01:24 Reply Quote 0
                  • C
                    Connor Fuhrman @Alex Kushleyev
                    last edited by Connor Fuhrman 11 Jan 2024, 01:24 11 Jan 2024, 01:24

                    @Alex-Kushleyev thank you for the quick reply! My application requires the ability to encode an arbitrary image frame not necessarily directly from a camera stream. Can you please comment on the feasibility of using the modal-pipes library to provide frame data to the video-server (or maybe the proper module is the voxl-streamer?)? I've used this solution in the past to provide data to the Tensorflow Lite server so I'm hoping the same approach can be taken here.

                    From this file on ModalAI's gitlab it looks like I can configure the voxl-streamer to accept a raw image frame in YUV colorspace? I can then make the pipe contain arbitrary image data, no?

                    A 1 Reply Last reply 11 Jan 2024, 14:27 Reply Quote 0
                    • A
                      Alex Kushleyev ModalAI Team @Connor Fuhrman
                      last edited by 11 Jan 2024, 14:27

                      @Connor-Fuhrman , yes you can send raw frames (YUV) to voxl-streamer and it will use a hardware encoder to encode the frames into a video. You just need to publish the frames via MPA to a new "handle" and tell voxl-streamer which handle to use.

                      So your processing application can subscribe to YUV frames, do some work on the frames, annotate them, and send them out under a different handle via MPA. Just keep in mind that sending very large frames over MPA (such as 4K) will incur noticeable CPU usage.

                      Do you need an example how to publish a YUV frame via MPA? It sounded like you already know how to do it.

                      C 1 Reply Last reply 17 Jan 2024, 01:45 Reply Quote 0
                      • C
                        Connor Fuhrman @Alex Kushleyev
                        last edited by 17 Jan 2024, 01:45

                        @Alex-Kushleyev I've got this working using a custom ModalAI pipe sending an RGB image (I tried YUV because I thought that was the expected input but looking through the source code I saw the vixl-streamer could accept RGB as well).

                        I am unfortunately only successful with a 480p image and nothing at a higher resolution. I do not get any error messages writing to the pipe with a larger image size and I don't see the voxl-streamer receive an initial frame. Do you have thoughts as to why I'm not seeing any output from voxl-streamer and not getting an error when writing to the pipe?

                        I've attached the logs from my application and from running /usr/bin/voxl-streamer -v 0.

                        My configuration file is

                        {
                            "input-pipe": "ros_to_voxl_streamer",
                            "bitrate":	10000000,
                            "decimator": 1,
                            "port": 8900,
                            "rotation":	0
                        }
                        

                        The logs from the voxl-streamer can be found here for 480p images and here for 720p images. The logs from my application can be found here for the 480p image output and here for the 720p image output.

                        When I write to the pipe my function is

                          bool image_pipe::write_frame(cv::Mat frame_bgr)
                          {
                            namespace time = std::chrono;
                        
                            log_verbose(fmt::format("Input frame has size [width={}, height={}]",
                        			    frame_bgr.size().width,
                        			    frame_bgr.size().height));
                            
                            static auto first_frame = true;
                            static std::size_t num_frame_write_errors = 0;
                            static constexpr std::size_t num_acceptable_frame_write_errors = 10;
                        
                            // Resize the image to the output size specified
                            cv::resize(frame_bgr, frame_bgr, img_size);
                        
                            log_verbose(fmt::format("Resized input frame to size [width={}, height={}]",
                        			    frame_bgr.size().width,
                        			    frame_bgr.size().height));
                        
                            cv::Mat frame_rgb;
                            cv::cvtColor(frame_bgr, frame_rgb, cv::COLOR_BGR2RGB);
                        
                            // If this is the first frame we've sent then configure the
                            // camera metadata
                            if (first_frame)
                            {
                              camera_mdata.height = static_cast<std::int16_t>(img_size.height);
                              camera_mdata.width = static_cast<std::int16_t>(img_size.width);
                              camera_mdata.size_bytes = static_cast<std::int32_t>(frame_rgb.total()) *
                        	static_cast<std::int32_t>(frame_rgb.elemSize());
                              camera_mdata.stride = static_cast<std::int32_t>(frame_rgb.step[0]);
                        
                              log_info(fmt::format("First frame written to image_pipe object. Metadata:\n{}",
                        			   describe_camera_pipe_metadata()));
                        
                              first_frame = false;
                            }
                        
                            camera_mdata.timestamp_ns = time::duration_cast<time::nanoseconds>(
                              time::steady_clock::now().time_since_epoch()).count();
                        
                            log_verbose(fmt::format("Sending image through pipe with metadata\n{}",
                        			    describe_camera_pipe_metadata()));
                        
                            const auto* frame_data = reinterpret_cast<const void*>(frame_rgb.data);
                            const auto ret = pipe_server_write_camera_frame(0, camera_mdata, frame_data);
                        
                            if (ret != 0)
                            {
                              if (num_frame_write_errors++ == num_acceptable_frame_write_errors)
                              {
                        	throw std::runtime_error("Maximum number of consequtive frame write errors reached");
                              }
                              log_warning(fmt::format("Was not able to write a frame to the pipe. There are {} more error(s) left before failure",
                        			      num_acceptable_frame_write_errors - num_frame_write_errors));
                              return false;
                            }
                        
                            num_frame_write_errors = 0;
                        
                            if (++camera_mdata.frame_id % 100 == 0)
                            {
                              log_verbose(fmt::format("Camera metadata:\n{}",
                        			      describe_camera_pipe_metadata()));
                            }
                        
                            return true;
                          }
                        

                        Note that the calls to log_ functions are what generates the above-linked log files.

                        and the construction of the pipe occurs here:

                          static inline pipe_info_t make_pipe_info(const std::string_view pipe_name)
                          {
                            pipe_info_t info; // = PIPE_INFO_INITIALIZER;
                            std::strcpy(info.name, pipe_name.data());
                            std::strcpy(info.type, "camera_image_metadata_t");
                            std::strcpy(info.server_name, "ros-voxl-streamer");
                        
                            info.size_bytes = 2 * MODAL_PIPE_DEFAULT_PIPE_SIZE;
                        
                            return info;
                          }
                          
                          image_pipe::image_pipe(std::string pipe_name, cv::Size _img_size)
                            : voxl_pipe_info(make_pipe_info(pipe_name)),
                              img_size(_img_size)
                          { 
                            // Configure the camera_mdata member
                            camera_mdata.magic_number = CAMERA_MAGIC_NUMBER;
                            camera_mdata.exposure_ns = -1;
                            camera_mdata.gain = -1;
                            camera_mdata.format = IMAGE_FORMAT_RGB; // IMAGE_FORMAT_YUV420;
                            camera_mdata.frame_id = 0;
                            camera_mdata.framerate = 5;
                        
                            const auto ret = pipe_server_create(0, voxl_pipe_info, 0);
                            if (ret != 0)
                            {
                              throw std::runtime_error("Cannot create pipe");
                            }
                          }
                        
                        A 1 Reply Last reply 17 Jan 2024, 17:53 Reply Quote 0
                        • A
                          Alex Kushleyev ModalAI Team @Connor Fuhrman
                          last edited by 17 Jan 2024, 17:53

                          @Connor-Fuhrman , I am not sure.

                          Please try to enable debug-level logging, so you may see some errors (both in your application) and voxl-streamer.

                          In your application you can add the following :

                          M_JournalSetLevel((M_JournalLevel) debugLevel);
                          

                          example: https://gitlab.com/voxl-public/voxl-sdk/services/voxl-camera-server/-/blob/master/src/main.cpp?ref_type=heads#L301

                          voxl-streamer already has this option, just need a command line option : https://gitlab.com/voxl-public/voxl-sdk/services/voxl-streamer/-/blob/master/src/main.c?ref_type=heads#L383

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Powered by NodeBB | Contributors