ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Image Stabilization calibration and pipe size clarification

    Ask your questions right here!
    2
    14
    591
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alex KushleyevA
      Alex Kushleyev ModalAI Team @jameskuesel
      last edited by Alex Kushleyev

      Hi @jameskuesel ,

      When you enable en_raw_preview, which is what we do when MISP is enabled, the preview_width and preview_height basically force the selection of a specific camera mode with those dimensions. This means that the camera will send a bayer image of that size.

      After the image is received by the ISP (which does nothing else in RAW-only mode), then MISP consumes the raw bayer image and performs debayering on the GPU. Then misp_width and misp_height will be used to specify the dimensions of the output image, sampled from the original bayer image.

      If the output dimensions have exactly the same ratio as the input dimensions, then MISP will perform (arbitrary) down-scaling only (no crop = same horizontal and vertical FOV). However, if the output dims have different aspect ratio, the output image will be a cropped (and downscaled) version of the original image, such that the width and height ratio is maintained (features are not stretched). The crop is selected to maximize the fit of the output image within the input image.

      With that in mind, if you select 4040x3040 as preview resolution and select misp resolution such that both width and height are scaled by the same factor (does not have to be integer), you can calibrate the intrinsics using the small image and after the calibration you can upscale the intrinsics:

      • multiply the principal point offsets by the same factor
      • multiply the focal length by the same factor
      • keep the same fisheye distortion coefficients (since they are a function of angle, not pixels)

      See another version of this explanation in a different context : https://forum.modalai.com/topic/4900/running-qvio-on-a-hires-camera/12 (look for "intrinsics")

      • Input to the EIS algorithm is the full bayer image that is defined by the preview_width and preview_height.
      • the output image size is defined by misp_width and misp_height, just like when EIS is off. The misp resolution can be arbitrary
      • an important parameter is misp_zoom, which defines the relationship between the original image and the output image, in terms of zoom level. if you set zoom = 1.0 with EIS enabled, then the misp output will produce the largest undistorted image that fits within the original bayer image
      • there will be no deformation (stretching). the EIS ROI size will fit within the original bayer image and the zoom level is really controlling how much the FOV is reduced and how much stabilization margin you have.
      • you can enable the ROI display (PIP or side-by-side) to see how the EIS ROI fits within the original full image
      • the actual resolution of the misp output has nothing to do with how the EIS ROI fits within the original bayer image : the zoom value and fisheye undistortion dictate that. The output misp resolution just defines how many pixels the output image will contain, not the field of view (FOV).
      • for output resolution, you can pick whatever you need for your application.

      By the way, you can have multiple EIS outputs on the same camera concurrently (MISP supports up to 4 channels). So you can have one stream that is small resolution (for streaming over wifi) and another one higher resolution (and can be different aspect ratio) for high quality recording. https://docs.modalai.com/camera-video/electronic-image-stabilization/#using-eis-with-misp-channels

      Please note that currently, when you use multiple misp channels from the same camera, the image is processed separately, meaning if you have a large resolution and small resolution misp stream, the small resolution is NOT generated by downsampling the output of the large stream (which could be a nice optimization, in some cases).

      Please let me know if you have any other questions.

      Alex

      J 1 Reply Last reply Reply Quote 0
      • J
        jameskuesel @Alex Kushleyev
        last edited by

        @Alex-Kushleyev

        Thanks for the detailed response Alex.

        I think this addresses my most important concern, which was stretching of the misp pipes, I didn't realize that it was just cropping which is good news.

        One more thing: Seems enabling Snapshots isn't compatible alongside MISP . Is that a fundamental limitation, or more of a how the stack is built thing? And is there currently any recommended way to still save a still image with MISP enabled? Or if not, a future implementation plan?

        What I’m aiming for long term is high-res recording, lower-res streaming, and the option to grab a still in one configuration. With MISP ofc.

        Best,
        James

        Alex KushleyevA J 2 Replies Last reply Reply Quote 0
        • Alex KushleyevA
          Alex Kushleyev ModalAI Team @jameskuesel
          last edited by

          Hi @jameskuesel ,

          Yes there is an issue when both MISP and snapshot are enabled. I will look into it again.

          By the way, what snapshot are you looking to grab? There are a few options;

          • a frame from high-res recording (with eis?)
          • a frame from low-res streaming (with eis?)
          • full frame, full resolution (no stabilization)
            • what format? jpg or raw bayer (for offline processing)

          We can implement a snapshot feature in MISP (and not going through ISP).

          Alex

          1 Reply Last reply Reply Quote 0
          • J
            jameskuesel @jameskuesel
            last edited by

            @jameskuesel

            Hi Alex,

            To be greedy, ideally as many of those as possible. Ones that appeal most to us would be a

            1. A frame from high-res recording (with eis)
            2. A full frame, full res (no eis)

            Then for format,
            JPG probably makes the most sense but ofc, the more options the better haha.

            Alex KushleyevA 1 Reply Last reply Reply Quote 0
            • Alex KushleyevA
              Alex Kushleyev ModalAI Team @jameskuesel
              last edited by Alex Kushleyev

              @jameskuesel ,

              For item 1, (a frame from hires recording with eis), you can do this already by just capturing the YUV from the hires_misp_color stream. you could do this using existing tool voxl-record-raw-image.

              For item 2, the full frame can also be captured (using the same tool) but only in raw bayer format. This would be good for offline processing, if you wanted to get the maximum image quality, however this image would not have any processing applied from MISP.

              Additionally, since MISP supports multiple outputs, as you already know, (misp channels), you could set up one channel to be a full frame image (which you normally dont stream), with EIS off.

              So with correct voxl-camera-server.conf, you should be able to get all the streams you want

              • full frame
              • hires recording
              • low-res streaming

              And grab YUVs from any of those streams. The last part would be then just encoding them to JPG (which could be done separately or part of voxl-record-raw-image, which we could add)

              If you send me your voxl-camera-server.conf (or the part specific to the hires camera), i can update it to show you how you can get the three streams.

              Alex

              J 1 Reply Last reply Reply Quote 0
              • J
                jameskuesel @Alex Kushleyev
                last edited by

                @Alex-Kushleyev

                Hi Alex,

                Sorry has been awhile, keep having to step away and re-visit this.

                First, seems like snapshot seems to be fixed in one of your latest dev releases, tried that out and seems to work so that’s awesome. Able to capture a full frame non-MISP image from the 412. That being said, although I tested out voxl-record-raw-image on the hires_misp_color stream, and was able to eventually convert it to a jpg, its not as usable if not in a standard image format by default.

                In addition I had a few more clarifying questions.

                1.

                Seems like there’s no way to have the large hires pipe on at the same time as the MISP pipe? Meaning it’s not possible to get a recording of the full frame, no eis, at the same time the MISP pipe is being used? The reason why this might be useful is because although I could have a high-resolution MISP pipe for recording it would have the same zoom level as the other misp pipes as it seems that is shared parameter. I assume this is a system limitation and the voxl2 might not be able to handle having multiple 4k pipes and encoding them. Just want to verify this.

                2.

                I noticed there is a imx412 and imx412-misp driver. What is the difference between these two? MISP seems to work fine using the regular imx412 driver (and that’s what I’ve been using since https://docs.modalai.com/camera-video/electronic-image-stabilization/ didn’t make mention of the other one).

                Is it just the same thing, different name or are there gains to be made switching to the dedicated MISP driver? What are those?

                Let me know, Thanks!

                P.S love to see the continued work being done by you all here on the camera server very cool!

                Alex KushleyevA 1 Reply Last reply Reply Quote 0
                • Alex KushleyevA
                  Alex Kushleyev ModalAI Team @jameskuesel
                  last edited by

                  hi @jameskuesel , nice to hear from you

                  Please note that we recently added new functionality to voxl-record-raw-image to save to jpeg using the software jpeg encoder (turbojpeg, similarly to how it's done in voxl-portal for sending the images to the browser):

                  voxl-record-raw-image -h
                  
                  
                  Record a raw image from an MPA pipe to disk
                  This is typically used for inspecting raw or YUV
                  image data formatting or for loading into something
                  like MATLAB or OpenCV for post-processing.
                  
                  Optional arguments are:
                  -d, --dir {dir}       directory to save output files in (default: /data/raw-images/)
                  -h, --help            print this help message
                  -j, --jpeg            convert to JPEG before saving (supports NV12, NV21, RGB, RAW8 formats)
                  -q, --jpeg-quality    JPEG quality for compression (1-100, default: 90)
                  -n, --num-images {n}  number of images to save from the pipe (default 1)
                  -m, --meta            save metadata in file name (timestamp (us), exposure (us), gain)
                  -s, --skip {n}        skip n frames before saving a new frame
                  
                  1. VOXL2 should be able to encode up to 4x 4K30 streams (video encoder can do 8K30). So there is no limitations to encode two 4K30 streams (if the stream is larger than 4K, such as 4040x3040, still two of them should be fine). Since MISP supports up to 4 output channels, that should be ok, but it seems that youre issue is that you want to have different zoom on different channels. This is a feature we have also been discussing internally and should be easy to add a param that would behave like this: either shared zoom for all misp channels or individual zoom (via config and controlled via the control pipe). Is that something that would work for you? We could add this pretty quick.

                  2. The latest IMX412 driver ( which is here : https://storage.googleapis.com/modalai_public/temp/imx412_test_bins/20250919/imx412_fpv_eis_20250919_drivers.zip) has been optimized to get the following:

                  • various custom resolutions, as mentioned here : https://docs.modalai.com/camera-video/low-latency-video-streaming/#imx412-operating-modes
                  • maximize throughput in order to reduce rolling shutter skew (also enable 4040x3040 @60 fps, allowing 4K60 EIS output!)
                  • BTW, there has also been quite a bit of headache over IMX412 camera interference with GPS, we are going to release updated IMX412 drivers very soon. see
                  • https://forum.modalai.com/topic/5116/gnss-emi-mitigation-guidelines
                  • https://docs.modalai.com/emi-mitigation-for-gnss/

                  Once we release the new IMX412 drivers that do not affect GPS, we will provide better documentation of the existing drivers and difference.. but for now you should just use the "eis" driver for imx412, as i linked above.

                  Alex

                  Alex KushleyevA 1 Reply Last reply Reply Quote 0
                  • Alex KushleyevA
                    Alex Kushleyev ModalAI Team @Alex Kushleyev
                    last edited by

                    @jameskuesel , i have been testing a version of camera server that supports independent zoom / drag on each of the 4 misp channels and also independent EIS modes. This means that you can have one full size unstabilized image and one or more stabilized with EIS and arbitrary zoom and look-at positions for each stream.

                    There is also an updated voxl-portal version that has the multi view that supports showing and interactive with multiple streams at the same time.

                    If you would like to test it, let me know, i will need to document a bit. Please provide some more information such as what camera you use and what use cases you need, so i can provide an exact camera server config to test (basically specs for each misp channel, - fps, dims, eis/on off and eis mode)

                    Alex

                    J 1 Reply Last reply Reply Quote 0
                    • J
                      jameskuesel @Alex Kushleyev
                      last edited by

                      @Alex-Kushleyev

                      Hi Alex, apologies again!

                      Yes independent zoom is what we would be looking for. Awesome to hear you’ve been working on it. Yeah the three main use cases for us would be recording, streaming, and calibration (Makes it easier for an assembly process to not have to switch camera configs). So here is the minimum channel use case we were looking for

                      We currently use 
J6_UPPER_SENSOR: imx412-fpv

                      Channel 1 (Potential Recording Pipe 1 (Full Frame)):
                      Dimensions: 4040x3040
                      FPS: 30fps
                      Zoom: OFF/1x
                      EIS: OFF
                      EIS MODE: N/A

                      Channel 2: (Potential Recording Pipe 2 (Variable Zoom):
                      Dimensions: 4040x3040
                      FPS: 30fps
                      Zoom: Variable (1x-whatever, would be sync with streaming pipe. We would implement this logic)
                      EIS: ON
                      EIS MODE: Horizon (or full-follow)

                      Notes: Would likely leave it to our users whether to record full frame or be synced with streaming pipe. So would only ever need one of the recording pipes likely never both at once.

                      Channel 3: (Streaming Pipe):
                      Dimensions: 1280x800 (16:10 matches our tablet aspect ratio, not sure if there is something better to use here.)
                      FPS: 30fps
                      Zoom: Variable (1x-whatever)
                      EIS: On
                      EIS MODE: Horizon (or full-follow)

                      Channel 3 (Calibration Pipe):
                      Dimensions: 1010x760 (we’ve been calibrating at quarter resolution with our own internal calibration routine)
                      FPS: 30fps
                      Zoom: Off/1x (zero zoom full picture)
                      EIS: OFF
                      EIS MODE: N/A
                      Notes: Really only need the grayscale/normalized from this.

                      Anyways, happy to give it a try if you have available. Zero rush, currently juggling a few projects.

                      Best,
James

                      Alex KushleyevA 1 Reply Last reply Reply Quote 0
                      • Alex KushleyevA
                        Alex Kushleyev ModalAI Team @jameskuesel
                        last edited by

                        Hi @jameskuesel,

                        I think we should be merging these new features (independent zoom, independent EIS for each misp channel) to dev this week. I will keep you posted.

                        Regarding your use case:

                        • Channel 2, if EIS is enabled, you would not have full frame resolution (4040x3040) -- probably something smaller. if you want to upsample a smaller ROI, this is supported, but would be kind of a waste of processing.

                        Otherwise it looks like the new changes will support what you need. I will follow up once it's merged and we have some documentation.

                        Alex

                        J 1 Reply Last reply Reply Quote 0
                        • J
                          jameskuesel @Alex Kushleyev
                          last edited by jameskuesel

                          @Alex-Kushleyev

                          Awesome, sounds good Alex. That does indeed sound like it would meet our needs.

                          If EIS is enabled, How much smaller would the image be than 4040x3040? What is the max output resolution with EIS enabled?

                          Alex KushleyevA 1 Reply Last reply Reply Quote 0
                          • Alex KushleyevA
                            Alex Kushleyev ModalAI Team @jameskuesel
                            last edited by Alex Kushleyev

                            @jameskuesel , actually EIS can output any arbitrary resolution, just like non-eis MISP output. What i meant to point out is that in order for EIS to work, you will need to have the stabilized ROI smaller than the full image (which typically means zoom > 1.0 and optional crop / aspect ratio change). So you could still have 4040x3040 eis output but the FOV would be smaller than original. If you take this to extreme and set zoom to 10x you will have a large 4040x3040 output image that was generated from a very small patch of the original image (ROI), so the detail would be very low. Does this make sense?

                            J 1 Reply Last reply Reply Quote 0
                            • J
                              jameskuesel @Alex Kushleyev
                              last edited by jameskuesel

                              @Alex-Kushleyev

                              Aha yes that was my original understanding. Yes knew you'd need a zoom level for it to actually look good/work effectively. Would probably have a lower bound of 1.2-1.3 likely from what I experimented with. TBD on dimensions.

                              Sorry, was confused, sounds good!

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Powered by NodeBB | Contributors