[libcamera-devel] Cropping (aka. digital zoom)

Naushir Patuck naush at raspberrypi.com
Fri Jun 19 14:26:58 CEST 2020


Hi David,

On Fri, 19 Jun 2020 at 12:36, David Plowman
<david.plowman at raspberrypi.com> wrote:
>
> Hi everyone
>
> Another thing I'd like to discuss is the ability to crop a region of
> interest from a camera image. This is another feature that will be
> available in many camera systems but which is not currently possible
> within libcamera.
>
> Just to recap, cropping specifies a window in the input image, and
> everything outside this window is discarded. However, the used portion
> of the image is still scaled to the same final output size, making
> this a natural mechanism for implementing digital zoom.
>
> I would therefore like to propose a new control consisting of four
> numbers:
>
> * The first two are the x and y offsets of the crop window within the
>   input image. Having these parameters allows an application to pan
>   around within the input image.
>
> * The next two would be the width and height of the crop window.
>
> I believe it's sensible for these numbers to be ratios, rather than
> (for example) pixels. This makes it much easier for an application to
> use. To specify that you want the middle quarter of an image ("2x
> digital zoom"), you'd pass the numbers 0.25, 0.25, 0.5 and 0.5. Note
> how setting width == height guarantees that you don't mess up your
> aspect ratio.

One question I have, what reference would these ratios be based on?
Typically you would choose the dimensions of the input frame to the
ISP.  However, we currently do a "hidden" crop to fix aspect ratio
when, for example, we want to output 16:9 given a 4:3 input frame.  So
would the ratio reference this hidden crop?  If no, then the
application must have knowledge of the input frame size (it currently
does not), and must crop correctly to adjust aspect ratios.  If yes,
then we will be hiding a portion of the input frame that the
application will never be able to pan into, but maybe that's not a
problem?

>
> Questions
>
> 1. How to represent the numbers?
>
> I see an existing Rectangle class that gets used within controls, so
> maybe that makes sense. I'd perhaps go with fixed point values such
> that 65536 is equivalent to 1.0? (Though that doesn't leave huge scope
> for sub-pixel cropping, where platforms support this...)

Would it be better to have a floating point equivalent of the
Rectangle class so we can use floats here?

Regards,
Naush


> 2. Are there devices that can't pan within the input image?
>
> How would they signal this? I guess reporting min and max
> rectangles with identical x and y values would cover this.
>
> 3. Minimum and maximum values?
>
> A platform could also report "reasonable" minimum width and height
> values that indicate roughly what is supported. Values of 1.0 here
> (65536) would mean that there is no zoom capability.
>
> Valid maximum widths and heights of course depend on x and y (or
> vice versa). Perhaps we just report 1.0 here.
>
> 4. How to handle awkward cases or requests?
>
> I think we have to allow implementations silently to coerce the
> requested values into the "closest" thing that works. This might
> involve changing offsets (for example) to accommodate the requested
> width/height, or even adjusting the crop slightly to satisfy
> things like pixel alignment requirements.
>
> Perhaps the metadata returned in a request can indicate a value closer
> to what was actually used.
>
> Finally, I'm sure there are other questions and things I've
> overlooked. What do people think?
>
> Best regards
> David
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel at lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel


More information about the libcamera-devel mailing list