[libcamera-devel] Cropping (aka. digital zoom)

David Plowman david.plowman at raspberrypi.com
Mon Jun 22 16:18:19 CEST 2020


Hi Naush

On Fri, 19 Jun 2020 at 13:27, Naushir Patuck <naush at raspberrypi.com> wrote:
>
> Hi David,
>
> On Fri, 19 Jun 2020 at 12:36, David Plowman
> <david.plowman at raspberrypi.com> wrote:
> >
> > Hi everyone
> >
> > Another thing I'd like to discuss is the ability to crop a region of
> > interest from a camera image. This is another feature that will be
> > available in many camera systems but which is not currently possible
> > within libcamera.
> >
> > Just to recap, cropping specifies a window in the input image, and
> > everything outside this window is discarded. However, the used portion
> > of the image is still scaled to the same final output size, making
> > this a natural mechanism for implementing digital zoom.
> >
> > I would therefore like to propose a new control consisting of four
> > numbers:
> >
> > * The first two are the x and y offsets of the crop window within the
> >   input image. Having these parameters allows an application to pan
> >   around within the input image.
> >
> > * The next two would be the width and height of the crop window.
> >
> > I believe it's sensible for these numbers to be ratios, rather than
> > (for example) pixels. This makes it much easier for an application to
> > use. To specify that you want the middle quarter of an image ("2x
> > digital zoom"), you'd pass the numbers 0.25, 0.25, 0.5 and 0.5. Note
> > how setting width == height guarantees that you don't mess up your
> > aspect ratio.
>
> One question I have, what reference would these ratios be based on?
> Typically you would choose the dimensions of the input frame to the
> ISP.  However, we currently do a "hidden" crop to fix aspect ratio
> when, for example, we want to output 16:9 given a 4:3 input frame.  So
> would the ratio reference this hidden crop?  If no, then the
> application must have knowledge of the input frame size (it currently
> does not), and must crop correctly to adjust aspect ratios.  If yes,
> then we will be hiding a portion of the input frame that the
> application will never be able to pan into, but maybe that's not a
> problem?

I think it's applied after we've done that "hidden" crop for the aspect
ratio. Anything else would seem bizarre and devious. You'd wind up
being able to pan up and down just because you're cropping 16:9
from a 4:3 sensor mode. It could be amusing, but I don't think the
average user would be expecting it...!

>
> >
> > Questions
> >
> > 1. How to represent the numbers?
> >
> > I see an existing Rectangle class that gets used within controls, so
> > maybe that makes sense. I'd perhaps go with fixed point values such
> > that 65536 is equivalent to 1.0? (Though that doesn't leave huge scope
> > for sub-pixel cropping, where platforms support this...)
>
> Would it be better to have a floating point equivalent of the
> Rectangle class so we can use floats here?

Well, true, on the other hand I sense that floats are not universally
loved, and the Rectangle is "just there"....!

Best regards
David

>
> Regards,
> Naush
>
>
> > 2. Are there devices that can't pan within the input image?
> >
> > How would they signal this? I guess reporting min and max
> > rectangles with identical x and y values would cover this.
> >
> > 3. Minimum and maximum values?
> >
> > A platform could also report "reasonable" minimum width and height
> > values that indicate roughly what is supported. Values of 1.0 here
> > (65536) would mean that there is no zoom capability.
> >
> > Valid maximum widths and heights of course depend on x and y (or
> > vice versa). Perhaps we just report 1.0 here.
> >
> > 4. How to handle awkward cases or requests?
> >
> > I think we have to allow implementations silently to coerce the
> > requested values into the "closest" thing that works. This might
> > involve changing offsets (for example) to accommodate the requested
> > width/height, or even adjusting the crop slightly to satisfy
> > things like pixel alignment requirements.
> >
> > Perhaps the metadata returned in a request can indicate a value closer
> > to what was actually used.
> >
> > Finally, I'm sure there are other questions and things I've
> > overlooked. What do people think?
> >
> > Best regards
> > David
> > _______________________________________________
> > libcamera-devel mailing list
> > libcamera-devel at lists.libcamera.org
> > https://lists.libcamera.org/listinfo/libcamera-devel


More information about the libcamera-devel mailing list