[libcamera-devel] Cropping (aka. digital zoom)

Fri Jun 19 13:36:45 CEST 2020

Hi everyone

Another thing I'd like to discuss is the ability to crop a region of
interest from a camera image. This is another feature that will be
available in many camera systems but which is not currently possible
within libcamera.

Just to recap, cropping specifies a window in the input image, and
everything outside this window is discarded. However, the used portion
of the image is still scaled to the same final output size, making
this a natural mechanism for implementing digital zoom.

I would therefore like to propose a new control consisting of four
numbers:

* The first two are the x and y offsets of the crop window within the
  input image. Having these parameters allows an application to pan
  around within the input image.

* The next two would be the width and height of the crop window.

I believe it's sensible for these numbers to be ratios, rather than
(for example) pixels. This makes it much easier for an application to
use. To specify that you want the middle quarter of an image ("2x
digital zoom"), you'd pass the numbers 0.25, 0.25, 0.5 and 0.5. Note
how setting width == height guarantees that you don't mess up your
aspect ratio.

Questions

1. How to represent the numbers?

I see an existing Rectangle class that gets used within controls, so
maybe that makes sense. I'd perhaps go with fixed point values such
that 65536 is equivalent to 1.0? (Though that doesn't leave huge scope
for sub-pixel cropping, where platforms support this...)

2. Are there devices that can't pan within the input image?

How would they signal this? I guess reporting min and max
rectangles with identical x and y values would cover this.

3. Minimum and maximum values?

A platform could also report "reasonable" minimum width and height
values that indicate roughly what is supported. Values of 1.0 here
(65536) would mean that there is no zoom capability.

Valid maximum widths and heights of course depend on x and y (or
vice versa). Perhaps we just report 1.0 here.

4. How to handle awkward cases or requests?

I think we have to allow implementations silently to coerce the
requested values into the "closest" thing that works. This might
involve changing offsets (for example) to accommodate the requested
width/height, or even adjusting the crop slightly to satisfy
things like pixel alignment requirements.

Perhaps the metadata returned in a request can indicate a value closer
to what was actually used.

Finally, I'm sure there are other questions and things I've
overlooked. What do people think?

Best regards
David