[libcamera-devel] [PATCH v3 1/2] libcamera: Infrastructure for digital zoom

Fri Jul 31 11:02:44 CEST 2020

Hi Jacopo, everyone

On Thu, 30 Jul 2020 at 17:34, Jacopo Mondi <jacopo at jmondi.org> wrote:
>
> Hi David,
>   sorry for jumping late on the topic, I managed to read the
> discussions in v1 and v2 only today.

Please don't apologise, I'm very glad that you've picked this up. This
is actually one of the "big ticket" items that libcamera is currently
lacking compared to our existing stack, so I was going to give the
discussion a "nudge" quite soon!

>
> On Thu, Jul 23, 2020 at 09:43:37AM +0100, David Plowman wrote:
> > These changes add a Digital Zoom control, taking a rectangle as its
> > argument, indicating the region of the sensor output that the
> > pipeline will "zoom up" to the final output size.
> >
> > Additionally, we need to have a method returning the "pipelineCrop"
> > which gives the dimensions of the sensor output, taken by the
> > pipeline, and within which we can subsequently pan and zoom.
>
> This is the part I don't get. What's the intended user of this
> information ? Is this meant to be provided to applications so that
> they know which is the area inside which they can later pan/zoom, right ?

Correct. I'm imagining a user might say "I want to see the image at 2x
zoom". If we know the unzoomed image is being created by (as per one
of my examples somewhere) 2028x1140 pixels from the sensor, then the
application would send a "digital zoom" of offset: (507,285) size:
1014x570,

I think the feeling was that we liked having control right down to the
pixel level, which in turn means the the numbers 2028x1140 need to be
supplied to the application from somewhere.

>
> I'm kind of against to ad-hoc function to retrieve information about
> the image processing pipeline configuration. I can easily see them
> multiply and make our API awful, and each platform would need to add
> its own very-specific bits, and that can't certainly go through the
> main Camera class API.
>
> If I got your use case right:
>
> 1) You need to report the pipeline crop area, which is a rectangle
>    defined on the frame that is produced by the sensor.
>    Using your example:
>
>         sensorSize = 1920x1440
>         pipelineCrop = 1920x1080
>
>    It seems to me the 'pipelineCrop' is a good candidate to be a
>    CameraConfiguration property, which at the moment is not yet
>    implemented. Roughly speaking, we have been discussing it in the
>    context of frame durations, and it's about augmenting the
>    CameraConfiguration class with a set of properties that depend on
>    the currently applied configuration of the enabled streams.
>
>    Ideally we should be able to receive CameraConfiguration, validate
>    it, and populate it with properties, like the minimum frame
>    duration (which is a combination of durations of all streams), and
>    now with this new 'pipelineCrop' which is the result of a decision
>    the pipeline handler takes based on the requested output stream
>    sizes.
>
>    I think the ideal plan forward for this would be to define a new
>    control/property (I would say property as it's immutable from
>    applications) that is set by pipeline handlers during validate()
>    when the sensor mode is selected and eventually cropped.

Absolutely. The problem with this has never been how to calculate the
"pipelineCrop", or who calculates it (that's the pipeline handler),
but how to signal this value to the application. When can I have this
feature, please? :)

>
>    Application can access it from the CameraConfiguration and decide
>    where to set their pan/zoom rectangle.
>
>    How the pipeline handler applies the cropping to the sensor
>    produced frame is less relevant imo. Systems which have direct access
>    to the v4l2 subdev will probably use the crop/compose rectangle
>    applied on the sensor. In your case I see you use the crop target
>    on the ISP video device node. Am I wrong or you could calculate
>    that at validate() time, once you know all the desired output sizes and
>    based on them identify the 'best' sensor mode to use ?

We update the selection on the ISP device node. I would expect this
approach to be more common than doing it on the sensor itself (you
don't lose statistics for the rest of the frame), but it's of course
up to the pipeline handlers.

>
> 2) Application, once they know the 'pipelineCrop' (I'm ok with that
>    term btw, as we have analogCrop and we will have a digitalCrop for
>    the sensor's processing stages, so pipelineCrop feels right) can
>    set their pan/zoom rectangle with a control, as the one you have
>    defined below. Now, what do you think the reference of that rectangle
>    should be ? I mean, should the 'pipelineCrop' always be in position
>    (0,0) and the 'digitalZoom' be applied on top of it ? I would say
>    so, but then we lose the information on which part of the image
>    the pipeline decided to crop, and I'm not sure it is relevant or
>    not.

So I'm in favour of giving applications only the size of the pipeline
crop, nothing else. It's all you need to implement digital zoom, and
panning within the original image, and it makes life very
straightforward for everyone.

I agree there are more complex scenarios you could imagine. Maybe you
let applications pan/zoom within the whole of the sensor area, so
you'd be able to pan to parts of the image that you can't see in the
"default unzoomed" version. I mean, there's nothing "wrong" in this,
it just feels more complicated and a bit unexpected to me. I also
think this would delegate most of the aspect ratio calculations to the
applications, and I suspect they'd get themselves confused and
actually implement the simpler scheme anyway... But yes, there is a
decision to be made here.

>
>    The other possibility is to report the 'pipelineCrop' in respect to
>    the full active pixel array size, and ask application to provide a
>    'digitalZoom' rectangle with the same reference. Is it worth the
>    additional complication in you opinion?

Sorry, answered that one above!

>
>    Third option is to report the sensor frame size AND the pipeline crop
>    defined in respect to it, and let application provide a digitalZoom
>    rectangle defined in respect to the pipeline crop. This would look
>    like (numbers are random):
>         sensorFrameSize = {0, 0, 1920, 1440}
>         pipelineCrop = { 0, 150, 1920, 1080}
>         digitalZoom = {300, 300, 1280, 720}
>
>    This would allow application to know that the sensor frame is
>    1920x1400, the pipeline selected a region on 1920x1080 by applying
>    an offset of 150 pixels in the vertical direction and the desired
>    digital zoom is then applied on this last rectangle (resulting in a
>    effective (300,450) dislocation from the full sensor frame).
>
> I won't bikeshed on names too much, just leave as a reference that
> Android reports the same property we are defining as pipelineCrop as
> 'scalera.cropRegion' as for them 'scaler' is used to report properties
> of the ISP processing pipeline.

I could go with "scalerCrop" too!

Best regards
David

>
> I'm sorry for the wall of text, I wish we had a blackboard :)
>
> Thanks
>   j
>
> >
> > Signed-off-by: David Plowman <david.plowman at raspberrypi.com>
> > ---
> >  include/libcamera/camera.h                    |  2 ++
> >  include/libcamera/internal/pipeline_handler.h |  4 +++
> >  src/libcamera/camera.cpp                      | 27 +++++++++++++++++++
> >  src/libcamera/control_ids.yaml                | 10 +++++++
> >  4 files changed, 43 insertions(+)
> >
> > diff --git a/include/libcamera/camera.h b/include/libcamera/camera.h
> > index 4d1a4a9..6819b8e 100644
> > --- a/include/libcamera/camera.h
> > +++ b/include/libcamera/camera.h
> > @@ -92,6 +92,8 @@ public:
> >       std::unique_ptr<CameraConfiguration> generateConfiguration(const StreamRoles &roles = {});
> >       int configure(CameraConfiguration *config);
> >
> > +     Size const &getPipelineCrop() const;
> > +
> >       Request *createRequest(uint64_t cookie = 0);
> >       int queueRequest(Request *request);
> >
> > diff --git a/include/libcamera/internal/pipeline_handler.h b/include/libcamera/internal/pipeline_handler.h
> > index 22e629a..5bfe890 100644
> > --- a/include/libcamera/internal/pipeline_handler.h
> > +++ b/include/libcamera/internal/pipeline_handler.h
> > @@ -89,6 +89,8 @@ public:
> >
> >       const char *name() const { return name_; }
> >
> > +     Size const &getPipelineCrop() const { return pipelineCrop_; }
> > +
> >  protected:
> >       void registerCamera(std::shared_ptr<Camera> camera,
> >                           std::unique_ptr<CameraData> data);
> > @@ -100,6 +102,8 @@ protected:
> >
> >       CameraManager *manager_;
> >
> > +     Size pipelineCrop_;
> > +
> >  private:
> >       void mediaDeviceDisconnected(MediaDevice *media);
> >       virtual void disconnect();
> > diff --git a/src/libcamera/camera.cpp b/src/libcamera/camera.cpp
> > index 69a1b44..f8b8ec6 100644
> > --- a/src/libcamera/camera.cpp
> > +++ b/src/libcamera/camera.cpp
> > @@ -793,6 +793,33 @@ int Camera::configure(CameraConfiguration *config)
> >       return 0;
> >  }
> >
> > +/**
> > + * \brief Return the size of the sensor image being used by the pipeline
> > + * to create the output.
> > + *
> > + * This method returns the size, in pixels, of the raw image read from the
> > + * sensor and which is used by the pipeline to form the output image(s)
> > + * (rescaling if necessary). Note that these values take account of any
> > + * cropping performed on the sensor output so as to produce the correct
> > + * aspect ratio. It would normally be necessary to retrieve these values
> > + * in order to calculate correct parameters for digital zoom.
> > + *
> > + * Example: a sensor mode may produce a 1920x1440 output image. But if an
> > + * application has requested a 16:9 image, the values returned here might
> > + * be 1920x1080 - the largest portion of the sensor output that provides
> > + * the correct aspect ratio.
> > + *
> > + * \context This function is \threadsafe. It will only return valid
> > + * (non-zero) values when the camera has been configured.
> > + *
> > + * \return The dimensions of the sensor image used by the pipeline.
> > + */
> > +
> > +Size const &Camera::getPipelineCrop() const
> > +{
> > +     return p_->pipe_->getPipelineCrop();
> > +}
> > +
> >  /**
> >   * \brief Create a request object for the camera
> >   * \param[in] cookie Opaque cookie for application use
> > diff --git a/src/libcamera/control_ids.yaml b/src/libcamera/control_ids.yaml
> > index 988b501..5a099d5 100644
> > --- a/src/libcamera/control_ids.yaml
> > +++ b/src/libcamera/control_ids.yaml
> > @@ -262,4 +262,14 @@ controls:
> >          In this respect, it is not necessarily aimed at providing a way to
> >          implement a focus algorithm by the application, rather an indication of
> >          how in-focus a frame is.
> > +
> > +  - DigitalZoom:
> > +      type: Rectangle
> > +      description: |
> > +        Sets the portion of the full sensor image, in pixels, that will be
> > +        used for digital zoom. That is, this part of the sensor output will
> > +        be scaled up to make the full size output image (and everything else
> > +        discarded). To obtain the "full sensor image" that is available, the
> > +        method Camera::getOutputCrop() should be called once the camera is
> > +        configured. An application may pan and zoom within this rectangle.
> >  ...
> > --
> > 2.20.1
> >
> > _______________________________________________
> > libcamera-devel mailing list
> > libcamera-devel at lists.libcamera.org
> > https://lists.libcamera.org/listinfo/libcamera-devel