[libcamera-devel] [PATCH v3 1/2] libcamera: Infrastructure for digital zoom

Thu Jul 30 18:38:04 CEST 2020

Hi David,
  sorry for jumping late on the topic, I managed to read the
discussions in v1 and v2 only today.

On Thu, Jul 23, 2020 at 09:43:37AM +0100, David Plowman wrote:
> These changes add a Digital Zoom control, taking a rectangle as its
> argument, indicating the region of the sensor output that the
> pipeline will "zoom up" to the final output size.
>
> Additionally, we need to have a method returning the "pipelineCrop"
> which gives the dimensions of the sensor output, taken by the
> pipeline, and within which we can subsequently pan and zoom.

This is the part I don't get. What's the intended user of this
information ? Is this meant to be provided to applications so that
they know which is the area inside which they can later pan/zoom, right ?

I'm kind of against to ad-hoc function to retrieve information about
the image processing pipeline configuration. I can easily see them
multiply and make our API awful, and each platform would need to add
its own very-specific bits, and that can't certainly go through the
main Camera class API.

If I got your use case right:

1) You need to report the pipeline crop area, which is a rectangle
   defined on the frame that is produced by the sensor.
   Using your example:

        sensorSize = 1920x1440
        pipelineCrop = 1920x1080

   It seems to me the 'pipelineCrop' is a good candidate to be a
   CameraConfiguration property, which at the moment is not yet
   implemented. Roughly speaking, we have been discussing it in the
   context of frame durations, and it's about augmenting the
   CameraConfiguration class with a set of properties that depend on
   the currently applied configuration of the enabled streams.

   Ideally we should be able to receive CameraConfiguration, validate
   it, and populate it with properties, like the minimum frame
   duration (which is a combination of durations of all streams), and
   now with this new 'pipelineCrop' which is the result of a decision
   the pipeline handler takes based on the requested output stream
   sizes.

   I think the ideal plan forward for this would be to define a new
   control/property (I would say property as it's immutable from
   applications) that is set by pipeline handlers during validate()
   when the sensor mode is selected and eventually cropped.

   Application can access it from the CameraConfiguration and decide
   where to set their pan/zoom rectangle.

   How the pipeline handler applies the cropping to the sensor
   produced frame is less relevant imo. Systems which have direct access
   to the v4l2 subdev will probably use the crop/compose rectangle
   applied on the sensor. In your case I see you use the crop target
   on the ISP video device node. Am I wrong or you could calculate
   that at validate() time, once you know all the desired output sizes and
   based on them identify the 'best' sensor mode to use ?

2) Application, once they know the 'pipelineCrop' (I'm ok with that
   term btw, as we have analogCrop and we will have a digitalCrop for
   the sensor's processing stages, so pipelineCrop feels right) can
   set their pan/zoom rectangle with a control, as the one you have
   defined below. Now, what do you think the reference of that rectangle
   should be ? I mean, should the 'pipelineCrop' always be in position
   (0,0) and the 'digitalZoom' be applied on top of it ? I would say
   so, but then we lose the information on which part of the image
   the pipeline decided to crop, and I'm not sure it is relevant or
   not.

   The other possibility is to report the 'pipelineCrop' in respect to
   the full active pixel array size, and ask application to provide a
   'digitalZoom' rectangle with the same reference. Is it worth the
   additional complication in you opinion?

   Third option is to report the sensor frame size AND the pipeline crop
   defined in respect to it, and let application provide a digitalZoom
   rectangle defined in respect to the pipeline crop. This would look
   like (numbers are random):
        sensorFrameSize = {0, 0, 1920, 1440}
        pipelineCrop = { 0, 150, 1920, 1080}
        digitalZoom = {300, 300, 1280, 720}

   This would allow application to know that the sensor frame is
   1920x1400, the pipeline selected a region on 1920x1080 by applying
   an offset of 150 pixels in the vertical direction and the desired
   digital zoom is then applied on this last rectangle (resulting in a
   effective (300,450) dislocation from the full sensor frame).

I won't bikeshed on names too much, just leave as a reference that
Android reports the same property we are defining as pipelineCrop as
'scalera.cropRegion' as for them 'scaler' is used to report properties
of the ISP processing pipeline.

I'm sorry for the wall of text, I wish we had a blackboard :)

Thanks
  j

>
> Signed-off-by: David Plowman <david.plowman at raspberrypi.com>
> ---
>  include/libcamera/camera.h                    |  2 ++
>  include/libcamera/internal/pipeline_handler.h |  4 +++
>  src/libcamera/camera.cpp                      | 27 +++++++++++++++++++
>  src/libcamera/control_ids.yaml                | 10 +++++++
>  4 files changed, 43 insertions(+)
>
> diff --git a/include/libcamera/camera.h b/include/libcamera/camera.h
> index 4d1a4a9..6819b8e 100644
> --- a/include/libcamera/camera.h
> +++ b/include/libcamera/camera.h
> @@ -92,6 +92,8 @@ public:
>  	std::unique_ptr<CameraConfiguration> generateConfiguration(const StreamRoles &roles = {});
>  	int configure(CameraConfiguration *config);
>
> +	Size const &getPipelineCrop() const;
> +
>  	Request *createRequest(uint64_t cookie = 0);
>  	int queueRequest(Request *request);
>
> diff --git a/include/libcamera/internal/pipeline_handler.h b/include/libcamera/internal/pipeline_handler.h
> index 22e629a..5bfe890 100644
> --- a/include/libcamera/internal/pipeline_handler.h
> +++ b/include/libcamera/internal/pipeline_handler.h
> @@ -89,6 +89,8 @@ public:
>
>  	const char *name() const { return name_; }
>
> +	Size const &getPipelineCrop() const { return pipelineCrop_; }
> +
>  protected:
>  	void registerCamera(std::shared_ptr<Camera> camera,
>  			    std::unique_ptr<CameraData> data);
> @@ -100,6 +102,8 @@ protected:
>
>  	CameraManager *manager_;
>
> +	Size pipelineCrop_;
> +
>  private:
>  	void mediaDeviceDisconnected(MediaDevice *media);
>  	virtual void disconnect();
> diff --git a/src/libcamera/camera.cpp b/src/libcamera/camera.cpp
> index 69a1b44..f8b8ec6 100644
> --- a/src/libcamera/camera.cpp
> +++ b/src/libcamera/camera.cpp
> @@ -793,6 +793,33 @@ int Camera::configure(CameraConfiguration *config)
>  	return 0;
>  }
>
> +/**
> + * \brief Return the size of the sensor image being used by the pipeline
> + * to create the output.
> + *
> + * This method returns the size, in pixels, of the raw image read from the
> + * sensor and which is used by the pipeline to form the output image(s)
> + * (rescaling if necessary). Note that these values take account of any
> + * cropping performed on the sensor output so as to produce the correct
> + * aspect ratio. It would normally be necessary to retrieve these values
> + * in order to calculate correct parameters for digital zoom.
> + *
> + * Example: a sensor mode may produce a 1920x1440 output image. But if an
> + * application has requested a 16:9 image, the values returned here might
> + * be 1920x1080 - the largest portion of the sensor output that provides
> + * the correct aspect ratio.
> + *
> + * \context This function is \threadsafe. It will only return valid
> + * (non-zero) values when the camera has been configured.
> + *
> + * \return The dimensions of the sensor image used by the pipeline.
> + */
> +
> +Size const &Camera::getPipelineCrop() const
> +{
> +	return p_->pipe_->getPipelineCrop();
> +}
> +
>  /**
>   * \brief Create a request object for the camera
>   * \param[in] cookie Opaque cookie for application use
> diff --git a/src/libcamera/control_ids.yaml b/src/libcamera/control_ids.yaml
> index 988b501..5a099d5 100644
> --- a/src/libcamera/control_ids.yaml
> +++ b/src/libcamera/control_ids.yaml
> @@ -262,4 +262,14 @@ controls:
>          In this respect, it is not necessarily aimed at providing a way to
>          implement a focus algorithm by the application, rather an indication of
>          how in-focus a frame is.
> +
> +  - DigitalZoom:
> +      type: Rectangle
> +      description: |
> +        Sets the portion of the full sensor image, in pixels, that will be
> +        used for digital zoom. That is, this part of the sensor output will
> +        be scaled up to make the full size output image (and everything else
> +        discarded). To obtain the "full sensor image" that is available, the
> +        method Camera::getOutputCrop() should be called once the camera is
> +        configured. An application may pan and zoom within this rectangle.
>  ...
> --
> 2.20.1
>
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel at lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel