[libcamera-devel] [PATCH 0/1] Proposal of mapping between camera configurations and requested configurations

Tue Sep 1 18:09:36 CEST 2020

Hi Hiro,
   first of all I'm very sorry for the un-aceptable delay in giving
you a reply.

If that's of any consolation we have not ignored your email, but it
has gone through several internal discussion, as it come at the
time where the JPEG support was being merged and the two things
collided a bit. Add a small delay due to leaves, and here you have a
month of delay. Again, we're really sorry for this.

> On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> This is a proposal about how to map camera configurations and
> requested configurations in Android Camera HAL adaptation layer.
> Please also see the sample code in the following patch.
>
> # Software Stream Processing in libcamera
>
> _hiroh at chromium.org / Draft: 2020-08-06_
>
>

As an initial and un-related note looking at the patch, I can see you
are following the ChromeOS coding style. Please note that libcamera
has it's own code style, which you can find documented at

- https://www.libcamera.org/coding-style.html#coding-style-guidelines

And we have a style checker, which can assist with this. The best way to
use the style checker is to install it as a git-hook.

I understand that this is an RFC, but we will need this style to be
followed to be able to integrate any future patches.

>
> # Objective
>
> Perform frame processing in libcamera to achieve requested stream
> configurations that are not supported natively by the camera
> hardware, but required by the Android Camera HAL interface.
>

As you can see in the camera_device.cpp file we have tried to list the
resolution and image formats that the Android Camera3 specification
lists as mandatory or suggested.

Do you have a list of additional requirements to add ?
Are there ChromeOS specific requirements ?
Or is this meant to full-fill the above stated requirements on
platforms that cannot satisfy them ?

>
> # Background
>
>
> ### Libcamera
>
> In addition to its native API, libcamera[^1] provides a number of
> camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> The platform specific implementations are wrapped in libcamera core
> and a caller of libcamera doesn’t have to take care the platform.
>
>
> ### Android Camera HAL
>
> Chrome OS camera stack uses Android Camera HAL[^2] interface.
> Libcamera provides Android Camera HAL with an adaptation layer[^3]
> between libcamera core part and Android HAL, which is called
> Android HAL adaptation layer in this document.
>
> To present a uniform set of capabilities to the API users, Android
> Camera HAL API[^4] allows caller to request stream configurations
> that are beyond the device capabilities. For example, while a
> camera device is able to produce a single stream, a HAL caller
> requests three possibly different resolution streams (PRIV, YUV,
> JPEG). However, libcamera core implementation produces
> camera-capable streams. Therefore, we have to create three streams
> from the single stream produced by libcamera.
>
> Requests beyond the device capability is supported only in Android
> HAL at this moment. I describe the design in this document that the
> stream processing is performed in Android HAL adaptation layer.
>
>
> # Overview
>
>
> ## Current implementation
>
> The requested stream configuration is given by
> _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> This delegates CameraDevice::configureStreams()[^5] in libcamera.
> The current implementation attempts all the given configurations
> and succeeds if and only if the camera device can produces them
> without any adjustments.
>
>
> ### libcamera::CameraConfiguration
>
> It is CameraConfiguration[^6] that judges whether adjustments are
> required, or even requested configurations are infeasible.
>
> The procedure of configuration is that CameraDevice
>
>
>
> 1. Adds every configuration by
> CameraConfiguration::addConfiguration(). 2. Assorts the added
> configurations by CameraConfiguration::validate().
>
> CameraConfiguration, especially for validate(), is implemented per
> pipeline. For instance, the CameraConfiguration implementation for
> IPU3 is IPU3CameraConfiguration[^7].
>
> validate() returns one of the below,
>
>
>
> *   Valid *    A camera can produce streams with requested
> configurations. *   Adjusted *   A camera cannot produce streams
> with requested configurations as-is, but can produce streams with
> different pixel formats or resolutions. *   Invalid *   A camera
> cannot produce streams with either requested configurations or
> different pixel formats and resolutions. For instance, this is
> returned when the larger resolution is requested than the maximum
> supported one?
>
> What we need to resolve is, when Adjusted is returned, to map
> adjusted camera streams to requested camera streams and required
> processing.
>
>
> ## Stream processing
>
> The processing to be thought of are followings.
>
>
>
> *   Down-scaling *   We don’t perform up-scaling because it affects
> stream qualities *   Down-scaling is allowed for the same ratio to
> avoid producing distorted frames. For instance, scaling from
> 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> Cropping is executed only to change the frame ratio. Thus it must
> be done after down-scaling if required. For example, to convert
> 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> 480x360.
>
> *   Format conversion *   Pixel format conversion *   JPEG
> encoding
>
>
> # Proposal
>
> Basically we only need to consider a mapping algorithm after
> validate(). However, to obtain less processing and better stream
> qualities, we should reorder given configurations within
> validate().

>

The way the HAL layer works, and I agree something has changed since
the recent merge of the JPEG support, is slightly more complex, and
boils down to the following steps

1) Build the list of supported configuration

When a CameraDevice is initialized, a list of supported stream
configuration is built, in order to be able to report to Android
what it could ask. See CameraDevice::initializeStreamConfigurations().

We currently report the libcamera::Camera supported formats and
size, plus additional JPEG streams which are produced in the HAL.
This creates the first distinction between HAL-only-streams and
libcamera-streams, that you correctly identified in your summary.

Here, as we do (naively at the moment) for JPEG, you should inspect
the libcamera-streams and pass them through your code that infer
what kind of HAL-only-streams can be produced from the available
libcamera ones. If I'm not mistaken Android only asks for stream
combinations reported through the
ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
if you do not augment that list at initialization time, you won't
ever be asked for non-native streams later.

2) Camera configuration

That's the part you focused on, and a good part of what you wrote
could indeed be used to move forward.

The problem here can be summarized as: 'for each stream android
requested, the ones that cannot be natively produced by the
libcamera::Camera shall be mapped on the closest possible native
stream' (and here we could apply your implementation that identifies
the 'best matching' stream)

Unfortunately the problem breaks down into several others:

1) How to identify if a stream is a native or an HAL only one ?
Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
streams are native ones. This should be made smarter.

2) How to best map HAL-streams to libcamera-streams. Assume to
receive a request for two YUV streams in 1080p and 720p resolutions.
The libcamera::Camera claims to be able to support both, so we can
simply go and ask for those two streams. Then we receive a request
for the same streams plus a full-size JPEG one. What we have to do is
ask for the full-size YUV stream and use it to produce JPEG, and one
1080p YUV to produce both the YUV streams in 1080p and 720p
resolutions. In the case we'll then have to crop one YUV stream, and
dedicate a full-size YUV one to JPEG. Alternatively we can produce
1080p from the same full-size YUV used to produce JPEG, and ask for a
720p stream to the camera.

Now, Android specifies some format/size requirements in the Camera3
specification, I assume ChromeOS has maybe others. As we tried to
record the Camera3 requirements and satisfy them in the code, I
think the additional streams that are required should be someone
listed first, in order to be able to create only the -required-
additional streams.

For an example, have a look at CameraDevice::camera3Resolutions and
CameraDevice::camera3FormatsMap, these encode the Camera3
specification requirements.

Once the additional requirments have been encoded, I would then
proceed to divide them in 3 categories (there might very well be
others):

  1) Format conversions: Convert to one pixel format to the other. What
     happens today with JPEG more or less. We have an Encode interface for
     that purpose and I guess format converter should be implemented
     according to it, but that has to be discussed.

  2) Down-scale/crop: Assuming it happens in the HAL using maybe some
     external components, down-scaling/cropping produce additional
     resolutions from the list of natively supported ones. Given a
     powerful enough implementation we could produce ANY format <= a given
     native format, but that's not what we want I guess. We shall
     establish a list of additional resolutions we want to report to the
     framework layer, and find out how to produce them from the native
     streams.

   3) Image transformations A bit a lateral issue, but I assume some
      'transformations' can be performed by HAL only components. This
      mostly depends on handling streams with some specific metadata
      associated, which needs to be handled in the HAL. The most trivial
      example is rotation. If the libcamera::Camera is for whatever reason
      unable to rotate the images, they have to be software rotated in the
      HAL. This won't require any stream mapping, but rather inspecting
      metadata and pass the native streams through an additional processing
      layer.

3) Buffer allocation/handling:
   When performing any conversions between a HAL stream and a libcamera
   stream we may need to allocate an intermediate buffer to provide storage
   for processing the frame in libcamera, with the conversion entity
   reading from the libcamera buffer and writing into the android buffer.
   This is likely possible with the existing FrameBufferAllocator classes,
   but may have extra requirements.

My take is still that we should try to solve one problem at the time:

1) formalize additional requirements that are not expressed by our
   CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap
2) if not other requirements are necessary, indentify a use case that
   cannot be satisfied by the current pipeline implementations we
   have. In example, a UVC camera that cannot produce NV12 and need
   conversion might be a good start
3) Address the buffer allocation issues which I understand is still
   to be addressed.

Sorry for the wall of text. Hope it helps.

Thanks
  j