[libcamera-devel] [PATCH 0/1] Proposal of mapping between camera configurations and requested configurations

Thu Sep 3 02:36:47 CEST 2020

Hi Jacopo,

On Tue, Sep 1, 2020 at 6:05 PM Jacopo Mondi <jacopo at jmondi.org> wrote:
>
> Hi Hiro,
>    first of all I'm very sorry for the un-aceptable delay in giving
> you a reply.
>
> If that's of any consolation we have not ignored your email, but it
> has gone through several internal discussion, as it come at the
> time where the JPEG support was being merged and the two things
> collided a bit. Add a small delay due to leaves, and here you have a
> month of delay. Again, we're really sorry for this.
>
> > On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> > This is a proposal about how to map camera configurations and
> > requested configurations in Android Camera HAL adaptation layer.
> > Please also see the sample code in the following patch.
> >
> > # Software Stream Processing in libcamera
> >
> > _hiroh at chromium.org / Draft: 2020-08-06_
> >
> >
>
> As an initial and un-related note looking at the patch, I can see you
> are following the ChromeOS coding style. Please note that libcamera
> has it's own code style, which you can find documented at
>
> - https://www.libcamera.org/coding-style.html#coding-style-guidelines
>
> And we have a style checker, which can assist with this. The best way to
> use the style checker is to install it as a git-hook.
>
> I understand that this is an RFC, but we will need this style to be
> followed to be able to integrate any future patches.
>
> >
> > # Objective
> >
> > Perform frame processing in libcamera to achieve requested stream
> > configurations that are not supported natively by the camera
> > hardware, but required by the Android Camera HAL interface.
> >
>
> As you can see in the camera_device.cpp file we have tried to list the
> resolution and image formats that the Android Camera3 specification
> lists as mandatory or suggested.
>
> Do you have a list of additional requirements to add ?
> Are there ChromeOS specific requirements ?
> Or is this meant to full-fill the above stated requirements on
> platforms that cannot satisfy them ?
>

There can be per-device resolutions that should be supported due to
product requirements. Our current HAL implementations use
configuration files which define the required configurations.

That said, I think it's an independent problem, which we can likely
ignore for now, and I believe what Hiro had in mind was the latter -
platforms that cannot satisfy them. This also includes the cases you
mentioned below, when a number of streams greater than the number of
native hardware streams is requested.

As usual, the Android Camera2 API documentation is the authoritative
source of information here:
https://developer.android.com/reference/android/hardware/camera2/CameraDevice.html#createCaptureSession(android.hardware.camera2.params.SessionConfiguration)

The tables lower on the page include required stream combinations for
various capability levels.

> >
> > # Background
> >
> >
> > ### Libcamera
> >
> > In addition to its native API, libcamera[^1] provides a number of
> > camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> > The platform specific implementations are wrapped in libcamera core
> > and a caller of libcamera doesn’t have to take care the platform.
> >
> >
> > ### Android Camera HAL
> >
> > Chrome OS camera stack uses Android Camera HAL[^2] interface.
> > Libcamera provides Android Camera HAL with an adaptation layer[^3]
> > between libcamera core part and Android HAL, which is called
> > Android HAL adaptation layer in this document.
> >
> > To present a uniform set of capabilities to the API users, Android
> > Camera HAL API[^4] allows caller to request stream configurations
> > that are beyond the device capabilities. For example, while a
> > camera device is able to produce a single stream, a HAL caller
> > requests three possibly different resolution streams (PRIV, YUV,
> > JPEG). However, libcamera core implementation produces
> > camera-capable streams. Therefore, we have to create three streams
> > from the single stream produced by libcamera.
> >
> > Requests beyond the device capability is supported only in Android
> > HAL at this moment. I describe the design in this document that the
> > stream processing is performed in Android HAL adaptation layer.
> >
> >
> > # Overview
> >
> >
> > ## Current implementation
> >
> > The requested stream configuration is given by
> > _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> > This delegates CameraDevice::configureStreams()[^5] in libcamera.
> > The current implementation attempts all the given configurations
> > and succeeds if and only if the camera device can produces them
> > without any adjustments.
> >
> >
> > ### libcamera::CameraConfiguration
> >
> > It is CameraConfiguration[^6] that judges whether adjustments are
> > required, or even requested configurations are infeasible.
> >
> > The procedure of configuration is that CameraDevice
> >
> >
> >
> > 1. Adds every configuration by
> > CameraConfiguration::addConfiguration(). 2. Assorts the added
> > configurations by CameraConfiguration::validate().
> >
> > CameraConfiguration, especially for validate(), is implemented per
> > pipeline. For instance, the CameraConfiguration implementation for
> > IPU3 is IPU3CameraConfiguration[^7].
> >
> > validate() returns one of the below,
> >
> >
> >
> > *   Valid *    A camera can produce streams with requested
> > configurations. *   Adjusted *   A camera cannot produce streams
> > with requested configurations as-is, but can produce streams with
> > different pixel formats or resolutions. *   Invalid *   A camera
> > cannot produce streams with either requested configurations or
> > different pixel formats and resolutions. For instance, this is
> > returned when the larger resolution is requested than the maximum
> > supported one?
> >
> > What we need to resolve is, when Adjusted is returned, to map
> > adjusted camera streams to requested camera streams and required
> > processing.
> >
> >
> > ## Stream processing
> >
> > The processing to be thought of are followings.
> >
> >
> >
> > *   Down-scaling *   We don’t perform up-scaling because it affects
> > stream qualities *   Down-scaling is allowed for the same ratio to
> > avoid producing distorted frames. For instance, scaling from
> > 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> > Cropping is executed only to change the frame ratio. Thus it must
> > be done after down-scaling if required. For example, to convert
> > 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> > 480x360.
> >
> > *   Format conversion *   Pixel format conversion *   JPEG
> > encoding
> >
> >
> > # Proposal
> >
> > Basically we only need to consider a mapping algorithm after
> > validate(). However, to obtain less processing and better stream
> > qualities, we should reorder given configurations within
> > validate().
>
> >
>
> The way the HAL layer works, and I agree something has changed since
> the recent merge of the JPEG support, is slightly more complex, and
> boils down to the following steps
>
> 1) Build the list of supported configuration
>
> When a CameraDevice is initialized, a list of supported stream
> configuration is built, in order to be able to report to Android
> what it could ask. See CameraDevice::initializeStreamConfigurations().
>
> We currently report the libcamera::Camera supported formats and
> size, plus additional JPEG streams which are produced in the HAL.
> This creates the first distinction between HAL-only-streams and
> libcamera-streams, that you correctly identified in your summary.
>
> Here, as we do (naively at the moment) for JPEG, you should inspect
> the libcamera-streams and pass them through your code that infer
> what kind of HAL-only-streams can be produced from the available
> libcamera ones. If I'm not mistaken Android only asks for stream
> combinations reported through the
> ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
> if you do not augment that list at initialization time, you won't
> ever be asked for non-native streams later.

I'm not entirely sure about this, because there are mandatory stream
configurations defined for the Camera2 API. If something is mandatory,
I suspect there is no need to query for the availability of it.

That said, I'd assume that CTS verifies whether all the required
configurations are both reported and supported, so perhaps there isn't
much to worry about here.

>
> 2) Camera configuration
>
> That's the part you focused on, and a good part of what you wrote
> could indeed be used to move forward.
>
> The problem here can be summarized as: 'for each stream android
> requested, the ones that cannot be natively produced by the
> libcamera::Camera shall be mapped on the closest possible native
> stream' (and here we could apply your implementation that identifies
> the 'best matching' stream)
>
> Unfortunately the problem breaks down into several others:
>
> 1) How to identify if a stream is a native or an HAL only one ?
> Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
> streams are native ones. This should be made smarter.
>
> 2) How to best map HAL-streams to libcamera-streams. Assume to
> receive a request for two YUV streams in 1080p and 720p resolutions.
> The libcamera::Camera claims to be able to support both, so we can
> simply go and ask for those two streams. Then we receive a request
> for the same streams plus a full-size JPEG one. What we have to do is
> ask for the full-size YUV stream and use it to produce JPEG, and one
> 1080p YUV to produce both the YUV streams in 1080p and 720p
> resolutions. In the case we'll then have to crop one YUV stream, and
> dedicate a full-size YUV one to JPEG. Alternatively we can produce
> 1080p from the same full-size YUV used to produce JPEG, and ask for a
> 720p stream to the camera.
>

Right, there are multiple possible choices. I've discussed this and
concluded that there might be some help needed from the pipeline
handler to tell the client which configuration is better from the
hardware point of view.

> Now, Android specifies some format/size requirements in the Camera3
> specification, I assume ChromeOS has maybe others. As we tried to
> record the Camera3 requirements and satisfy them in the code, I
> think the additional streams that are required should be someone
> listed first, in order to be able to create only the -required-
> additional streams.
>
> For an example, have a look at CameraDevice::camera3Resolutions and
> CameraDevice::camera3FormatsMap, these encode the Camera3
> specification requirements.
>
> Once the additional requirments have been encoded, I would then
> proceed to divide them in 3 categories (there might very well be
> others):

I believe we don't have any additional requirements for now.

>
>   1) Format conversions: Convert to one pixel format to the other. What
>      happens today with JPEG more or less. We have an Encode interface for
>      that purpose and I guess format converter should be implemented
>      according to it, but that has to be discussed.
>

One thing that is also missing today is MJPEG decoding. This is also
required to fulfill the stream configuration requirements, since it's
assumed that the formats are displayable and explicit YUV streams are
included as well.

>   2) Down-scale/crop: Assuming it happens in the HAL using maybe some
>      external components, down-scaling/cropping produce additional
>      resolutions from the list of natively supported ones. Given a
>      powerful enough implementation we could produce ANY format <= a given
>      native format, but that's not what we want I guess. We shall
>      establish a list of additional resolutions we want to report to the
>      framework layer, and find out how to produce them from the native
>      streams.

Given the above, we should be able to stick to the resolutions we have
already supported in the adaptation layer.

>
>    3) Image transformations A bit a lateral issue, but I assume some
>       'transformations' can be performed by HAL only components. This
>       mostly depends on handling streams with some specific metadata
>       associated, which needs to be handled in the HAL. The most trivial
>       example is rotation. If the libcamera::Camera is for whatever reason
>       unable to rotate the images, they have to be software rotated in the
>       HAL. This won't require any stream mapping, but rather inspecting
>       metadata and pass the native streams through an additional processing
>       layer.

Right. I honestly hope we won't need to do software rotation on any
reasonable hardware platform, but AFAIK we still have some in Chrome
OS for which we do, in some specific cases, like a tablet with the
camera in landscape orientation, but the device in portrait
orientation.

>
> 3) Buffer allocation/handling:
>    When performing any conversions between a HAL stream and a libcamera
>    stream we may need to allocate an intermediate buffer to provide storage
>    for processing the frame in libcamera, with the conversion entity
>    reading from the libcamera buffer and writing into the android buffer.
>    This is likely possible with the existing FrameBufferAllocator classes,
>    but may have extra requirements.

I suppose we could have 3 types of buffers here:
1) buffers written by hardware driven by libcamera
 - without any software processing these would be directly provided by
Android and imported to libcamera,
 - with processing, I assume libcamera would have to allocate its own,
but I guess that would just end up being V4L2 MMAP buffers?
2) buffers between processing steps - if software only, an arbitrary
malloc buffer could be used.
3) buffers for the end results - always provided by Android and
 - without processing they are the same thing as 1),
 - with processing they need to be mapped in the adaptation layer and
wouldn't reach libcamera itself.

For consistency, one might be tempted to use some external allocator,
like gralloc, and import the hardware buffers to libcamera in both
cases, but there are limitations - gralloc only knows how to allocate
the end result buffers, so it wouldn't give us arbitrary buffers
without some ugly hacks and DMA-buf heaps still need some time to gain
adoption. Therefore I think we might need to live with special cases
like this until the world improves.

>
> My take is still that we should try to solve one problem at the time:
>
> 1) formalize additional requirements that are not expressed by our
>    CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap

This hopefully shouldn't be needed, although we might want to double
check if those fully cover the Android requirements.

> 2) if not other requirements are necessary, indentify a use case that
>    cannot be satisfied by the current pipeline implementations we
>    have. In example, a UVC camera that cannot produce NV12 and need
>    conversion might be a good start

The use cases we encountered in practice:
a) a UVC camera which outputs only MJPEG for higher resolutions and
needs decoding (and possibly one more extra conversion) to output YUV
4:2:0.
b) a UVC camera which only outputs 1 stream, while Android requires up
to 2 YUV streams + JPEG for the LIMITED capability level.
c) IPU3/RKISP1 which can output up to 2 streams, but there is a stream
configuration that requires 3 streams which could have different
resolutions - 2 YUV up to but not necessarily equal max PREVIEW size
and 1 JPEG with MAXIMUM resolution.

I don't remember if we in the end had to deal with it, but I recall also:
d) hardware platform that doesn't support one of the smaller required
resolutions due to max scaling factor constraints.

> 3) Address the buffer allocation issues which I understand is still
>    to be addressed.

Agreed.

>
> Sorry for the wall of text. Hope it helps.

Yep, thanks for starting the discussion.

Best regards,
Tomasz