[libcamera-devel] [PATCH RFC 7/7] WIP: android: add YUYV->NV12 format conversion via libyuv

Fri Sep 22 15:18:30 CEST 2023

Hi Mattijs

On Fri, Sep 15, 2023 at 09:57:31AM +0200, Mattijs Korpershoek via libcamera-devel wrote:
> For some platforms, it's possible that the gralloc implementation
> and the CSI receiver cannot agree on a pixel format. When that happens,
> there is usually a m2m converter in the pipeline which handles pixel format
> conversion.
>
> On platforms without pixel format converters, such as the AM62x, we need to do
> software conversion.
>
> The AM62x platform:
> * uses a CSI receiver (j721e-csi2rx), that only supports
>   packed YUV422 formats such as YUYV, YVYU, UYVY and VYUY.
> * Has a gralloc implementation that only supports of semi-planar
>   YUV420 formats such as NV12.
>
> Implement YUYV->NV12 format conversion using libyuv.
>
> This is mainly done by transforming the first stream from Type::Direct into
> Type::Internal so that it goes through the post-processor loop.
>
> ```
> The WIP: part is mainly around computeYUYVSize():
>
> Since gralloc and j721e-csi2rx are incompatible, we need a way to get
> gralloc to allocate (NV12) the kernel-requested buffer length (YUYV).
> In other words, we should make sure that the first plane of the NV12
> allocated buffer is long enough to fit a YUYV image.
>
> According to [1], NV12 has 8 bits (one byte) per component, and the
> first plane is the Y component.
> So a 1920x1080 image in NV12 has plane[0].length=1920*1080=2073600
>
> According to [2], YUYV stores 2 pixels per container of 32 bits, which
> gives us 16 bits (2 bytes for one pixel).
> So a 1920x1080 image in YUYV has plane[0].length=1920*1080*2=4147200
>
> So apply a *2 factor to make the kernel believe it's receiving a YUYV buffer.
>
> Note: this also means that we are wasting NV12's plane[1] buffer with
> each allocation.
>
> [1] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-yuv-planar.html
> [2] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-packed-yuv.html
> ```
>
> Signed-off-by: Mattijs Korpershoek <mkorpershoek at baylibre.com>
> ---
>  src/android/camera_capabilities.cpp | 90 ++++++++++++++++++++++++++++++++++++-
>  src/android/camera_capabilities.h   |  4 ++
>  src/android/camera_device.cpp       |  6 ++-
>  src/android/camera_stream.cpp       | 54 +++++++++++++++++++++-
>  src/android/camera_stream.h         |  5 +++
>  5 files changed, 154 insertions(+), 5 deletions(-)
>
> diff --git a/src/android/camera_capabilities.cpp b/src/android/camera_capabilities.cpp
> index 1bfeaea4b121..e2e0f7409e94 100644
> --- a/src/android/camera_capabilities.cpp
> +++ b/src/android/camera_capabilities.cpp
> @@ -124,6 +124,16 @@ const std::map<int, const Camera3Format> camera3FormatsMap = {
>  	},
>  };
>
> +/**
> + * \var yuvConversions
> + * \brief list of supported pixel formats for an input pixel format
> + *
> + * \todo This should be retrieved statically from yuv/post_processor_yuv instead
> + */
> +const std::map<PixelFormat, const std::vector<PixelFormat>> yuvConversions = {
> +	{ formats::YUYV, { formats::NV12 } },
> +};
> +
>  const std::map<camera_metadata_enum_android_info_supported_hardware_level, std::string>
>  hwLevelStrings = {
>  	{ ANDROID_INFO_SUPPORTED_HARDWARE_LEVEL_LIMITED,  "LIMITED" },
> @@ -582,8 +592,10 @@ int CameraCapabilities::initializeStreamConfigurations()
>  			LOG(HAL, Debug) << "Testing " << pixelFormat;
>
>  			/*
> -			 * The stream configuration size can be adjusted,
> -			 * not the pixel format.
> +			 * The stream configuration size can be adjusted.
> +			 * The pixel format might be converted via libyuv.
> +			 * Conversion check is done in another loop after
> +			 * testing native supported formats.
>  			 *
>  			 * \todo This could be simplified once all pipeline
>  			 * handlers will report the StreamFormats list of
> @@ -603,7 +615,46 @@ int CameraCapabilities::initializeStreamConfigurations()
>  			/* If the format is not mandatory, skip it. */
>  			if (!camera3Format.mandatory)
>  				continue;
> +		}
> +
> +		/*
> +		 * Test if we can map the format via a software conversion.
> +		 * This means that the converter can produce an "output" that is
> +		 * compatible with the format defined in Android.
> +		 */
> +		bool needConversion = false;
> +		for (const PixelFormat &pixelFormat : libcameraFormats) {
>
> +			LOG(HAL, Debug) << "Testing " << pixelFormat << " using conversion";
> +
> +			/* \todo move this into a separate function */

Might be a good idea

> +			for (const auto &[inputFormat, outputFormats] : yuvConversions) {
> +				/* check if the converter can produce pixelFormat */
> +				auto it = std::find(outputFormats.begin(), outputFormats.end(), pixelFormat);
> +				if (it == outputFormats.end())
> +					continue;
> +
> +				/*
> +				 * The converter can produce output pixelFormat, see if we can configure
> +				 * the camera with the associated input pixelFormat.
> +				 */
> +				cfg.pixelFormat = inputFormat;
> +				CameraConfiguration::Status status = cameraConfig->validate();
> +
> +				if (status != CameraConfiguration::Invalid && cfg.pixelFormat == inputFormat) {
> +					mappedFormat = inputFormat;
> +					conversionMap_[androidFormat] = std::make_pair(inputFormat, *it);
> +					needConversion = true;
> +					break;
> +				}
> +			}
> +
> +			/* We found a valid conversion format, so bail out */
> +			if (mappedFormat.isValid())
> +				break;
> +		}

I quite like this.

When I first thought about this problem I was considering expanding
camera3FormatsMap with an additional field to list there the possible
source formats an android-format could be software converted to.

Keeping a separate yuvConversions[] map and checking it here
separately is however nice, but as you noted, this should come from
the post-processor. For this first version I think it's good

> +
> +		if (!mappedFormat.isValid()) {
>  			LOG(HAL, Error)
>  				<< "Failed to map mandatory Android format "
>  				<< camera3Format.name << " ("
> @@ -619,6 +670,11 @@ int CameraCapabilities::initializeStreamConfigurations()
>  		LOG(HAL, Debug) << "Mapped Android format "
>  				<< camera3Format.name << " to "
>  				<< mappedFormat;
> +		if (needConversion) {
> +			LOG(HAL, Debug) << mappedFormat
> +					<< " will be converted into "
> +					<< conversionMap_[androidFormat].second;
> +		}

nit: no {} for single liners

Unless it's:
                if () {
                        instruction1;
                        instruction2;
                } else {
                        instruction1;
                }

Here and in other parts of the code

>
>  		std::vector<Size> resolutions;
>  		const PixelFormatInfo &info = PixelFormatInfo::info(mappedFormat);
> @@ -1457,6 +1513,36 @@ PixelFormat CameraCapabilities::toPixelFormat(int format) const
>  	return it->second;
>  }
>
> +/*
> + * Check if we need to do software conversion via a post-processor
> + * for an Android format code
> + */
> +bool CameraCapabilities::needConversion(int format) const
> +{
> +	auto it = conversionMap_.find(format);
> +	if (it == conversionMap_.end()) {
> +		LOG(HAL, Error) << "Requested format " << utils::hex(format)
> +				<< " not supported for conversion";
> +		return false;
> +	}
> +
> +	return true;

This could just be

        auto formats = conversionFormats(format);
        return formats != conversionMap_.end();

> +}
> +
> +/*
> + * Returns a conversion (input,output) pair for a given Android format code
> + */
> +std::pair<PixelFormat, PixelFormat> CameraCapabilities::conversionFormats(int format) const
> +{
> +	auto it = conversionMap_.find(format);
> +	if (it == conversionMap_.end()) {
> +		LOG(HAL, Error) << "Requested format " << utils::hex(format)
> +				<< " not supported for conversion";
> +	}
> +
> +	return it->second;
> +}
> +
>  std::unique_ptr<CameraMetadata> CameraCapabilities::requestTemplateManual() const
>  {
>  	if (!capabilities_.count(ANDROID_REQUEST_AVAILABLE_CAPABILITIES_MANUAL_SENSOR)) {
> diff --git a/src/android/camera_capabilities.h b/src/android/camera_capabilities.h
> index 6f66f221d33f..c3e6b48ab91d 100644
> --- a/src/android/camera_capabilities.h
> +++ b/src/android/camera_capabilities.h
> @@ -30,6 +30,9 @@ public:
>
>  	CameraMetadata *staticMetadata() const { return staticMetadata_.get(); }
>  	libcamera::PixelFormat toPixelFormat(int format) const;
> +	bool needConversion(int format) const;
> +	std::pair<libcamera::PixelFormat, libcamera::PixelFormat>
> +	conversionFormats(int format) const;
>  	unsigned int maxJpegBufferSize() const { return maxJpegBufferSize_; }
>
>  	std::unique_ptr<CameraMetadata> requestTemplateManual() const;
> @@ -77,6 +80,7 @@ private:
>
>  	std::vector<Camera3StreamConfiguration> streamConfigurations_;
>  	std::map<int, libcamera::PixelFormat> formatsMap_;
> +	std::map<int, std::pair<libcamera::PixelFormat, libcamera::PixelFormat>> conversionMap_;
>  	std::unique_ptr<CameraMetadata> staticMetadata_;
>  	unsigned int maxJpegBufferSize_;
>
> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> index d34bae715a47..842cbb06d345 100644
> --- a/src/android/camera_device.cpp
> +++ b/src/android/camera_device.cpp
> @@ -635,8 +635,12 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
>  			continue;
>  		}
>
> +		CameraStream::Type type = CameraStream::Type::Direct;
> +		if (capabilities_.needConversion(stream->format))
> +			type = CameraStream::Type::Internal;
> +

Ok, now patch #4 makes more sense indeed :)

I think it can be squashed here

>  		Camera3StreamConfig streamConfig;
> -		streamConfig.streams = { { stream, CameraStream::Type::Direct } };
> +		streamConfig.streams = { { stream, type } };
>  		streamConfig.config.size = size;
>  		streamConfig.config.pixelFormat = format;
>  		streamConfigs.push_back(std::move(streamConfig));
> diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp
> index 4fd05dda5ed3..961ee40017f1 100644
> --- a/src/android/camera_stream.cpp
> +++ b/src/android/camera_stream.cpp
> @@ -95,6 +95,7 @@ int CameraStream::configure()
>
>  		switch (outFormat) {
>  		case formats::NV12:
> +		case formats::YUYV:
>  			postProcessor_ = std::make_unique<PostProcessorYuv>();
>  			break;
>
> @@ -107,6 +108,16 @@ int CameraStream::configure()
>  			return -EINVAL;
>  		}
>
> +		needConversion_ =
> +			cameraDevice_->capabilities()->needConversion(camera3Stream_->format);
> +
> +		if (needConversion_) {
> +			auto conv = cameraDevice_->capabilities()->conversionFormats(camera3Stream_->format);
> +			LOG(HAL, Debug) << "Configuring the post processor to convert "
> +					<< conv.first << " -> " << conv.second;
> +			output.pixelFormat = conv.second;
> +		}
> +
>  		int ret = postProcessor_->configure(input, output);
>  		if (ret)
>  			return ret;
> @@ -183,7 +194,12 @@ int CameraStream::process(Camera3RequestDescriptor::StreamBuffer *streamBuffer)
>  		streamBuffer->fence.reset();
>  	}
>
> -	const StreamConfiguration &output = configuration();
> +	StreamConfiguration output = configuration();
> +	if (needConversion_) {
> +		output.pixelFormat =
> +			cameraDevice_->capabilities()->conversionFormats(camera3Stream_->format).second;

nit: 80 cols preferred, 120 when necessary :)
also no {}

Does stride needs adjusment too or is it not considered during
post-processing ?

> +	}
> +
>  	streamBuffer->dstBuffer = std::make_unique<CameraBuffer>(
>  		*streamBuffer->camera3Buffer, output.pixelFormat, output.size,
>  		PROT_READ | PROT_WRITE);
> @@ -205,6 +221,39 @@ void CameraStream::flush()
>  	worker_->flush();
>  }
>
> +Size CameraStream::computeYUYVSize(const Size &nv12Size)
> +{
> +	/*
> +	 * On am62x platforms, the receiver driver (j721e-csi2rx) only
> +	 * supports packed YUV422 formats such as YUYV, YVYU, UYVY and VYUY.
> +	 *
> +	 * However, the gralloc implementation is only capable of semiplanar
> +	 * YUV420 such as NV12.
> +	 *
> +	 * To trick the kernel into believing it's receiving a YUYV buffer, we adjust the
> +	 * size we request to gralloc so that plane(0) of the NV12 buffer is long enough to
> +	 * match the length of a YUYV plane.
> +	 *
> +	 * for NV12, one pixel is encoded on 1.5 bytes, but plane 0 has 1 byte per pixel.
> +	 * for YUYV, one pixel is encoded on 2 bytes.
> +	 *
> +	 * So apply a *2 factor.
> +	 *
> +	 * See:
> +	 * https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-packed-yuv.html
> +	 * https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-yuv-planar.html
> +	 */
> +	constexpr unsigned int YUYVfactor = 2;
> +
> +	unsigned int width = nv12Size.width;
> +	unsigned int height = nv12Size.height;
> +
> +	if (needConversion_)
> +		width = width * YUYVfactor;
> +
> +	return Size{ width, height };
> +}
> +
>  FrameBuffer *CameraStream::getBuffer()
>  {
>  	if (!allocator_)
> @@ -222,8 +271,9 @@ FrameBuffer *CameraStream::getBuffer()
>  		 * \todo Store a reference to the format of the source stream
>  		 * instead of hardcoding.
>  		 */
> +		const Size hackedSize = computeYUYVSize(configuration().size);
>  		auto frameBuffer = allocator_->allocate(HAL_PIXEL_FORMAT_YCBCR_420_888,
> -							configuration().size,
> +							hackedSize,
>  							camera3Stream_->usage);

I see your point about this being problematic, and I wonder if we
shouldn't stop assuming HAL_PIXEL_FORMAT_YCBCR_420_888 and instead map
this to the actual format produced by libcamera (YUYV in this case).

CameraStream has access to the libcamera::StreamConfiguration it maps
to. If I'm not mistaken that StreamConfiguration::pixelFormat will be
== formats::YUYV, right ? Could we associated it to the corresponding
Android format (HAL_PIXEL_FORMAT_YCrCb_420_SP?) instead ? Would this
remove the need to trick gralloc into allocating a larger buffer ?

>  		allocatedBuffers_.push_back(std::move(frameBuffer));
>  		buffers_.emplace_back(allocatedBuffers_.back().get());
> diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h
> index 4c5078b2c26d..52a5606399c5 100644
> --- a/src/android/camera_stream.h
> +++ b/src/android/camera_stream.h
> @@ -128,10 +128,13 @@ public:
>
>  	int configure();
>  	int process(Camera3RequestDescriptor::StreamBuffer *streamBuffer);
> +	libcamera::Size computeYUYVSize(const libcamera::Size &nv12Size);
>  	libcamera::FrameBuffer *getBuffer();
>  	void putBuffer(libcamera::FrameBuffer *buffer);
>  	void flush();
>
> +	bool needConversion() const { return needConversion_; }

Not used ?

Ok, lot of work, very nice! With a few adjustments I hope we can see
this as a proper patch series.

I understand this is very specific to your use case (YUYV-to-NV12) and
might not work out-of-the-box for other systems, but I think it's fine
and it's a good first step on which others can build on top.

Thanks!
   j

> +
>  private:
>  	class PostProcessorWorker : public libcamera::Thread
>  	{
> @@ -184,4 +187,6 @@ private:
>  	std::unique_ptr<PostProcessor> postProcessor_;
>
>  	std::unique_ptr<PostProcessorWorker> worker_;
> +
> +	bool needConversion_;
>  };
>
> --
> 2.41.0
>