[PATCH v5 10/10] libcamera: software_isp: Apply CCM in debayering

Milan Zamazal mzamazal at redhat.com
Tue Feb 4 14:59:51 CET 2025


Hi Stefan,

Stefan Klug <stefan.klug at ideasonboard.com> writes:

> Hi Milan,
>
> On Mon, Feb 03, 2025 at 11:26:14PM +0100, Milan Zamazal wrote:
>> Milan Zamazal <mzamazal at redhat.com> writes:
>> 
>> > Milan Zamazal <mzamazal at redhat.com> writes:
>> >
>> >> This patch applies color correction matrix (CCM) in debayering if the
>> >> CCM is specified.  Not using CCM must still be supported for performance
>> >> reasons.
>> >>
>> >> The CCM is applied as follows:
>> >>
>> >>             [r1 r2 r3]
>> >>   [r g b] * [g1 g2 g3]
>> >>             [b1 b2 b3]
>> >
>> > I think this should actually be:
>> >
>> >   [r1 g1 b1]   [r]
>> >   [r2 g2 b2] * [g]
>> >   [r3 g3 b3]   [b]
>
> This is the one that is usually used. The color is represented as
> column. But mathematically both are equivalent. You wrote the transposed
> notation.
>
> out = M * c
>
> is equivalent to 
>
> out^T = c^T * M^T
>
>> >
>> > Unless somebody corrects me, I'll change this in v6.
>> 
>> Or maybe not.  There is a bug in applying the CCM color tables in
>> debayering, which confused me.  If anybody knows which of the formats
>> above applies to the CCMs in the tuning files, please tell me.
>
> As noted above, the second form is used. In the tuning files they are
> written in row major order [r1, g1, b1, r2, g2, b2, r3, g3, b3]. A nice
> trick to check that is that every row usually sums up to one. This is
> needed so that the CCM does preserves grey colors.

Thank you for clarification.  I'll fix it in v6, together with two other
mistakes I discovered while implementing a saturation control (which
made the issues much more prominent than my previous testing).

> Best regards,
> Stefan
>
>> 
>> > I was unsure which is the correct version, I couldn't find it documented
>> > anywhere.  I can't remember why I selected the original version
>> > eventually.  But the second one produces much better images with the
>> > matrices I copied from another pipeline for my sensor and it's also used
>> > by matrices in RGB<->YUV conversion recipes so I guess that one is
>> > actually right.  Of course, it's just about the format of the input
>> > matrices, nothing else changes.
>> >
>> >> The CCM matrix (the right side of the multiplication) is constant during
>> >> single frame processing, while the input pixel (the left side) changes.
>> >> Because each of the color channels is only 8-bit in software ISP, we can
>> >> make 9 lookup tables with 256 input values for multiplications of each
>> >> of the r_i, g_i, b_i values.  This way we don't have to multiply each
>> >> pixel, we can use table lookups and additions instead.  Gamma (which is
>> >> non-linear and thus cannot be a part of the 9 lookup tables values) is
>> >> applied on the final values rounded to integers using another lookup
>> >> table.
>> >>
>> >> We use int16_t to store the precomputed multiplications.  This seems to
>> >> be noticeably (>10%) faster than `float' for the price of slightly less
>> >> accuracy and it covers the range of values that sane CCMs produce.  The
>> >> selection and structure of data is performance critical, for example
>> >> using bytes would add significant (>10%) speedup but would be too short
>> >> to cover the value range.
>> >>
>> >> The color lookup tables can be represented either as unions,
>> >> accommodating tables for both the CCM and non-CCM cases, or as separate
>> >> tables for each of the cases, leaving the tables for the other case
>> >> unused.  The latter is selected as a matter of preference.
>> >>
>> >> The tables are copied (as before), which is not elegant but also not a
>> >> big problem.  There are patches posted that use shared buffers for
>> >> parameters passing in software ISP (see software ISP TODO #5) and they
>> >> can be adjusted for the new parameter format.
>> >>
>> >> Color gains from white balance are supposed not to be a part of the
>> >> specified CCM.  They are applied on it using matrix multiplication,
>> >> which is simple and in correspondence with future additions in the form
>> >> of matrix multiplication, like saturation adjustment.
>> >>
>> >> With this patch, the reported per-frame slowdown when applying CCM is
>> >> about 45% on Debix Model A and about 75% on TI AM69 SK.
>> >>
>> >> Using std::clamp in debayering adds some performance penalty (a few
>> >> percent).  The clamping is necessary to eliminate out of range values
>> >> possibly produced by the CCM.  If it could be avoided by adjusting the
>> >> precomputed tables some way then performance could be improved a bit.
>> >>
>> >> Signed-off-by: Milan Zamazal <mzamazal at redhat.com>
>> >> ---
>> >>  .../internal/software_isp/debayer_params.h    | 15 ++++-
>> >>  src/ipa/simple/algorithms/lut.cpp             | 65 ++++++++++++++-----
>> >>  src/ipa/simple/algorithms/lut.h               |  1 +
>> >>  src/libcamera/software_isp/debayer.cpp        | 52 ++++++++++++++-
>> >>  src/libcamera/software_isp/debayer_cpu.cpp    | 35 ++++++++--
>> >>  src/libcamera/software_isp/debayer_cpu.h      |  4 ++
>> >>  6 files changed, 145 insertions(+), 27 deletions(-)
>> >>
>> >> diff --git a/include/libcamera/internal/software_isp/debayer_params.h b/include/libcamera/internal/software_isp/debayer_params.h
>> >> index 7d8fdd48..9a7dbe93 100644
>> >> --- a/include/libcamera/internal/software_isp/debayer_params.h
>> >> +++ b/include/libcamera/internal/software_isp/debayer_params.h
>> >> @@ -1,6 +1,6 @@
>> >>  /* SPDX-License-Identifier: LGPL-2.1-or-later */
>> >>  /*
>> >> - * Copyright (C) 2023, 2024 Red Hat Inc.
>> >> + * Copyright (C) 2023-2025 Red Hat Inc.
>> >>   *
>> >>   * Authors:
>> >>   * Hans de Goede <hdegoede at redhat.com>
>> >> @@ -18,11 +18,24 @@ namespace libcamera {
>> >>  struct DebayerParams {
>> >>  	static constexpr unsigned int kRGBLookupSize = 256;
>> >>  
>> >> +	struct CcmColumn {
>> >> +		int16_t r;
>> >> +		int16_t g;
>> >> +		int16_t b;
>> >> +	};
>> >> +
>> >>  	using ColorLookupTable = std::array<uint8_t, kRGBLookupSize>;
>> >> +	using CcmLookupTable = std::array<CcmColumn, kRGBLookupSize>;
>> >> +	using GammaLookupTable = std::array<uint8_t, kRGBLookupSize>;
>> >>  
>> >>  	ColorLookupTable red;
>> >>  	ColorLookupTable green;
>> >>  	ColorLookupTable blue;
>> >> +
>> >> +	CcmLookupTable redCcm;
>> >> +	CcmLookupTable greenCcm;
>> >> +	CcmLookupTable blueCcm;
>> >> +	GammaLookupTable gammaLut;
>> >>  };
>> >>  
>> >>  } /* namespace libcamera */
>> >> diff --git a/src/ipa/simple/algorithms/lut.cpp b/src/ipa/simple/algorithms/lut.cpp
>> >> index e5e995c7..7719aecc 100644
>> >> --- a/src/ipa/simple/algorithms/lut.cpp
>> >> +++ b/src/ipa/simple/algorithms/lut.cpp
>> >> @@ -1,6 +1,6 @@
>> >>  /* SPDX-License-Identifier: LGPL-2.1-or-later */
>> >>  /*
>> >> - * Copyright (C) 2024, Red Hat Inc.
>> >> + * Copyright (C) 2024-2025, Red Hat Inc.
>> >>   *
>> >>   * Color lookup tables construction
>> >>   */
>> >> @@ -80,6 +80,11 @@ void Lut::updateGammaTable(IPAContext &context)
>> >>  	context.activeState.gamma.contrast = contrast;
>> >>  }
>> >>  
>> >> +int16_t Lut::ccmValue(unsigned int i, float ccm) const
>> >> +{
>> >> +	return std::round(i * ccm);
>> >> +}
>> >> +
>> >>  void Lut::prepare(IPAContext &context,
>> >>  		  [[maybe_unused]] const uint32_t frame,
>> >>  		  [[maybe_unused]] IPAFrameContext &frameContext,
>> >> @@ -91,28 +96,52 @@ void Lut::prepare(IPAContext &context,
>> >>  	 * observed, it's not permanently prone to minor fluctuations or
>> >>  	 * rounding errors.
>> >>  	 */
>> >> -	if (context.activeState.gamma.blackLevel != context.activeState.blc.level ||
>> >> -	    context.activeState.gamma.contrast != context.activeState.knobs.contrast)
>> >> +	const bool gammaUpdateNeeded =
>> >> +		context.activeState.gamma.blackLevel != context.activeState.blc.level ||
>> >> +		context.activeState.gamma.contrast != context.activeState.knobs.contrast;
>> >> +	if (gammaUpdateNeeded)
>> >>  		updateGammaTable(context);
>> >>  
>> >>  	auto &gains = context.activeState.awb.gains;
>> >>  	auto &gammaTable = context.activeState.gamma.gammaTable;
>> >>  	const unsigned int gammaTableSize = gammaTable.size();
>> >> -
>> >> -	for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) {
>> >> -		const double div = static_cast<double>(DebayerParams::kRGBLookupSize) /
>> >> -				   gammaTableSize;
>> >> -		/* Apply gamma after gain! */
>> >> -		unsigned int idx;
>> >> -		idx = std::min({ static_cast<unsigned int>(i * gains.r() / div),
>> >> -				 gammaTableSize - 1 });
>> >> -		params->red[i] = gammaTable[idx];
>> >> -		idx = std::min({ static_cast<unsigned int>(i * gains.g() / div),
>> >> -				 gammaTableSize - 1 });
>> >> -		params->green[i] = gammaTable[idx];
>> >> -		idx = std::min({ static_cast<unsigned int>(i * gains.b() / div),
>> >> -				 gammaTableSize - 1 });
>> >> -		params->blue[i] = gammaTable[idx];
>> >> +	const double div = static_cast<double>(DebayerParams::kRGBLookupSize) /
>> >> +			   gammaTableSize;
>> >> +
>> >> +	if (!context.ccmEnabled) {
>> >> +		for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) {
>> >> +			/* Apply gamma after gain! */
>> >> +			unsigned int idx;
>> >> +			idx = std::min({ static_cast<unsigned int>(i * gains.r() / div),
>> >> +					 gammaTableSize - 1 });
>> >> +			params->red[i] = gammaTable[idx];
>> >> +			idx = std::min({ static_cast<unsigned int>(i * gains.g() / div),
>> >> +					 gammaTableSize - 1 });
>> >> +			params->green[i] = gammaTable[idx];
>> >> +			idx = std::min({ static_cast<unsigned int>(i * gains.b() / div),
>> >> +					 gammaTableSize - 1 });
>> >> +			params->blue[i] = gammaTable[idx];
>> >> +		}
>> >> +	} else if (context.activeState.ccm.changed || gammaUpdateNeeded) {
>> >> +		Matrix<double, 3, 3> gainCcm = { { gains.r(), 0, 0,
>> >> +						   0, gains.g(), 0,
>> >> +						   0, 0, gains.b() } };
>> >> +		auto ccm = gainCcm * context.activeState.ccm.ccm;
>> >> +		auto &red = params->redCcm;
>> >> +		auto &green = params->greenCcm;
>> >> +		auto &blue = params->blueCcm;
>> >> +		for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) {
>> >> +			red[i].r = ccmValue(i, ccm[0][0]);
>> >> +			red[i].g = ccmValue(i, ccm[1][0]);
>> >> +			red[i].b = ccmValue(i, ccm[2][0]);
>> >> +			green[i].r = ccmValue(i, ccm[0][1]);
>> >> +			green[i].g = ccmValue(i, ccm[1][1]);
>> >> +			green[i].b = ccmValue(i, ccm[2][1]);
>> >> +			blue[i].r = ccmValue(i, ccm[0][2]);
>> >> +			blue[i].g = ccmValue(i, ccm[1][2]);
>> >> +			blue[i].b = ccmValue(i, ccm[2][2]);
>> >> +			params->gammaLut[i] = gammaTable[i / div];
>> >> +		}
>> >>  	}
>> >>  }
>> >>  
>> >> diff --git a/src/ipa/simple/algorithms/lut.h b/src/ipa/simple/algorithms/lut.h
>> >> index 889f864b..77324800 100644
>> >> --- a/src/ipa/simple/algorithms/lut.h
>> >> +++ b/src/ipa/simple/algorithms/lut.h
>> >> @@ -33,6 +33,7 @@ public:
>> >>  
>> >>  private:
>> >>  	void updateGammaTable(IPAContext &context);
>> >> +	int16_t ccmValue(unsigned int i, float ccm) const;
>> >>  };
>> >>  
>> >>  } /* namespace ipa::soft::algorithms */
>> >> diff --git a/src/libcamera/software_isp/debayer.cpp b/src/libcamera/software_isp/debayer.cpp
>> >> index 45fe6960..4c84a556 100644
>> >> --- a/src/libcamera/software_isp/debayer.cpp
>> >> +++ b/src/libcamera/software_isp/debayer.cpp
>> >> @@ -23,9 +23,39 @@ namespace libcamera {
>> >>   * \brief Size of a color lookup table
>> >>   */
>> >>  
>> >> +/**
>> >> + * \struct DebayerParams::CcmColumn
>> >> + * \brief Type of a single column of a color correction matrix (CCM)
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::CcmColumn::r
>> >> + * \brief Red (first) component of a CCM column
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::CcmColumn::g
>> >> + * \brief Green (second) component of a CCM column
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::CcmColumn::b
>> >> + * \brief Blue (third) component of a CCM column
>> >> + */
>> >> +
>> >>  /**
>> >>   * \typedef DebayerParams::ColorLookupTable
>> >> - * \brief Type of the lookup tables for red, green, blue values
>> >> + * \brief Type of the simple lookup tables for red, green, blue values
>> >> + */
>> >> +
>> >> +/**
>> >> + * \typedef DebayerParams::CcmLookupTable
>> >> + * \brief Type of the CCM lookup tables for red, green, blue values
>> >> + */
>> >> +
>> >> +/**
>> >> + * \typedef DebayerParams::GammaLookupTable
>> >> + * \brief Type of the gamma lookup tables for CCM
>> >>   */
>> >>  
>> >>  /**
>> >> @@ -43,6 +73,26 @@ namespace libcamera {
>> >>   * \brief Lookup table for blue color, mapping input values to output values
>> >>   */
>> >>  
>> >> +/**
>> >> + * \var DebayerParams::redCcm
>> >> + * \brief CCM lookup table for red color, mapping input values to output values
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::greenCcm
>> >> + * \brief CCM lookup table for green color, mapping input values to output values
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::blueCcm
>> >> + * \brief CCM lookup table for blue color, mapping input values to output values
>> >> + */
>> >> +
>> >> +/**
>> >> + * \var DebayerParams::gammaLut
>> >> + * \brief Gamma lookup table used with color correction matrix
>> >> + */
>> >> +
>> >>  /**
>> >>   * \class Debayer
>> >>   * \brief Base debayering class
>> >> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
>> >> index 0cd03a8f..5ddfdcf6 100644
>> >> --- a/src/libcamera/software_isp/debayer_cpu.cpp
>> >> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
>> >> @@ -11,6 +11,7 @@
>> >>  
>> >>  #include "debayer_cpu.h"
>> >>  
>> >> +#include <algorithm>
>> >>  #include <stdlib.h>
>> >>  #include <sys/ioctl.h>
>> >>  #include <time.h>
>> >> @@ -51,8 +52,12 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats)
>> >>  	enableInputMemcpy_ = true;
>> >>  
>> >>  	/* Initialize color lookup tables */
>> >> -	for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++)
>> >> +	for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) {
>> >>  		red_[i] = green_[i] = blue_[i] = i;
>> >> +		redCcm_[i] = { 1, 0, 0 };
>> >> +		greenCcm_[i] = { 0, 1, 0 };
>> >> +		blueCcm_[i] = { 0, 0, 1 };
>> >> +	}
>> >>  }
>> >>  
>> >>  DebayerCpu::~DebayerCpu() = default;
>> >> @@ -62,12 +67,24 @@ DebayerCpu::~DebayerCpu() = default;
>> >>  	const pixel_t *curr = (const pixel_t *)src[1] + xShift_; \
>> >>  	const pixel_t *next = (const pixel_t *)src[2] + xShift_;
>> >>  
>> >> -#define STORE_PIXEL(b, g, r)        \
>> >> -	*dst++ = blue_[b];          \
>> >> -	*dst++ = green_[g];         \
>> >> -	*dst++ = red_[r];           \
>> >> -	if constexpr (addAlphaByte) \
>> >> -		*dst++ = 255;       \
>> >> +#define GAMMA(value) \
>> >> +	*dst++ = gammaLut_[std::clamp(value, 0, static_cast<int>(gammaLut_.size()) - 1)]
>> >> +
>> >> +#define STORE_PIXEL(b_, g_, r_)                                        \
>> >> +	if constexpr (ccmEnabled) {                                    \
>> >> +		const DebayerParams::CcmColumn &blue = blueCcm_[b_];   \
>> >> +		const DebayerParams::CcmColumn &green = greenCcm_[g_]; \
>> >> +		const DebayerParams::CcmColumn &red = redCcm_[r_];     \
>> >> +		GAMMA(blue.r + blue.g + blue.b);                       \
>> >> +		GAMMA(green.r + green.g + green.b);                    \
>> >> +		GAMMA(red.r + red.g + red.b);                          \
>> >> +	} else {                                                       \
>> >> +		*dst++ = blue_[b_];                                    \
>> >> +		*dst++ = green_[g_];                                   \
>> >> +		*dst++ = red_[r_];                                     \
>> >> +	}                                                              \
>> >> +	if constexpr (addAlphaByte)                                    \
>> >> +		*dst++ = 255;                                          \
>> >>  	x++;
>> >>  
>> >>  /*
>> >> @@ -757,6 +774,10 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>> >>  	green_ = params.green;
>> >>  	red_ = swapRedBlueGains_ ? params.blue : params.red;
>> >>  	blue_ = swapRedBlueGains_ ? params.red : params.blue;
>> >> +	greenCcm_ = params.greenCcm;
>> >> +	redCcm_ = swapRedBlueGains_ ? params.blueCcm : params.redCcm;
>> >> +	blueCcm_ = swapRedBlueGains_ ? params.redCcm : params.blueCcm;
>> >> +	gammaLut_ = params.gammaLut;
>> >>  
>> >>  	/* Copy metadata from the input buffer */
>> >>  	FrameMetadata &metadata = output->_d()->metadata();
>> >> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h
>> >> index 21c08a2d..9f73a2fd 100644
>> >> --- a/src/libcamera/software_isp/debayer_cpu.h
>> >> +++ b/src/libcamera/software_isp/debayer_cpu.h
>> >> @@ -141,6 +141,10 @@ private:
>> >>  	DebayerParams::ColorLookupTable red_;
>> >>  	DebayerParams::ColorLookupTable green_;
>> >>  	DebayerParams::ColorLookupTable blue_;
>> >> +	DebayerParams::CcmLookupTable redCcm_;
>> >> +	DebayerParams::CcmLookupTable greenCcm_;
>> >> +	DebayerParams::CcmLookupTable blueCcm_;
>> >> +	DebayerParams::GammaLookupTable gammaLut_;
>> >>  	debayerFn debayer0_;
>> >>  	debayerFn debayer1_;
>> >>  	debayerFn debayer2_;
>> 



More information about the libcamera-devel mailing list