[PATCH v3 08/16] libcamera: software_isp: Add DebayerCpu class

Stefan Klug stefan.klug at ideasonboard.com
Tue Feb 20 13:26:14 CET 2024


Hi Andrei,

Am 20.02.24 um 11:54 schrieb Andrei Konovalov:
> Hi Stefan,
> 
> On 19.02.2024 20:09, Stefan Klug wrote:
>> Hi Hans,
>>
>> thanks for your work on the SoftISP.
>>
>> Am 14.02.24 um 18:01 schrieb Hans de Goede:
>>> Add CPU based debayering implementation. This initial implementation
>>> only supports debayering packed 10 bits per pixel bayer data in
>>> the 4 standard bayer orders.
>>>
>>> Doxygen documentation by Dennis Bonke.
>>>
>>> Tested-by: Bryan O'Donoghue <bryan.odonoghue at linaro.org> # sc8280xp 
>>> Lenovo x13s
>>> Tested-by: Pavel Machek <pavel at ucw.cz>
>>> Reviewed-by: Pavel Machek <pavel at ucw.cz>
>>> Co-developed-by: Dennis Bonke <admin at dennisbonke.com>
>>> Signed-off-by: Dennis Bonke <admin at dennisbonke.com>
>>> Co-developed-by: Andrey Konovalov <andrey.konovalov at linaro.org>
>>> Signed-off-by: Andrey Konovalov <andrey.konovalov at linaro.org>
>>> Co-developed-by: Pavel Machek <pavel at ucw.cz>
>>> Signed-off-by: Pavel Machek <pavel at ucw.cz>
>>> Signed-off-by: Hans de Goede <hdegoede at redhat.com>
>>> ---
>>> Changes in v3:
>>> - Move debayer_cpu.h to src/libcamera/software_isp/
>>> - Move documentation to .cpp file
>>> - Document how/why an array of src pointers is passed to
>>>    the debayer functions
>>> ---
>>>   src/libcamera/software_isp/debayer_cpu.cpp | 619 +++++++++++++++++++++
>>>   src/libcamera/software_isp/debayer_cpu.h   | 143 +++++
>>>   src/libcamera/software_isp/meson.build     |   1 +
>>>   3 files changed, 763 insertions(+)
>>>   create mode 100644 src/libcamera/software_isp/debayer_cpu.cpp
>>>   create mode 100644 src/libcamera/software_isp/debayer_cpu.h
>>>
>>> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp 
>>> b/src/libcamera/software_isp/debayer_cpu.cpp
>>> new file mode 100644
>>> index 00000000..53e90776
>>> --- /dev/null
>>> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
>>> @@ -0,0 +1,619 @@
>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
>>> +/*
>>> + * Copyright (C) 2023, Linaro Ltd
>>> + * Copyright (C) 2023, Red Hat Inc.
>>> + *
>>> + * Authors:
>>> + * Hans de Goede <hdegoede at redhat.com>
>>> + *
>>> + * debayer_cpu.cpp - CPU based debayering class
>>> + */
>>> +
>>> +#include "debayer_cpu.h"
>>> +
>>> +#include <math.h>
>>> +#include <stdlib.h>
>>> +#include <time.h>
>>> +
>>> +#include <libcamera/formats.h>
>>> +
>>> +#include "libcamera/internal/bayer_format.h"
>>> +#include "libcamera/internal/framebuffer.h"
>>> +#include "libcamera/internal/mapped_framebuffer.h"
>>> +
>>> +namespace libcamera {
>>> +
>>> +/**
>>> + * \class DebayerCpu
>>> + * \brief Class for debayering on the CPU
>>> + *
>>> + * Implementation for CPU based debayering
>>> + */
>>> +
>>> +/**
>>> + * \brief Constructs a DebayerCpu object.
>>> + * \param[in] stats Pointer to the stats object to use.
>>> + */
>>> +DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats)
>>
>> I tried to use the ISP without statistics/regulation altogether and 
>> just set stats to a nullptr. This fails in a few places. IMHO it would 
>> improve flexibility & reusability to allow that.
>> Attched is a patch with the modifications as I needed them anyways. 
>> Feel free to include them.
> 
> Your patch itself looks OK for me (I haven't tested it though).
> 
> But I am curious what is the reason for not using stats?
> Leaving the debayer step only, one looses AWB (which doesn't need any 
> particular
> support from the hardware and prevents the typical raw bayer green tint) 
> and AE/AGC (which only
> needs at least one of the two camera sensor controls, and with wrong 
> exposure the
> image may loose the information, and this can't be fully compensated by 
> post processing).

I was working on the camera sensor implementation in libcamera and 
needed to quickly display the debayered sensor image. No auto regulation 
should happen as I wanted to manually control gain/exposure. One could 
even expose manual whitebalance gains for usecases under known 
conditions. Having the SoftIsp as a modular playground for such cases is 
quite helpful.

> 
> Thanks,
> Andrei
> 
>> Cheers Stefan
>>
>>> +    : stats_(std::move(stats)), gamma_correction_(1.0)
>>> +{
>>> +#ifdef __x86_64__
>>> +    enableInputMemcpy_ = false;
>>> +#else
>>> +    enableInputMemcpy_ = true;
>>> +#endif
>>> +    /* Initialize gamma to 1.0 curve */
>>> +    for (unsigned int i = 0; i < kGammaLookupSize; i++)
>>> +        gamma_[i] = i / 4;
>>> +
>>> +    for (unsigned int i = 0; i < kMaxLineBuffers; i++)
>>> +        lineBuffers_[i] = nullptr;
>>> +}
>>> +
>>> +DebayerCpu::~DebayerCpu()
>>> +{
>>> +    for (unsigned int i = 0; i < kMaxLineBuffers; i++)
>>> +        free(lineBuffers_[i]);
>>> +}
>>> +
>>> +// RGR
>>> +// GBG
>>> +// RGR
>>> +#define BGGR_BGR888(p, n, 
>>> div)                                                                \
>>> +    *dst++ = blue_[curr[x] / 
>>> (div)];                                                      \
>>> +    *dst++ = green_[(prev[x] + curr[x - p] + curr[x + n] + next[x]) 
>>> / (4 * (div))];       \
>>> +    *dst++ = red_[(prev[x - p] + prev[x + n] + next[x - p] + next[x 
>>> + n]) / (4 * (div))]; \
>>> +    x++;
>>> +
>>> +// GBG
>>> +// RGR
>>> +// GBG
>>> +#define GRBG_BGR888(p, n, div)                                    \
>>> +    *dst++ = blue_[(prev[x] + next[x]) / (2 * (div))];        \
>>> +    *dst++ = green_[curr[x] / (div)];                         \
>>> +    *dst++ = red_[(curr[x - p] + curr[x + n]) / (2 * (div))]; \
>>> +    x++;
>>> +
>>> +// GRG
>>> +// BGB
>>> +// GRG
>>> +#define GBRG_BGR888(p, n, div)                                     \
>>> +    *dst++ = blue_[(curr[x - p] + curr[x + n]) / (2 * (div))]; \
>>> +    *dst++ = green_[curr[x] / (div)];                          \
>>> +    *dst++ = red_[(prev[x] + next[x]) / (2 * (div))];          \
>>> +    x++;
>>> +
>>> +// BGB
>>> +// GRG
>>> +// BGB
>>> +#define RGGB_BGR888(p, n, 
>>> div)                                                                 \
>>> +    *dst++ = blue_[(prev[x - p] + prev[x + n] + next[x - p] + next[x 
>>> + n]) / (4 * (div))]; \
>>> +    *dst++ = green_[(prev[x] + curr[x - p] + curr[x + n] + next[x]) 
>>> / (4 * (div))];        \
>>> +    *dst++ = red_[curr[x] / 
>>> (div)];                                                        \
>>> +    x++;
>>> +
>>> +void DebayerCpu::debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t 
>>> *src[])
>>> +{
>>> +    const int width_in_bytes = window_.width * 5 / 4;
>>> +    const uint8_t *prev = (const uint8_t *)src[0];
>>> +    const uint8_t *curr = (const uint8_t *)src[1];
>>> +    const uint8_t *next = (const uint8_t *)src[2];
>>> +
>>> +    /*
>>> +     * For the first pixel getting a pixel from the previous column 
>>> uses
>>> +     * x - 2 to skip the 5th byte with least-significant bits for 4 
>>> pixels.
>>> +     * Same for last pixel (uses x + 2) and looking at the next column.
>>> +     * x++ in the for-loop skips the 5th byte with 4 x 2 lsb-s for 
>>> 10bit packed.
>>> +     */
>>> +    for (int x = 0; x < width_in_bytes; x++) {
>>> +        /* First pixel */
>>> +        BGGR_BGR888(2, 1, 1)
>>> +        /* Second pixel BGGR -> GBRG */
>>> +        GBRG_BGR888(1, 1, 1)
>>> +        /* Same thing for third and fourth pixels */
>>> +        BGGR_BGR888(1, 1, 1)
>>> +        GBRG_BGR888(1, 2, 1)
>>> +    }
>>> +}
>>> +
>>> +void DebayerCpu::debayer10P_GRGR_BGR888(uint8_t *dst, const uint8_t 
>>> *src[])
>>> +{
>>> +    const int width_in_bytes = window_.width * 5 / 4;
>>> +    const uint8_t *prev = (const uint8_t *)src[0];
>>> +    const uint8_t *curr = (const uint8_t *)src[1];
>>> +    const uint8_t *next = (const uint8_t *)src[2];
>>> +
>>> +    for (int x = 0; x < width_in_bytes; x++) {
>>> +        /* First pixel */
>>> +        GRBG_BGR888(2, 1, 1)
>>> +        /* Second pixel GRBG -> RGGB */
>>> +        RGGB_BGR888(1, 1, 1)
>>> +        /* Same thing for third and fourth pixels */
>>> +        GRBG_BGR888(1, 1, 1)
>>> +        RGGB_BGR888(1, 2, 1)
>>> +    }
>>> +}
>>> +
>>> +void DebayerCpu::debayer10P_GBGB_BGR888(uint8_t *dst, const uint8_t 
>>> *src[])
>>> +{
>>> +    const int width_in_bytes = window_.width * 5 / 4;
>>> +    const uint8_t *prev = (const uint8_t *)src[0];
>>> +    const uint8_t *curr = (const uint8_t *)src[1];
>>> +    const uint8_t *next = (const uint8_t *)src[2];
>>> +
>>> +    for (int x = 0; x < width_in_bytes; x++) {
>>> +        /* Even pixel */
>>> +        GBRG_BGR888(2, 1, 1)
>>> +        /* Odd pixel GBGR -> BGGR */
>>> +        BGGR_BGR888(1, 1, 1)
>>> +        /* Same thing for next 2 pixels */
>>> +        GBRG_BGR888(1, 1, 1)
>>> +        BGGR_BGR888(1, 2, 1)
>>> +    }
>>> +}
>>> +
>>> +void DebayerCpu::debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t 
>>> *src[])
>>> +{
>>> +    const int width_in_bytes = window_.width * 5 / 4;
>>> +    const uint8_t *prev = (const uint8_t *)src[0];
>>> +    const uint8_t *curr = (const uint8_t *)src[1];
>>> +    const uint8_t *next = (const uint8_t *)src[2];
>>> +
>>> +    for (int x = 0; x < width_in_bytes; x++) {
>>> +        /* Even pixel */
>>> +        RGGB_BGR888(2, 1, 1)
>>> +        /* Odd pixel RGGB -> GRBG*/
>>> +        GRBG_BGR888(1, 1, 1)
>>> +        /* Same thing for next 2 pixels */
>>> +        RGGB_BGR888(1, 1, 1)
>>> +        GRBG_BGR888(1, 2, 1)
>>> +    }
>>> +}
>>> +
>>> +static bool isStandardBayerOrder(BayerFormat::Order order)
>>> +{
>>> +    return order == BayerFormat::BGGR || order == BayerFormat::GBRG ||
>>> +           order == BayerFormat::GRBG || order == BayerFormat::RGGB;
>>> +}
>>> +
>>> +/*
>>> + * Setup the Debayer object according to the passed in parameters.
>>> + * Return 0 on success, a negative errno value on failure
>>> + * (unsupported parameters).
>>> + */
>>> +int DebayerCpu::getInputConfig(PixelFormat inputFormat, 
>>> DebayerInputConfig &config)
>>> +{
>>> +    BayerFormat bayerFormat =
>>> +        BayerFormat::fromPixelFormat(inputFormat);
>>> +
>>> +    if (bayerFormat.bitDepth == 10 &&
>>> +        bayerFormat.packing == BayerFormat::Packing::CSI2 &&
>>> +        isStandardBayerOrder(bayerFormat.order)) {
>>> +        config.bpp = 10;
>>> +        config.patternSize.width = 4; /* 5 bytes per *4* pixels */
>>> +        config.patternSize.height = 2;
>>> +        config.outputFormats = std::vector<PixelFormat>({ 
>>> formats::RGB888 });
>>> +        return 0;
>>> +    }
>>> +
>>> +    LOG(Debayer, Info)
>>> +        << "Unsupported input format " << inputFormat.toString();
>>> +    return -EINVAL;
>>> +}
>>> +
>>> +int DebayerCpu::getOutputConfig(PixelFormat outputFormat, 
>>> DebayerOutputConfig &config)
>>> +{
>>> +    if (outputFormat == formats::RGB888) {
>>> +        config.bpp = 24;
>>> +        return 0;
>>> +    }
>>> +
>>> +    LOG(Debayer, Info)
>>> +        << "Unsupported output format " << outputFormat.toString();
>>> +    return -EINVAL;
>>> +}
>>> +
>>> +/* TODO: this ignores outputFormat since there is only 1 supported 
>>> outputFormat for now */
>>> +int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat, 
>>> [[maybe_unused]] PixelFormat outputFormat)
>>> +{
>>> +    BayerFormat bayerFormat =
>>> +        BayerFormat::fromPixelFormat(inputFormat);
>>> +
>>> +    if (bayerFormat.bitDepth == 10 &&
>>> +        bayerFormat.packing == BayerFormat::Packing::CSI2) {
>>> +        switch (bayerFormat.order) {
>>> +        case BayerFormat::BGGR:
>>> +            debayer0_ = &DebayerCpu::debayer10P_BGBG_BGR888;
>>> +            debayer1_ = &DebayerCpu::debayer10P_GRGR_BGR888;
>>> +            return 0;
>>> +        case BayerFormat::GBRG:
>>> +            debayer0_ = &DebayerCpu::debayer10P_GBGB_BGR888;
>>> +            debayer1_ = &DebayerCpu::debayer10P_RGRG_BGR888;
>>> +            return 0;
>>> +        case BayerFormat::GRBG:
>>> +            debayer0_ = &DebayerCpu::debayer10P_GRGR_BGR888;
>>> +            debayer1_ = &DebayerCpu::debayer10P_BGBG_BGR888;
>>> +            return 0;
>>> +        case BayerFormat::RGGB:
>>> +            debayer0_ = &DebayerCpu::debayer10P_RGRG_BGR888;
>>> +            debayer1_ = &DebayerCpu::debayer10P_GBGB_BGR888;
>>> +            return 0;
>>> +        default:
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    LOG(Debayer, Error) << "Unsupported input output format 
>>> combination";
>>> +    return -EINVAL;
>>> +}
>>> +
>>> +int DebayerCpu::configure(const StreamConfiguration &inputCfg,
>>> +              const 
>>> std::vector<std::reference_wrapper<StreamConfiguration>> &outputCfgs)
>>> +{
>>> +    if (getInputConfig(inputCfg.pixelFormat, inputConfig_) != 0)
>>> +        return -EINVAL;
>>> +
>>> +    if (stats_->configure(inputCfg) != 0)
>>> +        return -EINVAL;
>>> +
>>> +    const Size &stats_pattern_size = stats_->patternSize();
>>> +    if (inputConfig_.patternSize.width != stats_pattern_size.width ||
>>> +        inputConfig_.patternSize.height != stats_pattern_size.height) {
>>> +        LOG(Debayer, Error)
>>> +            << "mismatching stats and debayer pattern sizes for "
>>> +            << inputCfg.pixelFormat.toString();
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    inputConfig_.stride = inputCfg.stride;
>>> +
>>> +    if (outputCfgs.size() != 1) {
>>> +        LOG(Debayer, Error)
>>> +            << "Unsupported number of output streams: "
>>> +            << outputCfgs.size();
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    const StreamConfiguration &outputCfg = outputCfgs[0];
>>> +    SizeRange outSizeRange = sizes(inputCfg.pixelFormat, 
>>> inputCfg.size);
>>> +    std::tie(outputConfig_.stride, outputConfig_.frameSize) =
>>> +        strideAndFrameSize(outputCfg.pixelFormat, outputCfg.size);
>>> +
>>> +    if (!outSizeRange.contains(outputCfg.size) || 
>>> outputConfig_.stride != outputCfg.stride) {
>>> +        LOG(Debayer, Error)
>>> +            << "Invalid output size/stride: "
>>> +            << "\n  " << outputCfg.size << " (" << outSizeRange << ")"
>>> +            << "\n  " << outputCfg.stride << " (" << 
>>> outputConfig_.stride << ")";
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    if (setDebayerFunctions(inputCfg.pixelFormat, 
>>> outputCfg.pixelFormat) != 0)
>>> +        return -EINVAL;
>>> +
>>> +    window_.x = ((inputCfg.size.width - outputCfg.size.width) / 2) &
>>> +            ~(inputConfig_.patternSize.width - 1);
>>> +    window_.y = ((inputCfg.size.height - outputCfg.size.height) / 2) &
>>> +            ~(inputConfig_.patternSize.height - 1);
>>> +    window_.width = outputCfg.size.width;
>>> +    window_.height = outputCfg.size.height;
>>> +
>>> +    /* Don't pass x,y since process() already adjusts src before 
>>> passing it */
>>> +    stats_->setWindow(Rectangle(window_.size()));
>>> +
>>> +    /* pad with patternSize.Width on both left and right side */
>>> +    lineBufferPadding_ = inputConfig_.patternSize.width * 
>>> inputConfig_.bpp / 8;
>>> +    lineBufferLength_ = window_.width * inputConfig_.bpp / 8 +
>>> +                2 * lineBufferPadding_;
>>> +    for (unsigned int i = 0;
>>> +         i < (inputConfig_.patternSize.height + 1) && 
>>> enableInputMemcpy_;
>>> +         i++) {
>>> +        free(lineBuffers_[i]);
>>> +        lineBuffers_[i] = (uint8_t *)malloc(lineBufferLength_);
>>> +        if (!lineBuffers_[i])
>>> +            return -ENOMEM;
>>> +    }
>>> +
>>> +    measuredFrames_ = 0;
>>> +    frameProcessTime_ = 0;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/*
>>> + * Get width and height at which the bayer-pattern repeats.
>>> + * Return pattern-size or an empty Size for an unsupported inputFormat.
>>> + */
>>> +Size DebayerCpu::patternSize(PixelFormat inputFormat)
>>> +{
>>> +    DebayerCpu::DebayerInputConfig config;
>>> +
>>> +    if (getInputConfig(inputFormat, config) != 0)
>>> +        return {};
>>> +
>>> +    return config.patternSize;
>>> +}
>>> +
>>> +std::vector<PixelFormat> DebayerCpu::formats(PixelFormat inputFormat)
>>> +{
>>> +    DebayerCpu::DebayerInputConfig config;
>>> +
>>> +    if (getInputConfig(inputFormat, config) != 0)
>>> +        return std::vector<PixelFormat>();
>>> +
>>> +    return config.outputFormats;
>>> +}
>>> +
>>> +std::tuple<unsigned int, unsigned int>
>>> +DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, 
>>> const Size &size)
>>> +{
>>> +    DebayerCpu::DebayerOutputConfig config;
>>> +
>>> +    if (getOutputConfig(outputFormat, config) != 0)
>>> +        return std::make_tuple(0, 0);
>>> +
>>> +    /* round up to multiple of 8 for 64 bits alignment */
>>> +    unsigned int stride = (size.width * config.bpp / 8 + 7) & ~7;
>>> +
>>> +    return std::make_tuple(stride, stride * size.height);
>>> +}
>>> +
>>> +void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[])
>>> +{
>>> +    const unsigned int patternHeight = inputConfig_.patternSize.height;
>>> +
>>> +    if (!enableInputMemcpy_)
>>> +        return;
>>> +
>>> +    for (unsigned int i = 0; i < patternHeight; i++) {
>>> +        memcpy(lineBuffers_[i], linePointers[i + 1] - 
>>> lineBufferPadding_,
>>> +               lineBufferLength_);
>>> +        linePointers[i + 1] = lineBuffers_[i] + lineBufferPadding_;
>>> +    }
>>> +
>>> +    /* Point lineBufferIndex_ to first unused lineBuffer */
>>> +    lineBufferIndex_ = patternHeight;
>>> +}
>>> +
>>> +void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], 
>>> const uint8_t *src)
>>> +{
>>> +    const unsigned int patternHeight = inputConfig_.patternSize.height;
>>> +
>>> +    for (unsigned int i = 0; i < patternHeight; i++)
>>> +        linePointers[i] = linePointers[i + 1];
>>> +
>>> +    linePointers[patternHeight] = src +
>>> +                      (patternHeight / 2) * (int)inputConfig_.stride;
>>> +}
>>> +
>>> +void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])
>>> +{
>>> +    const unsigned int patternHeight = inputConfig_.patternSize.height;
>>> +
>>> +    if (!enableInputMemcpy_)
>>> +        return;
>>> +
>>> +    memcpy(lineBuffers_[lineBufferIndex_], 
>>> linePointers[patternHeight] - lineBufferPadding_,
>>> +           lineBufferLength_);
>>> +    linePointers[patternHeight] = lineBuffers_[lineBufferIndex_] + 
>>> lineBufferPadding_;
>>> +
>>> +    lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);
>>> +}
>>> +
>>> +void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>> +{
>>> +    unsigned int y_end = window_.y + window_.height;
>>> +    /* Holds [0] previous- [1] current- [2] next-line */
>>> +    const uint8_t *linePointers[3];
>>> +
>>> +    /* Adjust src to top left corner of the window */
>>> +    src += window_.y * inputConfig_.stride + window_.x * 
>>> inputConfig_.bpp / 8;
>>> +
>>> +    /* [x] becomes [x - 1] after initial shiftLinePointers() call */
>>> +    if (window_.y) {
>>> +        linePointers[1] = src - inputConfig_.stride; /* 
>>> previous-line */
>>> +        linePointers[2] = src;
>>> +    } else {
>>> +        /* window_.y == 0, use the next line as prev line */
>>> +        linePointers[1] = src + inputConfig_.stride;
>>> +        linePointers[2] = src;
>>> +        /* Last 2 lines also need special handling */
>>> +        y_end -= 2;
>>> +    }
>>> +
>>> +    setupInputMemcpy(linePointers);
>>> +
>>> +    for (unsigned int y = window_.y; y < y_end; y += 2) {
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        stats_->processLine0(y, linePointers);
>>> +        (this->*debayer0_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        (this->*debayer1_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +    }
>>> +
>>> +    if (window_.y == 0) {
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        stats_->processLine0(y_end, linePointers);
>>> +        (this->*debayer0_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +
>>> +        shiftLinePointers(linePointers, src);
>>> +        /* next line may point outside of src, use prev. */
>>> +        linePointers[2] = linePointers[0];
>>> +        (this->*debayer1_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +    }
>>> +}
>>> +
>>> +void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>> +{
>>> +    const unsigned int y_end = window_.y + window_.height;
>>> +    /*
>>> +     * This holds pointers to [0] 2-lines-up [1] 1-line-up [2] 
>>> current-line
>>> +     * [3] 1-line-down [4] 2-lines-down.
>>> +     */
>>> +    const uint8_t *linePointers[5];
>>> +
>>> +    /* Adjust src to top left corner of the window */
>>> +    src += window_.y * inputConfig_.stride + window_.x * 
>>> inputConfig_.bpp / 8;
>>> +
>>> +    /* [x] becomes [x - 1] after initial shiftLinePointers() call */
>>> +    linePointers[1] = src - 2 * inputConfig_.stride;
>>> +    linePointers[2] = src - inputConfig_.stride;
>>> +    linePointers[3] = src;
>>> +    linePointers[4] = src + inputConfig_.stride;
>>> +
>>> +    setupInputMemcpy(linePointers);
>>> +
>>> +    for (unsigned int y = window_.y; y < y_end; y += 4) {
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        stats_->processLine0(y, linePointers);
>>> +        (this->*debayer0_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        (this->*debayer1_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        stats_->processLine2(y, linePointers);
>>> +        (this->*debayer2_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +
>>> +        shiftLinePointers(linePointers, src);
>>> +        memcpyNextLine(linePointers);
>>> +        (this->*debayer3_)(dst, linePointers);
>>> +        src += inputConfig_.stride;
>>> +        dst += outputConfig_.stride;
>>> +    }
>>> +}
>>> +
>>> +static inline int64_t timeDiff(timespec &after, timespec &before)
>>> +{
>>> +    return (after.tv_sec - before.tv_sec) * 1000000000LL +
>>> +           (int64_t)after.tv_nsec - (int64_t)before.tv_nsec;
>>> +}
>>> +
>>> +void DebayerCpu::process(FrameBuffer *input, FrameBuffer *output, 
>>> DebayerParams params)
>>> +{
>>> +    timespec frameStartTime;
>>> +
>>> +    if (measuredFrames_ < DebayerCpu::kLastFrameToMeasure) {
>>> +        frameStartTime = {};
>>> +        clock_gettime(CLOCK_MONOTONIC_RAW, &frameStartTime);
>>> +    }
>>> +
>>> +    /* Apply DebayerParams */
>>> +    if (params.gamma != gamma_correction_) {
>>> +        for (unsigned int i = 0; i < kGammaLookupSize; i++)
>>> +            gamma_[i] = UINT8_MAX * powf(i / (kGammaLookupSize - 
>>> 1.0), params.gamma);
>>> +
>>> +        gamma_correction_ = params.gamma;
>>> +    }
>>> +
>>> +    for (unsigned int i = 0; i < kRGBLookupSize; i++) {
>>> +        constexpr unsigned int div =
>>> +            kRGBLookupSize * DebayerParams::kGain10 / kGammaLookupSize;
>>> +        unsigned int idx;
>>> +
>>> +        /* Apply gamma after gain! */
>>> +        idx = std::min({ i * params.gainR / div, (kGammaLookupSize - 
>>> 1) });
>>> +        red_[i] = gamma_[idx];
>>> +
>>> +        idx = std::min({ i * params.gainG / div, (kGammaLookupSize - 
>>> 1) });
>>> +        green_[i] = gamma_[idx];
>>> +
>>> +        idx = std::min({ i * params.gainB / div, (kGammaLookupSize - 
>>> 1) });
>>> +        blue_[i] = gamma_[idx];
>>> +    }
>>> +
>>> +    /* Copy metadata from the input buffer */
>>> +    FrameMetadata &metadata = output->_d()->metadata();
>>> +    metadata.status = input->metadata().status;
>>> +    metadata.sequence = input->metadata().sequence;
>>> +    metadata.timestamp = input->metadata().timestamp;
>>> +
>>> +    MappedFrameBuffer in(input, MappedFrameBuffer::MapFlag::Read);
>>> +    MappedFrameBuffer out(output, MappedFrameBuffer::MapFlag::Write);
>>> +    if (!in.isValid() || !out.isValid()) {
>>> +        LOG(Debayer, Error) << "mmap-ing buffer(s) failed";
>>> +        metadata.status = FrameMetadata::FrameError;
>>> +        return;
>>> +    }
>>> +
>>> +    stats_->startFrame();
>>> +
>>> +    if (inputConfig_.patternSize.height == 2)
>>> +        process2(in.planes()[0].data(), out.planes()[0].data());
>>> +    else
>>> +        process4(in.planes()[0].data(), out.planes()[0].data());
>>> +
>>> +    metadata.planes()[0].bytesused = out.planes()[0].size();
>>> +
>>> +    /* Measure before emitting signals */
>>> +    if (measuredFrames_ < DebayerCpu::kLastFrameToMeasure &&
>>> +        ++measuredFrames_ > DebayerCpu::kFramesToSkip) {
>>> +        timespec frameEndTime = {};
>>> +        clock_gettime(CLOCK_MONOTONIC_RAW, &frameEndTime);
>>> +        frameProcessTime_ += timeDiff(frameEndTime, frameStartTime);
>>> +        if (measuredFrames_ == DebayerCpu::kLastFrameToMeasure) {
>>> +            const unsigned int measuredFrames = 
>>> DebayerCpu::kLastFrameToMeasure -
>>> +                                DebayerCpu::kFramesToSkip;
>>> +            LOG(Debayer, Info)
>>> +                << "Processed " << measuredFrames
>>> +                << " frames in " << frameProcessTime_ / 1000 << "us, "
>>> +                << frameProcessTime_ / (1000 * measuredFrames)
>>> +                << " us/frame";
>>> +        }
>>> +    }
>>> +
>>> +    stats_->finishFrame();
>>> +    outputBufferReady.emit(output);
>>> +    inputBufferReady.emit(input);
>>> +}
>>> +
>>> +SizeRange DebayerCpu::sizes(PixelFormat inputFormat, const Size 
>>> &inputSize)
>>> +{
>>> +    Size pattern_size = patternSize(inputFormat);
>>> +    unsigned int border_height = pattern_size.height;
>>> +
>>> +    if (pattern_size.isNull())
>>> +        return {};
>>> +
>>> +    /* No need for top/bottom border with a pattern height of 2 */
>>> +    if (pattern_size.height == 2)
>>> +        border_height = 0;
>>> +
>>> +    /*
>>> +     * For debayer interpolation a border is kept around the entire 
>>> image
>>> +     * and the minimum output size is pattern-height x pattern-width.
>>> +     */
>>> +    if (inputSize.width < (3 * pattern_size.width) ||
>>> +        inputSize.height < (2 * border_height + pattern_size.height)) {
>>> +        LOG(Debayer, Warning)
>>> +            << "Input format size too small: " << inputSize.toString();
>>> +        return {};
>>> +    }
>>> +
>>> +    return SizeRange(Size(pattern_size.width, pattern_size.height),
>>> +             Size((inputSize.width - 2 * pattern_size.width) & 
>>> ~(pattern_size.width - 1),
>>> +                  (inputSize.height - 2 * border_height) & 
>>> ~(pattern_size.height - 1)),
>>> +             pattern_size.width, pattern_size.height);
>>> +}
>>> +
>>> +} /* namespace libcamera */
>>> diff --git a/src/libcamera/software_isp/debayer_cpu.h 
>>> b/src/libcamera/software_isp/debayer_cpu.h
>>> new file mode 100644
>>> index 00000000..e0c4b9a8
>>> --- /dev/null
>>> +++ b/src/libcamera/software_isp/debayer_cpu.h
>>> @@ -0,0 +1,143 @@
>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
>>> +/*
>>> + * Copyright (C) 2023, Linaro Ltd
>>> + * Copyright (C) 2023, Red Hat Inc.
>>> + *
>>> + * Authors:
>>> + * Hans de Goede <hdegoede at redhat.com>
>>> + *
>>> + * debayer_cpu.h - CPU based debayering header
>>> + */
>>> +
>>> +#pragma once
>>> +
>>> +#include <memory>
>>> +#include <stdint.h>
>>> +#include <vector>
>>> +
>>> +#include <libcamera/base/object.h>
>>> +
>>> +#include "debayer.h"
>>> +#include "swstats_cpu.h"
>>> +
>>> +namespace libcamera {
>>> +
>>> +class DebayerCpu : public Debayer, public Object
>>> +{
>>> +public:
>>> +    DebayerCpu(std::unique_ptr<SwStatsCpu> stats);
>>> +    ~DebayerCpu();
>>> +
>>> +    int configure(const StreamConfiguration &inputCfg,
>>> +              const 
>>> std::vector<std::reference_wrapper<StreamConfiguration>> &outputCfgs);
>>> +    Size patternSize(PixelFormat inputFormat);
>>> +    std::vector<PixelFormat> formats(PixelFormat input);
>>> +    std::tuple<unsigned int, unsigned int>
>>> +    strideAndFrameSize(const PixelFormat &outputFormat, const Size 
>>> &size);
>>> +    void process(FrameBuffer *input, FrameBuffer *output, 
>>> DebayerParams params);
>>> +    SizeRange sizes(PixelFormat inputFormat, const Size &inputSize);
>>> +
>>> +    /**
>>> +     * \brief Get the file descriptor for the statistics.
>>> +     *
>>> +     * \return the file descriptor pointing to the statistics.
>>> +     */
>>> +    const SharedFD &getStatsFD() { return stats_->getStatsFD(); }
>>> +
>>> +    /**
>>> +     * \brief Get the output frame size.
>>> +     *
>>> +     * \return The output frame size.
>>> +     */
>>> +    unsigned int frameSize() { return outputConfig_.frameSize; }
>>> +
>>> +private:
>>> +    /**
>>> +     * \brief Called to debayer 1 line of Bayer input data to output 
>>> format
>>> +     * \param[out] dst Pointer to the start of the output line to write
>>> +     * \param[in] src The input data
>>> +     *
>>> +     * Input data is an array of (patternSize_.height + 1) src
>>> +     * pointers each pointing to a line in the Bayer source. The middle
>>> +     * element of the array will point to the actual line being 
>>> processed.
>>> +     * Earlier element(s) will point to the previous line(s) and later
>>> +     * element(s) to the next line(s).
>>> +     *
>>> +     * These functions take an array of src pointers, rather then
>>> +     * a single src pointer + a stride for the source, so that when 
>>> the src
>>> +     * is slow uncached memory it can be copied to faster memory before
>>> +     * debayering. Debayering a standard 2x2 Bayer pattern requires 
>>> access
>>> +     * to the previous and next src lines for interpolating the missing
>>> +     * colors. To allow copying the src lines only once 3 buffers each
>>> +     * holding a single line are used, re-using the oldest buffer for
>>> +     * the next line and the pointers are swizzled so that:
>>> +     * src[0] = previous-line, src[1] = currrent-line, src[2] = 
>>> next-line.
>>> +     * This way the 3 pointers passed to the debayer functions form
>>> +     * a sliding window over the src avoiding the need to copy each
>>> +     * line more then once.
>>> +     *
>>> +     * Similarly for bayer patterns which repeat every 4 lines, 5 src
>>> +     * pointers are passed holding: src[0] = 2-lines-up, src[1] = 
>>> 1-line-up
>>> +     * src[2] = current-line, src[3] = 1-line-down, src[4] = 
>>> 2-lines-down.
>>> +     */
>>> +    typedef void (DebayerCpu::*debayerFn)(uint8_t *dst, const 
>>> uint8_t *src[]);
>>> +
>>> +    /* CSI-2 packed 10-bit raw bayer format (all the 4 orders) */
>>> +    void debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);
>>> +    void debayer10P_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]);
>>> +    void debayer10P_GBGB_BGR888(uint8_t *dst, const uint8_t *src[]);
>>> +    void debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t *src[]);
>>> +
>>> +    struct DebayerInputConfig {
>>> +        Size patternSize;
>>> +        unsigned int bpp; /* Memory used per pixel, not precision */
>>> +        unsigned int stride;
>>> +        std::vector<PixelFormat> outputFormats;
>>> +    };
>>> +
>>> +    struct DebayerOutputConfig {
>>> +        unsigned int bpp; /* Memory used per pixel, not precision */
>>> +        unsigned int stride;
>>> +        unsigned int frameSize;
>>> +    };
>>> +
>>> +    int getInputConfig(PixelFormat inputFormat, DebayerInputConfig 
>>> &config);
>>> +    int getOutputConfig(PixelFormat outputFormat, 
>>> DebayerOutputConfig &config);
>>> +    int setDebayerFunctions(PixelFormat inputFormat, PixelFormat 
>>> outputFormat);
>>> +    void setupInputMemcpy(const uint8_t *linePointers[]);
>>> +    void shiftLinePointers(const uint8_t *linePointers[], const 
>>> uint8_t *src);
>>> +    void memcpyNextLine(const uint8_t *linePointers[]);
>>> +    void process2(const uint8_t *src, uint8_t *dst);
>>> +    void process4(const uint8_t *src, uint8_t *dst);
>>> +
>>> +    static constexpr unsigned int kGammaLookupSize = 1024;
>>> +    static constexpr unsigned int kRGBLookupSize = 256;
>>> +    /* Max. supported Bayer pattern height is 4, debayering this 
>>> requires 5 lines */
>>> +    static constexpr unsigned int kMaxLineBuffers = 5;
>>> +
>>> +    std::array<uint8_t, kGammaLookupSize> gamma_;
>>> +    std::array<uint8_t, kRGBLookupSize> red_;
>>> +    std::array<uint8_t, kRGBLookupSize> green_;
>>> +    std::array<uint8_t, kRGBLookupSize> blue_;
>>> +    debayerFn debayer0_;
>>> +    debayerFn debayer1_;
>>> +    debayerFn debayer2_;
>>> +    debayerFn debayer3_;
>>> +    Rectangle window_;
>>> +    DebayerInputConfig inputConfig_;
>>> +    DebayerOutputConfig outputConfig_;
>>> +    std::unique_ptr<SwStatsCpu> stats_;
>>> +    uint8_t *lineBuffers_[kMaxLineBuffers];
>>> +    unsigned int lineBufferLength_;
>>> +    unsigned int lineBufferPadding_;
>>> +    unsigned int lineBufferIndex_;
>>> +    bool enableInputMemcpy_;
>>> +    float gamma_correction_;
>>> +    unsigned int measuredFrames_;
>>> +    int64_t frameProcessTime_;
>>> +    /* Skip 30 frames for things to stabilize then measure 30 frames */
>>> +    static constexpr unsigned int kFramesToSkip = 30;
>>> +    static constexpr unsigned int kLastFrameToMeasure = 60;
>>> +};
>>> +
>>> +} /* namespace libcamera */
>>> diff --git a/src/libcamera/software_isp/meson.build 
>>> b/src/libcamera/software_isp/meson.build
>>> index 92fc90f3..dd5e70c1 100644
>>> --- a/src/libcamera/software_isp/meson.build
>>> +++ b/src/libcamera/software_isp/meson.build
>>> @@ -2,5 +2,6 @@
>>>   libcamera_sources += files([
>>>       'debayer.cpp',
>>> +    'debayer_cpu.cpp',
>>>       'swstats_cpu.cpp',
>>>   ])
>>

-- 
Regards,

Stefan Klug



More information about the libcamera-devel mailing list