[PATCH v6 06/18] libcamera: software_isp: Add SwStatsCpu class

Laurent Pinchart laurent.pinchart at ideasonboard.com
Wed Mar 27 20:25:05 CET 2024


On Wed, Mar 27, 2024 at 04:18:45PM +0100, Hans de Goede wrote:
> On 3/27/24 2:51 PM, Laurent Pinchart wrote:
> > Hi Milan and Hans,
> > 
> > Thank you for the patch.
> > 
> > I understand all this is work in progress, so I won't comment on all the
> > improvements we could add in the future, but focus on merging this as
> > quickly as possible (aiming for one last new version for the series to
> > address simple issues only).
> 
> Thank you for the review. Milan is going to prepare a v7 of this
> series, so I'm going to <snip> all the trivial remarks and only
> comment on the remaining remarks.
> 
> > On Tue, Mar 19, 2024 at 01:35:53PM +0100, Milan Zamazal wrote:
> >> From: Hans de Goede <hdegoede at redhat.com>
> >>
> >> Add a CPU based SwStats implementation for SoftwareISP / SoftIPA use.
> >>
> >> This implementation offers a configure function + functions to gather
> >> statistics on a line by line basis. This allows CPU based software
> >> debayering to call into interlace debayering and statistics gathering
> > 
> > I think you meant interleave, not interlace.
> 
> Ack.
> 
> <snip>
> 
> >> +struct SwIspStats {
> >> +	/**
> >> +	 * \brief Holds the sum of all sampled red pixels.
> > 
> > Does it wrap around or saturate in case of overflow ?
> 
> Wrap, the idea being that it needs to be big enough to
> handle max-pixel-value * number-of-samples pixels.
> 
> Saturate would require an conditional for each samples
> pixel and benchmarking has shown that adding any conditionals
> which run for each pixel greatly slow things down.

I'm fine with wrap-around, but let's document it.

> > 
> >> +	 */
> >> +	unsigned long sumR_;
> > 
> > Any reason for using a platform-dependent type instead of uint32_t or
> > uint64_t ?
> 
> Not really, the sums really should be uint64_t-s to avoid
> the mentioned wrapping. Milan can you change the type here
> to uint64_t please.
> 
> >> +	/**
> >> +	 * \brief Holds the sum of all sampled green pixels.
> >> +	 */
> >> +	unsigned long sumG_;
> >> +	/**
> >> +	 * \brief Holds the sum of all sampled blue pixels.
> >> +	 */
> >> +	unsigned long sumB_;
> >> +	/**
> >> +	 * \brief Number of bins in the yHistogram.
> >> +	 */
> >> +	static constexpr unsigned int kYHistogramSize = 16;
> >> +	/**
> >> +	 * \brief A histogram of luminance values.
> >> +	 */
> >> +	std::array<unsigned int, kYHistogramSize> yHistogram;
> > 
> > Same comments here.
> 
> This also wraps, but since a single bin is increased by 1
> for a single pixel in the worst case scenario a single bin
> gets increased to the total amount of sampled pixels which
> will easily fit in an uint32_t, and this is an array so
> size matters (for cacheline usage) so lets make this
> an uint32_t array.
> 
> >> +/**
> >> + * \brief Reset state to start statistics gathering for a new frame.
> >> + *
> >> + * This may only be called after a successful setWindow() call.
> > 
> > It would be nice to design the API in a way that makes this kind of
> > logic error impossible, or at least more difficult.
> 
> Ack, I guess we can check for an empty window and then log
> an error?

Or maybe set a default window in configure() ?

> >> + */
> >> +void SwStatsCpu::startFrame(void)
> >> +{
> >> +	stats_.sumR_ = 0;
> >> +	stats_.sumB_ = 0;
> >> +	stats_.sumG_ = 0;
> >> +	stats_.yHistogram.fill(0);
> >> +}
> >> +
> >> +/**
> >> + * \brief Finish statistics calculation for the current frame.
> >> + *
> >> + * This may only be called after a successful setWindow() call.
> >> + */
> >> +void SwStatsCpu::finishFrame(void)
> >> +{
> >> +	*sharedStats_ = stats_;
> > 
> > Is it more efficient to copy the stats instead of operating directly on
> > the shared memory ?
> 
> I inherited doing things this way from Andrey. I kept this because
> we don't really have any synchronization with the IPA reading this.
> 
> So the idea is to only touch this when the next set of statistics
> is ready since we don't know when the IPA is done with accessing
> the previous set of statistics ...
> 
> This is both something which seems mostly a theoretic problem,
> yet also definitely something which I think we need to fix.
> 
> Maybe use a ringbuffer of stats buffers and pass the index into
> the ringbuffer to the emit signal ?

That would match how we deal with hardware ISPs, and I think that's a
good idea. It will help decoupling the processing side from the IPA.

> Milan can you add a SoftwareISP-TODO document for the next
> version and mention this as something which we need to look at
> in the future ?
> 
> >> +	statsReady.emit(0);
> 
> <snip>
> 
> >> +	int configure(const StreamConfiguration &inputCfg);
> >> +	void setWindow(Rectangle window);
> > 
> > const Rectangle &window
> > 
> > Any reason to separate the two functions ?
> 
> The inputCfg gets set once before starting streaming, the window
> can be updated on the fly, to e.g. track a face (if we ever add
> face tracking).

Good reason.

> >> +	void startFrame();
> >> +	void finishFrame();
> >> +
> >> +	/**
> >> +	 * \brief Process line 0.
> >> +	 * \param[in] y The y coordinate.
> >> +	 * \param[in] src The input data.
> >> +	 *
> >> +	 * This function processes line 0 for input formats with patternSize height == 1.
> >> +	 * It'll process line 0 and 1 for input formats with patternSize height >= 2.
> >> +	 * This function may only be called after a successful setWindow() call.
> >> +	 */
> >> +	void processLine0(unsigned int y, const uint8_t *src[])
> >> +	{
> >> +		if ((y & ySkipMask_) || y < (unsigned int)window_.y ||
> > 
> > C++ casts please.
> > 
> >> +		    y >= (window_.y + window_.height))
> >> +			return;
> >> +
> >> +		(this->*stats0_)(src);
> >> +	}
> > 
> > Does making the function inline improve performances ?
> 
> That is the idea yes. I've not measured this separately but this
> was part of a bigger rework for performance reasons and that
> entire rework did speed up things by a factor 2x - 3x .
> 
> >> +
> >> +	/**
> >> +	 * \brief Process line 2 and 3.
> >> +	 * \param[in] y The y coordinate.
> >> +	 * \param[in] src The input data.
> >> +	 *
> >> +	 * This function processes line 2 and 3 for input formats with patternSize height == 4.
> >> +	 * This function may only be called after a successful setWindow() call.
> >> +	 */
> >> +	void processLine2(unsigned int y, const uint8_t *src[])
> >> +	{
> >> +		if ((y & ySkipMask_) || y < (unsigned int)window_.y ||
> >> +		    y >= (window_.y + window_.height))
> >> +			return;
> >> +
> >> +		(this->*stats2_)(src);
> >> +	}
> >> +
> >> +	/**
> >> +	 * \brief Signals that the statistics are ready.
> >> +	 *
> >> +	 * The int parameter isn't actually used.
> > 
> > Then drop it :-)
> > 
> > 	Signal<> statsReady;
> > 
> > is perfectly valid.
> 
> Ah so just empty? This was remarked on a previous version (possibly by you?)
> and I think I tried adding <void> as the more of a C-programmer that I am
> and that definitely did not work.

Yes, if you don't want any argument to the signal just leave it empty. I
initially tried to write Signal<void> when implementing the Signal
class, but realized that syntax wouldn't work.

> Milan can you give this a try.
> 
> > But better, I wonder if the signal could be dropped completely. The
> > SwStatsCpu class does not operate asynchronously. Shouldn't whoever
> > calls the finishFrame() function then handle emitting the signal ?
> > 
> > Now, the trouble is that this would be the DebayerCpu class, whose name
> > doesn't indicate as a prime candidate to handle stats. However, it
> > already exposes a getStatsFD() function, so we're already calling for
> > trouble :-) Either that should be moved to somewhere else, or the class
> > should be renamed. Considering that the class applies colour gains in
> > addition to performing the interpolation, it may be more of a naming
> > issue.
> > 
> > Removing the signal and refactoring those classes doesn't have to be
> > addressed now, I think it would be part of a larger refactoring
> > (possibly also considering platforms that have no ISP but can produce
> > stats in hardware, such as the i.MX7), but please keep it on your radar.
> 
> Right, I think this is all stuff to add to the SoftwareISP-TODO doc
> which I suggested above.

Yes, I'm fine with that.

-- 
Regards,

Laurent Pinchart


More information about the libcamera-devel mailing list