[PATCH v6 08/18] libcamera: software_isp: Add DebayerCpu class
Laurent Pinchart
laurent.pinchart at ideasonboard.com
Wed Mar 27 20:37:29 CET 2024
Hi Hans,
On Wed, Mar 27, 2024 at 07:08:02PM +0100, Hans de Goede wrote:
> On 3/27/24 3:12 PM, Laurent Pinchart wrote:
> > Hi Milan and Hans,
> >
> > Thank you for the patch.
>
> Same as with the previous review: Thank you for the review.
>
> Milan is going to prepare a v7 of this series, so I'm going to <snip>
> all the trivial remarks and only comment on the remaining remarks.
>
> > On Tue, Mar 19, 2024 at 01:35:55PM +0100, Milan Zamazal wrote:
> >> From: Hans de Goede <hdegoede at redhat.com>
>
> <snip>
>
> >> +{
> >> +#ifdef __x86_64__
> >> + enableInputMemcpy_ = false;
> >> +#else
> >> + enableInputMemcpy_ = true;
> >> +#endif
> >
> > This is very much *not* nice :-( It needs to at least be explained
> > clearly somewhere.
>
> Yeah this really needs to be a parameter to the SoftwareISP
> constructor because some non x86 designs may also have
> fully cached buffers, so they can also skip the extra
> memcpy step.
>
> My suggestion would be to introduce a struct SoftwareIspParams
> with only a bool enableInputMemcpy_ in there for now and
> then in the simplepipeline handler instead of having a bool
> flag to indicate if the SoftwareISP should be list for a specific
> media-controller driver-name, have a pointer to this struct and
> have that pointer being non NULL indicate the swisp should
> be used.
>
> For now we would then have 2 SoftwareIspParams structs
> this pointer can point to (uncached and cached) but in the future
> we may very well have 1 SoftwareIspParams struct per supported
> mediactl driver.
>
> Actually since the IPU6 kernel side has not been merged yet,
> for now we can just replace this with a hardcoded:
>
> enableInputMemcpy_ = true;
>
> Milan, lets do that for the v7 submission.
Could you please also add a comment that explains why we want the memcpy
to be configurable ?
> Then I can add some mechanism to config this as a follow up
> patch when IPU6 has actually landed in the kernel and thus
> we can enable it in libcamera upstream.
>
> <snip>
>
> >> +void DebayerCpu::debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[])
> >> +{
> >> + const int width_in_bytes = window_.width * 5 / 4;
> >
> > snakeCase here too, ane where applicable everywhere else.
> >
> >> + const uint8_t *prev = (const uint8_t *)src[0];
> >
> > Is the cast needed ? If so it should be a C++ cast.
>
> For the set of debayer10P_* functions introduced as the first
> supported bayer fmt these casts are not necessary and can
> be dropped.
>
> In "[PATCH v6 14/18] libcamera: debayer_cpu: Add support for 8, 10
> and 12 bpp unpacked bayer input" there are more casts and these
> are actually necessary and will need to be replaced with
> static c++ style casts.
Right, I hadn't seen that yet when reviewing this patch.
> >> + LOG(Debayer, Info)
> >
> > Shouldn't this be an Error mesage ? Same below.
>
> These functions are called when the simple pipeline handler
> is discovering supported formats so on CSI receivers or sensors
> supporting multiple formats this happen for one of the fmts
> is sorta expected behavior.
>
> I don't feel strongly about this though, so if you want this
> to be upgraded to an Error we can do that.
If it's an expected behaviour, I would make this a debug message then.
> >> + << "Unsupported input format " << inputFormat.toString();
> >> + return -EINVAL;
> >> +}
>
> <snip>
>
> >> + /* Measure before emitting signals */
> >> + if (measuredFrames_ < DebayerCpu::kLastFrameToMeasure &&
> >> + ++measuredFrames_ > DebayerCpu::kFramesToSkip) {
> >> + timespec frameEndTime = {};
> >> + clock_gettime(CLOCK_MONOTONIC_RAW, &frameEndTime);
> >> + frameProcessTime_ += timeDiff(frameEndTime, frameStartTime);
> >> + if (measuredFrames_ == DebayerCpu::kLastFrameToMeasure) {
> >> + const unsigned int measuredFrames = DebayerCpu::kLastFrameToMeasure -
> >> + DebayerCpu::kFramesToSkip;
> >> + LOG(Debayer, Info)
> >> + << "Processed " << measuredFrames
> >> + << " frames in " << frameProcessTime_ / 1000 << "us, "
> >> + << frameProcessTime_ / (1000 * measuredFrames)
> >> + << " us/frame";
> >> + }
> >> + }
> >
> > Is this debugging leftovers, or should it be cleaned up ? It seems quite
> > a bit of a hack. If we want to enable measurements, we shouldn't
> > hardcode them to frames 30 to 60.
>
> This is meant to be kept around, also see:
>
> [PATCH v6 16/18] libcamera: Add "Software ISP benchmarking" documentation
>
> the SoftwareISP is sensitive to performance regressions so this is
> intended as a simple builtin benchmark to check for those.
>
> So that means this falls under the "or should it be cleaned up" part
> of your question.
>
> You indicate you don't like the hardcoded frames 30 - 60. The idea is
> to warm up the caches a bit and then measure a bunch of frames, which
> is just what this does. What alternative approach are you suggesting?
I wonder if there would be a way to control at runtime when/how to
perform those measurements. Maybe that's a bit overkill. In any case, it
would be something to address later.
> >> + /**
> >> + * \brief Get the file descriptor for the statistics.
> >> + *
> >> + * \return the file descriptor pointing to the statistics.
> >> + */
> >> + const SharedFD &getStatsFD() { return stats_->getStatsFD(); }
> >
> > This,
>
> Note the statistics pass-through stuff is sort of a necessary evil
> since we want one main loop going over the data line by line and
> doing both debayering as well as stats while the line is still
> hot in the l2 cache. And things like the process2() and process4()
> loops are highly CPU debayering specific so I don't think we should
> move those out of the CpuDebayer code.
Yes, that I understood from the review. "necessary evil" is indeed the
right term :-) I expect it will take quite some design skills to balance
the need for performances and the need for a maintainable architecture.
> > plus the fact that this class handles colour gains and gamma,
> > makes me thing we have either a naming issue, or an architecture issue.
>
> I agree that this does a bit more then debayering, although
> the debayering really is the main thing it does.
>
> I guess the calculation of the rgb lookup tables which do the
> color gains and gamma could be moved outside of this class,
> that might even be beneficial for GPU based debayering assuming
> that that is going to use rgb lookup tables too (it could
> implement actual color gains + gamma correction in some different
> way).
>
> I think this falls under the lets wait until we have a GPU
> based SoftISP MVP/POC and then do some refactoring to see which
> bits should go where.
As long as we record this in the todo file (with a short summary of the
above), that's fine with me.
--
Regards,
Laurent Pinchart
More information about the libcamera-devel
mailing list