[libcamera-devel] [RFC PATCH] ipa: ipu3: Clear incoming parameter use flags
Laurent Pinchart
laurent.pinchart at ideasonboard.com
Fri Sep 10 22:15:23 CEST 2021
Hi Kieran,
On Fri, Sep 10, 2021 at 09:05:42PM +0100, Kieran Bingham wrote:
> On 10/09/2021 17:38, Laurent Pinchart wrote:
> > On Fri, Sep 10, 2021 at 05:24:12PM +0100, Kieran Bingham wrote:
> >> On 10/09/2021 16:49, Kieran Bingham wrote:
> >>> The incoming params buffer may contain uninitialised data, or the
> >>> parameters of previously queued frames. Clearing the entire buffer
> >>> may be an expensive operation, and the kernel will only read from
> >>> structures which have their associated use-flag set.
> >>>
> >>> It is the responsibility of the algorithms to set the use flags
> >>> accordingly for any data structure they update during prepare().
> >>
> >> To extend this commit message from the introduction:
> >>
> >> "Clear the use flags of the parameter buffer before passing the buffer
> >> to the algorithms during their prepare() operations."
> >>
> >>> Signed-off-by: Kieran Bingham <kieran.bingham at ideasonboard.com>
> >
> > Reviewed-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
> >
> > But... this has led me to check ipu3_uapi_flags. It's defined as
> > follows:
> >
> > struct ipu3_uapi_flags {
> > __u32 gdc:1;
> > __u32 obgrid:1;
> > __u32 reserved1:30;
> >
> > __u32 acc_bnr:1;
> > __u32 acc_green_disparity:1;
> > __u32 acc_dm:1;
> > __u32 acc_ccm:1;
> > __u32 acc_gamma:1;
> > __u32 acc_csc:1;
> > __u32 acc_cds:1;
> > __u32 acc_shd:1;
> > __u32 reserved2:2;
> > __u32 acc_iefd:1;
> > __u32 acc_yds_c0:1;
> > __u32 acc_chnr_c0:1;
> > __u32 acc_y_ee_nr:1;
> > __u32 acc_yds:1;
> > __u32 acc_chnr:1;
> > __u32 acc_ytm:1;
> > __u32 acc_yds2:1;
> > __u32 acc_tcc:1;
> > __u32 acc_dpc:1;
> > __u32 acc_bds:1;
> > __u32 acc_anr:1;
> > __u32 acc_awb_fr:1;
> > __u32 acc_ae:1;
> > __u32 acc_af:1;
> > __u32 acc_awb:1;
> > __u32 reserved3:4;
> >
> > __u32 lin_vmem_params:1;
> > __u32 tnr3_vmem_params:1;
> > __u32 xnr3_vmem_params:1;
> > __u32 tnr3_dmem_params:1;
> > __u32 xnr3_dmem_params:1;
> > __u32 reserved4:1;
> > __u32 obgrid_param:1;
> > __u32 reserved5:25;
> > } __attribute__((packed));
>
> pahole gives:
>
> > struct ipu3_uapi_flags {
> > __u32 gdc:1; /* 0: 0 4 */
> > __u32 obgrid:1; /* 0: 1 4 */
> > __u32 reserved1:30; /* 0: 2 4 */
> > __u32 acc_bnr:1; /* 4: 0 4 */
> > __u32 acc_green_disparity:1; /* 4: 1 4 */
> > __u32 acc_dm:1; /* 4: 2 4 */
> > __u32 acc_ccm:1; /* 4: 3 4 */
> > __u32 acc_gamma:1; /* 4: 4 4 */
> > __u32 acc_csc:1; /* 4: 5 4 */
> > __u32 acc_cds:1; /* 4: 6 4 */
> > __u32 acc_shd:1; /* 4: 7 4 */
> > __u32 reserved2:2; /* 4: 8 4 */
> > __u32 acc_iefd:1; /* 4:10 4 */
> > __u32 acc_yds_c0:1; /* 4:11 4 */
> > __u32 acc_chnr_c0:1; /* 4:12 4 */
> > __u32 acc_y_ee_nr:1; /* 4:13 4 */
> > __u32 acc_yds:1; /* 4:14 4 */
> > __u32 acc_chnr:1; /* 4:15 4 */
> > __u32 acc_ytm:1; /* 4:16 4 */
> > __u32 acc_yds2:1; /* 4:17 4 */
> > __u32 acc_tcc:1; /* 4:18 4 */
> > __u32 acc_dpc:1; /* 4:19 4 */
> > __u32 acc_bds:1; /* 4:20 4 */
> > __u32 acc_anr:1; /* 4:21 4 */
> > __u32 acc_awb_fr:1; /* 4:22 4 */
> > __u32 acc_ae:1; /* 4:23 4 */
> > __u32 acc_af:1; /* 4:24 4 */
> > __u32 acc_awb:1; /* 4:25 4 */
> > __u32 reserved3:4; /* 4:26 4 */
> > __u32 lin_vmem_params:1; /* 4:30 4 */
> > __u32 tnr3_vmem_params:1; /* 4:31 4 */
> > __u32 xnr3_vmem_params:1; /* 8: 0 4 */
> > __u32 tnr3_dmem_params:1; /* 8: 1 4 */
> > __u32 xnr3_dmem_params:1; /* 8: 2 4 */
> > __u32 reserved4:1; /* 8: 3 4 */
> > __u32 obgrid_param:1; /* 8: 4 4 */
> > __u32 reserved5:25; /* 8: 5 4 */
> >
> > /* size: 12, cachelines: 1, members: 37 */
> > /* bit_padding: 2 bits */
> > /* last cacheline: 12 bytes */
> > };
>
> and it's used in the params structure:
>
> > struct ipu3_uapi_params {
> > struct ipu3_uapi_flags use __attribute__((__aligned__(32))); /* 0 12 */
> > struct ipu3_uapi_acc_param acc_param __attribute__((__aligned__(1))); /* 12 37408 */
> >
> > /* XXX last struct has 12 bytes of padding */
> >
> > /* --- cacheline 584 boundary (37376 bytes) was 44 bytes ago --- */
>
> So indeed, it's visibly only 12 bytes
>
> > The first group spans 32 bits (1+1+32). So does the last group
> > (1+1+1+1+1+1+1+25). Guess how many bits the second group spans? Hint:
> > it's not 32. I wonder if this operates as expected.
>
> This structure is read by the kernel, and userspace, not hardware as far
> as I'm aware.
>
> So both sides will be using the same offsets/alignments and padding to
> parse the structures.
>
> What part do you anticipate is not operating as expected?
Ah I was missing that part. I think the structure was intended to store
lin_vmem_params in the first bit of the last u32. But if it's not parsed
by hardware (or firmware), it's probably harmless.
> There are 2 bits of padding exposed by the structure. Is that what you
> are worried about?
>
> I would indeed expect that last reserved5 to be reserved5:27 if someone
> was trying to explicitly pad, but as long as the structures are
> identical on the user and kernel side they should use the same offsets.
I'd expect reserved3 to be 6 bits long instead. We could fix it, but
that will break the ABI.
> >>> ---
> >>>
> >>> Yes, the commit message is the same as the comment before the line.
> >>> I felt the text was worthy of documenting the clearing of the flags, and
> >>> ensuring that it's documented in the code that the algorithms are
> >>> responsible for setting their use flag of any structure they modify.
> >>>
> >>> Note that this is sent compile tested only, as it's something I noticed,
> >>> while writing documentation and wanted to check.
> >>>
> >>> src/ipa/ipu3/ipu3.cpp | 11 +++++++++++
> >>> 1 file changed, 11 insertions(+)
> >>>
> >>> diff --git a/src/ipa/ipu3/ipu3.cpp b/src/ipa/ipu3/ipu3.cpp
> >>> index c925cf642611..30d2a53903ec 100644
> >>> --- a/src/ipa/ipu3/ipu3.cpp
> >>> +++ b/src/ipa/ipu3/ipu3.cpp
> >>> @@ -457,6 +457,17 @@ void IPAIPU3::processControls([[maybe_unused]] unsigned int frame,
> >>>
> >>> void IPAIPU3::fillParams(unsigned int frame, ipu3_uapi_params *params)
> >>> {
> >>> + /*
> >>> + * The incoming params buffer may contain uninitialised data, or the
> >>> + * parameters of previously queued frames. Clearing the entire buffer
> >>> + * may be an expensive operation, and the kernel will only read from
> >>> + * structures which have their associated use-flag set.
> >>> + *
> >>> + * It is the responsibility of the algorithms to set the use flags
> >>> + * accordingly for any data structure they update during prepare().
> >>> + */
> >>> + params->use = {};
> >>> +
> >>> for (auto const &algo : algorithms_)
> >>> algo->prepare(context_, params);
> >>>
--
Regards,
Laurent Pinchart
More information about the libcamera-devel
mailing list