[libcamera-devel] libcamera transform control

Tue Jul 21 11:19:30 CEST 2020

Hi again

Replying again to my own discussion (I seem to be making a habit of this...)

On Mon, 20 Jul 2020 at 08:59, David Plowman
<david.plowman at raspberrypi.com> wrote:
>
> Hi Laurent, Naush and everyone!
>
> Thanks for all the discussion while I was away!
>
> On Mon, 13 Jul 2020 at 23:27, Laurent Pinchart
> <laurent.pinchart at ideasonboard.com> wrote:
> >
> > Hi Naush,
> >
> > On Mon, Jul 13, 2020 at 09:16:19AM +0100, Naushir Patuck wrote:
> > > Hi Laurent,
> > >
> > > David is away this week, so I'll reply with some thoughts and comments
> > > in his absence.
> > >
> > > On Sun, 12 Jul 2020 at 00:28, Laurent Pinchart wrote:
> > > > On Thu, Jul 09, 2020 at 09:13:56AM +0100, David Plowman wrote:
> > > > > Replying to my own post to add comments from Naush and Jacopo!
> > > > >
> > > > > On Mon, 6 Jul 2020 at 14:32, David Plowman wrote:
> > > > > >
> > > > > > Hi everyone
> > > > > >
> > > > > > This email takes the previous discussion of image transformations and
> > > > > > moves it into a thread of its own.
> > > > > >
> > > > > > The problem at hand is how to specify the 2D planar transformation to
> > > > > > be applied to a camera image. These are well known to us as flips,
> > > > > > rotations and so on (the usual eight members of dihedral group D4).
> > > > > >
> > > > > > Laurent's suggestion was to apply them as standard libcamera controls.
> > > > > > Some platforms may be able to apply them on a per-frame basis; others
> > > > > > will not (they will require the transform to be set before the camera
> > > > > > is started - once this mechanism exists). (Is there anyway to signal a
> > > > > > platform's capabilities in this regard?)
> > > >
> > > > Not at the moment, but I think we should add that information to the
> > > > ControlInfo class. It should be fairly straightforward, we can add a
> > > > flags field, with a single field defined for now to tell whether the
> > > > control can be modified per frame or has to be set before starting the
> > > > camera.
> > >
> > > That would work.  However, I do struggle to see what would require an
> > > application to change the orientation on a per-frame basis.  In
> > > particular, if the change in orientation requires a transpose
> > > operation, then the buffer sizing might also change due to alignment
> > > constraints, and these would almost certainly require a set of buffer
> > > re-allocations unless you pre-allocate for the largest possible size.
> > > Of course, that is not to say that there may be other controls that
> > > can only be set on startup (I cannot think of any specific ones right
> > > now).
> >
> > While we could in theory support a change of rotation at runtime
> > (assuming the kernel drivers and APIs would let us do so), I don't think
> > that would be particularly valuable. The main use case I could image
> > would be flipping the image horizontally, a.k.a. mirroring, for video
> > conferencing use cases where the user may want to enable or disable
> > mirroring for the local display.  One could however argue that platforms
> > used for video conferencing with a local display would likely have a GPU
> > that would handle the composition, making mirroring easy to implement on
> > the display side, but there could be other similar use cases on less
> > powerful platforms. We could force a stop/start sequence in that case,
> > but are there drawbacks in allowing this to be changed at runtime ?
>
> To me it seems a reasonable compromise to go with the standard Control
> mechanism, but perhaps to add a flag indicating whether you have to
> apply the control _before_ starting the camera, or whether you can
> apply it after. As Naush says, it could get quite hairy trying to do
> it in the sensor, or if you had an inline ISP that was doing it, and
> figuring out which frame after the request would actually have the
> change might be awkward.

I wanted to add a couple of further comments to this. Quite commonly,
transforms may be implemented in the sensor. This means that whenever
you set or change the transform, the pixel format of the raw stream
may change too. In fact I've already seen this kind of problem - under
some circumstances the Bayer order recorded in qcam's DNG files can be
wrong.

So we need to consider the interaction of the transform control with
the raw stream format. Specifically:

* If we change the transform at run time, are we prepared for the raw
stream format to change spontaneously? Would requests have to contain
metadata saying what the pixel format of the raw stream was for that
capture?

* We've talked about passing an initial ControlList into either the
configure or start method. Whichever approach one takes, you've got to
be careful about exactly when the raw stream pixel format may or may
nor be correct.

All this keeps nudging me back to an initial thought that the
transform should be a bona fide field in the stream configuration. You
set it before calling configure, and it never changes. All these
problems simply disappear.

Thoughts?

Thanks! (and sorry for being awkward...)
David

>
> >
> > > > > > For the time being we are ignoring the possibility that some platforms
> > > > > > might be able to apply different transformations to different streams
> > > > > > (I believe there is in any case no mechanism for per-stream controls).
> > > >
> > > > Indeed. That's something we may add in the future, but as let's try to
> > > > avoid it if possible :-)
> > > >
> > > > > > Note also that raw streams will always have the orientation with which
> > > > > > they come out of the camera, though again I don't believe we have a
> > > > > > convenient way for a platform to indicate whether this includes the
> > > > > > transform or not  (perhaps a stream could indicate its transform
> > > > > > relative to the requested transform?).
>
> How do folks feel about this? It seems like a useful feature to me.
>
> > > > > >
> > > > > > We propose to represent the transforms as "int" controls, in fact
> > > > > > being members of an enum with eight entries. Further, we suggest that
> > > > > > the first four entries are "identity", "hflip", "vflip", "h+vflip",
> > > > > > meaning the control's maximum value can indicate whether transforms
> > > > > > that involve a transposition are excluded.
> > > >
> > > > An enum sounds good. How would you name the next our entries ? :-)
>
> So I think I'd have a class or enum class (or equivalent) using 3 bits
> to represent a transform. For example, bit 0 = hflip, bit 1 = vflip,
> bit 2 = transpose. By convention we might say that the transpose is
> applied after the flips. I'd also add some aliases for transforms with
> familiar names. And we should define a * operator to compose
> transforms. Oh, and an inverse operator too.
>
> I always hate naming stuff, but here's an example.Maybe use MIRROR
> instead of FLIP?
>
> IDENTITY = ROT0 = 0,
> HFLIP = 1,
> VFLIP = 2,
> HVFLIP = ROT180 = 3,
> TRANSPOSE = 4,
> ROT270 = 5,
> ROT90 = 6,
> TRANSPOSE_ROT180 = 7
>
> You'll notice I have no natural name for the last entry! (any offers?)
>
> > > >
> > > > > > Naush:
> > > > > >
> > > > > > ... the pipeline handlers need to maintain a list of
> > > > > > controls supported, like in include/libcamera/ipa/raspberrypi.h
> > > > > > (RPiControls).  This then gets exported to the CameraData base class
> > > > > > member controlInfo_.  ARequest will not allow setting a
> > > > > > libcamera::control that is not listed in CameraData::controlInfo_
> > > > >
> > > > > Jacopo:
> > > > >
> > > > > That's correct, it seems to me from this discussion, a ControlInfo
> > > > > could be augmented with information about a control being supported
> > > > > per-frame or just at at configuration time (or else...)
> > > >
> > > > Seems we agree :-)
> > > >
> > > > Please also note that include/libcamera/ipa/raspberrypi.h lists controls
> > > > that are supported by the IPA. If there are controls that are fully
> > > > implemented on the pipeline handler side without involving the IPA (I'm
> > > > not suggesting this is or is not the case here, it's a general remark),
> > > > they can be added to the control info map by the pipeline handler.
> > > >
> > > > > > Notes on the Raspberry Pi implementation:
> > > > > > Only transpose-less transforms will be allowed, and only before the
> > > >
> > > > What do you mean by transpose-less transforms ?
>
> So you can look at the list I made above. Another way to think of it
> is that a transpose-less transform preserves input rows of pixels as
> output rows of pixels. A transform with a transpose in it turns input
> rows of pixels into output *columns*.
>
> > >
> > > Anything that involves only a combination of horizontal and vertical
> > > flips would be a tranpose-less transform, e.g. rot180, rot180 +
> > > mirror.  Any rot 90/270 would require a transpose operation, and we
> > > cannot do that with the hardware block.
> > >
> > > > > > camera is started. We will support them by setting the corresponding
> > > > > > control bits in the camera driver (so raw streams will include the
> > > > > > transform).
> > > >
> > > > Does this mean configuring h/v flip at the camera sensor level ? I
> > > > assume that will be the most usual case. I wonder if offline ISPs
> > > > typically support h/v flip.
> > >
> > > The RPi ISP does support it.  However, we do find it much much easier
> > > overall if the flips occurred at the source.
> > >
> > > > > > This means the ALSC (Auto Lens Shading) algorithm will
> > > > > > have to know whether camera images are oriented differently from the
> > > > > > calibration images.
> > > >
> > > > How do you think we should convey the orientation of the calibration
> > > > data ?
> > >
> > > We could have a field in our tuning file specifying the orientation of
> > > the calibration table - or always enforce the calibration must happen
> > > at rot 0.  Then any orientation change can be passed into the IPA (via
> > > controls?) so that the calibration table can be flipped appropriately.
> > > Personally, I prefer to have all calibration data fixed at rot 0, but
> > > David may have other opinions :)  However, all this is up to vendors
> > > to decide for themselves I suppose.
> >
> > I think I agree with you here, it seems less confusing and error-prone
> > to standardize the calibration rotation, but I may be missing something.
>
> Down at the Raspberry Pi level, I think it might be natural to include
> the camera transform in our DeviceStatus metadata. This makes it
> immediately available to any control algorithms that care.
>
> Doing lens shading calibration with the identity transform certainly
> makes sense and reduces confusion. But you know someone somewhere will
> get themselves in a muddle and discover that they've got their
> calibration images the wrong way round.
>
> So we _will_ end up with a "calibration transform" field in ALSC
> recording the orientation of those calibration images. However, with
> those compose and inverse operations defined for transforms, it will
> merely be a (slightly head-scratching) one-liner to deduce the right
> transform for the output tables!
>
> Thanks and best regards
> David
>
> >
> > > > > > Thoughts?
> >
> > --
> > Regards,
> >
> > Laurent Pinchart