[libcamera-devel] [RFC PATCH 0/2] Sensor mode hints

Wed Sep 22 11:35:27 CEST 2021

Hi Jacopo, Naush

Thanks for joining this discussion, I think that's great!

I'll address some of Jacopo's questions further down, but it's
probably worth talking about the context a bit more first as that will
inform the answers I give.

The problem at hand is that of letting applications get hold of
"correct" sensor modes. This means things like "I want the fastest
frame rate possible", or "I only want full FoV modes". Or even things
like "I want low and high resolution modes that share the same FoV". I
see a couple of approaches.

Approach A: here we provide the ability to select known sensor modes
exactly, for example by specifying the resolution, bit-depth etc. This
does indeed work and is actually more-or-less where we are at the
moment.

Approach B: alternatively libcamera can provide applications with a
list of the available sensor modes so that they can (optionally)
figure out for themselves what they want. In time this would ideally
contain information about FoV, framerates and so on. It's obviously
more complicated than approach A but includes anything that you could
do with it.

But I think the most important difference is this:

* Approach A gives applications no way to figure out anything about
sensor modes for themselves. Approach A is the one that will lead us
to applications that hardcode particular sensor mode choices, or have
lookup tables of sensor modes for certain supported sensors.

To be honest I think I could live with approach A, but approach B
feels preferable to me. What do others think?

On Wed, 22 Sept 2021 at 09:11, Naushir Patuck <naush at raspberrypi.com> wrote:
>
> Hi Jacopo,
>
> On Tue, 21 Sept 2021 at 14:47, Jacopo Mondi <jacopo at jmondi.org> wrote:
>>
>> Hi David,
>>
>> On Thu, Sep 16, 2021 at 02:20:13PM +0100, David Plowman wrote:
>> > Hi everyone
>> >
>> > Here's a first attempt at functionality that allows applications to
>> > provide "hints" as to what kind of camera mode they want.
>> >
>> > 1. Bit Depths and Sizes
>> >
>> > So far I'm allowing hints about bit depth and the image sizes that can
>> > be read out of a sensor, and I've gathered these together into
>> > something I've call a SensorMode.
>> >
>> > I've added a SensorMode field to the CameraConfiguration so that
>> > applications can optionally put in there what they think they want,
>> > and also a CameraSensor::getSensorModes function that returns a list
>> > of the supported modes (raw formats only...).
>>
>> This is more about the theory, but do you think it's a good idea
>> to allow applications to be written based on some sensor-specific
>> parameter ? I mean, I understand it's totally plausible to write a
>> specialized application for say, RPi + imx477. This assumes the
>> developers knows the sensor modes, the sensor configuration used to
>> produce it and low level details of the sensor. Most of those
>> information assumed to be known by the developer also because there's
>> currently no way to expose from V4L2 how a mode has been
>> realized in the sensor, if through binning or skipping or cropping. It
>> can be deduced probably looking at the analogue cropping and full
>> pixel array size, but it's a guessing exercise.
>>
>> Now, if we assume that a very specialized application knows the sensor
>> it deals with, what is the use case for writing a generic application
>> that instead inspects the (limited) information of the sensor produced
>> modes and selects its favourite one generically enough ? Wouldn't it
>> be better to just assume the application precisely knows what RAW
>> formats it wants and adds a StreamConfiguration for it ?

I agree this is the key question. Do we want applications that know
precisely what mode they want, or do we want there to be the option
(it's not compulsory!) for applications to make smarter mode choices
in a more general way?

>
>
> This type of usage may be more relevant for Raspberry Pi than other vendors,
> though it does not need to be of course.  Our current (non-libcamera based)
> raspicam applications  allows the user to select a specific sensor mode to use
> with the "-md" command line parameter with an integer specifying the mode index
> to use.  The modes for each of our sensors are fully documented so the user
> knows the exact width/height/crop/bin/bit-depth used.  This is one of those features
> that our users have found extremely useful in a wide range of situations, and we
> can't really live without it.  Examples of this are using a low bit depth mode for
> fast fps operation, or using binned mode for better noise performance, or forcing
> a specific FoV to name a few.
>
> The idea behind this change is to empower the users to do similar things with our
> libcamera based apps.  Currently, there is no full substitute for our existing mechanism.
> The goal is not really to write an application that is specialised to a particular sensor,
> rather we allow the user a mechanism to choose a particular advertised mode through
> this new mechanism.  This can be done either via some command line parameters, or
> even programmatically, that's up to the user/app.  So applications still remain generic,
> but now allow the users to have a certain level of sensor specific control.  Of course, an
> application may choose to not bother with any of this, and then we revert to the pipeline
> handlers having full choice in the sensor mode to use, like we do right now.
>
> I'll let David respond to the below comments in more detail, but I just wanted to
> provide some further context for what this change is about.  Hope that helps!
>
> Regards,
> Naush
>
>>
>>
>> >
>> > There are various ways an application could use this:
>> >
>> > * It might not care and would ignore the new field altogether. The
>> >   pipeline handler will stick to its current behaviour.
>> >
>> > * It might have some notion of what it wants, perhaps a larger bit
>> >   depth, and/or a range of sizes. It can fill some or all of those
>> >   into the SensorMode and the pipeline handler should respect it.
>> >
>> > * Or it could query the CameraSensor for its list of SensorModes and
>> >   then sift through them looking for the one that it likes best.
>>
>> This is the part I fail to fully grasp. Could you summarize what are
>> the parameters that would drive the mode selection policy in the
>> application ?

Bit-depth and size are the most obvious and are the initial ones I
proposed. But I think we should be expecting to extend this if
possible to FoV and framerate.

>>
>> >
>> > 2. Field of View and Framerates
>> >
>> > The SensorMode should probably include FoV and framerate information
>>
>> FoV it's problematic, I understand. We have a property that describes
>> the pixel array size, and I understand comparing the RAW output size
>> is not enough as the same resolution can be theoretically be obtained
>> by cropping or subsampling. I would not be opposite to report it
>> somehow as it might represent an important selection criteria.
>>
>> Duration, on the other hand can't be read from the limits of
>> controls::FrameDurationLimits ? I do expect those control to be
>> populated with the sensor's durations (see
>> https://git.linuxtv.org/libcamera.git/tree/src/ipa/ipu3/ipu3.cpp#n256)
>>
>> Sure, applications should try all the supported RAW modes, configure
>> the camera with them, and read back the control value.
>>
>> > so that applications can make intelligent choices automatically.
>> > However, this is a bit trickier for various reasons so I've left it
>> > out. There could be a later phase of work that adds these.
>> >
>> > Even without this, however, the implementation gets us out of our
>> > rather critical hole where we simply can't get 10-bit modes. It also
>>
>> Help me out here: why can't you select a RAW format with 10bpp ?

Mainly because I've been working on the assumption that we will
ultimately move to something more like "approach B", and so baking in
more of "approach A" now is bad. In particular, I don't really see any
future path from A to B. Also, I still have some niggles with the use
of the raw stream for this purpose:

* It still feels a bit kludgey to me if I have to ask for a raw stream
to force a sensor mode choice, even when I actually have no interest
in receiving the raw stream. But other opinions may differ there?

* The raw stream mixes up the sensor mode and its format in memory.
For example, if I want a particular sensor mode, why do I have to
specify the packing? I have no interest in the packing and want the
pipeline handler to make the most efficient choice.

* On the other hand, there will be times when I am interested in the
format in memory, presumably because I want to do something with the
pixels. Here, requesting the raw stream and specifying a packing
format (probably an unpacked one!) makes sense.

* As an aside, I see a future where we have compressed in-memory
formats too. Perhaps we just treat that as (a particularly opaque form
of) packing?

>>
>> > provides a better alternative to the current nasty practice of
>> > requesting a raw stream specifically to bodge the camera mode
>> > selection, even when the raw stream is not actually wanted!
>>
>> I see it the other way around actually :) Tying application to sensor
>> modes goes in the opposite direction of 'abstracting camera
>> implementation details from application' (tm)
>>
>> Also goes in the opposite direction of the long-term dream of having
>> sensor driver not being tied to a few fixed modes because that's what
>> producers gave us to start with, but I get this is a bit far-fetched.
>>
>> Of course the line between abstraction and control is as usual draw on
>> the sand and I might be too concerned about exposing sensor details in
>> our API, even if in an opt-in way.

Going back to my original paragraphs: my view is that approach A ties
an application to particular sensors and lists of known modes for
those sensors. Approach B gives applications the possibility (if they
want it) to be free of such look-up tables.

I hope that explains the context a bit better, but most importantly, I
would agree that figuring out what we actually want is the first step!

Thanks!
David

>>
>> Thanks
>>    j
>>
>> >
>> > 3. There are 2 commits here
>> >
>> > The first adds the SensorMode class, puts it into the
>> > CameraConfiguration, and allows the supported modes to be listed from
>> > the CameraSensor. (All the non-Pi stuff.)
>> >
>> > The second commit updates our mode selection code to select according
>> > to the hinted SensorMode (figuring out defaults if it was empty). But
>> > it essentially works just the same, if in a slightly more generic way.
>> >
>> > The code here is fully functional and seems to work fine. Would other
>> > pipeline handlers be able to adapt to the idea of a "hinted
>> > SensorMode" as easily?
>> >
>> >
>> > As always, I'm looking forward to people's thoughts!
>> >
>> > Thanks
>> > David
>> >
>> > David Plowman (2):
>> >   libcamera: Add SensorMode class
>> >   libcamera: pipeline_handler: raspberrypi: Handle the new SensorMode
>> >     hint
>> >
>> >  include/libcamera/camera.h                    |   3 +
>> >  include/libcamera/internal/camera_sensor.h    |   4 +
>> >  include/libcamera/meson.build                 |   1 +
>> >  include/libcamera/sensor_mode.h               |  50 +++++++++
>> >  src/libcamera/camera_sensor.cpp               |  15 +++
>> >  src/libcamera/meson.build                     |   1 +
>> >  .../pipeline/raspberrypi/raspberrypi.cpp      | 105 +++++++++++++-----
>> >  src/libcamera/sensor_mode.cpp                 |  60 ++++++++++
>> >  8 files changed, 212 insertions(+), 27 deletions(-)
>> >  create mode 100644 include/libcamera/sensor_mode.h
>> >  create mode 100644 src/libcamera/sensor_mode.cpp
>> >
>> > --
>> > 2.20.1
>> >