[libcamera-devel] Expected Request queuing behavior by apps

Fri Dec 16 11:21:17 CET 2022

Hi all,

Few thoughts for me below.

On Fri, 16 Dec 2022 at 09:05, Jacopo Mondi via libcamera-devel <
libcamera-devel at lists.libcamera.org> wrote:

> Hi Umang
>    thanks for starting the thread.
>
> On Fri, Dec 16, 2022 at 11:30:39AM +0530, Umang Jain via libcamera-devel
> wrote:
> > Hi everyone,
> >
> > Goal:
> > I am kicking-off a thread to discuss the Request queuing behaviour that
> can
> > be expected by libcamera  from application's side. In the recent
> > discussions, we have found that we don't yet advertise or document this
> > aspect explicitly. Hence, it would be helpful to do so, especially it
> helps
> > with internal plumbing of requests routing through pipeline-handlers(s),
> IPA
> > and various requests holding placeholders for state tracking (queues,
> > arrays, FCQueue etc.)  or even post-processing.
> >
> > Assumptions up till this point:
> > Atleast what _I_ have assumed up until this point is that applications
> queue
> > a certain number of requests (let's say 4) and then continuing queuing
> the
> > request from the `requestCompleted` callback handler (slot). I see this
> > behaviour implemented in our `libcamera/src/apps`. This assumption
> > inherently suggests that these applications, are not trying to
> over-queue to
> > the hardware hence, respecting a certain depth of the hardware pipeline.
> > They makes sure new requests get queued when previous ones gets
> completed.
> > Other argument can be that, reuse of Request objects are cheaper than
> > constructing a new one for each queueRequest() and queuing all at once.
> >
> > Defining the explicit behaviour by applications:
> > Note, in the last segment - "Assumption". We do not explicitly say that
> the
> > applications should queue the requests in a certain manner.
> >
>
> I agree this is the canonical way, but it's certainly not the only one
>
> > Hence, it is time to defining our expectations on what applications are
> > allowed or not allowed to do, with respect of queuing requests. Below
> are a
> > few documentation headers to move the discussion forwards.
> >
> > - Exposing the hardware capabilities
> > Should libcamera expose the underlying hardware capabilities/constraints
> to
> > applications and expect them to rate-limit the queuing of requests based?
> >
>

An application ought to know to submit a minimum number of requests in
order to
get the underlying hardware to generate an output stream.  However, I don't
feel
that an application should constrain itself on the maximum number of
requests.
This ought to be done in the framework, just like your patch has done.  This
sounds like what you describe in your "No limit" section (*) below?

Additionally, my example of the application queueing 20 in-flight requests
and
then just waiting for completion without recycling any further requests may
sound a bit contrived, but it does highlight that an application is free to
do it, and chances are one of them will!  If we want to make restrictions
here,
it must be baked into the API.

>
> Let's start by clearly enumerating these, to see how we handle them
> today.
>
> I would start by listing the requirements of the current libcamera
> backend (V4L2) and see how these have surfaced to the current API and
> what we might want to change.
>
>
> 1) Startup:
>
>    V4L2 video devices have a requirement of having a min number of buffers
>    queued before streaming is started. The requirement comes from the
>    need of having enough memory buffer available to sustain memory
>    transfer and data capture operations.
>
>    The number of min buffers queued is a platform-specific parameter
>
>    The only documentation I found is in the vb2 framework documentation
>
> https://www.kernel.org/doc/html/latest/driver-api/media/v4l2-videobuf2.html?highlight=min_buffers_needed
>
>    libcamera currently queues buffers to the capture video devices
>    at queueRequest time only, hence applications need to queue enough
>    requests before streaming is actually started. The Camera state
>    machine demands that Camera::start() is called before any
>    Camera::queueRequest() can be called, but if applications do not
>    queue enough requests, streaming will never be started.
>
>    We currently do not have a clear way to express that, if not by
>    validating StreamConfiguration::bufferCount which is set by
>    pipelines to a known valid values for the platform, but nothing in
>    our API tells applications "you have to queue enough requests to
>    have the camera actually started" as far as I can see ?
>
> 2) Runtime
>
>    Following the principle of "1 request - 1 buffer", which I guess we
>    aim to change in future, you can't queue more requests to a camera
>    than the number of buffers requested on a video device.
>
>    The number of requests buffers is again usually defined by
>    StreamConfiguration::buffersCount and it usually also dictates the
>    number of intermediate buffers the pipeline handler allocates
>    (think of the CIO2 buffers, or the ISP parameter buffers as
>    examples).
>

>    Please note this is an issue for application that allocates buffers
>    from elsewhere (ie a dmabuf-heap pool or on the display) and import
>    them in libcamera. If applications allocates buffers in the Camera
>    they will be limited to work with those buffers, and by definition,
>    they can't use more than what they have allocated.
>
>    It's interesting to notice how some pipeline had to take of that by
>    themselves
>
>
> https://git.libcamera.org/libcamera/libcamera.git/tree/src/libcamera/pipeline/ipu3/ipu3.cpp#n853
>    89dae5844964 ("libcamera: pipeline: ipu3: Store requests in the case a
> buffer shortage")
>
>    Your work on throttling the number of requests in the base
>    PipelineHandler class according to the depth of the frame contexts
>    queue could be re-considered in this new context: up to bufferCount
>    requests can be queued -to the device- at the same time and the
>    PipelineHandler base class caches them on behalf of the pipeline
>    handlers. This will make StreamConfiguration::bufferCount a central
>    part of our API, and in that case the implications of chosing a
>    non-accurate bufferCount should be clearly documented.
>

I've always been a bit unsure of the intended use of
StreamConfiguration::buffersCount!
We set a value for each StreamRole based on what we think might be a
"reasonable"
number of buffers for a particular use case.  The application is free to
use this
number, or override with how many buffers it actually wants.

You could equally take the view that buffersCount gives the absolute minimum
number of buffers/requests in-flight for a stream to produce output.  But
then
your stream will generate output, but might compromise on performance.  In
the
RPi platform, StreamConfiguration::buffersCount == 1 in this case, but I
would
never want an application to actually only ever use 1 request as we
essentially
will run at half our intended rate.

Additionally, in the RPi platform, this number was also set/updated by the
application to provide guidance on the number of internal buffer
allocations to
use for a given stream.  But that's been removed now, particularly with my
pending series at
https://patchwork.libcamera.org/project/libcamera/list/?series=3663

If buffersCount is used going forward, we probably need a stricter
definition.
Also should mention, this field is marked to be removed with:
https://patchwork.libcamera.org/patch/13470/

>    We can also say we expect applications to deal with that by
>    themselves, but that's not realistic at the moment, more over in
>    the perspective that libcamera will increasingly be used by other
>    frameworks and adaptation layers, and all of them should
>    re-implement the same safety mechanism.
>
> 3) Stop
>
>    That's where we offer gurantees: requests will all be completed,
>    and will be completed in order. We don't expect applications to do
>    much here.
>
> Do you agree with my understaning on these parts ? Are there other
> aspects I have not considered ?
>
> All in all, I feel like we have one big topic here: we had the V4L2
> buffer handling requirments surfaced to our API, and it has surfaced
> not much in the API signature, but rather in how the library behaves
> with too many implicit behaviour and an assumption of a certain
> understanding of V4L2 from application developers.
>
>
> > - No limit
> > Advertise explicitly that there are no constraints in queuing Request to
> > libcamera *whatsoever*, from the applications side?
> >
> > - Number of requests to be queued by the application, to get X number of
> > frames
> >
> >         X   =    1 Frame
> >         X <=    frames less than processing blocks on a platform
> >         X   >    frames less than processing blocks on a platform
>
> I'm not sure I got this part :)
>

* I think this "No limit" case matches my understanding of application
constraints
described above?

Regards,
Naush

>
> >
> > Depending on the request queuing behaviour libcamera expects, I see that
> > documenting the aforementioned might be useful for applications
> developers
> > as well (maybe this goes in per-platform information?). lc-compliance can
> > incorporate checks for these as well.
> >
> > These are just on top off my head, new discussions points are welcomed as
> > well.
>
> Let's start by clarifying how things work today and what implicit
> requirements we have today and idealy make a plan if our API should
> stabilize on the '1 request - 1 buffer' model.
>
> Thanks
>   j
>
> >
> > Thanks,
> > Umang
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libcamera.org/pipermail/libcamera-devel/attachments/20221216/89e708de/attachment.htm>