[libcamera-devel] Docs about buffers

Thu Nov 18 16:32:23 CET 2021

Quoting Dorota Czaplejewicz (2021-11-18 13:40:07)
> Hi,
> 
> I started drafting this response before Kieran's email, so I will
> incorporate the answers to both inline.
> 
> On Thu, 18 Nov 2021 10:51:06 +0100 Jacopo Mondi <jacopo at jmondi.org>
> wrote:
> 
> > Hi Dorota
> > 
> > On Wed, Nov 17, 2021 at 04:46:54PM +0100, Dorota Czaplejewicz wrote:
> > > Hi,
> > >
> > > thanks to Jacopo, I read the pipeline handler guide. Most of it
> > > was great, confirming some of my reasoning about the code, but I
> > > got stuck there in the same place that gave me the most trouble in
> > > code: frame buffers.
> > >
> > > I got the impression that docs about frame buffers are written for
> > > someone with a different background than I have. Perhaps it
> > > assumes some knowledge about V4L2 internals, whereas I'm just a
> > > lowly consumer of those APIs, and I haven't had to deal with
> > > media-device complexity with my driver (no ISP of any kind). As a
> > > result, I was unable to understand the vocabulary, and I didn't
> > > learn much from the docs that I couldn't guess already. Without
> > > false humility, I think that folks like me can be useful in
> > > writing pipeline handlers, so I wanted to give my perspective on
> > > what gave me trouble about frame buffers.  
> > 
> > Yes indeed there are some assumed knowledge about the underlying
> > mechanism. Not that much about anything related to MC but rather
> > about a few key memory handling concepts, only some of them
> > V4L2-specific.
> > 
> > I can't explain all the details as I would probably get something
> > wrong and confuse you more, but I'll leave you some pointers to
> > hopefully clarify your questions and get to fix the documentation
> > where it's not good enough.
> > 
> > >
> > > When I look at some code, I ask myself the crucial questions of
> > > "what is the intention?", "how does this work?", "why this way?",
> > > and I will focus on that.
> > >
> > > I had some smaller difficulties with the FrameBuffer class itself.
> > > https://libcamera.org/api-html/classlibcamera_1_1FrameBuffer.html
> > >
> > > While the description does quite a good job at explaining the
> > > intention and its contents, it's leaving parts of the other
> > > questions unanswered. For example, where is the actual data
> > > stored? Some other place mentions that the buffer is a DMABUF
> > > buffer, which presumably means that the data is referenced using
> > > Plane::fd. Since buffers are usually just plain in-memory arrays,
> > > this could also use a "why?" explanation (to allow zero-copy
> > > between x and y?). I know that there's a sentenct about that:
> > >  
> > > > The planes are specified when creating the FrameBuffer and are
> > > > expressed as a set of dmabuf file descriptors, offset and
> > > > length.  
> > >
> > > but I think the language is overly complicated, when it could say
> > > sth like "The frame buffer data is stored in the dmabuf file
> > > descriptors for each Plane object" to easily answer "where is the
> > > data?". (I'm saying that cause I missed it myself while tired
> > > after reading the guide.)  
> > 
> > The thing is that, in my understanding at least, saying "the data is
> > stored in the dmabuf" is thecnically wrong. dmabuf is a cross-device
> > memory sharing and representation mechanism.
> > 
> > I'm afraid the "where is the data" very much depend on where memory
> > is allocated and how the application makes use of the API.
> > 
> > Generally speaking the memory (but I'm sure this is partial) for the
> > frame buffer can be allocated - By another device (DRM/KMS in
> > example) as a dmabuf descriptor and passed to libcamera wrapped in
> > a FrameBuffer - In the camera video device itself and then export to
> > other consumers (including the Camera itself, see the
> > FrameBufferAllocator class)
> > 
> Thanks for the explanation.
> > >
> > > The description to access the data is placed in the docs for
> > > Plane, which makes it slightly harder to discover. Perhaps it
> > > would be beneficial to place all the docs regarding Plane and
> > > FrameBuffer in one place, since the two classes are basically two
> > > aspects of one thing. Having one description would make writing
> > > easier: no more questions "should I mention memory mapping here or
> > > reference the docs for Plane instead?".  
> > 
> > That could be indeed considered, care to send a patch ?
> > 
> Sure. Do you mind if I send it as one with the attempt to make the
> description more accessible?
> > >
> > > A related thing that I think should get more visibility is to
> > > stress that the buffers don't necessarily live on the CPU. (Where
> > > do they live? Is FrameBuffer suitable to represent data flowing
> > > from the sensor to the MIPI controller? Can it live in any memory
> > > area accessible via DMA?)  
> > 
> > I'm afraid I'm not fully getting this.
> > 
> > With 'living on the CPU' do you mean they have a CPU-accessible
> > address that can be accessed by userspace and that gets obtained by
> > mmap ? Because that's again a representation and where the memory
> > area is allocated depends on the usege of the API.
> > 
> > When it comes to the second part, I'm afraid there's no such thing
> > as 'data flowing to the sensor to the MIPI controller'. I mean,
> > there of course is a stream of serial data flowing on the bus, but
> > there's not representation of those 'live' data which can be
> > accessed as those data are not 'memory' yet.
> > 
> > It's rather the CSI-2 receiver which, depending on the SoC
> > architecture, buffers those data and transfers them in chunks in
> > some DMA accessible memory areas which are later accessible to
> > application as a memory mapped cpu-address or a dmabuf descriptor.
> > 
> > The above "DMA accessible memory area" is where the buffers are
> > actually allocated and again it really depends on what the
> > application does.
> > 
> Could this be summarized as such:
> 
> FrameBuffer objects reference memory areas used for storing data. It
> doesn't necessarily contain image data, but it could also be a
> parameter buffer, or metadata or statistics from the ISP.  It uses the
> dma-buf mechanism (see dma-buf documentation [0]) to avoid unnecessary
> copying of frames when moving them around.
> 
> The FrameBuffer object does not need to be the only reference to its
> memory areas. It doesn't own them, and destroying a FrameBuffer does
> not affect the memory allocation.

Ah, I think I led you down a small error-path there. Destroying a
FrameBuffer 'could' cause the memory allocation to be destroyed.

The dma-bufs are a shared resource, so I think they are destroyed when
the last user closes the FD?

But I'm not 100% certain of that without digging into the details.

> Stored memory references are represented by the dma-buf file
> descriptors stored in each Plane object. Because dma-buf is used, the
> memory area can be physically located on any device, as long as it's
> accessible using DMA. That also means that FrameBuffer objects can
> reference memory which can't be mapped to userspace.
> 
> [0]https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html
> 
> (This link is not optimal from the perspective of a userspace
> consumer, but that's the best I could find. Notably, I haven't quickly
> identified how to get data out of the buffer, be it by copying or
> mapping.)
> 
> Now that I think of it, is the split into planes as defined by the
> format? They are always each the same size, right?

The planes are defined by the pixel format yes. Although .. there is
also a little bit of extra confusion there - as some 'semi-planar'
formats can mix things up a bit. And then there is a single-planar API
and a multi-planar API from V4L2 to contend with as to how to represent
those types ...

> > > Another problem is the FrameBuffer::cancel() method. It says:
> > >  
> > > > Marks the buffer as cancelled.  If a buffer is not used by a
> > > > request, it shall be marked as cancelled to indicate that the
> > > > metadata is invalid.  
> > >
> > > I'm missing the intention/purpose of this. Why do I want to
> > > indicate the metadata is invalid? What can bring metadata to
> > > invalidity? I can imagine how it works (some flag), but I also
> > > don't know why it changes based on request. What if the buffer
> > > *is* used on request? Will the call get silently ignored?  
> > 
> > I guess the Metadata being cancelled implies the whole frame is
> > cancelled. I think we wanted to expose that through the Metadata,
> > but maybe it's confusing.
> > 
> 
> From Kieran's email:
> > The cancel method is there to indicate that on the object itself, so
> > that when an application gets the FrameBuffer back - it can know if
> > the buffer contains a usable picture, and metadata (such as
> > timestamp) which is valid - or - if it would be meaningless to read
> > from any of the fields.
> 
> Why is FrameBuffer::cancel() not private API? It sounds like it's
> useful only to send a message about failure from the producer to the
> consumer.

You are right. This looks to be simply because it hasn't been made
private yet.

The FrameBuffer object was made 'extensible' about 5 months ago, while
the cancel helper was added 8 months ago.

I would suspect that the move to make FrameBuffer 'extensible' hasn't
moved all possible things to the private implementation yet. Cancelling
a buffer shouldn't be exposed from the public API.

If you'd like to fix this as an exercise to help you learn the code
base, I'll let you fix this. If you don't want to - I'll fix it and
submit a patch. But this might be an 'easy' task that might help you dig
deeper around the codebase, so I don't want to take the fix away from
you if you want it.

FrameBuffer::cancel() should likely be moved to
FrameBuffer::Private::cancel().

You'll see that there is an 'include/libcamera/internal/framebuffer.h'
with the 'private' implementation, while
'include/libcamera/framebuffer.h' has the public API version.

We specifically haven't 'released' libcamera yet to make a stable API
because we are still moving all these things around as we develop the
framework.

> If a buffer is cancelled, can it be submitted again? Will it get
> un-cancelled? I'm seeing no assignments of the type
> 
> FrameMetadata::status = FrameSuccess;
> 
> and I'm worried that buffers become unuseable after a single cancel
> (but that would make linguistic sense if it's "cancelled" like a
> ticket).

Buffers are only marked as FrameSuccess when they have successfully been
captured/dequeued from v4l2

src/libcamera/v4l2_videodevice.cpp:

FrameBuffer *V4L2VideoDevice::dequeueBuffer()
{
    ...
    buffer->metadata_.status = buf.flags & V4L2_BUF_FLAG_ERROR
                             ? FrameMetadata::FrameError
                             : FrameMetadata::FrameSuccess;
    ...
}

So when a buffer completes successfully - it's status is set correctly.

> > > It's also not clear if the FrameBuffer is focused on the memory
> > > allocation (mutable place to put data, reusable), or on the data
> > > itself (no need to be mutable, used once). I suspect the former,
> > > but it could be spelled out.  
> > 
> > FrameBuffer is just a representation of a memory area in form of of
> > a set of dmabuf descriptors with an associated length.
> > 
> > >
> > > * * *
> > >
> > > Now, on to the parts that I can't get through: allocateBuffers,
> > > importBuffers, exportBuffers, and releaseBuffers.
> > >
> > > I know what "allocate" means, but when I try to read the
> > > description, I'm immediately confused:
> > >  
> > > > This function wraps buffer allocation with the V4L2 MMAP memory
> > > > type.  
> > >
> > > I'm not sure what "V4L2 memory types" are, somehow I managed to
> > > avoid coming across them. The next sentence is better:  
> > 
> > I'm afraid there's no other suggestion I can give if not reading
> > about REQBUFS and the there described memory mapped, user pointer or
> > DMABUF based I/O methods.
> > 
> > https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/vidioc-reqbufs.html
> > 
> > The class you're looking at models a V4L2 device so it tighgly
> > coupled with the V4L2 infrastructure and unfortunately requires some
> > background knowledge about the V4L2 internals.
> > 
> > When it comes to documentation, there's no point in us trying to
> > explain again here what is already documented in the proper place.
> > Also adding a link would not be enough imho, as we would need plenty
> > of them :)
> > 
> I think it's better to add an abundance of links, as the alternative
> source of knowledge is reverse-engineering, or asking directly.
> 
> For example, "memory type" could link here:
> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/buffer.html#enum-v4l2-memory
> 
> > >  
> > > > It requests count buffers from the driver, allocating the
> > > > corresponding memory, and exports them as a set of FrameBuffer
> > > > objects in buffers.  
> > >
> > > I'm not really sure about where the allocations take place, but
> > > I'm reading that as "this allocates `count` buffers on the device,
> > > and makes them accessible to the CPU (to DMA?) via FrameBuffer
> > > objects. The FrameBuffer objects are appended to the vector
> > > `buffers`".
> > >
> > > Next is "export", which is like "allocate", but:
> > >  
> > > > Unlike allocateBuffers(), this function leaves the driver's
> > > > internal buffer management uninitialized.  
> > >
> > > Together, the purpose/intention is missing: when should one be
> > > chosen and when the other? Why do driver internals matter? Maybe
> > > it's worth to link some external resource here.  
> > 
> > A few requisites: - V4L2 buffer orphaning (in the REQBUFS
> > documentation) - V4L2 expbuf IOCTL
> > https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/vidioc-expbuf.html
> > - Laurent's cover letter patch series that introduced used of
> > orphaned buffers
> > https://lists.libcamera.org/pipermail/libcamera-devel/2020-March/007224.html
> > 
> > For you above question about when to use one or the other, for each
> > function in the below pasted documentation there is a pragraph that
> > explains the intended use case. How can those paragraph be clarified
> > in your opinion ?
> > 
> Actually, those do a pretty good job at explaining. But the guide
> doesn't link to the V4L2VideoDevice, but instead to each function
> separately. That combined catastrophically with 2 supporting facts: -
> I was overwhelmed and hyper-focused, so somehow I didn't think to look
> wider, - I noticed exportBuffers() first in SimpleCameraData as well
> as SimpleCameraData::stream, and thought it was some general
> abstraction.
> 
> I think that's a failure in splitting one thing across too many places
> again. Given that those 4 (5?) methods are intimately linked, I think
> the only reasonable solution is to add "See wider explanation in
> V4L2VideoDevice" in each method's docs. I'd also move what's currently
> under each method to the general section. Is that fine?

I agree, we definitely need more 'bigger picture' overviews to bring
together all the tiny bits of documentation everywhere.

> >  * - The allocateBuffers() function wraps buffer allocation with the
> >  V4L2 MMAP *   memory type. It requests buffers from the driver,
> >  allocating the *   corresponding memory, and exports them as a set
> >  of FrameBuffer objects.  *   Upon successful return the driver's
> >  internal buffer management is *   initialized in MMAP mode, and the
> >  video device is ready to accept *   queueBuffer() calls.  * *
> >  This is the most traditional V4L2 buffer management, and is mostly
> >  useful *   to support internal buffer pools in pipeline handlers,
> >  either for CPU *   consumption (such as statistics or parameters
> >  pools), or for internal *   image buffers shared between devices.
> > 
> >  [in example this last sentence could be changed imho, as for
> >  internal buffer pools allocating is not enough as one side would
> >  need to export (== allocate, export and orhpan) + import (== set
> >  the queue in dmabuf mode) and the other side to import]
> > 
> >  * * - The exportBuffers() function operates similarly to
> >  allocateBuffers(), but *   leaves the driver's internal buffer
> >  management uninitialized. It uses the *   V4L2 buffer orphaning
> >  support to allocate buffers with the MMAP method, *   export them
> >  as a set of FrameBuffer objects, and reset the driver's *
> >  internal buffer management. The video device shall be initialized
> >  with *   importBuffers() or allocateBuffers() before it can accept
> >  queueBuffer() *   calls. The exported buffers are directly usable
> >  with any V4L2 video device *   in DMABUF mode, or with other dmabuf
> >  importers.  * *   This method is mostly useful to implement buffer
> >  allocation helpers or to *   allocate ancillary buffers, when a
> >  V4L2 video device is used in DMABUF *   mode but no other source of
> >  buffers is available. An example use case *   would be allocation
> >  of scratch buffers to be used in case of buffer *   underruns on a
> >  video device that is otherwise supplied with external *   buffers.
> >  * * - The importBuffers() function initializes the driver's buffer
> >  management to *   import buffers in DMABUF mode. It requests
> >  buffers from the driver, but *   doesn't allocate memory. Upon
> >  successful return, the video device is ready *   to accept
> >  queueBuffer() calls. The buffers to be imported are provided to *
> >  queueBuffer(), and may be supplied externally, or come from a
> >  previous *   exportBuffers() call.  * *   This is the usual buffers
> >  initialization method for video devices whose *   buffers are
> >  exposed outside of libcamera. It is also typically used on one *
> >  of the two video device that participate in buffer sharing inside *
> >  pipelines, the other video device typically using
> >  allocateBuffers().  * * - The releaseBuffers() function resets the
> >  driver's internal buffer *   management that was initialized by a
> >  previous call to allocateBuffers() or *   importBuffers(). Any
> >  memory allocated by allocateBuffers() is freed.  *   Buffer
> >  exported by exportBuffers() are not affected by this function.
> > 
>
> I'm seeing one thing missing, which is an explanation of "internal
> buffer management of the driver". Clearly, there's some resetting and
> requesting going on, so there's some kind of a contract being relied
> upon that should be spelled out/linked to. What is it?

V4L2 can operate in different modes. MMAP / DMABUF (and USERPTR, which
we won't use).

I think 'internal buffer management' referes to those different mode
types.

And it also corresponds to the 'V4L2 buffer structures' I referenced in
a previous mail. This is the 'part' that is allocated by 'count'
buffers, when calling 'importBuffers()', but could be simplified to just
always allocate the maximum.

Given that the term is used four times in that section, you're certainly
right that it deserves some better explanation as to what it is though.

> The API here relies to a huge extent on implicit "action at a
> distance", so I think it's worth explaining failure modes stemming
> from things that might be off the screen. E.g. I can imagine debugging
> the errors on queueBuffers(5) after importBuffers(3), or forgetting to
> importBuffers at all, or calling something twice, etc. Since there's
> no abstraction enforcing those connections in a legible way (either
> compilation errors, or verbose runtime errors), mistakes will happen
> here. I'll copy the relevant parts from V2L4 docs, without any hope
> that they cover the topic exhaustively.

We have a V4l2BufferCache which maps between the buffers being queued,
and the 'V4L2 internal' buffer types. But you're right currently, if
importBuffers(3) was called, and queueBuffers() was called with 5
buffers - we would have issues at the moment. That needs work on the
V4L2BufferCache and V4L2VideoDevice to improve it.

The V4L2BufferCache does not like getting more buffers than expected
currently. It could be improved to re-use the 'oldest'
v4l2-internal-buffer at a performance-penalty-cost as far as I
understand.

> > > About "import", I would have guessed that it works in reverse to
> > > export, that is: takes buffers on the CPU, and makes them
> > > accessible to the device. Then it would take a collection of
> > > buffers, but instead it takes... a number? I'm unable to make any
> > > sense out of this alone, not the intention, not how it works, not
> > > when to use it.  
> > 
> > Hopefully reading how REQBUFS in V4L2_DMABUF mode works will help
> > 
> It does help indeed. I'll link those docs in the patch.
> 
> > Understanding also how the FrameBufferAllocator works might help
> > understanding the 'allocate-export-import' pattern
> > 
> > >
> > > "release" is also somewhat confusing, in that it's not a dual to
> > > "allocate", in that it seems to release ~all buffers associated
> > > with the device. What happens if those buffers are used anyway?  
> > 
> > Again from the V4L2 REQBUF ioctl documentation
> > 
> > Applications can call ioctl VIDIOC_REQBUFS again to change the
> > number of buffers. Note that if any buffers are still mapped or
> > exported via DMABUF, then ioctl VIDIOC_REQBUFS can only succeed if
> > the V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS capability is set. Otherwise
> > ioctl VIDIOC_REQBUFS will return the EBUSY error code. If
> > V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS is set, then these buffers are
> > orphaned and will be freed when they are unmapped or when the
> > exported DMABUF fds are closed. A count value of zero frees or
> > orphans all buffers, after aborting or finishing any DMA in
> > progress, an implicit VIDIOC_STREAMOFF.
> > 
> > >
> > >
> > > I haven't found a more general guide to buffers: when to allocate
> > > them, when to reuse them, how to pass them between devices, who
> > > owns FrameBuffer objects, etc. I think having something like this
> > > would make first encounters go smoothly.  
> > 
> > I'm afraid we've tried but not had been able to completely isolate
> > the V4L2 specificites implemented by the V4L2VideoDevice class with
> > a more general description of FrameBuffer.
> > 
> > If you're not interested in dealing with the V4L2 internals you
> > shouldn't be required to understand what happens in the video
> > device, but you should indeed be allowed to easily use the
> > FrameBuffer class.
> > 
> > >
> > > Overall, I don't want to come out as a complainer :) So in
> > > exchange for an explanation over email, I offer including what I
> > > learn in the docs.  
> > 
> > Absolutely, that's helpful feedback and I hope the references here
> > help your understanding of the background and lead to a better
> > documentation of the FrameBuffer class.
> 
> Thanks for all the explanations, Jacopo and Kieran. I'll send patches
> once I get the answers to the remaining questions.

I'm not sure if I answered all, but hopefully some.

--
Kieran

> > 
> > Thanks j
> > >
> > > Regards, Dorota  
> > 
> > 
>