[libcamera-devel] Docs about buffers

Thu Nov 18 10:51:06 CET 2021

Hi Dorota

On Wed, Nov 17, 2021 at 04:46:54PM +0100, Dorota Czaplejewicz wrote:
> Hi,
>
> thanks to Jacopo, I read the pipeline handler guide. Most of it was great, confirming some of my reasoning about the code, but I got stuck there in the same place that gave me the most trouble in code: frame buffers.
>
> I got the impression that docs about frame buffers are written for someone with a different background than I have. Perhaps it assumes some knowledge about V4L2 internals, whereas I'm just a lowly consumer of those APIs, and I haven't had to deal with media-device complexity with my driver (no ISP of any kind). As a result, I was unable to understand the vocabulary, and I didn't learn much from the docs that I couldn't guess already. Without false humility, I think that folks like me can be useful in writing pipeline handlers, so I wanted to give my perspective on what gave me trouble about frame buffers.

Yes indeed there are some assumed knowledge about the underlying
mechanism. Not that much about anything related to MC but rather about
a few key memory handling concepts, only some of them V4L2-specific.

I can't explain all the details as I would probably get something
wrong and confuse you more, but I'll leave you some pointers to
hopefully clarify your questions and get to fix the documentation
where it's not good enough.

>
> When I look at some code, I ask myself the crucial questions of "what is the intention?", "how does this work?", "why this way?", and I will focus on that.
>
> I had some smaller difficulties with the FrameBuffer class itself. https://libcamera.org/api-html/classlibcamera_1_1FrameBuffer.html
>
> While the description does quite a good job at explaining the intention and its contents, it's leaving parts of the other questions unanswered. For example, where is the actual data stored? Some other place mentions that the buffer is a DMABUF buffer, which presumably means that the data is referenced using Plane::fd. Since buffers are usually just plain in-memory arrays, this could also use a "why?" explanation (to allow zero-copy between x and y?). I know that there's a sentenct about that:
>
> > The planes are specified when creating the FrameBuffer and are expressed as a set of dmabuf file descriptors, offset and length.
>
> but I think the language is overly complicated, when it could say sth like "The frame buffer data is stored in the dmabuf file descriptors for each Plane object" to easily answer "where is the data?". (I'm saying that cause I missed it myself while tired after reading the guide.)

The thing is that, in my understanding at least, saying "the data is
stored in the dmabuf" is thecnically wrong. dmabuf is a cross-device
memory sharing and representation mechanism.

I'm afraid the "where is the data" very much depend on where memory is
allocated and how the application makes use of the API.

Generally speaking the memory (but I'm sure this is partial) for the
frame buffer can be allocated
- By another device (DRM/KMS in example) as a dmabuf descriptor and passed
  to libcamera wrapped in  a FrameBuffer
- In the camera video device itself and then export to other consumers
  (including the Camera itself, see the FrameBufferAllocator class)

>
> The description to access the data is placed in the docs for Plane, which makes it slightly harder to discover. Perhaps it would be beneficial to place all the docs regarding Plane and FrameBuffer in one place, since the two classes are basically two aspects of one thing. Having one description would make writing easier: no more questions "should I mention memory mapping here or reference the docs for Plane instead?".

That could be indeed considered, care to send a patch ?

>
> A related thing that I think should get more visibility is to stress that the buffers don't necessarily live on the CPU. (Where do they live? Is FrameBuffer suitable to represent data flowing from the sensor to the MIPI controller? Can it live in any memory area accessible via DMA?)

I'm afraid I'm not fully getting this.

With 'living on the CPU' do you mean they have a CPU-accessible
address that can be accessed by userspace and that gets obtained by
mmap ? Because that's again a representation and where the memory
area is allocated depends on the usege of the API.

When it comes to the second part, I'm afraid there's no such thing as
'data flowing to the sensor to the MIPI controller'. I mean, there of
course is a stream of serial data flowing on the bus, but there's not
representation of those 'live' data which can be accessed as those
data are not 'memory' yet.

It's rather the CSI-2 receiver which, depending on the SoC
architecture, buffers those data and transfers them in chunks in some
DMA accessible memory areas which are later accessible to application as
a memory mapped cpu-address or a dmabuf descriptor.

The above "DMA accessible memory area" is where the buffers are
actually allocated and again it really depends on what the application
does.

>
>
> Another problem is the FrameBuffer::cancel() method. It says:
>
> > Marks the buffer as cancelled.
> > If a buffer is not used by a request, it shall be marked as cancelled to indicate that the metadata is invalid.
>
> I'm missing the intention/purpose of this. Why do I want to indicate the metadata is invalid? What can bring metadata to invalidity? I can imagine how it works (some flag), but I also don't know why it changes based on request. What if the buffer *is* used on request? Will the call get silently ignored?

I guess the Metadata being cancelled implies the whole frame is
cancelled. I think we wanted to expose that through the Metadata, but
maybe it's confusing.

>
> It's also not clear if the FrameBuffer is focused on the memory allocation (mutable place to put data, reusable), or on the data itself (no need to be mutable, used once). I suspect the former, but it could be spelled out.

FrameBuffer is just a representation of a memory area in form of
of a set of dmabuf descriptors with an associated length.

>
> * * *
>
> Now, on to the parts that I can't get through: allocateBuffers, importBuffers, exportBuffers, and releaseBuffers.
>
> I know what "allocate" means, but when I try to read the description, I'm immediately confused:
>
> > This function wraps buffer allocation with the V4L2 MMAP memory type.
>
> I'm not sure what "V4L2 memory types" are, somehow I managed to avoid coming across them. The next sentence is better:

I'm afraid there's no other suggestion I can give if not reading about
REQBUFS and the there described memory mapped, user pointer or DMABUF based I/O
methods.

https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/vidioc-reqbufs.html

The class you're looking at models a V4L2 device so it tighgly coupled
with the V4L2 infrastructure and unfortunately requires some
background knowledge about the V4L2 internals.

When it comes to documentation, there's no point in us trying to
explain again here what is already documented in the proper place.
Also adding a link would not be enough imho, as we would need plenty
of them :)

>
> > It requests count buffers from the driver, allocating the corresponding memory, and exports them as a set of FrameBuffer objects in buffers.
>
> I'm not really sure about where the allocations take place, but I'm reading that as "this allocates `count` buffers on the device, and makes them accessible to the CPU (to DMA?) via FrameBuffer objects. The FrameBuffer objects are appended to the vector `buffers`".
>
> Next is "export", which is like "allocate", but:
>
> > Unlike allocateBuffers(), this function leaves the driver's internal buffer management uninitialized.
>
> Together, the purpose/intention is missing: when should one be chosen and when the other? Why do driver internals matter? Maybe it's worth to link some external resource here.

A few requisites:
- V4L2 buffer orphaning (in the REQBUFS documentation)
- V4L2 expbuf IOCTL
  https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/vidioc-expbuf.html
- Laurent's cover letter patch series that introduced used of orphaned buffers
  https://lists.libcamera.org/pipermail/libcamera-devel/2020-March/007224.html

For you above question about when to use one or the other, for each
function in the below pasted documentation there is a pragraph that
explains the intended use case. How can those paragraph be clarified
in your opinion ?

 * - The allocateBuffers() function wraps buffer allocation with the V4L2 MMAP
 *   memory type. It requests buffers from the driver, allocating the
 *   corresponding memory, and exports them as a set of FrameBuffer objects.
 *   Upon successful return the driver's internal buffer management is
 *   initialized in MMAP mode, and the video device is ready to accept
 *   queueBuffer() calls.
 *
 *   This is the most traditional V4L2 buffer management, and is mostly useful
 *   to support internal buffer pools in pipeline handlers, either for CPU
 *   consumption (such as statistics or parameters pools), or for internal
 *   image buffers shared between devices.

 [in example this last sentence could be changed imho, as for internal
 buffer pools allocating is not enough as one side would need to
 export (== allocate, export and orhpan) + import (== set the queue in
 dmabuf mode) and the other side to import]

 *
 * - The exportBuffers() function operates similarly to allocateBuffers(), but
 *   leaves the driver's internal buffer management uninitialized. It uses the
 *   V4L2 buffer orphaning support to allocate buffers with the MMAP method,
 *   export them as a set of FrameBuffer objects, and reset the driver's
 *   internal buffer management. The video device shall be initialized with
 *   importBuffers() or allocateBuffers() before it can accept queueBuffer()
 *   calls. The exported buffers are directly usable with any V4L2 video device
 *   in DMABUF mode, or with other dmabuf importers.
 *
 *   This method is mostly useful to implement buffer allocation helpers or to
 *   allocate ancillary buffers, when a V4L2 video device is used in DMABUF
 *   mode but no other source of buffers is available. An example use case
 *   would be allocation of scratch buffers to be used in case of buffer
 *   underruns on a video device that is otherwise supplied with external
 *   buffers.
 *
 * - The importBuffers() function initializes the driver's buffer management to
 *   import buffers in DMABUF mode. It requests buffers from the driver, but
 *   doesn't allocate memory. Upon successful return, the video device is ready
 *   to accept queueBuffer() calls. The buffers to be imported are provided to
 *   queueBuffer(), and may be supplied externally, or come from a previous
 *   exportBuffers() call.
 *
 *   This is the usual buffers initialization method for video devices whose
 *   buffers are exposed outside of libcamera. It is also typically used on one
 *   of the two video device that participate in buffer sharing inside
 *   pipelines, the other video device typically using allocateBuffers().
 *
 * - The releaseBuffers() function resets the driver's internal buffer
 *   management that was initialized by a previous call to allocateBuffers() or
 *   importBuffers(). Any memory allocated by allocateBuffers() is freed.
 *   Buffer exported by exportBuffers() are not affected by this function.

>
> About "import", I would have guessed that it works in reverse to export, that is: takes buffers on the CPU, and makes them accessible to the device. Then it would take a collection of buffers, but instead it takes... a number? I'm unable to make any sense out of this alone, not the intention, not how it works, not when to use it.

Hopefully reading how REQBUFS in V4L2_DMABUF mode works will help

Understanding also how the FrameBufferAllocator works might help
understanding the 'allocate-export-import' pattern

>
> "release" is also somewhat confusing, in that it's not a dual to "allocate", in that it seems to release ~all buffers associated with the device. What happens if those buffers are used anyway?

Again from the V4L2 REQBUF ioctl documentation

Applications can call ioctl VIDIOC_REQBUFS again to change the number
of buffers. Note that if any buffers are still mapped or exported via
DMABUF, then ioctl VIDIOC_REQBUFS can only succeed if the
V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS capability is set. Otherwise ioctl
VIDIOC_REQBUFS will return the EBUSY error code. If
V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS is set, then these buffers are
orphaned and will be freed when they are unmapped or when the exported
DMABUF fds are closed. A count value of zero frees or orphans all
buffers, after aborting or finishing any DMA in progress, an implicit
VIDIOC_STREAMOFF.

>
>
> I haven't found a more general guide to buffers: when to allocate them, when to reuse them, how to pass them between devices, who owns FrameBuffer objects, etc. I think having something like this would make first encounters go smoothly.

I'm afraid we've tried but not had been able to completely isolate the
V4L2 specificites implemented by the V4L2VideoDevice class with a more
general description of FrameBuffer.

If you're not interested in dealing with the V4L2 internals
you shouldn't be required to understand what happens in the video
device, but you should indeed be allowed to easily use the FrameBuffer
class.

>
> Overall, I don't want to come out as a complainer :) So in exchange for an explanation over email, I offer including what I learn in the docs.

Absolutely, that's helpful feedback and I hope the references here
help your understanding of the background and lead to a better
documentation of the FrameBuffer class.

Thanks
   j
>
> Regards,
> Dorota

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.libcamera.org/pipermail/libcamera-devel/attachments/20211118/6293c67b/attachment.sig>