[PATCH 03/27] libcamera: dma_buf_allocator: Favour udmabuf over cma heap allocations

Thu Apr 24 16:14:54 CEST 2025

On Thu, Apr 24, 2025 at 02:51:42PM +0100, Bryan O'Donoghue wrote:
> On 24/04/2025 14:17, Nicolas Dufresne wrote:
> >> even if the CPU only 'reads' the data ?
> > Its any CPU write -> GPU that in serious trouble on recent Intel. That
> > being said, if nothing either flush or invalidate the cache, if you
> > read before, write with GPU, and read again after that same buffer, the
> > data you'll read may be from old cache. In practice, GPU don't do
> > anything in-place, so that is non-issue here.
> 
> Well we have the GPU write to its own frame buffer and currently then 
> get a GBM pointer to that surface and memcpy() out into the libcamera 
> buffer be that CMA or UDMABuf.
> 
> +	gbm_surface_lock_front_buffer(gbm_surface_);
> +
> +	sync.flags = DMA_BUF_SYNC_START | DMA_BUF_SYNC_READ;
> +	ioctl(bo_fd_, DMA_BUF_IOCTL_SYNC, &sync);
> +
> +	if (data_len > framesize_) {
> +		LOG(GBM, Error) << "Invalid read size " << data_len << " max is " << framesize_;
> +		return -EINVAL;
> +	}
> +
> +	memcpy(data, map_, data_len);
> +
> +	sync.flags = DMA_BUF_SYNC_END | DMA_BUF_SYNC_READ;
> +	ioctl(bo_fd_, DMA_BUF_IOCTL_SYNC, &sync);
> +
> +	gbm_surface_release_buffer(gbm_surface_, gbm_bo_);
> +
> 
> That I believe should be cache-coherent for both CMA and UDMAbuf across 
> architectures.
> 
> We had an interesting discussion on how render-to-texture would work - 
> if you created the render texture using eglCreateImageKHR.
> 
> // render to texture
> 
> dmabuf_handle = handle to CMA or UDMABuf buffer.
> 
> EGLint attrs[] = {
> 	EGL_DMA_BUF_PLANE0_FD_EXT, dmabuf_handle
> };
> 
> mytexture-output = glCreateImageKHR(display, context,
>                                      EGL_LINUX_DMA_BUF_EXT,
>                                      NULL, attrs);
> 
> glBindFramebuffer(GL_FRAMBUFFER, someframebuffer);
> glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
>                         GL_TEXTURE_2D, mytexture-output);
> 
> 
> if dmabuf_handle points to CMA heap - who is responsible to flush the 
> cache before the CPU accessess the content of the DMA buf handle..
> 
> actually as I type that, I think the answer is that it is always the 
> responsibility of the CPU side to manage its own cache so..
> 
> glDraw();
> 
> sync.flags = DMA_BUF_SYNC_START | DMA_BUF_SYNC_READ;
> ioctl(dmabuf_handle, DMA_BUF_IOCTL_SYNC, &sync);
> 
> sync.flags = DMA_BUF_SYNC_END | DMA_BUF_SYNC_READ;
> ioctl(dmabuf_handle, DMA_BUF_IOCTL_SYNC, &sync);
> 
> something like that ..
> 
> >> For GPU ISP - the only access the CPU should do is read the data to
> >> generate some statistics. And even that - I would hope in the future
> >> will be turned into operations performed by the GPU ...
> > So read bayer, process in GPU, output YUV should just work. Thanks for
> > clarification, running out of CMA is pretty common, defaults CMA
> > reservation is usually very small. Appart from the buggy drivers,
> > virtual memory is a much better choice, and this is mostly what this
> > patch is about here.
> 
> Thinking about this patch.
> 
> A user on a system such as imx/hantro is free to configure the system to 
> support CMA and UDMABuf.
> 
> If you never pass that buffer to the video encoder - where the encoder 
> requires phys contig/cma heap memory - then you probably want udmabuf.

Add the display controller here, it's not just the video encoder.

Have you tried displaying images with cam (using the -D flag) on a
system without an IOMMU, with this patch applied ? Does it still work ?

> The reverse is also true.
> 
> Its almost use-case specific. If you want the encoder you need CMA and 
> if you just want to say - display the output in cheese you want UDMAbuf.
> 
> Probably the right thing to do is to leave CMA heap as default and leave 
> it to the system designer to configure the system as they wish.
> 
> In my case, switching off the CMA heap or perhaps a libcamera config/env 
> variable to specify which to use...

-- 
Regards,

Laurent Pinchart