[libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't rely on bufferCount

Laurent Pinchart laurent.pinchart at ideasonboard.com
Mon Aug 2 01:42:53 CEST 2021


Hi Nícolas,

Thank you for the patch.

On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:
> Pipelines have relied on bufferCount to decide on the number of buffers
> to allocate internally through allocateBuffers() and on the number of
> V4L2 buffer slots to reserve through importBuffers(). Instead, the
> number of internal buffers should be the minimum required by the
> algorithms to avoid wasting memory, and the number of V4L2 buffer slots
> should overallocate to avoid thrashing dmabuf mappings.
> 
> For now, just set them to constants and stop relying on bufferCount, to
> allow for its removal.
> 
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado at collabora.com>
> ---
> 
> No changes in v7
> 
> Changes in v6:
> - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and
>   INTERNAL_BUFFER_COUNT constant
> 
>  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------
>  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-
>  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------
>  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------
>  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------
>  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-
>  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++
>  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--
>  src/libcamera/pipeline/simple/converter.h         |  3 +++
>  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----
>  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--
>  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--
>  12 files changed, 35 insertions(+), 43 deletions(-)

Given that some of the pipeline handlers will need more intrusive
changes to address the comments below, you could split this with one
patch per pipeline handler (or perhaps grouping the easy ones together).

> 
> diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp
> index e955bc3456ba..f36e99dacbe7 100644
> --- a/src/libcamera/pipeline/ipu3/imgu.cpp
> +++ b/src/libcamera/pipeline/ipu3/imgu.cpp
> @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,
>  /**
>   * \brief Allocate buffers for all the ImgU video devices
>   */
> -int ImgUDevice::allocateBuffers(unsigned int bufferCount)
> +int ImgUDevice::allocateBuffers()
>  {
>  	/* Share buffers between CIO2 output and ImgU input. */
> -	int ret = input_->importBuffers(bufferCount);
> +	int ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);
>  	if (ret) {
>  		LOG(IPU3, Error) << "Failed to import ImgU input buffers";
>  		return ret;
>  	}
>  
> -	ret = param_->allocateBuffers(bufferCount, &paramBuffers_);
> +	ret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);
>  	if (ret < 0) {
>  		LOG(IPU3, Error) << "Failed to allocate ImgU param buffers";
>  		goto error;
>  	}
>  
> -	ret = stat_->allocateBuffers(bufferCount, &statBuffers_);
> +	ret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);
>  	if (ret < 0) {
>  		LOG(IPU3, Error) << "Failed to allocate ImgU stat buffers";
>  		goto error;
> @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)
>  	 * corresponding stream is active or inactive, as the driver needs
>  	 * buffers to be requested on the V4L2 devices in order to operate.
>  	 */
> -	ret = output_->importBuffers(bufferCount);
> +	ret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);
>  	if (ret < 0) {
>  		LOG(IPU3, Error) << "Failed to import ImgU output buffers";
>  		goto error;
>  	}
>  
> -	ret = viewfinder_->importBuffers(bufferCount);
> +	ret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);
>  	if (ret < 0) {
>  		LOG(IPU3, Error) << "Failed to import ImgU viewfinder buffers";
>  		goto error;
> diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h
> index 9d4915116087..f934a951fc75 100644
> --- a/src/libcamera/pipeline/ipu3/imgu.h
> +++ b/src/libcamera/pipeline/ipu3/imgu.h
> @@ -61,7 +61,7 @@ public:
>  					    outputFormat);
>  	}
>  
> -	int allocateBuffers(unsigned int bufferCount);
> +	int allocateBuffers();
>  	void freeBuffers();
>  
>  	int start();
> @@ -86,6 +86,9 @@ private:
>  	static constexpr unsigned int PAD_VF = 3;
>  	static constexpr unsigned int PAD_STAT = 4;
>  
> +	static constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;
> +	static constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;

5 buffer slots is low. It means that if applications cycle more than 5
buffers, the V4L2VideoDevice cache that maintains associations between
dmabufs and buffer slots will the trashed. Due to the internal queue of
requests in the IPU3 pipeline handler (similar to what you have
implemented in "[PATCH 0/3] libcamera: pipeline: Add internal request
queue" for other pipeline handlers), we won't fail at queuing requests,
but performance will suffer. I thus think we need to increase the number
of slots to what applications can be reasonably expected to use. We
could use 8, or even 16, as buffer slots are cheap. The same holds for
other pipeline handlers.

The number of slots for the CIO2 output should match the number of
buffer slots for the ImgU input, as the same buffers are used on the two
video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,
instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2
buffers that are allocated by exportBuffers() in CIO2Device::start(), to
be used in case the application doesn't provide any RAW buffer, should
be lower, as those are real buffer and are thus expensive. The number of
buffers and buffer slots on the CIO2 thus needs to be decoupled.

For proper operation, the CIO2 will require at least two queued buffers
(one being DMA'ed to, and one waiting). We need at least one extra
buffer queued to the ImgU to keep buffers flowing. Depending on
processing timings, it may be that the ImgU will complete processing of
its buffer before the CIO2 captures the next one, leading to a temporary
situation where the CIO2 will have three buffers queued, or the CIO2
will finish the capture first, leading to a temporary situation where
the CIO2 will have one buffer queued and the ImgU will have two buffers
queued. In either case, shortly afterwards, the other component will
complete capture or processing, and we'll get back to a situation with
two buffers queued in the CIO2 and one in the ImgU. That's thus a
minimum of three buffers for raw images.

>From an ImgU point of view, we could probably get away with a single
parameter and a single stats buffer. This would however not allow
queuing the next frame for processing in the ImgU before the current
frame completes, so two buffers would be better. Now, if we take the IPA
into account, the statistics buffer will spend some time on the IPA side
for processing. It would thus be best to have an extra statistics buffer
to accommodate that, thus requiring three statistics buffers (and three
parameters buffers, as we associate them together).

This rationale leads to using the same number of internal buffers for
the CIO2, the parameters and the statistics. We currently use four, and
while the logic above indicates we could get away with three, it would
be safer to keep using four in this patch, and possibly reduce the
number of buffers later.

I know documentation isn't fun, but I think this rationale should be
captured in a comment in the IPU3 pipeline handler, along with a \todo
item to try and lower the number of internal buffers to three.

> +
>  	int linkSetup(const std::string &source, unsigned int sourcePad,
>  		      const std::string &sink, unsigned int sinkPad,
>  		      bool enable);
> diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp
> index 5fd1757bfe13..4efd201c05e5 100644
> --- a/src/libcamera/pipeline/ipu3/ipu3.cpp
> +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp
> @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)
>  {
>  	IPU3CameraData *data = cameraData(camera);
>  	ImgUDevice *imgu = data->imgu_;
> -	unsigned int bufferCount;
>  	int ret;
>  
> -	bufferCount = std::max({
> -		data->outStream_.configuration().bufferCount,
> -		data->vfStream_.configuration().bufferCount,
> -		data->rawStream_.configuration().bufferCount,
> -	});
> -
> -	ret = imgu->allocateBuffers(bufferCount);
> +	ret = imgu->allocateBuffers();
>  	if (ret < 0)
>  		return ret;
>  
> diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp
> index d1cd3d9dc082..776e0f92aed1 100644
> --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp
> +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp
> @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)
>  {
>  	RPiCameraData *data = cameraData(camera);
>  	int ret;
> +	constexpr unsigned int bufferCount = 4;
>  
>  	/*
> -	 * Decide how many internal buffers to allocate. For now, simply look
> -	 * at how many external buffers will be provided. We'll need to improve
> -	 * this logic. However, we really must have all streams allocate the same
> -	 * number of buffers to simplify error handling in queueRequestDevice().
> +	 * Allocate internal buffers. We really must have all streams allocate
> +	 * the same number of buffers to simplify error handling in
> +	 * queueRequestDevice().
>  	 */
> -	unsigned int maxBuffers = 0;
> -	for (const Stream *s : camera->streams())
> -		if (static_cast<const RPi::Stream *>(s)->isExternal())
> -			maxBuffers = std::max(maxBuffers, s->configuration().bufferCount);
> -
>  	for (auto const stream : data->streams_) {
> -		ret = stream->prepareBuffers(maxBuffers);
> +		ret = stream->prepareBuffers(bufferCount);

We have a similar problem here, 4 buffer slots is too little, but when
the stream has to allocate internal buffers (!importOnly), which is the
case for most streams, we don't want to overallocate.

I'd like to get feedback from Naush here, but I think this means we'll
have to relax the requirement documented in the comment above, and
accept a different number of buffers for each stream.

>  		if (ret < 0)
>  			return ret;
>  	}
> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp
> index 11325875b929..f4ea2fd4d4d0 100644
> --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp
> +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp
> @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)
>  	unsigned int ipaBufferId = 1;
>  	int ret;
>  
> -	unsigned int maxCount = std::max({
> -		data->mainPathStream_.configuration().bufferCount,
> -		data->selfPathStream_.configuration().bufferCount,
> -	});
> -
> -	ret = param_->allocateBuffers(maxCount, &paramBuffers_);
> +	ret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);
>  	if (ret < 0)
>  		goto error;
>  
> -	ret = stat_->allocateBuffers(maxCount, &statBuffers_);
> +	ret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);
>  	if (ret < 0)
>  		goto error;
>  
> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp
> index 25f482eb8d8e..fea330f72886 100644
> --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp
> +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp
> @@ -172,7 +172,7 @@ int RkISP1Path::start()
>  		return -EBUSY;
>  
>  	/* \todo Make buffer count user configurable. */
> -	ret = video_->importBuffers(RKISP1_BUFFER_COUNT);
> +	ret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);
>  	if (ret)
>  		return ret;
>  
> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h
> index 91757600ccdc..3c5891009c58 100644
> --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h
> +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h
> @@ -27,6 +27,9 @@ class V4L2Subdevice;
>  struct StreamConfiguration;
>  struct V4L2SubdeviceFormat;
>  
> +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;
> +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;

The situation should be simpler for the rkisp1, as it has a different
pipeline model (inline ISP as opposed to offline ISP for the IPU3). We
can allocate more slots (8 or 16, as for other pipeline handlers), and
restrict the number of internal buffers (for stats and parameters) to
the number of requests we expect to queue to the device at once, plus
one for the IPA.  Four thus seems good. Capturing this rationale in a
comment would be good too.

BTW, I may be too tired to think properly, or just unable to see the
obvious, so please challenge any rationale you think is incorrect.

> +
>  class RkISP1Path
>  {
>  public:
> diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp
> index b5e34c4cd0c5..b3bcf01483f7 100644
> --- a/src/libcamera/pipeline/simple/converter.cpp
> +++ b/src/libcamera/pipeline/simple/converter.cpp
> @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,
>  
>  int SimpleConverter::Stream::start()
>  {
> -	int ret = m2m_->output()->importBuffers(inputBufferCount_);
> +	int ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);

Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not
much of an issue I suppose.

>  	if (ret < 0)
>  		return ret;
>  
> -	ret = m2m_->capture()->importBuffers(outputBufferCount_);
> +	ret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);
>  	if (ret < 0) {
>  		stop();
>  		return ret;
> diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h
> index 276a2a291c21..7e1d60674f62 100644
> --- a/src/libcamera/pipeline/simple/converter.h
> +++ b/src/libcamera/pipeline/simple/converter.h
> @@ -29,6 +29,9 @@ class SizeRange;
>  struct StreamConfiguration;
>  class V4L2M2MDevice;
>  
> +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;
> +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;

Let's name the variables kSimpleInternalBufferCount and
kSimpleBufferSlotCount, as that's the naming scheme we're moving to for
non-macro constants. Same comment elsewhere in this patch.

Those constants don't belong to converter.h. Could you turn them into
member constants of the SimplePipelineHandler class, as
kNumInternalBuffers (which btw should be removed) ? The number of buffer
slots can be passed as a parameter to SimpleConverter::start().

There's no stats or parameters here, and no IPA, so the situation is
different than for IPU3 and RkISP1. The number of internal buffers
should just be one more than the minimum number of buffers required by
the capture device, I don't think there's another requirement.

> +
>  class SimpleConverter
>  {
>  public:
> diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp
> index 1c25a7344f5f..a1163eaf8be2 100644
> --- a/src/libcamera/pipeline/simple/simple.cpp
> +++ b/src/libcamera/pipeline/simple/simple.cpp
> @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL
>  		 * When using the converter allocate a fixed number of internal
>  		 * buffers.
>  		 */
> -		ret = video->allocateBuffers(kNumInternalBuffers,
> +		ret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,
>  					     &data->converterBuffers_);
>  	} else {
> -		/* Otherwise, prepare for using buffers from the only stream. */
> -		Stream *stream = &data->streams_[0];
> -		ret = video->importBuffers(stream->configuration().bufferCount);
> +		ret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);
>  	}
>  	if (ret < 0)
>  		return ret;
> diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp
> index fd39b3d3c72c..755949e7a59a 100644
> --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp
> +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp
> @@ -91,6 +91,8 @@ private:
>  		return static_cast<UVCCameraData *>(
>  			PipelineHandler::cameraData(camera));
>  	}
> +
> +	static constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;
>  };
>  
>  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)
> @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,
>  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)
>  {
>  	UVCCameraData *data = cameraData(camera);
> -	unsigned int count = data->stream_.configuration().bufferCount;
>  
> -	int ret = data->video_->importBuffers(count);
> +	int ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);

For the uvc and vimc pipeline handlers, we have no internal buffers, so
it's quite easy. We should have 8 or 16 slots, as for other pipeline
handlers.

>  	if (ret < 0)
>  		return ret;
>  
> diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp
> index e89d53182c6d..24ba743a946c 100644
> --- a/src/libcamera/pipeline/vimc/vimc.cpp
> +++ b/src/libcamera/pipeline/vimc/vimc.cpp
> @@ -102,6 +102,8 @@ private:
>  		return static_cast<VimcCameraData *>(
>  			PipelineHandler::cameraData(camera));
>  	}
> +
> +	static constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;
>  };
>  
>  namespace {
> @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,
>  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)
>  {
>  	VimcCameraData *data = cameraData(camera);
> -	unsigned int count = data->stream_.configuration().bufferCount;
>  
> -	int ret = data->video_->importBuffers(count);
> +	int ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);
>  	if (ret < 0)
>  		return ret;
>  

-- 
Regards,

Laurent Pinchart


More information about the libcamera-devel mailing list