<div dir="ltr"><div dir="ltr">Hi Laurent,<div><br></div><div>Thank you for your feedback!</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 11 Nov 2021 at 14:35, Laurent Pinchart <<a href="mailto:laurent.pinchart@ideasonboard.com">laurent.pinchart@ideasonboard.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Naush,<br>
<br>
Thank you for the patch.<br>
<br>
On Wed, Nov 10, 2021 at 10:08:01AM +0000, Naushir Patuck wrote:<br>
> For simplicity, the pipeline handler would look at the maximum number of<br>
> buffers set in the StreamConfiguration and allocate the same number of internal<br>
> buffers for all device nodes. This would likely overallocate buffers for some<br>
> nodes. Rework this logic to try and minimise overallcations without compromising<br>
> performance.<br>
> <br>
> For ISP nodes, we only ever need 1 set of internal buffers, as the hardware runs<br>
> synchronous with the requests and IPA.<br>
> <br>
> For Unicam nodes, allocate a minimum for 4 buffers (exported + internal), but<br>
> also require at least 1 internal buffer.<br>
> <br>
> Signed-off-by: Naushir Patuck <<a href="mailto:naush@raspberrypi.com" target="_blank">naush@raspberrypi.com</a>><br>
> ---<br>
> .../pipeline/raspberrypi/raspberrypi.cpp | 44 ++++++++++++++-----<br>
> 1 file changed, 33 insertions(+), 11 deletions(-)<br>
> <br>
> diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp<br>
> index 11d3c2b120dd..5f0f00aacd59 100644<br>
> --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp<br>
> +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp<br>
> @@ -1211,21 +1211,43 @@ int PipelineHandlerRPi::queueAllBuffers(Camera *camera)<br>
> int PipelineHandlerRPi::prepareBuffers(Camera *camera)<br>
> {<br>
> RPiCameraData *data = cameraData(camera);<br>
> + unsigned int numBuffers;<br>
<br>
I'd name the variable numRawBuffers.<br>
<br>
> + bool rawStream = false;<br>
> int ret;<br>
> <br>
> - /*<br>
> - * Decide how many internal buffers to allocate. For now, simply look<br>
> - * at how many external buffers will be provided. We'll need to improve<br>
> - * this logic. However, we really must have all streams allocate the same<br>
> - * number of buffers to simplify error handling in queueRequestDevice().<br>
<br>
Does the error logic still work after this change ?<br></blockquote><div><br></div><div>Yes, the comment above has been bit-rotted, and this is handled correctly now.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> - */<br>
> - unsigned int maxBuffers = 0;<br>
> - for (const Stream *s : camera->streams())<br>
> - if (static_cast<const RPi::Stream *>(s)->isExternal())<br>
> - maxBuffers = std::max(maxBuffers, s->configuration().bufferCount);<br>
> + for (Stream *s : camera->streams()) {<br>
> + if (isRaw(s->configuration().pixelFormat)) {<br>
> + numBuffers = s->configuration().bufferCount;<br>
> + rawStream = true;<br>
> + break;<br>
> + }<br>
> + }<br>
> <br>
> + /* Decide how many internal buffers to allocate. */<br>
> for (auto const stream : data->streams_) {<br>
> - ret = stream->prepareBuffers(maxBuffers);<br>
<br>
And here add a numBuffers local variable.<br>
<br>
> + if (stream == &data->unicam_[Unicam::Image] ||<br>
> + stream == &data->unicam_[Unicam::Embedded]) {<br>
> + /*<br>
> + * For Unicam, allocate a minimum of 4 buffers as we want<br>
> + * to avoid any frame drops. If an application has configured<br>
> + * a RAW stream, allocate additional buffers to make up the<br>
> + * minimum, but ensure we have at least 1 set of internal<br>
> + * buffers to use to minimise frame drops.<br>
> + */<br>
> + constexpr unsigned int minBuffers = 4;<br>
> + numBuffers = rawStream ? std::max<int>(1, minBuffers - numBuffers)<br>
> + : minBuffers;<br>
<br>
You could initialize numRawBuffers to 0 and drop rawStream, as if<br>
!rawStream, then numRawBuffers will be equal to 0, so<br>
<br>
numBuffers = std::max<int>(1, minBuffers - numRawBuffers);<br>
<br>
would do the right thing.<br>
<br>
> + } else {<br>
> + /*<br>
> + * Since the ISP runs synchronous with the IPA and requests,<br>
> + * we only ever need one set of internal buffers. Any buffers<br>
> + * the application wants to hold onto will already be exported<br>
> + * through PipelineHandlerRPi::exportFrameBuffers().<br>
> + */<br>
> + numBuffers = 1;<br>
> + }<br>
> +<br>
> + ret = stream->prepareBuffers(numBuffers);<br>
<br>
I have a hard time follow this. Before the patch, the pipeline handler<br>
calls prepareBuffers() with the number of buffers requested by the<br>
application. Now it calls it with a number of internal buffers only,<br>
without any change to the prepareBuffers() function itself. This would<br>
benefit from an explanation in the commit message, as it's not clear why<br>
it's correct.<br></blockquote><div><br></div><div>For clarification, prepareBuffers() would (and still will) allocate internal use</div><div>buffers, and call importBuffers() to allocate the buffer cache with the internal</div><div>+ user requested buffer count. Buffer allocations for the user requested buffer<br></div><div>count goes through PipelineHandler::exportFrameBuffers().</div><div><br></div><div>So this change (somewhat) decouples the user count from the internal count.</div><div>I will clarify this in the commit message.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Furthermore, what will happen if an application requests 4 buffers for<br>
the raw stream and any number of buffers for a processed streams, and<br>
repeatedly queues requests with no buffer for the raw stream ? It seems<br>
to me that you'll now allocate a single extra buffer for the raw stream,<br>
which won't be enough to keep the unicam queue running.<br></blockquote><div><br></div><div>This is a tricky one. Without any hints from the application as to what it intends</div><div>to do, I have to balance memory usage with performance. You are right that</div><div>with only a single internal buffer allocated, the performance may be degraded</div><div>if the application does not provide buffers through the Request. However,</div><div>If the application has allocated 4x 18Mbyte buffers (12mpix 12-bit), I really don't</div><div>want to be allocating 2x more of them when they possibly may never be used</div><div>(assuming the application is circulating all these buffers constantly). I would hope</div><div>application writers would be using those buffers sensibly :-)</div><div><br></div><div>Some of our platforms have very small CMA heaps to work with, memory will</div><div>get very tight for these use-cases, so I do want to limit memory usage at the</div><div>expense of performance. Some hints from the application on how it intends to</div><div>use buffers might help here with this balance.</div><div><br></div><div>David, do you think we can bump the internal allocation up to 2 buffers?</div><div><br></div><div>Regards,</div><div>Naush</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> if (ret < 0)<br>
> return ret;<br>
> }<br>
<br>
-- <br>
Regards,<br>
<br>
Laurent Pinchart<br>
</blockquote></div></div>