[libcamera-devel] [PATCH] libcamera: v4l2_videodevice: Handle unexpected buffers
Kieran Bingham
kieran.bingham at ideasonboard.com
Thu Aug 12 15:09:18 CEST 2021
Hi Laurent,
On 28/07/2021 12:26, Laurent Pinchart wrote:
> Hi Kieran,
>
> On Wed, Jul 28, 2021 at 09:51:19AM +0100, Kieran Bingham wrote:
>> On 28/07/2021 07:24, Laurent Pinchart wrote:
>>> On Thu, Jul 15, 2021 at 03:21:30PM +0100, Kieran Bingham wrote:
>>>> A kernel bug can lead to unexpected buffers being dequeued where we
>>>> haven't entered the buffer in our queuedBuffers_ list.
>>>>
>>>> This causes invalid accesses if not handled correctly within libcamera,
>>>> and while it is a kernel issue, we must protect against unpatched
>>>> kernels.
>>>>
>>>> Handle unexpected buffers by returning a nullptr, and move cache
>>>> management after the validation of the buffer.
>>>>
>>>> Signed-off-by: Kieran Bingham <kieran.bingham at ideasonboard.com>
>>>> ---
>>>> src/libcamera/v4l2_videodevice.cpp | 21 ++++++++++++++++++++-
>>>> 1 file changed, 20 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
>>>> index 3d2d99b46e4e..6c7f9daf24db 100644
>>>> --- a/src/libcamera/v4l2_videodevice.cpp
>>>> +++ b/src/libcamera/v4l2_videodevice.cpp
>>>> @@ -1519,9 +1519,28 @@ FrameBuffer *V4L2VideoDevice::dequeueBuffer()
>>>>
>>>> LOG(V4L2, Debug) << "Dequeuing buffer " << buf.index;
>>>>
>>>> + auto it = queuedBuffers_.find(buf.index);
>>>> + /*
>>>> + * If the video node fails to stream-on successfully (which can occur
>>>> + * when queing a buffer), a vb2 kernel bug can lead to the buffer which
>>>> + * returns a failure upon queing, being mistakenely kept in the kernel.
>>>
>>> s/queing,/queing/
>>>
>>> You've mistakenly spelled mistakenly mistakenely :-)
>>
>> You're too slow ;-)
>>
>> Highlighted in my reply to myself already.
>>
>>>> + * This leads to the kernel notifying us that a buffer is available to
>>>> + * dequeue, which we have no awareness of being queued, and thus we will
>>>> + * not find it in the queuedBuffers_ list.
>>>
>>> It may be worth expanding this a bit to explain about the
>>> stream-on-at-qbuf-time issue. If I recall correctly, given that the
>>> device fails to stream, the issue only occurs when we call
>>> VIDIOC_STREAMOFF and all buffers currently in queue (including the one
>>> erroneously kept in the queue) being signalled as complete in the error
>>> state. Is that right ?
>>
>> I don't think we're calling stream off because we haven't called stream
>> on (by nature of the fact that it was a failure at stream-on-at-qbuf-time?).
>
> Strream-on-at-qbuf means that STREAMON is called and succeeds, but
> doesn't reach the driver because vb2 blocks the STREAMON until enough
> buffers are available. When a QBUF provides the needed buffer, vb2 calls
> the .start_streaming() operation, which starts the device. A failure at
> that point is reported as a QBUF failure, not a STREAMON failure has
> that as completed long before. From a userspace point of view, the V4L2
> video node is streaming, and thus need to the stopped with STREAMOFF.
>
>> Although, perhaps we do as the queueBuffer could lead to an error path
>> that calls cio2_->stop() which does call streamOff regardless, but it's
>> a fairly moot point by that stage.
>>
>> The key issue is that the kernel notifies us of a buffer as completed in
>> error state, that we believed didn't queue - because it failed to
>> complete in Q_BUF.
>
> rMmy point is that that notification occurs at STREAMOFF time, when all
> queued buffers are given back to userspace. The device hasn't been
> started due to the .start_streaming() failure, so the kernel doesn't
> produce any buffer ready event until STREAMOFF time.
>
>>>> + *
>>>> + * Whilst this is a kernel bug and should be fixed there, ensure that we
>>>> + * safely ignore buffers which are unexpected to prevent crashes on
>>>> + * unpatched kernels.
>>>
>>> Is the fix on its way upstream ? If so, this could probably be reworded,
>>> and mentioning the target kernel version (and even better, the upstream
>>> commit) would be nice.
>>
>> I've given Tested-by: tags to Hans on his proof-of-concept patch, I
>> haven't seen any updated patch come through from him though, so I don't
>> currently know the progress.
>>
>> It may need chasing, or taking over if he doesn't have time.
>
> Let's start the chase :-)
The relevant kernel patch is posted, and headed to the stable trees, so
it will be picked up by CrOS soon enough.
Do you want to explicitly drop this patch, or are you happy to keep it?
I still think it has some merit, as it's better to report the issue
cleanly in the event that it happens ... however, its' a rare kernel bug
(I expect) that is already patched, and thus should become ... rarer ...
--
Kieran
>
>>>> + */
>>>> + if (it == queuedBuffers_.end()) {
>>>> + LOG(V4L2, Error)
>>>> + << "Dequeued an unexpected buffer:" << buf.index;
>>>
>>> Missing space after ':'.
>>
>> Also in my self-review and already fixed locally ;-)
>>
>>>> +
>>>
>>> You can drop the blank line.
>>
>> Ack.
>>
>>>> + return nullptr;
>>>> + }
>>>> +
>>>> cache_->put(buf.index);
>>>>
>>>> - auto it = queuedBuffers_.find(buf.index);
>>>> FrameBuffer *buffer = it->second;
>>>> queuedBuffers_.erase(it);
>>>>
>
More information about the libcamera-devel
mailing list