[libcamera-devel] [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification

Tue May 16 15:50:14 CEST 2023

Hello Laurent,

+CC: Andrii

On 06.05.23 10:16, Laurent Pinchart wrote:
> I'm also CC'ing the linux-media at vger.kernel.org mailing list for these
> discussions, I'm sure there are folks there who are interested in codec
> and camera virtualization.
>
> On Sat, May 06, 2023 at 11:12:29AM +0300, Laurent Pinchart via libcamera-devel wrote:
>> On Fri, May 05, 2023 at 04:55:33PM +0100, Alex Bennée via libcamera-devel wrote:
>>> Kieran Bingham writes:
>>>> Quoting Alexander Gordeev (2023-05-05 10:57:29)
>>>>> On 03.05.23 17:53, Cornelia Huck wrote:
>>>>>> On Wed, May 03 2023, Alex Bennée <alex.bennee at linaro.org> wrote:
>>>>>>> Cornelia Huck <cohuck at redhat.com> writes:
>>>>>>>> On Fri, Apr 28 2023, Alexander Gordeev <alexander.gordeev at opensynergy.com> wrote:
>>>>>>>>> On 27.04.23 15:16, Alexandre Courbot wrote:
>>>>>>>>>> But in any case, that's irrelevant to the guest-host interface, and I
>>>>>>>>>> think a big part of the disagreement stems from the misconception that
>>>>>>>>>> V4L2 absolutely needs to be used on either the guest or the host,
>>>>>>>>>> which is absolutely not the case.
>>>>>>>>>
>>>>>>>>> I understand this, of course. I'm arguing, that it is harder to
>>>>>>>>> implement it, get it straight and then maintain it over years. Also it
>>>>>>>>> brings limitations, that sometimes can be workarounded in the virtio
>>>>>>>>> spec, but this always comes at a cost of decreased readability and
>>>>>>>>> increased complexity. Overall it looks clearly as a downgrade compared
>>>>>>>>> to virtio-video for our use-case. And I believe it would be the same for
>>>>>>>>> every developer, that has to actually implement the spec, not just do
>>>>>>>>> the pass through. So if we think of V4L2 UAPI pass through as a
>>>>>>>>> compatibility device (which I believe it is), then it is fine to have
>>>>>>>>> both and keep improving the virtio-video, including taking the best
>>>>>>>>> ideas from the V4L2 and overall using it as a reference to make writing
>>>>>>>>> the driver simpler.
>>>>>>>>
>>>>>>>> Let me jump in here and ask another question:
>>>>>>>>
>>>>>>>> Imagine that, some years in the future, somebody wants to add a virtio
>>>>>>>> device for handling video encoding/decoding to their hypervisor.
>>>>>>>>
>>>>>>>> Option 1: There are different devices to chose from. How is the person
>>>>>>>> implementing this supposed to pick a device? They might have a narrow
>>>>>>>> use case, where it is clear which of the devices is the one that needs to
>>>>>>>> be supported; but they also might have multiple, diverse use cases, and
>>>>>>>> end up needing to implement all of the devices.
>>>>>>>>
>>>>>>>> Option 2: There is one device with various optional features. The person
>>>>>>>> implementing this can start off with a certain subset of features
>>>>>>>> depending on their expected use cases, and add to it later, if needed;
>>>>>>>> but the upfront complexity might be too high for specialized use cases.
>>>>>>>>
>>>>>>>> Leaving concrete references to V4L2 out of the picture, we're currently
>>>>>>>> trying to decide whether our future will be more like Option 1 or Option
>>>>>>>> 2, with their respective trade-offs.
>>>>>>>>
>>>>>>>> I'm slightly biased towards Option 2; does it look feasible at all, or
>>>>>>>> am I missing something essential here? (I had the impression that some
>>>>>>>> previous confusion had been cleared up; apologies in advance if I'm
>>>>>>>> misrepresenting things.)
>>>>>>>>
>>>>>>>> I'd really love to see some kind of consensus for 1.3, if at all
>>>>>>>> possible :)
>>>>>>>
>>>>>>> I think feature discovery and extensibility is a key part of the VirtIO
>>>>>>> paradigm which is why I find the virtio-v4l approach limiting. By
>>>>>>> pegging the device to a Linux API we effectively limit the growth of the
>>>>>>> device specification to as fast as the Linux API changes. I'm not fully
>>>>>>> immersed in v4l but I don't think it is seeing any additional features
>>>>>>> developed for it and its limitations for camera are one of the reasons
>>>>>>> stuff is being pushed to userspace in solutions like libcamera:
>>>>>>>
>>>>>>>     How is libcamera different from V4L2?
>>>>>>>
>>>>>>>     We see libcamera as a continuation of V4L2. One that can more easily
>>>>>>>     handle the recent advances in hardware design. As embedded cameras have
>>>>>>>     developed, all of the complexity has been pushed on to the developers.
>>>>>>>     With libcamera, all of that complexity is simplified and a single model
>>>>>>>     is presented to application developers.
>>>>>>
>>>>>> Ok, that is interesting; thanks for the information.
>>>>>>
>>>>>>>
>>>>>>> That said its not totally our experience to have virtio devices act as
>>>>>>> simple pipes for some higher level protocol. The virtio-gpu spec says
>>>>>>> very little about the details of how 3D devices work and simply offers
>>>>>>> an opaque pipe to push a (potentially propriety) command stream to the
>>>>>>> back end. As far as I'm aware the proposals for Vulkan and Wayland
>>>>>>> device support doesn't even offer a feature bit but simply changes the
>>>>>>> graphics stream type in the command packets.
>>>>>>>
>>>>>>> We could just offer a VIRTIO_VIDEO_F_V4L feature bit, document it as
>>>>>>> incompatible with other feature bits and make that the baseline
>>>>>>> implementation but it's not really in the spirit of what VirtIO is
>>>>>>> trying to achieve.
>>>>>>
>>>>>> I'd not be in favour of an incompatible feature flag,
>>>>>> either... extensions are good, but conflicting features is something
>>>>>> that I'd like to avoid.
>>>>>>
>>>>>> So, given that I'd still prefer to have a single device: How well does
>>>>>> the proposed virtio-video device map to a Linux driver implementation
>>>>>> that hooks into V4L2?
>>>>>
>>>>> IMO it hooks into V4L2 pretty well. And I'm going to spend next few
>>>>> months making the existing driver fully V4L2 compliant. If this goal
>>>>> requires changing the spec, than we still have time to do that. I don't
>>>>> expect a lot of problems on this side. There might be problems with
>>>>> Android using V4L2 in weird ways. Well, let's see. Anyway, I think all
>>>>> of this can be accomplished over time.
>>>>>
>>>>>> If the general process flow is compatible and it
>>>>>> is mostly a question of wiring the parts together, I think pushing that
>>>>>> part of the complexity into the Linux driver is a reasonable
>>>>>> trade-off. Being able to use an existing protocol is nice, but if that
>>>>>> protocol is not perceived as flexible enough, it is probably not worth
>>>>>> encoding it into a spec. (Similar considerations apply to hooking up the
>>>>>> device in the hypervisor.)
>>>>>
>>>>> I very much agree with these statements. I think this is how it should
>>>>> be: we start with a compact but usable device, then add features and
>>>>> enable them using feature flags. Eventually we can cover all the
>>>>> use-cases of V4L2 unless we decide to have separate devices for them
>>>>> (virtio-camera, etc). This would be better in the long term I think.
>>>>
>>>> Camera's definitely have their quirks - mostly because many usecases are
>>>> hard to convey over a single Video device node (with the hardware) but I
>>>> think we might expect that complexity to be managed by the host, and
>>>> probably offer a ready made stream to the guest. Of course how to handle
>>>> multiple streams and configuration of the whole pipeline may get more
>>>> difficult and warrant a specific 'virtio-camera' ... but I would think
>>>> the basics could be covered generically to start with.
>>>>
>>>> It's not clear who's driving this implementation and spec, so I guess
>>>> there's more reading to do.
>>>>
>>>> Anyway, I've added Cc libcamera-devel to raise awareness of this topic
>>>> to camera list.
>>>>
>>>> I bet Laurent has some stronger opinions on how he'd see camera's exist
>>>> in a virtio space.
>>
>> You seem to think I have strong opinions about everything. This may not
>> be a complitely unfounded assumption ;-)
>>
>> Overall I agree with you, I think cameras are too complex for a
>> low-level virtualization protocol. I'd rather see a high-level protocol
>> that exposes webcam-like devices, with the low-level complexity handled
>> on the host side (using libcamera of course ;-)). This would support use
>> cases that require sharing hardware blocks between multiple logical
>> cameras, including sharing the same camera streams between multiple
>> guests.
>>
>> If a guest needs low-level access to the camera, including the ability
>> to control the raw camera sensor or ISP, then I'd recommend passing the
>> corresponding hardware blocks to the guest for exclusive access.
>>
>>> Personally I would rather see a separate virtio-camera specification
>>> that properly encapsulates all the various use cases we have for
>>> cameras. In many ways just processing a stream of video is a much
>>> simpler use case.
>>>
>>> During Linaro's Project Stratos we got a lot of feedback from members
>>> who professed interest in a virtio-camera initiative. However we were
>>> unable to get enough engineering resources from the various companies to
>>> collaborate in developing a specification that would meet everyone's
>>> needs. The problem space is wide from having numerous black and white
>>> sensor cameras on cars to the full on computational photography as
>>> exposed by modern camera systems on phones. If you want to read more
>>> words on the topic I wrote a blog post at the time:
>>>
>>>    https://www.linaro.org/blog/the-challenges-of-abstracting-virtio/
>>>
>>> Back to the topic of virtio-video as I understand it the principle
>>> features/configurations are:
>>>
>>>    - All the various CODECs, resolutions and pixel formats
>>>    - Stateful vs Stateless streams
>>>    - If we want support grabbing single frames from a source
>>>
>>> My main concern about the V4L approach is that it pegs updates to the
>>> interface to the continuing evolution of the V4L interface in Linux. Now
>>> maybe video is a solved problem and there won't be (m)any new features
>>> we need to add after the initial revision. However I'm not a domain
>>> expert here so I just don't know.
>>
>> I've briefly discussed "virtio-v4l2" with Alex Courbot a few weeks ago
>> when we got a chance to meet face to face. I think the V4L2 kernel API
>> is a quite good fit in the sense that its level of abstraction, when
>> applied to video codecs and "simple" cameras (defined, more or less, as
>> something ressembling a USB webcam feature-wise). It doesn't mean that
>> the virtio-video or virtio-camera specifications should necessarily
>> reference V4L2 or use the exact same vocabulary, they could simply copy
>> the concepts, and stay loosely-coupled with V4L2 in the sense that both
>> specification should try to evolve in compatible directions.

Thanks for the info.

Would everybody agree to have only a simple USB webcam-like virtual
camera and expect more complex devices to be passed through for
exclusive access to a guest? I don't have my own opinion at the moment.
If we have an agreement here, then it would definitely help us move
forward with the virtio-video/virtio-v4l2 discussion. AFAIU this is what
Alex Bennée called "catering to the lowest common denominator" in his
article. Right? So he prefers to avoid this using feature negotiations
built in virtio. Well, I also like to have flexibility. Andrii, what do
you think?

Would V4L2 be enough for the virtual cameras in the future? For me the
existence of libcamera is already a sign, that V4L2 (or the way it is
developed) might not be flexible enough for everybody. If somebody has
an issue in the future, they might want to create a new device with an
overlapping scope. Then the same questions would be discussed again.

If we don't know yet answers to these questions, and we decide to
postpone the decision, then this means no devices could be merged,
right? For me this would be another argument to keep things separate.
Because we already know how to do the codecs. I think there is no
disagreement on this.

Well, basically virtio-video has taken a lot of ideas from V4L2. So in a
sense it is or it tries to be a subset of V4L2 for the codecs only, but
adapted for the virtualization case. I think it is much better defined
compared to V4L2 for this scope. I believe it can be extended to support
the simple cameras if necessary. However at the moment I'd prefer to see
a dedicated virtio-camera device as I said. So I agree with Alex Bennée.

--
Alexander Gordeev
Senior Software Engineer

OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin

Phone: +49 30 60 98 54 0 - 88
Fax: +49 (30) 60 98 54 0 - 99
EMail: alexander.gordeev at opensynergy.com

www.opensynergy.com

Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B
Geschäftsführer/Managing Director: Régis Adjamah

Please mind our privacy notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/> pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 DSGVO finden Sie hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>