[libcamera-devel] API design feedback

Fri Oct 8 12:35:28 CEST 2021

Hi Dorota,

Thanks for your feedback, and sorry for the delay in the reply.

On Sat, Sep 18, 2021 at 06:25:00PM +0200, Dorota Czaplejewicz wrote:
> Hi,
> 
> I was asked to review the libcamera API, so here's my perspective,
> colored by my Rust experience (prefer RAII, dumb data, explicit
> dependencies), and by being a relative newbie to libcamera (still
> remember what confused me at first).

That's very valuable, it points the shortcomings of the documentation,
if not the API design.

> One of the first things is that the `libcamera` namespace has both
> elements meant for the consumers of the API, and implementation
> details. I've been told that libcamera wants to abstract the
> underlying API like media controller, and meant to be private, so
> things like `class MediaEntity` feel out of place when they are placed
> in the root namespace.

Those classes are defined in headers in a separate directory
(include/libcamera/internal/) and they're not installed. They are
however documented using Doxygen and end up mixed with all public
classes. We're planning to split the documentation in two parts, so the
public API documentation will not show those API elements anymore. Would
that be enough in your opinion, or would a separate namespace be a good
additional improvement ?

> Nitpick about `CameraConfiguration::validate()` - I take "validate" to
> be a C++ const function giving a "valid/invalid" answer. This call
> actually modifies the data, so it should be rather named "adjust".

We have an API naming scheme that prefixes adjectives with "is" (e.g.
isValid()), and uses verbs in the imperative mood for functions that
perform an action (this is mostly derived from Qt). That's undocumented
though, and it should be fixed.

Regarding validate() vs. adjust(), I've noted the comment and we'll take
it into account when refactoring that API (which is needed for other
reasons, including some that you have pointed out below).

> About explicitness: it's not clear what validate() checks against.
> Checking SimpleCameraConfiguration, the config has free access to a
> Camera instance, and that means that validate() potentially calls out
> to the camera driver, but it's not clear from the code. Some problems
> show up in the API: the returned enum has no "the camera has died,
> impossible to answer". I see 2 ways to escape, depending on what the
> intention in the API is:
> 
> 1. If calling to the HW is needed, remove the camera from the config
> and do `Camera::validate(CameraConfguration)` (or the inverse),
> 2. If validation doesn't need to talk to HW, purge Camera instances
> from CameraConfiguration subclasses,

It depends on the pipeline handlers, I expect the more specific pipeline
handlers (RPi, IPU3, ...) to have enough knowledge about the device to
implement validate() without calling to the kernel, while the simple
pipeline handler or the UVC pipeline handler will have to validate the
configuration with the help of the drivers (and in the UVC case, even
the help of the hardware).

> That has the additional benefit that the user can construct the
> desired parameters without having to call out to the Camera class
> beforehand.

Note that we subclass the CameraConfiguration in pipeline handlers to
associate private data with the configuration (computed at validate()
time, and then used at configure() time), so applications would still
not be able to allocate a CameraConfiguration on their own. That could
be changed though, with a different mechanism to store the private data
without subclassing. This is something I'll experiment with as part of
the configuration API rework.

> I think that's on the docs, but there's nothing preventing the user
> from attempting to configure the camera with unvalidated config, but
> there's a kind of a hole saying "config must be validated" without
> saying what happens when it's not. Making config pure data solves that
> too.

Camera::configure() will return an error if the configuration isn't
valid (we could document that). The design rationale was to force
applications to validate the configuration, so they would be aware of
possible adjustments. With applications forced to validate the
configuration, the pipeline handler implementation of configure() can
then rely on a fully valid configuration, which simplifies
implementation of pipeline handlers by separating validation and
configuration. We could refactor that, but the separation of validation
and configuration on the pipeline handler side is something I'd really
like to keep.

> About RAII: I noticed that both CameraManager and Camera suffer from
> manual management. Both of them use something close to a stack (even
> though Camera has a state machine which IMO is too general):
> 
> CameraManager -> start (Started)
> 
> Here I'm not even sure why `start` is even useful. I don't see a
> reason to create a CameraManager and then not start it. Then the
> destructor can be responsible for stopping it.

That was mostly done to be able to return an error code to the
application, but in hindsight it was probably not a good idea. We'll
likely drop start() and stop(), in the uncommon use case where an
application would need to restart a camera manager, it can delete and
recreate it (that's cheap compared to start() and stop()).

> Camera -> acquire (Exclusive) -> configure (Configured) -> start (Started)
> 
> Here's also a stack, with only `release` making it somewhat
> complicated. 
> 
> The problem is that there are 4 states, and each state only exposes a
> subset of functionality. A lof of functionality is gated behind state
> checks, and it's all lumped inside a single class, making the docs
> harder to read - which functionality is available when? Each method
> needs a manual declaration for that, and manual work is fallible.
> 
> If the states were split into separate classes (3? 4?), then those
> problems are solved at the cost of some extra destructor checks, and
> more granular classes. Then, the user doesn't get their hopes crushed
> at runtime when they call the wrong method, but the compiler is going
> to stop them already, and there's no risk of leaving some calls on the
> wrong state. Additionally, the user doesn't need to call `release()`
> and `stop()` (call them in the destructor), and knows exactly when to
> do what (IDE completions).
> 
> An example usage:
> 
> ```
> {
> 	cm = CameraManager()
> 	camera = cm.get("foo")
> 	camera.disconnected.connect(disco_handler)
> 	controls = camera.controls()
> 	// is there anything interesting to do with an exclusive, unconfigured camera?
> 	// it's part of the problem that I don't know at a glance

One use case is for applications that use multiple cameras, they may
need to ensure exclusive access to all cameras before they configure
them.

> 	camera_exclusive = camera.configure(my_config)
> 	req = camera_exclusive.create_request()
> 	camera_exclusive.bufferCompleted.connect(buffer_handler)
> 	camera_stream = camera_exclusive.start()
> 	// there's some freedom about where the handlers belong to.
> 	// I think the following form is more explicit: the user can't forget about them.
> 	// camera_stream = camera_exclusive.start(buffer_handler, request_handler)

Our current API is based on signals (inspired by Qt) to decouple the
emitter objects from their users. Passing callback functions manually
make it more cumbersome. Pre-C++11 it was difficult, if not impossible,
to use class members as callback functions in a generic way that
wouldn't require an implementation of the horrible Java listener
pattern. With lambdas it got better, but I still don't really like
having to store pointers to the functions manually. Let's also note that
signals are thread-aware, if the emitter and receiver are bound to
different thread, signal delivery will go through a message queue. This
is a feature that would not be nice to replicate manually in every
emitter. Of course, we could still pass callback functions manually, and
connect them to signals internally. One upside would be that the Signal
class (and the associated BoundMethod classes) wouldn't need to be
exposed in the public API, reducing the API surface.

Based on this explanation, what do you think would be best ?

> } // destructors stop and release the camera and the manager without manual fiddling
> ```
> 
> The destructor tradeoff is that the `~Camera` destructor shouldn't be
> allowed to proceed while `CameraStream` is present. That might be
> exception-worthy.

We don't use exceptions, but we could ASSERT().

Would reference-counting Camera help here ? Same for CameraExclusive,
the CameraStream could take a reference to it. The downside is that
CameraExclusive would need to be handled as a shared pointer.

This being said, we trade a bit one problem for another, what happens if
the application calls camera_exclusive.start() multiple times ? Should
calls beyond the first one return an error (as a null unique pointer
in this case) ? We will still need a state machine in the Camera class
(or possibly CameraExclusive) to handle this.

> I haven't looked into the Allocator, but I suspect something similar
> is going on there, with a similar solution.
> 
> Stemming clearly from Rust and reinforced by untold hours lost
> debugging stupid mistakes, please don't leave the duty of correctness
> on the caller when it can be avoided, especially outside of hot loops.

Absolutely, that's something we've already done in the API design. You
don't want to hear about all the APIs that were proposed and that got
shot down, to be replaced by better APIs that made the error impossible
in the first place. That's one of our big API design rules (also
inspired by Qt): the API should be easy and intuitive to use correctly,
and impossible, or at least hard, to misuse. We of course have to do
with the tools at our disposal (in this case a C++ compiler), so there
are misuses that we can't make impossible or even very hard, but we try
as much as we can.

> Debugging undefined behaviour generally costs me more time than it
> would maintaining the needed bounds checks. As an example:
> 
> > `CameraConfiguration::at`  Calling this function with an invalid index results in undefined behaviour.
> 
> Can this instead throw an exception? That's much more explainable. I
> even volunteer to eliminate all places where UB is part of the
> contract.

Undefined behaviour pretty much means that the program will crash,
usually due to an ASSERT(). That's not very different from unhandled
exceptions. Unlike some undefined behaviours of the C language that are
used for clever purposes (mostly too clever ones, that then bite back),
here we're essentially telling that it will get very, very bad, so bad
that a crash is the most likely outcome.

Would it be better to explicitly state that the program will abort()
instead of mentioning undefined behaviour ?

> I may be back with more once I start using the library.

-- 
Regards,

Laurent Pinchart