[RFC PATCH v1 08/23] Documentation: design: Document `MetadataList`

Barnabás Pőcze barnabas.pocze at ideasonboard.com
Fri Jun 6 18:41:41 CEST 2025


Add a document describing the problem, the choices, and the design of
the separate metadata list data structure.

Signed-off-by: Barnabás Pőcze <barnabas.pocze at ideasonboard.com>
---
 Documentation/design/metadata-list.rst | 234 +++++++++++++++++++++++++
 Documentation/index.rst                |   1 +
 Documentation/meson.build              |   1 +
 3 files changed, 236 insertions(+)
 create mode 100644 Documentation/design/metadata-list.rst

diff --git a/Documentation/design/metadata-list.rst b/Documentation/design/metadata-list.rst
new file mode 100644
index 000000000..a42f94bdf
--- /dev/null
+++ b/Documentation/design/metadata-list.rst
@@ -0,0 +1,234 @@
+.. SPDX-License-Identifier: CC-BY-SA-4.0
+
+Design of the metadata list
+===========================
+
+This document explains the design and rationale of the metadata list.
+
+
+Description of the problem
+--------------------------
+
+Early metadata
+^^^^^^^^^^^^^^
+
+A pipeline handler might report numerous metadata items to the application about
+a single request. It is likely that different metadata items become available at
+different points in time while a request is being processed.
+
+Simultaneously, an application might desire to carry out potentially non-trivial
+extra processing one the image, etc. using certain metadata items. For such an
+application it is likely best if the final value of each metadata item is reported
+as soon as possible, thus allowing it to start processing as soon as possible.
+
+For this reason, libcamera provides the `metadataAvailable` signal on each `Camera` object.
+This signal is dispatched whenever new metadata items become available for a queued request.
+This mechanism is completely optional, only interested applications need to subscribe,
+others are free to ignore it completely. `Request::metadata()` will contain the sum of
+all early metadata items at request completion.
+
+Thread safety
+^^^^^^^^^^^^^
+
+At the moment, event handlers of the application are always dispatched in a private
+thread of libcamera. This requires that applications process the various events in a
+thread-safe manner wrt. themselves. The burden of correct synchronization falls
+upon the applications.
+
+Previously, a `ControlList` was used to store the metadata pertaining to a particular
+request. A `ControlList` is implemented using an `std::unordered_map`, meaning that
+its thread-safety is limited. This hints at a need for a separate data structure
+or at least some kind of thread-safe wrapper.
+
+
+Requirements
+------------
+
+We wish to provide a simple, easy-to-use, and hard-to-misuse interface for applications.
+Notably, applications should be able to delegate early metadata processing to their
+own separate threads safely wrt. the metadata list. Consider the following scenario:
+the pipeline handler send early metadata items to the application, the application
+delegates it to a separate thread. After that, the private libcamera thread is no
+longer blocked, thus the pipeline handler can continue working on the request: e.g.
+add more metadata items. Simultaneously, the application might be reading the metadata
+items on a separate thread. This situation should be safe and work correctly, ideally
+with any number of threads reading the completed metadata items. Until the request
+is destroyed or reused, whichever happens first.
+
+Secondarily, efficiency should be considered: copies, locks, reference counting, etc.
+should be avoided if possible.
+
+Preferably, it should be possible to refer to a contiguous (in insertion order) subset
+of values reasonably efficiently (i.e. avoiding having to store a separate list of
+numeric identifiers, etc.).
+
+
+Options
+-------
+
+Keep using `ControlList`
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Using a `ControlList` (and hence `std::unordered_map`) with early metadata completion would
+be possible, but it would place a number of potentially non-intuitive and easy to violate
+restrictions on applications, making it harder to use safely. Specifically, the application
+would have to retrieve a pointer to the `ControlValue` object in the metadata `ControlList`,
+and then access it only through that pointer. It wouldn't be able to do lookups on the metadata
+list outside the event handler. As a consequence, the usual way of retrieving metadata using
+the pre-defined `Control<T>` objects would no longer be possible, losing type-safety.
+
+Send a copy
+^^^^^^^^^^^
+
+Passing a separate `ControlList` containing the just completed metadata, and disallowing access
+to the request's metadata list until completion works fine, and avoids the synchronization issues
+on the libcamera side. Nonetheless, it has two significant drawbacks:
+
+1. It moves the issue of synchronization from libcamera to the application: the application still has
+   to access its own data in a thread-safe manner and/or transfer the partial metadata list to its
+   *main* thread of execution.
+2. Early metadata can be reported multiple times for each request, thus making copies can have negative
+   performance implications.
+
+
+Design
+------
+
+A separate data structure is introduced to contain the metadata items pertaining to a given request.
+It is referred to as "metadata list" from now on.
+
+The current design of the metadata list places a number of restrictions on request metadata.
+A metadata list is backed by a pre-allocated (at construction time) contiguous block of
+memory sized appropriately to contain all possible metadata items. This means that the
+number and size of metadata items that a camera can report must be known in advance. The
+newly introduced `MetadataListPlan` type is used for that purpose. At the time of writing
+this does not appear to be a significant limitation since most metadata has a fixed size,
+and each pipeline handler (and IPA) has a fixed set of metadata that it can report. There
+are, however, metadata items that have a variably-sized array type. In those cases an upper
+bound on the number of elements must be provided.
+
+`MetadataListPlan`
+^^^^^^^^^^^^^^^^^^
+
+A `MetadataListPlan` collects the set of possible metadata items. It maps the numeric id
+of the control to a collection of static information (size, etc.). This is most importantly
+used to calculate the size required to store all possible metadata item.
+
+Each camera has its own `MetadataListPlan` object similarly to its `ControlInfoMap`. It is
+used to create requests for the the camera with an appropriately sized `MetadataList`.
+Pipeline handlers should fill it during camera initialization or configuration, and they
+are allowed to modify it as long as they camera is not configured and during configuration.
+
+`MetadataList`
+^^^^^^^^^^^^^^
+
+The current metadata list implementation is a single-writer multiple-readers thread-safe
+data structure that provides lock-free lookup and access for any number of threads, while
+allowing a single thread at a time to add metadata items.
+
+The implemented metadata list has two main parts. The first part essentially contains
+a copy of the `MetadataListPlan` used to construct the `MetadataList`. In addition to
+the static information about the metadata item, it contains dynamic information such
+as whether the metadata item has been added to the list or not.
+
+The second part of a metadata list is a completely self-contained serialized list
+of metadata items. The number of bytes used for actually storing metadata items in
+this second part will be referred to as the "fill level" from now on. The self-contained
+nature of the second part leads to a certain level of data duplication between the two
+parts, however, the end goal is to have a serialized version of `ControlList` with the
+same serialized format. This would allow a `MetadataList` to be "trivially" reinterpreted
+as a control list at any point of its lifetime, simplifying the interoperability between the two.
+TODO: do we really want that?
+
+A metadata list, at construction time, calculates the number of bytes necessary to store
+all possible metadata items according to the supplied `MetadataListPlan`. Storage, for
+all possible metadata items and the necessary auxiliary structures is then allocated.
+This allocation remains fixed for the entire lifetime of a `MetadataList`, which is
+crucial to satisfy the earlier requirements.
+
+Each metadata item can only be added to a metadata list once. This constraint does not pose
+a significant limitation, instead, it simplifies the interface and implementation; it is
+essentially an append-only list.
+
+Serialization
+'''''''''''''
+
+The actual values are encoded in the "second part" of the metadata list in a fairly
+simple fashion. Each control value is encoded as header + data bytes + padding. Each
+value has a header, which contains information such as the size, alignment, type, etc.
+of the value. The data bytes are aligned to the alignment specified in the header,
+and padding may be inserted after the last data byte to guarantee proper alignment
+for the next header. Padding is present even after the last entry.
+
+The minimum amount of state needed to describe such a serialized list of values is
+merely the number of bytes used, which can reasonably be limited to 4 GiB, meaning
+that a 32-bit unsigned integer is sufficient to store the fill level. This makes it
+possible to easily update the state in a wait-free fashion.
+
+Lookup
+''''''
+
+Lookup in a metadata list is done using the metadata entries in the "first part".
+These entries are sorted by their numeric identifiers, hence binary search is used to
+find the appropriate entry. Then, it is checked whether the given control id has already
+been added, and if it has, then its data can be returned in a `ControlValueView` object.
+
+Insertion
+'''''''''
+
+Similarly to lookup, insertion also starts with binary searching the entry belonging
+to the given numeric identifier. If an entry is present for the given id and no value
+has already been stored with that id, then insertion can proceed. The value is appended
+to the serialized list of control values according to the format described earlier.
+Then the fill level is atomically incremented, and the entry is marked as set. After
+that the new value is available for readers to consume.
+
+Having a single writer is an essential requirement to be able to carry out insertion in
+a reasonably efficient, and thread-safe manner.
+
+Iteration
+'''''''''
+
+Iteration of a `MetadataList` is carried out only using the serialized list of controls
+in the "second part" of the data structure. An iterator can be implemented as a single
+pointer, pointing to the header of the current entry. The begin iterator simply points
+to location of the header of the first value. The end iterator is simply the end of the
+serialized list of values, which can be calculated from the begin iterator and the fill
+level of the serialized list.
+
+The above iterator can model a `C++ forward iterator`_, that is, only increments of 1 are
+possible in constant time, and going backwards is not possible. Advancing to the next value
+can be simply implemented by reading the size and alignment from the header, and adjusting
+the iterator's pointer by the necessary amount.
+
+TODO: is a forward iterator enough? is a bidirectional iterator needed?
+
+.. _C++ forward iterator: https://en.cppreference.com/w/cpp/iterator/forward_iterator.html
+
+Clearing
+''''''''
+
+Removing a single value is not supported, but clearing the entire metadata list is.
+This should only be done when there are no readers, otherwise readers might run into
+data races if they keep reading the metadata when new entries are being added after
+clearing it.
+
+Clearing is implemented by resetting each metadata entry in the "first part", as well
+as resetting the stored fill level of the serialized buffer to 0.
+
+Partial view
+''''''''''''
+
+When multiple metadata items are completed early, it is important to provide a way
+for the application to know exactly which metadata items have just been added. The
+serialized values in the data structure are encoded such that a simple byte range
+is capable of representing any number of items that have been added in succession.
+
+The `MetadataList::Checkpoint` type is used to store that state of the serialized
+list (number of bytes and number of items) at a given point in time. From such a
+checkpoint object a `MetadataList::Diff` object can be constructed, which represents
+all values added since the checkpoint. This *diff* object is reasonably small, and
+trivially copyable, making it easy to provide to the application. It has much of
+the same features as a `MetadataList`, e.g. it can be iterated and one can do lookups.
+Naturally, both iteration and lookups only consider the values added after the checkpoint
+and before the creation of the `MetadataList::Diff` object.
diff --git a/Documentation/index.rst b/Documentation/index.rst
index 251112fbd..60cb77702 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -24,6 +24,7 @@
    Tracing guide <guides/tracing>
 
    Design document: AE <design/ae>
+   Design document: Metadata list <design/metadata-list>
 
 .. toctree::
    :hidden:
diff --git a/Documentation/meson.build b/Documentation/meson.build
index 0fc5909d0..79e687953 100644
--- a/Documentation/meson.build
+++ b/Documentation/meson.build
@@ -127,6 +127,7 @@ if sphinx.found()
         'conf.py',
         'contributing.rst',
         'design/ae.rst',
+        'design/metadata-list.rst',
         'documentation-contents.rst',
         'environment_variables.rst',
         'feature_requirements.rst',
-- 
2.49.0



More information about the libcamera-devel mailing list