[PATCH v2 2/3] libcamera: Add face detection controls

Cheng-Hao Yang chenghaoyang at chromium.org
Fri Aug 30 23:03:57 CEST 2024


Thanks Jacopo,

On Wed, Aug 28, 2024 at 2:39 PM Jacopo Mondi <jacopo.mondi at ideasonboard.com>
wrote:

> Hello
>
> On Fri, Aug 23, 2024 at 02:29:09PM GMT, Harvey Yang wrote:
> > From: Yudhistira Erlandinata <yerlandinata at chromium.org>
> >
> > Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores,
> > and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting
> > FaceDetectFaceLandmark.
> >
> > BUG=b:308713855
> > TEST=emerge-geralt libcamera-mtkisp7
>
> Please drop these
>
Removed.


>
> >
> > Signed-off-by: Yudhistira Erlandinata <yerlandinata at chromium.org>
> > Co-developed-by: becker hsieh <beckerh at chromium.org>
> > Co-developed-by: Harvey Yang <chenghaoyang at chromium.org>
> > ---
> >  include/libcamera/controls.h        |  6 +++++
> >  src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++
> >  src/libcamera/controls.cpp          |  6 +++++
> >  3 files changed, 54 insertions(+)
> >
> > diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h
> > index 7c2bb2872..bf1b8609c 100644
> > --- a/include/libcamera/controls.h
> > +++ b/include/libcamera/controls.h
> > @@ -34,6 +34,7 @@ enum ControlType {
> >       ControlTypeString,
> >       ControlTypeRectangle,
> >       ControlTypeSize,
> > +     ControlTypePoint,
> >  };
> >
> >  namespace details {
> > @@ -87,6 +88,11 @@ struct control_type<Size> {
> >       static constexpr ControlType value = ControlTypeSize;
> >  };
> >
> > +template<>
> > +struct control_type<Point> {
> > +     static constexpr ControlType value = ControlTypePoint;
> > +};
> > +
> >  template<typename T, std::size_t N>
> >  struct control_type<Span<T, N>> : public
> control_type<std::remove_cv_t<T>> {
> >  };
> > diff --git a/src/libcamera/control_ids_core.yaml
> b/src/libcamera/control_ids_core.yaml
> > index 6381970b4..5ff46c71e 100644
> > --- a/src/libcamera/control_ids_core.yaml
> > +++ b/src/libcamera/control_ids_core.yaml
>
> I wonder if we can already made these core controls or as long as we
> don't have more users they should in control_ids_draft
>
> > @@ -860,6 +860,48 @@ controls:
> >              No further state changes or lens movements will occur until
> the
> >              AfPauseResume control is sent.
> >
> > +  - FaceDetectMode:
> > +      type: uint8_t
> > +      description: |
> > +        Reporting mode of face detection.
>
> I presume this is a metadata only control (iow can only be reported by
> the library to the application). If that's the case this should be
> made explicit.
>
> We don't do so consistently in our controls definition unfortunately,
> but some controls already specify that, like, in example,
> ColorTemperature:
>
>         The ColourTemperature control can only be returned in metadata.
>
> I would do the same here if that's the case.
>
> However, reading your next patch, this control doesn't seem to be for
> reporting only, but can also be set by applications to control how the
> pipeline behaves. That's not what the description says.
>

Right, in Android adapter, HAL returns a list of available
FaceDetectMode in static metadata. Application then sets a mode it
desires per-framely in controls. HAL returns the final mode that it
supports per-framely in the result metadata.

Therefore, I'll keep it this way :)


>
>
> > +
> > +      enum:
> > +        - name: FaceDetectModeOff
> > +          value: 0
> > +          description: |
> > +            Pipeline should not report face detection result.
> > +        - name: FaceDetectModeSimple
> > +          value: 1
> > +          description: |
> > +            Pipeline should at least report faces boundary rectangles
> and
>
> Why "at least" ?
>

In Android, there's another Full mode [1] that includes face landmarks and
face IDs. I think Yudhis didn't list Full mode here because mtkisp7 doesn't
support face IDs in their face detector lib/hardware...

Android CTS doesn't seem to care if more metadata is returned, when not
asked for though.

We can also list FULL mode here, and let mtkisp7 pipeline handler returns
Off & Simple modes in the static metadata. WDYT?

[1]:
https://developer.android.com/reference/android/hardware/camera2/CameraMetadata#STATISTICS_FACE_DETECT_MODE_FULL


>
> > +            confidence score for each of them.
>
> How ? Please describe here how these information are reported,
> mentioning explicitly
>
>                 \sa FaceDetectFaceRectangles
>                 \sa FaceDetectFaceScores
>
> I presume it is expected from the pipline handlers to report two lists
> of the same sizes of rectangles and scores.
>

Exactly. Thanks! Updated, please check again.


>
> Please keep in mind this documentation should be usable for other
> pipeline handler implementers to implement face detection and from
> application to understand how to interpret the results.
>

Right, I'll keep that in mind.


>
> > +
> > +  - FaceDetectFaceRectangles:
> > +      type: Rectangle
> > +      description: |
> > +        Boundary rectangle of the detected faces.
>
> This is reported only when FaceDetectMode is set to
> FaceDetectModeSimple I presume. It has to be specified. If it's a
> metadata only control the above suggestion of saying that explicitly
> applies.
>

Actually, as I mentioned above, Android doesn't seem to complain
about extra metadata. Therefore, it's `when` instead of `only when`.
Updated.


>
> > +
> > +      size: [n]
> > +
> > +  - FaceDetectFaceScores:
> > +      type: uint8_t
> > +      description: |
> > +        Confidence score of each of the detected faces by face detector.
> > +        The range of score is [0, 100], but usually those with low
> confidence
> > +        scores will not be included.
>
> I think you should eitehr clarify what determines what "low confidence" is
> or drop that part, as it's a pipeline handler specific decision.
>
Right, dropped.


>
> I presume a score needs a rectangle assigned, please mention that.
>
Do we need to mention that the size of faces in each metadata should
align in every control id involved? How about just mentioning once in
FaceDetectModeSimple?


>
> > +        Currently identical to ANDROID_STATISTICS_FACE_SCORES.
>
> This makes me think all of this should go in draft controls, where we
> have collected controls whose definition comes from the Android ones.
>

Right, moved to the draft ids.


>
> > +
> > +      size: [n]
> > +
> > +  - FaceDetectFaceLandmark:
> > +      type: Point
> > +      description: |
> > +        Array of human face landmark coordinates in format:
> > +        [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...],
> > +        with i = index of face.
>
> This is quite vague I'm afraid.
>
> Which are the face landmarks ? eyes and mouth ? Why not the nose ? A more
> sophisticated mechanism is required to support this and have it usable
> for multiple platforms I'm afraid. If this is only useful for Android and
> its description comes from an Android control, move it to draf where
> we can be slighly more liberal in defining things, as draf controls
> are by definition not expected to be stable.
>

Yes, it's to be aligned with Android control [2]. Moved to the draft ids.

[2]:
https://android.googlesource.com/platform/system/media/+/refs/heads/main/camera/docs/docs.html#32906


>
> > +
> > +      size: [n]
> > +
> >    - HdrMode:
> >        type: int32_t
> >        description: |
> > diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp
> > index 11d35321c..ce73ae9d7 100644
> > --- a/src/libcamera/controls.cpp
> > +++ b/src/libcamera/controls.cpp
> > @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = {
> >       [ControlTypeString]             = sizeof(char),
> >       [ControlTypeRectangle]          = sizeof(Rectangle),
> >       [ControlTypeSize]               = sizeof(Size),
> > +     [ControlTypePoint]              = sizeof(Point),
> >  };
> >
> >  } /* namespace */
> > @@ -255,6 +256,11 @@ std::string ControlValue::toString() const
> >                       str += value->toString();
> >                       break;
> >               }
> > +             case ControlTypePoint: {
> > +                     const Point *value = reinterpret_cast<const Point
> *>(data);
> > +                     str += value->toString();
> > +                     break;
> > +             }
> >               case ControlTypeNone:
> >               case ControlTypeString:
> >                       break;
> > --
> > 2.46.0.295.g3b9ea8a38a-goog
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libcamera.org/pipermail/libcamera-devel/attachments/20240830/7a4174f4/attachment.htm>


More information about the libcamera-devel mailing list