[libcamera-devel] [RFC PATCH v2 1/1] libcamera: controls: Controls for driving AF (autofocus) algorithms
Hanlin Chen
hanlinchen at chromium.org
Wed Feb 9 12:01:39 CET 2022
Hi David and Jacopo,
Thanks for the great work!
On Thu, Jan 20, 2022 at 1:11 AM David Plowman
<david.plowman at raspberrypi.com> wrote:
>
> Hi Jacopo
>
> Thanks for the comments!
>
> On Wed, 19 Jan 2022 at 10:13, Jacopo Mondi <jacopo at jmondi.org> wrote:
> >
> > Hi David,
> > sorry to catch up late and thanks everyone for the great discussion
> > on v1.
> >
> > I'll leave a few more comments here
> >
> > On Tue, Jan 18, 2022 at 11:37:50AM +0000, David Plowman wrote:
> > > This patch describes a series of controls that allow applications to
> > > drive AF algorithms:
> > >
> > > AfMode - manual, auto or continuous
> > > AfRange - full, macro or normal
> > > AfSpeed - fast or slow
> > > AfMethod - single or multi-spot
> > > AfWindow - AF window locations
> > > AfTrigger - start (trigger an AF scan) or cancel
> > > AfPause - pause continuous AF
> > > LensPosition - position of lens from lens driver
> > > AfState - reset, scanning, focused or failed
> > > ---
> > > src/libcamera/control_ids.yaml | 295 ++++++++++++++++++++++++++-------
> > > 1 file changed, 235 insertions(+), 60 deletions(-)
> > >
> > > diff --git a/src/libcamera/control_ids.yaml b/src/libcamera/control_ids.yaml
> > > index 9d4638ae..0b5ea9bd 100644
> > > --- a/src/libcamera/control_ids.yaml
> > > +++ b/src/libcamera/control_ids.yaml
> > > @@ -406,27 +406,6 @@ controls:
> > > The camera will cancel any active or completed metering sequence.
> > > The AE algorithm is reset to its initial state.
> > >
> > > - - AfTrigger:
> > > - type: int32_t
> > > - draft: true
> > > - description: |
> > > - Control for AF trigger. Currently identical to
> > > - ANDROID_CONTROL_AF_TRIGGER.
> > > -
> > > - Whether the camera device will trigger autofocus for this request.
> > > - enum:
> > > - - name: AfTriggerIdle
> > > - value: 0
> > > - description: The trigger is idle.
> > > - - name: AfTriggerStart
> > > - value: 1
> > > - description: The AF routine is started by the camera.
> > > - - name: AfTriggerCancel
> > > - value: 2
> > > - description: |
> > > - The camera will cancel any active trigger and the AF routine is
> > > - reset to its initial state.
> > > -
> > > - NoiseReductionMode:
> > > type: int32_t
> > > draft: true
> > > @@ -507,45 +486,6 @@ controls:
> > > The AE algorithm has started a pre-capture metering session.
> > > \sa AePrecaptureTrigger
> > >
> > > - - AfState:
> > > - type: int32_t
> > > - draft: true
> > > - description: |
> > > - Control to report the current AF algorithm state. Currently identical to
> > > - ANDROID_CONTROL_AF_STATE.
> > > -
> > > - Current state of the AF algorithm.
> > > - enum:
> > > - - name: AfStateInactive
> > > - value: 0
> > > - description: The AF algorithm is inactive.
> > > - - name: AfStatePassiveScan
> > > - value: 1
> > > - description: |
> > > - AF is performing a passive scan of the scene in continuous
> > > - auto-focus mode.
> > > - - name: AfStatePassiveFocused
> > > - value: 2
> > > - description: |
> > > - AF believes the scene is in focus, but might restart scanning.
> > > - - name: AfStateActiveScan
> > > - value: 3
> > > - description: |
> > > - AF is performing a scan triggered by an AF trigger request.
> > > - \sa AfTrigger
> > > - - name: AfStateFocusedLock
> > > - value: 4
> > > - description: |
> > > - AF believes has focused correctly and has locked focus.
> > > - - name: AfStateNotFocusedLock
> > > - value: 5
> > > - description: |
> > > - AF has not been able to focus and has locked.
> > > - - name: AfStatePassiveUnfocused
> > > - value: 6
> > > - description: |
> > > - AF has completed a passive scan without finding focus.
> > > -
> >
> > As Naush suggested, should the existing FocusFoM control be removed
> > too ?
>
> Ah yes, I forgot about that. Actually I think FocusFoM is useful. For
> example, on our HQ cams the lens position can only be changed (quite
> literally) "manually", so it's important for the user to be able to
> watch the FocusFoM change while they twiddle the lens.
>
> >
> > > - AwbState:
> > > type: int32_t
> > > draft: true
> > > @@ -690,4 +630,239 @@ controls:
> > > value. All of the custom test patterns will be static (that is the
> > > raw image must not vary from frame to frame).
> > >
> > > + - AfMode:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Control to set the mode of the AF (autofocus) algorithm. Applications
> > > + are allowed to set a new mode, and to send additional controls for
> > > + that new mode, in the same request. Furthermore, setting the mode to
> > > + the value it currently has is also permitted (with no effect).
> >
> > Is the last statement required ? Doesn't this apply to all controls
> > (re-setting them to the same is no-op) ?
>
> True enough, I'm happy to remove that. I just wanted to be sure
> everyone agrees that this is fine and that it won't start spitting out
> warnings!
>
> >
> > > + enum:
> > > + - name: AfModeManual
> > > + value: 0
> > > + description: |
> > > + The AF algorithm is in manual mode. In this mode it will never
> > > + perform any action nor move the lens of its own accord. The only
> > > + autofocus controls that have an immediate effect are AfMode (to
> > > + switch out of manual mode) and LensPosition (so that the lens can
> > > + be moved "manually").
> > > +
> > > + In this mode the AfState will always report AfStateReset.
> >
> > Indentation seems off here and in the descriptions of other controls.
>
> Not sure what happened there. Maybe some tabs crept in and got removed
> badly, but I'll fix it up.
>
> >
> > For what is worth, I do prefer having a Manual mode where LensPosition
> > is valid. I understand it forces applications to move to a different
> > state before moving the lens, but as Han-lin reported it would help
> > with translating to/from Android and matches more closely the AE
> > controls definition we agreed on.
>
> I think a separate manual mode is OK if we can send both "manual mode"
> and a lens position in the same request.
>
> >
> > I think we will never really find out how much Manual helps with
> > translating to Android until we don't actually do so, but in the
> > meantime I would keep it and if proves unecessary remove it later.
> >
> > For RPi it implies apps have to do one extra step. Can this be hidden
> > in your applications maybe ?
>
> Yes, if one can send these commands at the same time then it really is
> very painless.
>
> >
> > One thing which is not clear to me is what happens when transitioning
> > to Manual. For AE controls we decided switching to Manual from Auto
> > retains the lastly computed exposure time until applications do not
> > submit an ExposureTime control. Does this make any sense for
> > autofocus ? Would 'retain the last lens position' when moving from
> > Auto/CAF actually freeze the autofocus and implement the 'pause' that
> > was discussed in v1 ?
>
> The behaviour when switching from anything else to manual should be
> that the lens doesn't move. You're right it would implement a "pause"
> command but I think the discussion with Hanlin moved us towards a
> separate control for that. "Pause" has a special control value where
> you can tell it to "pause when you are not scanning", and there's the
> question of what CAF does when you "resume". Does it resume from
> exactly where it left off? Or does it start a scan again? The way I've
> described things here:
>
> * If you switch from something else back to CAF, it will immediately
> proceed to do a scan (it has no reason to believe the lens is anywhere
> sensible).
> * If you "resume" after a "pause" it will resume from where it left
> off. So if it thought it was in focus, it will stay like that (until
> such time as some image statistics suggest it needs to rescan).
> * If it was scanning but got paused during the scan, resuming will
> carry on with the scan. (Actually I wonder if the scan should start
> over rather than carry on from exactly where it was, but maybe that's
> an implementation detail).
> * Another difference, as I've described things here, is that "Pause"
> leaves the state unchanged, so you can see what state CAF was in,
> whereas moving to "Manual" puts it back to "reset". This could be
> changed, of course, though I worry a bit about mixing the modes
> together (see next comment).
>
> >
> > > + - name: AfModeAuto
> > > + value: 1
> > > + description: |
> > > + The AF algorithm is in auto mode. This means that the algorithm
> > > + will never move the lens or change state unless the AfTrigger
> > > + control is used. The AfTrigger control can be used to initiate a
> > > + focus scan, the results of which will also be reported by AfState.
> > > +
> > > + If the autofocus algorithm is moved from AfModeAuto to another
> > > + mode while a scan is in progress, the scan is cancelled
> > > + immediately, without waiting for the scan to finish.
> > > +
> > > + When first entering this mode the AfState will report
> > > + AfStateReset. When a trigger control is sent, AfState will
> > > + report AfStateScanning for a period before spontaneously
> > > + changing to AfStateFocused or AfStateFailed, depending on the
> > > + outcome of the scan. It will remain in this state until another
> > > + scan is initiated by the AfTrigger control. If a scan is
> > > + cancelled (without changing to another mode), AfState will return
> > > + to AfStateReset.
> > > + - name: AfModeContinuous
> > > + value: 2
> > > + description: |
> > > + The AF algorithm is in continuous mode. This means that the lens
> > > + can re-start a scan spontaneously at any moment, without any user
> > > + intervention. The AfState still reports whether the algorithm is
> > > + currently scanning or not, though the application has no ability
> > > + to initiate or cancel scans, nor move the lens for itself.
> > > +
> > > + When set to AfModeContinuous, the system will immediately initiate
> > > + a scan so AfState will report AfStateScanning, and will settle on
> > > + one of AfStateFocused or AfStateFailed, depending on the scan
> > > + result.
> > > +
> > > + The continuous autofocus behaviour can be paused with the
> > > + AfPause control. Pausing the algorithm does not change the value
> > > + reported by AfState, so that applications can determine the
> > > + state of the algorithm when the pause control took effect. Once
> > > + un-paused ("resumed"), the algorithm starts again from exactly
> > > + where it left off when it paused.
> > > +
> >
> > I like the defintion of these two modes, the only thing I'm not sure
> > is the Pause. As suggested [CAF|Auto]->Manual would have the effect to
> > pause the algorithm until the lens is not moved explicitely, but would
> > report StateReset according to the current defintion, while you here
> > suggested that [CAF]->Pause would not change the AfState. Why is this
> > useful for applications ? I might have missed that...
>
> It's hard to predict why this might be useful, but it feels like
> providing full information is usually a good thing. For example, if
> you pause CAF and do some captures, if the state reports "failed" (or
> even "scanning") rather than "focused" the UI could pop up one of
> those "your picture might be blurry" warnings!
Since the PauseDeffered would not stop AF immediately if it's scanning
and only pause when the scanning is done,
the application might expect the AfState transit from scanning to
focus gradually and can decide whether to
take a picture or wait for the focus.
I would understand them as:
[CAF]->Pause: AF is running according to the hint of pause, and only
because it's running, the AfState has meaning, and the AF can resume
from where it was.
[CAF|Auto]->Manual: AF is not running and the application has full
control, and only because it's not running, the AfState has no meaning
in the case.
The definition sets it to StateReset. In fact, the application should
ignore the AfState if it's in Manual mode.
[Auto]->AF_TRIGGER: Restart a full scan, no matter what's current
status, until focus or fail.
The difference to me is the duration to get a good focus.
[CAF|Auto]->Manual: Quick, but whether it's a good focus is due to application.
[Auto]->AF_TRIGGER: Long time, but good focus.
[CAF]->Pause: Balanced, quick and good focus. Can be resumed from
where it was, so the image won't be blurry suddenly.
The Manual mode for 3A also gives applications an opportunity to
implement its 3A algorithm, since it delegates the full control to
application.
For example, GCam has its own AE algorithm working on the manual mode,
and uses the result for HDRNet processing.
Although it may not be a strong reason =.=, I'm more inclined to treat
manual mode as an independent case, which does not mix with other
modes.
Pause does "influence" AfState since it hints the AF to stop
somewhere, and needs a period of time to reach the stable state.
Should we have AF reports the status reacting to the progress, like
"pausing" and "already paused"?
>
> I suppose you could have manual mode report states other than just
> "reset", but I worry a bit that we're mixing our modes together, which
> means understanding manual mode properly means you have to understand
> CAF. I also quite like the idea that entering a mode (CAF in this
> case) completely resets and restarts it. CAF algorithms (especially
> contrast detect ones) can have a nasty habit of getting stuck in the
> wrong place, so it's reassuring if there's a guaranteed way to restart
> it.
>
> >
> > > + - AfRange:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Control to set the range of focus distances that is scanned.
> > > + enum:
> > > + - name: AfRangeNormal
> > > + value: 0
> > > + description: |
> > > + A wide range of focus distances is scanned, all the way from
> > > + infinity down to close distances, though depending on the
> > > + implementation, possibly not including the very closest macro
> > > + positions.
> > > + - name: AfRangeMacro
> > > + value: 1
> > > + description: Only close distances are scanned.
> > > + - name: AfRangeFull
> > > + value: 2
> > > + description: |
> > > + The full range of focus distances is scanned just as with
> > > + AfRangeNormal but this time including the very closest macro
> > > + positions.
> >
> > We should start thinking how to map the lens movement range to such
> > 'closest marco' and 'close distances'. I honestly don't know enough to
> > immagine if a linear mapping to the lens movement range is enough or
> > not.
>
> I remember working with some VCM modules where the driver range was
> mapped to 0-255. But in reality, infinity was around 220, hyperfocal
> around 200 and the closest focus would be around 50. Most of the AF
> time was spent searching the last few centimetres in front of the
> lens, which is the reason for having AfRangeNormal as well as
> AfRangeFull.
>
> >
> > > +
> > > + - AfSpeed:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Control that determines whether the AF algorithm is to move the lens
> > > + as quickly as possible or more steadily. For example, during video
> > > + recording it may be desirable not to move the lens too abruptly, but
> > > + when in a preview mode (waiting for a still capture) it may be
> > > + helpful to move the lens as quickly as is reasonably possible.
> > > + enum:
> > > + - name: AfSpeedNormal
> > > + value: 0
> > > + description: Move the lens at its usual speed.
> > > + - name: AfSpeedFast
> > > + value: 1
> > > + description: Move the lens more quickly.
> > > +
> > > + - AfMethod:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Control whether the AF algorithm uses a single window in the image to
> > > + determine the best focus position, or multiple windows simultaneously.
> > > + enum:
> > > + - name: AfMethodSingle
> > > + value: 0
> > > + description: |
> > > + A single window within the image, defaulting to the centre, is used
> > > + to select the best focus distance.
> > > + - name: AfMethodMultiSpot
> > > + value: 0
> > > + description: |
> > > + Multiple windows within the image are used to select the best focus
> > > + distance. The best focus distance is found for each one of the
> > > + windows, and then the distance that is closest to the camera is
> > > + selected.
> >
> > I have an hard time understanding why this control cannot be inferred
> > by the number of rectangles passed to AfWindow.
> >
> > If it's a shortcut for 'default to center' I'm in two minds. Can we
> > say that 'no AfWindow == default to center' ? Do we lose anything with
> > that ?
>
> Good point. For a while I was wondering whether there's a "method"
> where you can have multiple windows but treat them as if they were one
> region, so you'd sort of get an "average best focus".
>
> I also wondered whether AfMethodMultiSpot should be divided into
> AfMethodMultiSpotNear and AfMethodMultiSpotFar - in the latter case
> you'd choose the furthest focal distance. Does anyone think that might
> be useful?
>
> Maybe we drop "AfMethod" for now, and bring it back in future if we
> ever find a reason?
>
> >
> > > +
> > > + - AfWindow:
> > > + type: Rectangle
> > > + draft: true
> > > + description: |
> > > + Sets the focus windows used by the AF algorithm. The units used express
> > > + a proportion of the ScalerCrop control (or if unavailable, of the entire
> > > + image), as u0.16 format numbers.
> >
> > My first reaction was "we should refer to the ActivePixelArray not to
> > ScalerCrop", reason being ScalerCrop is optional and might not be of
> > interest of the application to specify one.
> >
> > But I'm not sure if the active pixel array is the right target either.
> > I assume these rectangles get directly translated to some ISP
> > parameters that allows to specify grids where to sample the pixels
> > contrast from ? If that's the case what they do refer to ? I assume
> > it's the input RAW picture size, something not exposed to apps by
> > libcamera, but that pipeline handlers have access to...
>
> I'm not sure what the answer is here. I guess it's worth imagining
> what happens to the focus windows if an application zooms in and out
> (digitally). As the zoom changes, do we want applications to have to
> recalculate the focus windows all the time? That feels quite onerous.
>
> I'm sure an application would prefer to say something like (for
> example) "put a window in the middle and one in each corner" and then
> those windows stay there (in the output image) regardless. Perhaps
> that's what we should be aiming for?
>
> It would certainly be unhelpful if you zoomed in and discovered you
> were focusing on things that aren't even in the output image!
It's interesting. I quote Android's definition here:
"Pixel coordinates within android.sensor.info.activeArraySize..."
"If the metering region is outside the used android.scaler.cropRegion
returned in capture result metadata, the camera device will ignore the
sections outside the crop region and output only the intersection
rectangle as the metering region in the result metadata. If the region
is entirely outside the crop region, it will be ignored and not
reported in the result metadata."
Although to me it's not solving the problem just avoid focusing on
items outside of the ScalerCrop :D.
I guess we can delegate the calculation to the application?
1. AfWindow and ScalerCrop in ActivePixelArray coordinate.
2. AfWindow and ScalerCrop are per-frame controls.
3. The application knows the ScalerCrop and thus knows how to
calculate AfWindow if the application wants it to be the center of
ScalerCrop.
4. Implicitly intersect ScalerCrop and AfWindow for AF to avoid
accidentally strange focus.
>
> >
> > > +
> > > + In order to be activated, a rectangle must be programmed with non-zero
> > > + width and height. If no rectangles are programmed in this way, then the
> > > + system will choose its own single default window in the centre of the
> > > + image.
> >
> > Ah!
> >
> > > +
> > > + If AfMethod is set to AfMethodSingle, then only the first Rectangle in
> > > + this list is used (or the system default one if it is unprogrammed).
> > > +
> > > + If AfMethod is set to AfMethodMultiSpot then all the valid Rectangles in
> > > + this list are used. The size of the control indicates how many such
> > > + windows can be programmed and will vary between different platforms.
> > > +
> > > + size: [platform dependent]
> > > +
> > > + - AfTrigger:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + This control starts an autofocus scan when AfMode is set to AfModeAuto,
> > > + and can also be used to terminate a scan early.
> > > +
> > > + It is ignored if AfMode is set to AfModeContinuous.
> >
> > And in Manual too
>
> Yes!
>
> >
> > > +
> > > + enum:
> > > + - name: AfTriggerStart
> > > + value: 0
> > > + description: Start an AF scan. Ignored if a scan is in progress.
> > > + - name: AfTriggerCancel
> > > + value: 1
> > > + description: Cancel an AF scan. Ingored if no scan is in progress.
> > > +
> > > + - AfPause:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + This control has no effect except when in continuous autofocus mode
> > > + (AfModeContinuous). It can be used to pause any lens movements while
> > > + (for example) images are captured. The algorithm remains inactive
> > > + until it is instructed to resume.
> > > +
> > > + enum:
> > > + - name: AfPauseImmediate
> > > + value: 0
> > > + description: |
> > > + Pause the continuous autofocus algorithm immediately, whether or
> > > + not any kind of scan is underway. The AfState will continue to
> > > + report whatever value it had when the control was enacted.
> > > + - name AfPauseDeferred
> > > + value: 1
> > > + description: |
> > > + Pause the continuous autofocus algorithm as soon as it is no longer
> > > + scanning. The AfState will report AfStateFocused or AfStateFailed,
> > > + depending on whether the final scan succeeds or not. If no scan is
> > > + in currently progress, the algorithm will pause immediately.
> > > + - name: AfPauseResume
> > > + value: 2
> > > + description: |
> > > + Resume continous autofocus operation. The algorithm starts again
> > > + from exactly where it left off, with AfState unchanged (one of
> > > + AfStateFocused, AfStateFailed or following AfPauseImmediate it
> > > + might also have been in the AfStateScanning state).
> > > +
> > > +
> > > + - LensPosition:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Acts as a control to instruct the lens to move to a particular position
> > > + and also reports back the position of the lens for each frame.
> > > +
> > > + The units are determined by the lens driver.
> >
> > This would make it impossible to write apps in a portable way. I know
> > nothing at the moment about lenses so I cannot propose a suitable
> > range, but we should indeed define one and translate to the opportune
> > lens positions using the CameraLens helper class ?
>
> Something like that might work. We could use the min/max of the
> V4L2_CID_FOCUS_ABSOLUTE control to give the lens driver range. Though
> I'm not sure how you would know which end is macro and which is
> infinity. The default value could give you the setting for hyperfocal.
>
> But there's a problem because the lens driver range is not the
> _useable_ range, which is normally quite a lot less. Perhaps the
> device tree can be used to adjust the values the driver reports? Or
> perhaps there should be some kind of database? Editing the device tree
> because you want to try a different module seems harsh.
>
> The other catch is that the useable range (and hyperfocal position)
> varies with the module type, even when they share a lens driver, and
> there's normally no way to determine what module you have. For the Pi,
> I'd probably expect to store the ranges I want to search in the tuning
> file, so we only have to set the LIBCAMERA_RPI_TUNING_FILE variable
> correctly, but that doesn't help an application to know what lens
> positions it should use (for example, when an application starts I'd
> expect "move the lens to hyperfocal" to be quite common).
>
> >
> > > +
> > > + The LensPosition control is ignored unless the AfMode is set to
> > > + AfModeManual.
> > > +
> > > + - AfState:
> > > + type: int32_t
> > > + draft: true
> > > + description: |
> > > + Reports the current state of the AF algorithm.
> > > + enum:
> > > + - name: AfStateReset
> > > + value: 0
> > > + description: |
> > > + The AF algorithm reports this state when:
> > > + * It is in manual mode (AfModeManual).
> > > + * The system has entered auto mode (AfModeAuto) but no scan
> > > + has yet been initiated.
> > > + * The system is in auto mode (AfModeAuto) and a scan has been
> > > + cancelled.
> > > + - name: AfStateScanning
> > > + value: 1
> > > + description: |
> >
> > This is instead indented a bit too much to the right.
> >
> > > + AF is performing a scan. This state can be entered spontaneously
> > > + if AfMode is set to AfModeContinuous, otherwise it requires the
> > > + application to use the AfTrigger control to start the scan.
> > > + - name: AfStateFocused
> > > + value: 2
> > > + description: |
> > > + An AF scan has been performed and the algorithm believes the
> > > + scene is in focus.
> > > + - name: AfStateFailed
> > > + value: 3
> > > + description: |
> > > + An AF scan has been performed but the algorithm has not been able
> > > + to find the best focus position.
> > > +
> >
> > I like AfState to be simple, but as suggested by Han-lin I'm afraid to
> > be able to translate back to Android PASSIVE_FOCUSED vs FOCUSED_LOCKED
> > we need an AfModeManual mode as you have in this v2.
> >
> > Thanks
> > j
>
> Thanks!
> David
>
> >
> > > ...
> > > --
> > > 2.30.2
> > >
More information about the libcamera-devel
mailing list