[libcamera-devel] [PATCH v3 8/8] android: Implement flush() camera operation

Hirokazu Honda hiroh at chromium.org
Tue May 25 04:49:17 CEST 2021


Hi Jacopo, thank you for the patch.

On Mon, May 24, 2021 at 4:47 PM Jacopo Mondi <jacopo at jmondi.org> wrote:

> Hi Laurent,
>
> On Sun, May 23, 2021 at 09:50:46PM +0300, Laurent Pinchart wrote:
> > Hi Jacopo,
> >
> > Thank you for the patch.
> >
> > On Sun, May 23, 2021 at 04:22:51PM +0200, Jacopo Mondi wrote:
> > > On Sat, May 22, 2021 at 11:55:36AM +0200, Niklas Söderlund wrote:
> > > > On 2021-05-21 17:42:27 +0200, Jacopo Mondi wrote:
> > > > > Implement the flush() camera operation in the CameraDevice class
> > > > > and make it available to the camera framework by implementing the
> > > > > operation wrapper in camera_ops.cpp.
> > > > >
> > > > > The flush() implementation stops the Camera and the worker thread
> and
> > > > > waits for all in-flight requests to be returned. Stopping the
> Camera
> > > > > forces all Requests already queued to be returned immediately in
> error
> > > > > state. As flush() has to wait until all of them have been
> returned, make it
> > > > > wait on a newly introduced condition variable which is notified by
> the
> > > > > request completion handler when the queue of pending requests has
> been
> > > > > exhausted.
> > > > >
> > > > > As flush() can race with processCaptureRequest() protect the
> requests
> > > > > queueing by introducing a new CameraState::CameraFlushing state
> that
> > > > > processCaptureRequest() inspects before queuing the Request to the
> > > > > Camera. If flush() has been called while processCaptureRequest()
> was
> > > > > executing, return the current Request immediately in error state.
> > > > >
> > > > > Protect potentially concurrent calls to close() and
> configureStreams()
> >
> > Can this happen ? Quoting camera3.h,
> >
> >  * 12. Alternatively, the framework may call
> camera3_device_t->common->close()
> >  *    to end the camera session. This may be called at any time when no
> other
> >  *    calls from the framework are active, although the call may block
> until all
> >  *    in-flight captures have completed (all results returned, all
> buffers
> >  *    filled). After the close call returns, no more calls to the
> >  *    camera3_callback_ops_t functions are allowed from the HAL. Once the
> >  *    close() call is underway, the framework may not call any other HAL
> device
> >  *    functions.
> >
> > The important part is "when no other calss from the framework are
> > active". I don't think we need to handle close() racing with anything
> > else than process_capture_request().
>
> I've been discussing this with Hiro during v1, as initially I didn't
> consider close() and configureStreams().
>
> https://patchwork.libcamera.org/patch/12248/#16884
>
> I initially only considered processCaptureRequest() as a potential
> race, but got suggested differently by the cros camera team.
>
>
> >
> > > > > by inspecting the CameraState, and force a wait for any flush()
> call
> > > > > to complete before proceeding.
> > > > >
> > > > > Signed-off-by: Jacopo Mondi <jacopo at jmondi.org>
> > > > > ---
> > > > >  src/android/camera_device.cpp | 90
> +++++++++++++++++++++++++++++++++--
> > > > >  src/android/camera_device.h   |  9 +++-
> > > > >  src/android/camera_ops.cpp    |  8 +++-
> > > > >  3 files changed, 100 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/src/android/camera_device.cpp
> b/src/android/camera_device.cpp
> > > > > index 3fce14035718..899afaa49439 100644
> > > > > --- a/src/android/camera_device.cpp
> > > > > +++ b/src/android/camera_device.cpp
> > > > > @@ -750,16 +750,65 @@ int CameraDevice::open(const hw_module_t
> *hardwareModule)
> > > > >
> > > > >  void CameraDevice::close()
> > > > >  {
> > > > > -       streams_.clear();
> > > > > +       MutexLocker cameraLock(cameraMutex_);
> >
> > I'd add a blank line here.
> >
> > > > > +       if (state_ == CameraFlushing) {
> >
> > As mentioned above, I don't think you need to protect against close()
> > and flush() racing each other.
> >
> > > > > +               flushed_.wait(cameraLock, [&] { return state_ !=
> CameraStopped; });
> > > > > +               camera_->release();
> > > > >
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       streams_.clear();
> > > > >         stop();
> > > > >
> > > > >         camera_->release();
> > > > >  }
> > > > >
> > > > > -void CameraDevice::stop()
> > > > > +/*
> > > > > + * Flush is similar to stop() but sets the camera state to
> 'flushing' and wait
> >
> > s/wait/waits/
> >
> > > > > + * until all the in-flight requests have been returned before
> setting the
> > > > > + * camera state to stopped.
> > > > > + *
> > > > > + * Once flushing is done it unlocks concurrent calls to camera
> close() and
> > > > > + * configureStreams().
> > > > > + */
> > > > > +void CameraDevice::flush()
> > > > >  {
> > > > > +       {
> > > > > +               MutexLocker cameraLock(cameraMutex_);
> > > > > +
> > > > > +               if (state_ != CameraRunning)
> > > > > +                       return;
> > > > > +
> > > > > +               worker_.stop();
> > > > > +               camera_->stop();
> > > > > +               state_ = CameraFlushing;
> > > > > +       }
> > > > > +
> > > > > +       /*
> > > > > +        * Now wait for all the in-flight requests to be completed
> before
> > > > > +        * continuing. Stopping the Camera guarantees that all
> in-flight
> > > > > +        * requests are completed in error state.
> >
> > Do we need to wait ? Camera::stop() guarantees that all requests
> > complete synchronously with the stop() call.
>
> I didn't get the API that way... I thought after stop we would receive
> a sequence of failed requests... Actually I don't see anything that
> suggests that in camera.cpp or pipeline_handler.cpp apart from an assertion
> in Camera::stop()
>
> >
> > Partly answering myself here, we'll have to wait for post-processing
> > tasks to complete once we'll process them in a separate thread, but that
> > will likely be handled by Thread::wait(). I don't think you need a
> > condition variable here. I'm I'm not mistaken, this should simplify the
> > implementation.
>
> If Camera::stop() is synchronous we don't need to wait indeed
>
> >
> > > > > +        */
> > > > > +       {
> > > > > +               MutexLocker requestsLock(requestsMutex_);
> > > > > +               flushing_.wait(requestsLock, [&] { return
> descriptors_.empty(); });
> > > > > +       }
> > > >
> > > > I'm still uneasy about releasing the cameraMutex_ for this section.
> In
> > > > patch 6/8 you add it to protect the state_ variable but here it's
> > >
> > > I'm not changing state_ without the mutex acquired, am I ?
> > >
> > > > ignored. I see the ASSERT() added to stop() but the patter of taking
> the
> > > > lock checking state_, releasing the lock and do some work, retake the
> > > > lock and update state_ feels like a bad idea. Maybe I'm missing
> > >
> > > How so, apart from the fact it feels a bit unusual, I concur ?
> > >
> > > If I keep the held the mutex for the whole duration of flush no other
> > > concurrent method can proceed until all the queued requests have not
> > > been completed. While flush waits for the flushing_ condition to be
> > > signaled, processCaptureRequest() can proceed and immediately return
> > > the newly queued requests in error state by detecting state_ ==
> > > CameraFlushing which signals that flush in is progress.
> > > Otherwise it would have had to wait for flush to end. But then we're
> back
> > > to a situation where we could serialize all calls and that's it, we
> > > would be done with a single mutex to be held for the whole duration of
> > > all operations.
> > >
> > > If it only was for close() or configureStreams() we could have locked
> > > for the whole duration of flush(), as they anyway wait for flush to
> > > complete before proceeding (by waiting on the flushed_ condition here
> > > below signaled).
> > >
> > > > something and this is not a real problem, if so maybe we can capture
> > > > that in the comment here?
> > > >
> > > > > +
> > > > > +       /*
> > > > > +        * Set state to stopped and unlock close() or
> configureStreams() that
> > > > > +        * might be waiting for flush to be completed.
> > > > > +        */
> > > > >         MutexLocker cameraLock(cameraMutex_);
> > > > > +       state_ = CameraStopped;
> > > > > +       flushed_.notify_one();
> >
> > You should drop the lock before calling notify_one(). Otherwise you'll
> > wake up the task waiting on flushed_, which will try to lock
> > cameraMutex_, which will block immediately. The scheduler will have to
> > reschedule this task for the function to return and the lock to be
> > released before the waiter can proceed. That works, but isn't very
> > efficient.
>
> Weird, the cpp reference shows example about notify_one where the
> caller always has the mutex held locked, but I see your point and
> seems correct..
>
>
This is correct.
https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one
> The notifying thread does not need to hold the lock on the same mutex as
the one held by the waiting thread(s);

I know that, but I haven't pointed it out because it is not false.
>From the Laurent explanation, yeah, we should definitely avoid that. TIL.


> >
> >       {
> >               MutexLocker cameraLock(cameraMutex_);
> >               state_ = CameraStopped;
> >       }
> >
> >       flushed_.notify_one();
> >
>
> So I could change to this one, if I don't have to drop this part
> completely if we consider close() and configureStreams() not as
> possible races...
>
> > > > > +}
> > > > > +
> > > > > +/* Calls to stop() must be protected by cameraMutex_ being held
> by the caller. */
> > > > > +void CameraDevice::stop()
> > > > > +{
> > > > > +       ASSERT(state_ != CameraFlushing);
> > > > > +
> > > > >         if (state_ == CameraStopped)
> > > > >                 return;
> > > > >
> > > > > @@ -1581,8 +1630,18 @@ PixelFormat CameraDevice::toPixelFormat(int
> format) const
> > > > >   */
> > > > >  int CameraDevice::configureStreams(camera3_stream_configuration_t
> *stream_list)
> > > > >  {
> > > > > -       /* Before any configuration attempt, stop the camera. */
> > > > > -       stop();
> > > > > +       {
> > > > > +               /*
> > > > > +                * If a flush is in progress, wait for it to
> complete and to
> > > > > +                * stop the camera, otherwise before any new
> configuration
> > > > > +                * attempt we have to stop the camera explictely.
> > > > > +                */
> >
> > Same here, I don't think flush() and configure_streams() can race each
> > other. I believe the only possible race to be between flush() and
> > process_capture_request().
> >
>
> Ditto.
>
> > > > > +               MutexLocker cameraLock(cameraMutex_);
> > > > > +               if (state_ == CameraFlushing)
> > > > > +                       flushed_.wait(cameraLock, [&] { return
> state_ != CameraStopped; });
> > > > > +               else
> > > > > +                       stop();
> > > > > +       }
> > > > >
> > > > >         if (stream_list->num_streams == 0) {
> > > > >                 LOG(HAL, Error) << "No streams in configuration";
> > > > > @@ -1950,6 +2009,25 @@ int
> CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> > > > >         if (ret)
> > > > >                 return ret;
> > > > >
> > > > > +       /*
> > > > > +        * Just before queuing the request, make sure flush() has
> not
> > > > > +        * been called after this function has been executed. In
> that
> > > > > +        * case, immediately return the request with errors.
> > > > > +        */
> > > > > +       MutexLocker cameraLock(cameraMutex_);
> > > > > +       if (state_ == CameraFlushing || state_ == CameraStopped) {
> > > > > +               for (camera3_stream_buffer_t &buffer :
> descriptor.buffers_) {
> > > > > +                       buffer.status =
> CAMERA3_BUFFER_STATUS_ERROR;
> > > > > +                       buffer.release_fence =
> buffer.acquire_fence;
> > > > > +               }
> > > > > +
> > > > > +               notifyError(descriptor.frameNumber_,
> > > > > +                           descriptor.buffers_[0].stream,
> >
> > As commented on a previous patch, I think you should pass nullptr for
> > the stream here.
> >
>
> The "S6. Error management:" section of the camera3.h header does not
> mention that, not the ? where does you suggestion come from ? I don't find
> any reference in the review of [1/8]
>
>
> > > > > +                           CAMERA3_MSG_ERROR_REQUEST);
> > > > > +
> > > > > +               return 0;
> > > > > +       }
> > > > > +
> > > > >         worker_.queueRequest(descriptor.request_.get());
> > > > >
> > > > >         {
> > > > > @@ -1979,6 +2057,10 @@ void CameraDevice::requestComplete(Request
> *request)
> > > > >                         return;
> > > > >                 }
> > > > >
> > > > > +               /* Release flush if all the pending requests have
> been completed. */
> > > > > +               if (descriptors_.empty())
> > > > > +                       flushing_.notify_one();
> >
> > This will never happen, as you can only get here if descriptors_.find()
> > has found the descriptor. Did you mean to do this after the extract()
> > call below ?
>
> Ugh. This works only because Camera::stop() is synchronous then ?
>
>
Ah, good catch!

With this fix, the code looks good to me.
I am happy to ongoingly join the discussion.

Reviewed-by: Hirokazu Honda <hiroh at chromium.org>


> >
> > > > > +
> > > > >                 node = descriptors_.extract(it);
> > > > >         }
> > > > >         Camera3RequestDescriptor &descriptor = node.mapped();
> > > > > diff --git a/src/android/camera_device.h
> b/src/android/camera_device.h
> > > > > index 7cf8e8370387..e1b3bf7d30f2 100644
> > > > > --- a/src/android/camera_device.h
> > > > > +++ b/src/android/camera_device.h
> > > > > @@ -7,6 +7,7 @@
> > > > >  #ifndef __ANDROID_CAMERA_DEVICE_H__
> > > > >  #define __ANDROID_CAMERA_DEVICE_H__
> > > > >
> > > > > +#include <condition_variable>
> > > > >  #include <map>
> > > > >  #include <memory>
> > > > >  #include <mutex>
> > > > > @@ -42,6 +43,7 @@ public:
> > > > >
> > > > >         int open(const hw_module_t *hardwareModule);
> > > > >         void close();
> > > > > +       void flush();
> > > > >
> > > > >         unsigned int id() const { return id_; }
> > > > >         camera3_device_t *camera3Device() { return
> &camera3Device_; }
> > > > > @@ -92,6 +94,7 @@ private:
> > > > >         enum State {
> > > > >                 CameraStopped,
> > > > >                 CameraRunning,
> > > > > +               CameraFlushing,
> > > > >         };
> > > > >
> > > > >         void stop();
> > > > > @@ -120,8 +123,9 @@ private:
> > > > >
> > > > >         CameraWorker worker_;
> > > > >
> > > > > -       libcamera::Mutex cameraMutex_; /* Protects access to the
> camera state. */
> > > > > +       libcamera::Mutex cameraMutex_; /* Protects the camera
> state and flushed_. */
> > > > >         State state_;
> > > > > +       std::condition_variable flushed_;
> > > > >
> > > > >         std::shared_ptr<libcamera::Camera> camera_;
> > > > >         std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > > > @@ -134,8 +138,9 @@ private:
> > > > >         std::map<int, libcamera::PixelFormat> formatsMap_;
> > > > >         std::vector<CameraStream> streams_;
> > > > >
> > > > > -       libcamera::Mutex requestsMutex_; /* Protects descriptors_.
> */
> > > > > +       libcamera::Mutex requestsMutex_; /* Protects descriptors_
> and flushing_. */
> > > > >         std::map<uint64_t, Camera3RequestDescriptor> descriptors_;
> > > > > +       std::condition_variable flushing_;
> > > > >
> > > > >         std::string maker_;
> > > > >         std::string model_;
> > > > > diff --git a/src/android/camera_ops.cpp
> b/src/android/camera_ops.cpp
> > > > > index 696e80436821..8a3cfa175ff5 100644
> > > > > --- a/src/android/camera_ops.cpp
> > > > > +++ b/src/android/camera_ops.cpp
> > > > > @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const
> struct camera3_device *dev,
> > > > >  {
> > > > >  }
> > > > >
> > > > > -static int hal_dev_flush([[maybe_unused]] const struct
> camera3_device *dev)
> > > > > +static int hal_dev_flush(const struct camera3_device *dev)
> > > > >  {
> > > > > +       if (!dev)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       CameraDevice *camera = reinterpret_cast<CameraDevice
> *>(dev->priv);
> > > > > +       camera->flush();
> > > > > +
> > > > >         return 0;
> > > > >  }
> > > > >
> >
> > --
> > Regards,
> >
> > Laurent Pinchart
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libcamera.org/pipermail/libcamera-devel/attachments/20210525/00857ad4/attachment-0001.htm>


More information about the libcamera-devel mailing list