[libcamera-devel] [PATCH v3 2/3] android: CameraDevice: Prevent race due to concurrent HAL calls

Laurent Pinchart laurent.pinchart at ideasonboard.com
Mon May 3 16:55:38 CEST 2021


Hello,

On Mon, May 03, 2021 at 04:41:04PM +0200, Jacopo Mondi wrote:
> On Mon, Apr 26, 2021 at 05:38:29PM +0900, Hirokazu Honda wrote:
> > HAL API functions might be called by different threads in Android
> > Camera HAL3 API, while process_capture_request() must be called
> > by a single thread. Furthermore, close() and flush() can be
> > called any time. Therefore, races to variables can occur among
> > the HAL API functions.
> > This prevents the races by enclosing HAL calls by mutex. It
> > might not be the best in terms of the performance, but the
> > simplicity is good as a first step and also further development.
> >
> > Signed-off-by: Hirokazu Honda <hiroh at chromium.org>
> > ---
> >  src/android/camera_device.cpp | 59 ++++++++++++++++++++---------------
> >  src/android/camera_device.h   |  7 +++--
> >  2 files changed, 39 insertions(+), 27 deletions(-)
> >
> > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> > index b3def04b..96c8d4a9 100644
> > --- a/src/android/camera_device.cpp
> > +++ b/src/android/camera_device.cpp
> > @@ -743,7 +743,7 @@ int CameraDevice::open(const hw_module_t *hardwareModule)
> >
> >  void CameraDevice::close()
> >  {
> > -	streams_.clear();
> > +	std::scoped_lock<std::mutex> halLock(halMutex_);
> 
> Have you considered locking the wrappers functions in camera_ops.cpp ?
> Wouldn't it be simpler ? However locking with such a coarse-grained
> granularity I feel contradicts a bit the Camera3 API flush()
> documentation, which implies that flush() and
> process_capture_request() can race
> 
>      * Due to
>      * concurrency issues, it is possible that from the HAL's point of view, a
>      * process_capture_request() call may be started after flush has been invoked but has not
>      * returned yet. If such a call happens before flush() returns, the HAL should treat the new
>      * capture request like other in-flight pending requests (see #4 below).
> 
> Locking at function level rules out this use case: if we lock each
> function it's not possible for process_capture_request() to be started
> during flush() execution as the lock hold by flush will be released
> only at exit time.
> 
> If serializing operations with this granularity is fine (and I really
> with it is, as it would really simplify life for use)

I think it is as an initial step, but likely not in the longer term. It
defeats the point of having the camera service issue
process_capture_request() and stop()/flush() calls concurrently. If
serializing operations was good enough in every case, it would be done
in the camera service itself, and HAL implementations wouldn't need to
deal with this.

> the only racing
> condition we have to protect from is between the request completion
> signal emitted by libcamera, and the camera framework calling into the
> Camera HAL using the camera3 API as flush(), close() and
> process_capture_request() will always be serialized.
> 
> Assuming the above is ok with the Camera3 API, isn't it enough to
> 
> - flush() to call stop() and stop the Camera:
>         - All queued requests at the time Camera::stop() is called
>           will be returned one by one with an error state. There's no
>           reason to cancel the descriptors at stop() time, those
>           requests will be completed
> 
>         - A process_capture_request() called after flush() has stopped
>           the camera will return immeditely as the camera is stopped
>           and the capture request is returned with an error
>           (we might want a flag to keep track of this, otherwise we'll
>           hit a Camera state machine error, much like we do with
>           running_)
> 
>         - A close() called after flush() will be a no op

So far it seems fine to me.

>         - Only open() will allow the CameraDevice to move back from
>           stopped_ state

That's correct for close(). For flush(), we can to make sure that a
configureStreams() call will reopen the camera.

> - There's a tiny tiny window where a request can complete after flush
>   has been called but the camera has not been stopped yet. In

It's not that tiny, it can certainly be hit in practice.

>   requestComplete() we'll have to inspect the stopped_ flag state and
>   force the request to return with error even if it did not.

Why can't we complete the request without any error in that case ?

> - Once we move JPEG encoding to a worker thread, stop() will have to
>   forcefully stop it. Requests interrupted during encoding will be
>   returned with error.
> 
> A simple lock around stopped_ should protect it from concurrent
> accesses between stop() and requestComplete().

The issue is that the flush() function must wait for all requests to
complete. This could be implemented with a condition variable, allowing
to avoid serializing CameraDevice::requestComplete() with the API calls
from the camera service. I think that would be better indeed.

> Have you considered this option and ruled it out for reasons I'm
> failing to consider ? It feels simpler than decoupling requests and
> descriptors, maybe you tried and found out it's not :)
> 
> >  	stop();
> >
> > @@ -758,12 +758,17 @@ void CameraDevice::stop()
> >  	worker_.stop();
> >  	camera_->stop();
> >
> > -	descriptors_.clear();
> > +	{
> > +		std::scoped_lock<std::mutex> lock(mutex_);
> > +		descriptors_.clear();
> > +	}
> > +
> >  	running_ = false;
> >  }
> >
> >  void CameraDevice::setCallbacks(const camera3_callback_ops_t *callbacks)
> >  {
> > +	std::scoped_lock<std::mutex> halLock(halMutex_);
> >  	callbacks_ = callbacks;
> >  }
> >
> > @@ -1643,8 +1648,7 @@ PixelFormat CameraDevice::toPixelFormat(int format) const
> >   */
> >  int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> >  {
> > -	/* Before any configuration attempt, stop the camera. */
> > -	stop();
> > +	std::scoped_lock<std::mutex> halLock(halMutex_);
> >
> >  	if (stream_list->num_streams == 0) {
> >  		LOG(HAL, Error) << "No streams in configuration";
> > @@ -1656,6 +1660,9 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> >  		return -EINVAL;
> >  #endif
> >
> > +	/* Before any configuration attempt, stop the camera. */
> > +	stop();
> > +
> >  	/*
> >  	 * Generate an empty configuration, and construct a StreamConfiguration
> >  	 * for each camera3_stream to add to it.
> > @@ -1671,6 +1678,8 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> >  	 * ensure the required entries are available without further
> >  	 * reallocation.
> >  	 */
> > +
> > +	std::scoped_lock<std::mutex> lock(mutex_);
> >  	streams_.clear();
> >  	streams_.reserve(stream_list->num_streams);
> >
> > @@ -1907,6 +1916,8 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >  	if (!isValidRequest(camera3Request))
> >  		return -EINVAL;
> >
> > +	std::scoped_lock<std::mutex> halLock(halMutex_);
> > +
> >  	/* Start the camera if that's the first request we handle. */
> >  	if (!running_) {
> >  		worker_.start();
> > @@ -2022,29 +2033,26 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >
> >  void CameraDevice::requestComplete(Request *request)
> >  {
> > +	std::scoped_lock<std::mutex> lock(mutex_);
> > +
> >  	const Request::BufferMap &buffers = request->buffers();
> >  	camera3_buffer_status status = CAMERA3_BUFFER_STATUS_OK;
> >  	std::unique_ptr<CameraMetadata> resultMetadata;
> >
> > -	decltype(descriptors_)::node_type node;
> > -	std::unique_ptr<CaptureRequest> captureRequest;
> > -	{
> > -		std::scoped_lock<std::mutex> lock(mutex_);
> > -		auto requestIt = requests_.find(request->cookie());
> > -		if (requestIt == requests_.end()) {
> > -			LOG(HAL, Fatal)
> > -				<< "Unknown request: " << request->cookie();
> > -			return;
> > -		}
> > -		captureRequest = std::move(requestIt->second);
> > -		requests_.erase(requestIt);
> > +	auto requestIt = requests_.find(request->cookie());
> > +	if (requestIt == requests_.end()) {
> > +		LOG(HAL, Fatal)
> > +			<< "Unknown request: " << request->cookie();
> > +		return;
> > +	}
> > +	auto captureRequest = std::move(requestIt->second);
> > +	requests_.erase(requestIt);
> >
> > -		auto it = descriptors_.find(request->cookie());
> > -		if (it == descriptors_.end())
> > -			return;
> > +	auto it = descriptors_.find(request->cookie());
> > +	if (it == descriptors_.end())
> > +		return;
> >
> > -		node = descriptors_.extract(it);
> > -	}
> > +	auto node = descriptors_.extract(it);
> >  	Camera3RequestDescriptor &descriptor = node.mapped();
> >
> >  	if (request->status() != Request::RequestComplete) {
> > @@ -2149,11 +2157,12 @@ void CameraDevice::notifyError(uint32_t frameNumber, camera3_stream_t *stream)
> >  	camera3_notify_msg_t notify = {};
> >
> >  	/*
> > -	 * \todo Report and identify the stream number or configuration to
> > -	 * clarify the stream that failed.
> > +	 * \todo Report and identify the stream number or configuration
> > +	 * to clarify the stream that failed.
> >  	 */
> > -	LOG(HAL, Error) << "Error occurred on frame " << frameNumber << " ("
> > -			<< toPixelFormat(stream->format).toString() << ")";
> > +	LOG(HAL, Error)
> > +		<< "Error occurred on frame " << descriptor.frameNumber_ << " ("
> > +		<< toPixelFormat(descriptor.buffers_[0].stream->format).toString() << ")";
> >
> >  	notify.type = CAMERA3_MSG_ERROR;
> >  	notify.message.error.error_stream = stream;
> > diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> > index 95d77890..325fb967 100644
> > --- a/src/android/camera_device.h
> > +++ b/src/android/camera_device.h
> > @@ -125,9 +125,12 @@ private:
> >
> >  	std::vector<Camera3StreamConfiguration> streamConfigurations_;
> >  	std::map<int, libcamera::PixelFormat> formatsMap_;
> > -	std::vector<CameraStream> streams_;
> >
> > -	std::mutex mutex_; /* Protect descriptors_ and requests_. */
> > +	/* Avoid races by concurrent Camera HAL API calls. */
> > +	std::mutex halMutex_;
> > +
> > +	std::mutex mutex_; /* Protect streams_, descriptors_ and requests_. */
> > +	std::vector<CameraStream> streams_;
> >  	std::map<uint64_t, Camera3RequestDescriptor> descriptors_;
> >  	std::map<uint64_t, std::unique_ptr<CaptureRequest>> requests_;
> >

-- 
Regards,

Laurent Pinchart


More information about the libcamera-devel mailing list