[libcamera-devel] [PATCH v2 6/9] android: camera_device: Add a queue for sending capture results

Wed Sep 15 03:28:56 CEST 2021

Hi Umang,

On Mon, Sep 13, 2021 at 12:31:46PM +0530, Umang Jain wrote:
> On 9/13/21 6:29 AM, Laurent Pinchart wrote:
> > On Fri, Sep 10, 2021 at 12:36:35PM +0530, Umang Jain wrote:
> >> When a camera capture request completes, the next step is to send the
> >> capture results to the framework via process_capture_results(). All
> >> the capture associated result metadata is prepared and populated. If
> >> any post-processing is required, it will also take place in the same
> >> thread (which can be blocking for a subtaintial amount of time) before
> >> the results can be sent back to the framework.
> >>
> >> Subsequent patches will move the post-processing to run in a separate
> >> thread. In order to do so, there is few bits of groundwork which this
> >> patch entails. Mainly, we need to preserve the order of sending
> >> the capture results back in which they were queued by the framework
> >> to the HAL (i.e. the order is sequential). Hence, we need to add a queue
> >> in which capture results can be queued with context.
> >>
> >> As per this patch, the post-processor still runs synchronously as
> >> before, but it will queue up the current capture results with context.
> >> Later down the line, the capture results are dequeud in order and
> >> sent back to the framework via sendQueuedCaptureResults(). If the queue
> >> is empty or unused, the current behaviour is to skip the queue and
> >> send capture results directly.
> >>
> >> The context is preserved using Camera3RequestDescriptor utility
> >> structure. It has been expanded accordingly to address the needs of
> >> the functionality.
> >> ---
> >> Check if we can drop unique_ptr to camera3descriptor
> >> ---
> >>
> >> Signed-off-by: Umang Jain <umang.jain at ideasonboard.com>
> >> ---
> >>   src/android/camera_device.cpp | 106 ++++++++++++++++++++++++++++++----
> >>   src/android/camera_device.h   |  22 +++++++
> >>   src/android/camera_stream.cpp |   2 +
> >>   3 files changed, 119 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> >> index 84549d04..7f04d225 100644
> >> --- a/src/android/camera_device.cpp
> >> +++ b/src/android/camera_device.cpp
> >> @@ -240,6 +240,8 @@ Camera3RequestDescriptor::Camera3RequestDescriptor(
> >>   	/* Clone the controls associated with the camera3 request. */
> >>   	settings_ = CameraMetadata(camera3Request->settings);
> >>   
> >> +	internalBuffer_ = nullptr;
> >> +
> >>   	/*
> >>   	 * Create the CaptureRequest, stored as a unique_ptr<> to tie its
> >>   	 * lifetime to the descriptor.
> >> @@ -989,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >>   			 * once it has been processed.
> >>   			 */
> >>   			buffer = cameraStream->getBuffer();
> >> +			descriptor.internalBuffer_ = buffer;
> >>   			LOG(HAL, Debug) << ss.str() << " (internal)";
> >>   			break;
> >>   		}
> >> @@ -1148,26 +1151,107 @@ void CameraDevice::requestComplete(Request *request)
> >>   			continue;
> >>   		}
> >>   
> >> +
> >> +		/*
> >> +		 * Save the current context of capture result and queue the
> >> +		 * descriptor before processing the camera stream.
> >> +		 *
> >> +		 * When the processing completes, the descriptor will be
> >> +		 * dequeued and capture results will be sent to the framework.
> >> +		 */
> >> +		descriptor.status_ = Camera3RequestDescriptor::Pending;
> >
> > The status should also be initialized to Pending in the
> > Camera3RequestDescriptor constructor.
> >
> >> +		descriptor.resultMetadata_ = std::move(resultMetadata);
> >
> > Let's move the resultMetadata to the descriptor right when creating it,
> > it will be cleaner.
> >
> >> +		descriptor.captureResult_ = captureResult;
> >
> > This causes a copy of a fairly large structure. You can prevent this by
> > replacing
> >
> > 	camera3_capture_result_t captureResult = {};
> >
> > above with
> >
> > 	camera3_capture_result_t &captureResult = descriptor.captureResult_;
> > 	captureResult = {};
> >
> > or better, in the definition of Camera3RequestDescriptor, replace
> >
> > 	camera3_capture_result_t captureResult_;
> >
> > with
> >
> > 	camera3_capture_result_t captureResult_ = {};
> >
> > and drop the initialization in this function. This will ensure that the
> > capture results are correctly initialized as soon as
> > Camera3RequestDescriptor is constructed, avoiding possibly
> > use-before-init issues.
> >
> >> +
> >> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
> >> +			std::make_unique<Camera3RequestDescriptor>();
> >> +		*reqDescriptor = std::move(descriptor);
> >
> > This will copy the whole class as there's no move constructor for
> > Camera3RequestDescriptor. I don't think that's what you want. If the
> > descriptor needs to be moved out of the map to the queue, without a
> > copy, then they should be stored in the map as unique_ptr already. This
> > should be done as a separate patch before this one.
> 
> So, you mean descriptors in the map should be unique_ptr beforehand? 
> That will avoid the copy on std::move?

The only thing that std::move() does is to return an rvalue reference to
the parameter passed to it. That's all. What it implies is that the
assignment will then pick the version of operator=() that takes an
rvalue reference (called the move assignment operator). It's really just
a way to pick one implementation of operator=() over the other.

For classes such as std::unique_ptr<>, the move assignment operator will
transfer ownership of the pointer. For classes such as std::vector<>, it
will transfer ownership of the internal data, so it's cheap. For
Camera3RequestDescriptor, we don't define any move assignement operator,
so all members will be moved one by one, and for most members, that will
just be a copy.

Bottom line, moving a std::unique_ptr<> makes sense, moving a
Camera3RequestDescriptor less so.

After reviewing the whole series, and thinking some more about it, I
think we should introduce a
std::queue<std::unique_ptr<Camera3RequestDescriptor>> that will own the
Camera3RequestDescriptor instances, and a std::map<uint64_t,
Camera3RequestDescriptor *> to lookup the Camera3RequestDescriptor in
CameraDevice::requestComplete(). The queue will have ownership of the
request descriptor, the map will store pointers to the instance stored
in the queue. You'll need to

- add an entry to the queue in CameraDevice::processCaptureRequest(),
  and add it to the map there too, right before queuing the request to
  the worker

- look up the entry from the map in CameraDevice::requestComplete(), and
  remove it from the map there

- handle completion in order using the queue, with a function like
  CameraDevice::sendQueuedCaptureResults()

It would be best, I think, if this rework could be done early in the
series to make it easier to review, and then add the async
post-processor. I haven't looked at it in details though, so maybe
that's not practical.

> > I wonder if we shouldn't keep the descriptor in the map until we finish
> > post-processing, as that would help implementing partial completion
> > support in the future (where we would connect the bufferComplete signal
> > and start post-processing of buffers before libcamera completes the
> > whole request). Maybe that will be best handled in the future, not now,
> > so I think I'm fine moving descriptors from the map to the queue for
> > now.
> 
> Ack.
> 
> >> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
> >> +
> >> +		Camera3RequestDescriptor *currentDescriptor = queuedDescriptor_.back().get();
> >>   		int ret = cameraStream->process(src, *buffer.buffer,
> >
> > ret is unused, this causes a compilation failure. Could you please
> > compile-test individual patches in the series ?
> >
> > Now that camera stream processing completion is signaled, the process()
> > function should become void. This can go in a separate patch.
> >
> >> -						descriptor.settings_,
> >> -						resultMetadata.get(),
> >> -						&descriptor);
> >> +						currentDescriptor->settings_,
> >> +						currentDescriptor->resultMetadata_.get(),
> >
> > You can drop the resultMetadata parameter from the process() function
> > and access it through the descriptor instead.
> 
> As I mentioned in the previous replies, I think fields that are modified 
> as part of process(), should be passed explitcity. The descriptor I pass 
> down the line, is a const.
> 
> >> +						currentDescriptor);
> >> +		return;
> >> +	}
> >> +
> >> +	if (queuedDescriptor_.empty()) {
> >> +		captureResult.result = resultMetadata->get();
> >> +		callbacks_->process_capture_result(callbacks_, &captureResult);
> >> +	} else {
> >> +		/*
> >> +		 * Previous capture results waiting to be sent to framework
> >> +		 * hence, queue the current capture results as well. After that,
> >> +		 * check if any results are ready to be sent back to the
> >> +		 * framework.
> >> +		 */
> >> +		descriptor.status_ = Camera3RequestDescriptor::Succeeded;
> >> +		descriptor.resultMetadata_ = std::move(resultMetadata);
> >> +		descriptor.captureResult_ = captureResult;
> >> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
> >> +			std::make_unique<Camera3RequestDescriptor>();
> >> +		*reqDescriptor = std::move(descriptor);
> >> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
> >> +
> >> +		sendQueuedCaptureResults();
> >> +	}
> >> +}
> >> +
> >> +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream,
> >> +					    CameraStream::ProcessStatus status,
> >> +					    const Camera3RequestDescriptor *context)
> >> +{
> >> +	for (auto &d : queuedDescriptor_) {
> >
> > This function is called from the post-processing completion thread, so
> > it will race with CameraDevice::requestComplete(). Both access
> > queuedDescriptor_ with a lock, it won't end well.
> 
> There is no thread here in this patch. It's all synchronous as mentioned 
> in the commit message. The threading only appears 8/9 patch and there we 
> need to start looking at racing issues.

You're right, but could we introduce the required locking in this patch
already ? It's easier to review if the new data structures and the
locking come together.

> >> +		if (d.get() != context)
> >> +			continue;
> >
> > Iterating over a queue to find the element we need sounds like we're not
> > using the right type of container. I think you should keep the
> > Camera3RequestDescriptor in the map, and look it up from the map here.
> 
> No, The descriptor is already getting cleaned up in requestComplete().
> 
> Let me explain:
> 
> As per master right now, the descriptor is extracted from the map at the 
> start of requestComplete. The lifetime of the descriptor is now only in 
> the requestComplete() function. Once it returns, the descriptor is 
> destroyed.
> 
> What am I doing through this series is, adding a descriptor to a queue. 
> This happens because we don't want to destroy it, rather preserve it 
> until post-processing completes(well, it's going to happen async, so it 
> makes more sense). But still it should be removed from the map, just 
> that, not let it get destroyed.
> 
> Now, why a queue? Because we should be able to send capture results as 
> per ordering. The descriptors are queued up in order.

That's a very clear explanation, I agree with all of this.

My comment about looking it up from the map indeed doesn't make too much
sense. Reading this function again, can't you just access the
Camera3RequestDescriptor from the context parameter ? You'll need to
const_cast<> to remove the const, but that's fine, we know here that we
can do so because we own the Camera3RequestDescriptor.

> Not removing the descriptor from the map might have it's own 
> implications for e.g. you will need to Request around as well, as the 
> descriptors_ map is looked with with request->cookie(). I think it's 
> slightly invasive changes, no?

Yes, not a good idea.

> Plus I think the "android: Fix 
> descriptors_ clean up" will also conflict to some extend.

There could be some conflicts, yes :-S

Does my above proposal with a queue owning the request and a map to look
it up make sense ? We should ideally even be able to get rid of the map
by storing the pointer to the Camera3RequestDescriptor in the Request
cookie, but it's currently used to store the CaptureRequest pointer.
CaptureRequest should really be dropped as fence handling should go to
the libcamera core, but I don't want to include that in this series,
it's big enough already. A map is fine as an interim solution.

> >> +
> >>   		/*
> >>   		 * Return the FrameBuffer to the CameraStream now that we're
> >>   		 * done processing it.
> >>   		 */
> >>   		if (cameraStream->type() == CameraStream::Type::Internal)
> >> -			cameraStream->putBuffer(src);
> >> -
> >> -		if (ret) {
> >> -			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> >> -			notifyError(descriptor.frameNumber_, buffer.stream,
> >> -				    CAMERA3_MSG_ERROR_BUFFER);
> >> +			cameraStream->putBuffer(d->internalBuffer_);
> >> +
> >> +		if (status == CameraStream::ProcessStatus::Succeeded) {
> >> +			d->status_ = Camera3RequestDescriptor::Succeeded;
> >> +		} else {
> >> +			d->status_ = Camera3RequestDescriptor::Failed;
> >> +
> >> +			d->captureResult_.partial_result = 0;
> >> +			for (camera3_stream_buffer_t &buffer : d->buffers_) {
> >> +				CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv);
> >> +
> >> +				if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB)
> >> +					continue;
> >> +				buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> >> +				notifyError(d->frameNumber_, buffer.stream,
> >> +					    CAMERA3_MSG_ERROR_BUFFER);
> >> +			}
> >>   		}
> >> +		break;
> >>   	}
> >>   
> >> -	captureResult.result = resultMetadata->get();
> >> -	callbacks_->process_capture_result(callbacks_, &captureResult);
> >> +	/*
> >> +	 * Send back capture results to the framework by inspecting the queue.
> >> +	 * The framework can defer queueing further requests to the HAL (via
> >> +	 * process_capture_request) unless until it receives the capture results
> >> +	 * for already queued requests.
> >> +	 */
> >> +	sendQueuedCaptureResults();
> >> +}
> >> +
> >> +void CameraDevice::sendQueuedCaptureResults()
> >> +{
> >> +	while (!queuedDescriptor_.empty()) {
> >> +		std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front();
> >> +		if (d->status_ == Camera3RequestDescriptor::Pending)
> >> +			return;
> >> +
> >> +		d->captureResult_.result = d->resultMetadata_->get();
> >> +		callbacks_->process_capture_result(callbacks_, &(d->captureResult_));
> >> +		queuedDescriptor_.pop_front();
> >> +	}
> >>   }
> >>   
> >>   std::string CameraDevice::logPrefix() const
> >> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> >> index b59ac3e7..e7318358 100644
> >> --- a/src/android/camera_device.h
> >> +++ b/src/android/camera_device.h
> >> @@ -7,6 +7,7 @@
> >>   #ifndef __ANDROID_CAMERA_DEVICE_H__
> >>   #define __ANDROID_CAMERA_DEVICE_H__
> >>   
> >> +#include <deque>
> >>   #include <map>
> >>   #include <memory>
> >>   #include <mutex>
> >> @@ -46,6 +47,20 @@ struct Camera3RequestDescriptor {
> >>   	std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
> >>   	CameraMetadata settings_;
> >>   	std::unique_ptr<CaptureRequest> request_;
> >> +
> >> +	/*
> >> +	 * Below are utility placeholders used when a capture result
> >> +	 * needs to be queued before completion via process_capture_result()
> >> +	 */
> >> +	enum completionStatus {
> >> +		Pending,
> >> +		Succeeded,
> >> +		Failed,
> >> +	};
> >> +	std::unique_ptr<CameraMetadata> resultMetadata_;
> >> +	camera3_capture_result_t captureResult_;
> >
> > This patch is a bit hard to review. Could you split the addition of
> > resultMetadata_ and captureResult_ to Camera3RequestDescriptor to a
> > separate patch (including addressing the comments above) ?
> 
> Yes, it's hard since it setups with the ground work for post-processor. 
> It has many pieces linked together giving very little room for splitting up.
> 
> >> +	libcamera::FrameBuffer *internalBuffer_;
> >> +	completionStatus status_;
> >>   };
> >>   
> >>   class CameraDevice : protected libcamera::Loggable
> >> @@ -78,6 +93,9 @@ public:
> >>   	int processCaptureRequest(camera3_capture_request_t *request);
> >>   	void requestComplete(libcamera::Request *request);
> >>   
> >> +	void streamProcessingComplete(CameraStream *stream,
> >> +				      CameraStream::ProcessStatus status,
> >> +				      const Camera3RequestDescriptor *context);
> >>   protected:
> >>   	std::string logPrefix() const override;
> >>   
> >> @@ -106,6 +124,8 @@ private:
> >>   	std::unique_ptr<CameraMetadata> getResultMetadata(
> >>   		const Camera3RequestDescriptor &descriptor) const;
> >>   
> >> +	void sendQueuedCaptureResults();
> >> +
> >>   	unsigned int id_;
> >>   	camera3_device_t camera3Device_;
> >>   
> >> @@ -126,6 +146,8 @@ private:
> >>   	libcamera::Mutex descriptorsMutex_; /* Protects descriptors_. */
> >>   	std::map<uint64_t, Camera3RequestDescriptor> descriptors_;
> >>   
> >> +	std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_;
> >> +
> >>   	std::string maker_;
> >>   	std::string model_;
> >>   
> >> diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp
> >> index 996779c4..c7d874b2 100644
> >> --- a/src/android/camera_stream.cpp
> >> +++ b/src/android/camera_stream.cpp
> >> @@ -134,6 +134,8 @@ void CameraStream::postProcessingComplete(PostProcessor::Status status,
> >>   		processStatus = ProcessStatus::Succeeded;
> >>   	else
> >>   		processStatus = ProcessStatus::Failed;
> >> +
> >> +	cameraDevice_->streamProcessingComplete(this, processStatus, context);
> >>   }
> >>   
> >>   FrameBuffer *CameraStream::getBuffer()

-- 
Regards,

Laurent Pinchart