[libcamera-devel] [PATCH v2] cam: event_loop: Stop processing calls when event loop is exiting

Niklas Söderlund niklas.soderlund at ragnatech.se
Wed Jan 20 10:12:49 CET 2021


Hi Laurent,

Thanks for your feedback.

On 2021-01-20 05:56:46 +0200, Laurent Pinchart wrote:
> Ni Niklas,
> 
> Thank you for the patch.
> 
> On Wed, Jan 20, 2021 at 12:33:15AM +0100, Niklas Söderlund wrote:
> > Terminating the event loop with EventLoop::exit() does not grantee that
> > it will terminate. If the event loops 'call later' queue can be kept
> > non-empty by continuously queuing new calls using EventLoop::callLater()
> > either from a different thread or from callbacks in the loop itself
> > EventLoop::dispatchCalls() will never complete and the loop will run
> > forever.
> > 
> > Solve this by evaluating the exit condition between each callback and
> > only enter the idle loop if there are no callbacks to process.
> > 
> > Reported-by: Sebastian Fricke <sebastian.fricke at posteo.net>
> > Fixes: f49e93338b6309a6 ("cam: event_loop: Add deferred calls support")
> > Signed-off-by: Niklas Söderlund <niklas.soderlund at ragnatech.se>
> 
> This seems to be even more of a hack than the previous version :-( Can
> we fix this properly please ?

I would like noting more then to fix this properly.

The heart of the problem is that some of our pipelines require more then 
N requests to be queued for it to complete N. I agree it's bad that our 
pipelines behaves differently and something should be done to align 
them.

It feels natural that such work would be paired with a compliance tool 
so situations like this can be detected early and avoided. I have been 
tempted many times to start a compliance tool but have postponed it. My 
primary reason for doing so is that we never really defined the number 
of requests queued vs. pipeline depth and pipeline stalls and what a 
correct behavior in such situations is. I was pleased with the camera 
model work had was hoping it would lead to a more strict definition wich 
such tools could be built upon.

How ever solving the core problem is a rather large task to sort out.  
How can we handle the issue in the short time frame? No matter how I 
dice it I see a hack in cam is needed ;-(

- The best option I see is to use the 'process call in batches' solution
  as your suggestion in v1 and then either remove support for 
  --capture=N or accept that on "broken" pipelines N+pipeline depth 
  frames will be processed, but at least cam won't deadlock.

- Another other option (that I really don't like) is to revert the 
  deferred calls support.

- We can also fix this properly in cam by only queuing N requests and 
  expecting N requests to complete. This will break --capture=N on some 
  pipelines, but perhaps that is the pressure we need to push the fix 
  upwards.

> 
> > ---
> > * Changes since v1
> > - Was [PATCH 0/2] cam: Fix races in event loop and long request
> >   processing times
> > - Rework dispatchCalls() instead of callLater() for a simpler fix.
> > ---
> >  src/cam/event_loop.cpp | 22 +++++++++++++---------
> >  src/cam/event_loop.h   |  2 +-
> >  2 files changed, 14 insertions(+), 10 deletions(-)
> > 
> > diff --git a/src/cam/event_loop.cpp b/src/cam/event_loop.cpp
> > index 94c5d1d362455f33..18786b4500be9b6f 100644
> > --- a/src/cam/event_loop.cpp
> > +++ b/src/cam/event_loop.cpp
> > @@ -41,7 +41,9 @@ int EventLoop::exec()
> >  	exit_.store(false, std::memory_order_release);
> >  
> >  	while (!exit_.load(std::memory_order_acquire)) {
> > -		dispatchCalls();
> > +		if (dispatchCall())
> > +			continue;
> > +
> >  		event_base_loop(event_, EVLOOP_NO_EXIT_ON_EMPTY);
> >  	}
> >  
> > @@ -70,17 +72,19 @@ void EventLoop::callLater(const std::function<void()> &func)
> >  	interrupt();
> >  }
> >  
> > -void EventLoop::dispatchCalls()
> > +bool EventLoop::dispatchCall()
> >  {
> >  	std::unique_lock<std::mutex> locker(lock_);
> >  
> > -	for (auto iter = calls_.begin(); iter != calls_.end(); ) {
> > -		std::function<void()> call = std::move(*iter);
> > +	if (calls_.empty())
> > +		return false;
> >  
> > -		iter = calls_.erase(iter);
> > +	auto iter = calls_.begin();
> > +	std::function<void()> call = std::move(*iter);
> > +	calls_.erase(iter);
> >  
> > -		locker.unlock();
> > -		call();
> > -		locker.lock();
> > -	}
> > +	locker.unlock();
> > +	call();
> > +
> > +	return true;
> >  }
> > diff --git a/src/cam/event_loop.h b/src/cam/event_loop.h
> > index 408073c50594d09d..b84e1c9e432086e3 100644
> > --- a/src/cam/event_loop.h
> > +++ b/src/cam/event_loop.h
> > @@ -38,7 +38,7 @@ private:
> >  	std::mutex lock_;
> >  
> >  	void interrupt();
> > -	void dispatchCalls();
> > +	bool dispatchCall();
> >  };
> >  
> >  #endif /* __CAM_EVENT_LOOP_H__ */
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Regards,
Niklas Söderlund


More information about the libcamera-devel mailing list