[libcamera-devel] [PATCH v2] cam: event_loop: Stop processing calls when event loop is exiting
Niklas Söderlund
niklas.soderlund at ragnatech.se
Wed Jan 20 10:12:49 CET 2021
Hi Laurent,
Thanks for your feedback.
On 2021-01-20 05:56:46 +0200, Laurent Pinchart wrote:
> Ni Niklas,
>
> Thank you for the patch.
>
> On Wed, Jan 20, 2021 at 12:33:15AM +0100, Niklas Söderlund wrote:
> > Terminating the event loop with EventLoop::exit() does not grantee that
> > it will terminate. If the event loops 'call later' queue can be kept
> > non-empty by continuously queuing new calls using EventLoop::callLater()
> > either from a different thread or from callbacks in the loop itself
> > EventLoop::dispatchCalls() will never complete and the loop will run
> > forever.
> >
> > Solve this by evaluating the exit condition between each callback and
> > only enter the idle loop if there are no callbacks to process.
> >
> > Reported-by: Sebastian Fricke <sebastian.fricke at posteo.net>
> > Fixes: f49e93338b6309a6 ("cam: event_loop: Add deferred calls support")
> > Signed-off-by: Niklas Söderlund <niklas.soderlund at ragnatech.se>
>
> This seems to be even more of a hack than the previous version :-( Can
> we fix this properly please ?
I would like noting more then to fix this properly.
The heart of the problem is that some of our pipelines require more then
N requests to be queued for it to complete N. I agree it's bad that our
pipelines behaves differently and something should be done to align
them.
It feels natural that such work would be paired with a compliance tool
so situations like this can be detected early and avoided. I have been
tempted many times to start a compliance tool but have postponed it. My
primary reason for doing so is that we never really defined the number
of requests queued vs. pipeline depth and pipeline stalls and what a
correct behavior in such situations is. I was pleased with the camera
model work had was hoping it would lead to a more strict definition wich
such tools could be built upon.
How ever solving the core problem is a rather large task to sort out.
How can we handle the issue in the short time frame? No matter how I
dice it I see a hack in cam is needed ;-(
- The best option I see is to use the 'process call in batches' solution
as your suggestion in v1 and then either remove support for
--capture=N or accept that on "broken" pipelines N+pipeline depth
frames will be processed, but at least cam won't deadlock.
- Another other option (that I really don't like) is to revert the
deferred calls support.
- We can also fix this properly in cam by only queuing N requests and
expecting N requests to complete. This will break --capture=N on some
pipelines, but perhaps that is the pressure we need to push the fix
upwards.
>
> > ---
> > * Changes since v1
> > - Was [PATCH 0/2] cam: Fix races in event loop and long request
> > processing times
> > - Rework dispatchCalls() instead of callLater() for a simpler fix.
> > ---
> > src/cam/event_loop.cpp | 22 +++++++++++++---------
> > src/cam/event_loop.h | 2 +-
> > 2 files changed, 14 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/cam/event_loop.cpp b/src/cam/event_loop.cpp
> > index 94c5d1d362455f33..18786b4500be9b6f 100644
> > --- a/src/cam/event_loop.cpp
> > +++ b/src/cam/event_loop.cpp
> > @@ -41,7 +41,9 @@ int EventLoop::exec()
> > exit_.store(false, std::memory_order_release);
> >
> > while (!exit_.load(std::memory_order_acquire)) {
> > - dispatchCalls();
> > + if (dispatchCall())
> > + continue;
> > +
> > event_base_loop(event_, EVLOOP_NO_EXIT_ON_EMPTY);
> > }
> >
> > @@ -70,17 +72,19 @@ void EventLoop::callLater(const std::function<void()> &func)
> > interrupt();
> > }
> >
> > -void EventLoop::dispatchCalls()
> > +bool EventLoop::dispatchCall()
> > {
> > std::unique_lock<std::mutex> locker(lock_);
> >
> > - for (auto iter = calls_.begin(); iter != calls_.end(); ) {
> > - std::function<void()> call = std::move(*iter);
> > + if (calls_.empty())
> > + return false;
> >
> > - iter = calls_.erase(iter);
> > + auto iter = calls_.begin();
> > + std::function<void()> call = std::move(*iter);
> > + calls_.erase(iter);
> >
> > - locker.unlock();
> > - call();
> > - locker.lock();
> > - }
> > + locker.unlock();
> > + call();
> > +
> > + return true;
> > }
> > diff --git a/src/cam/event_loop.h b/src/cam/event_loop.h
> > index 408073c50594d09d..b84e1c9e432086e3 100644
> > --- a/src/cam/event_loop.h
> > +++ b/src/cam/event_loop.h
> > @@ -38,7 +38,7 @@ private:
> > std::mutex lock_;
> >
> > void interrupt();
> > - void dispatchCalls();
> > + bool dispatchCall();
> > };
> >
> > #endif /* __CAM_EVENT_LOOP_H__ */
>
> --
> Regards,
>
> Laurent Pinchart
--
Regards,
Niklas Söderlund
More information about the libcamera-devel
mailing list