[PATCH v3] libcamera: software_isp: Handle signals in the proper thread
Laurent Pinchart
laurent.pinchart at ideasonboard.com
Sun Feb 16 01:32:27 CET 2025
Hi Milan,
On Fri, Feb 14, 2025 at 09:57:22PM +0100, Milan Zamazal wrote:
> Stanislaw Gruszka writes:
> > On Fri, Jan 31, 2025 at 08:59:28PM +0100, Milan Zamazal wrote:
> >> inputBufferReady ready signal in the simple pipeline is handled in the
> >> pipeline handler thread. outputBufferReady and ispStatsReady signals
> >> should be handled there too.
> >>
> >> Rather than relying on the user of the SoftwareIsp instance, let
> >> SoftwareIsp inherits Object. SoftwareIsp serves as a signal proxy, the
> >> signals above are emitted from signal handlers. This means that if
> >> SoftwareIsp inherits Object then the slots are invoked in SoftwareIsp
> >> thread. Which is the camera manager thread because the SoftwareIsp
> >> instance is created there.
> >>
> >> Signed-off-by: Milan Zamazal <mzamazal at redhat.com>
> >> Reviewed-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
> >
> > Unfortunately I have assertion failures like below with the change.
> > Reproducible approximately 1 per 4 times when stopping qcam.
>
> I can reproduce it occasionally with cam too.
>
> > Could you please take a look?
>
> I think what happens there is:
>
> - The ISP stats signal is emitted at the "right" moment to cause the
> race and gets queued.
>
> - The pipeline is asked to stop and the IPA proxy state is set to
> ProxyStopping.
>
> - As part of stopping the IPA, pending messages of the given thread are
> invoked.
>
> - The ISP stats message causes a call to IPA to process the stats, which
> fails on the assertion due to ProxyStopping IPA state. The stats
> related IPA call itself is not different from what hardware pipelines
> do but the circumstances when it happens are different, either
> preventing the race in hardware pipelines or making it much less
> likely (I don't know).
We indeed tried to model the software ISP similarly to hardware devices,
in that they would run separately from the pipeline handler thread (the
hardware obviously runs on its own separately from the CPU, and the
software ISP runs in a separate thread to mimick that), and emit a
signal received in the pipeline handler thread. There's a crucial
difference though, related to how the signal is delivered.
For hardware devices, the file descriptor of the kernel device is added
to the event loop's even sources (through an EventNotifier). When the
pipeline handler thread goes back to event loop processing, it sees an
event on the file descriptor, and calls the event notifier's signal
handler, in the same thread. For video devices used to capture
statistics, that's an instance of the V4L2VideoDevice class, which
decodes the event and emits the corresponding signal (e.g. bufferReady).
That signal is connected to the pipeline handler's buffer ready event,
which dequeues the statistics buffer. All of this happens synchronously
in the same thread. When the video device is stopped, the notifier is
disabled synchronously with the stop() call. As stop() is called in the
pipeline handler thread, this guarantees that no event will be received
by the pipeline handler when stop() returns. Races are not possible.
For the software ISP, the event is delivered by emitting a signal from
the ISP thread, in the DebayerCpu::process() function. The signal is
connected to a function of the SoftwareIsp class, using queued signal
delivery. The function then emits another signal, connected to the
pipeline handler. Queued signal delivery means that emitting a signal
queues a message to the receiver's message queue. That message is
processed when the receiver's thread runs the event loop, and calls the
signal handler synchronously from there. As for hardware devices, the
signal handler is guaranteed to be called in the pipeline handler
thread. However, because the signal message can be queued before the
software ISP thread is stopped with SoftwareIsp::stop(), its delivery
can occur after the stop() function returns.
> Now the question is what can be done about it. Things should be already
> run in the right thread as discussed previously. The signal emitter
> cannot do anything about the issue. So the pending message should be
> either discarded or the IPA call should be blocked if the IPA state is
> not ProxyRunning.
>
> The only way I can see to discard the messages is to destroy the
> corresponding object, which is not an option in this case.
>
> I can't see no public way to check the IPA state from outside, excluding
> the option of blocking the IPA call too. I'm not sure how to interpret
> the fact that ProxyState enum is public while its only use is to declare
> a protected class member; does it indicate that the option of dealing
> with the state from outside is left open?
>
> At the moment, what can help is tracking the stop action in SoftwareIsp
> class itself and preventing forwarding the signal from there in such a
> case. This apparently fixes the issue. I can post a patch next week if
> there is no better suggestion.
Event handling for hardware device was designed to avoid race
conditions, and I think it has served us well. As seen here, handling
race conditions with threads is difficult. I think the best approach is
to design components and APIs to isolate the rest of libcamera from
handling those problems. Handling the race condition in pipeline
handlers is likely possible, but it will be more error-prone, especially
if we consider using the software ISP (or other CPU- or GPU-based image
processing) in other pipeline handlers in the future. We should
therefore avoid event delivery after SoftwareIsp::stop() returns.
Your proposed solution, of blocking the signal delivery in the
SoftwareIsp class when stop() is called, seems good to me. It's as close
as we can get to disabling the event notifier for hardware devices.
The stop() function is currently implemented as
void SoftwareIsp::stop()
{
ispWorkerThread_.exit();
ispWorkerThread_.wait();
ipa_->stop();
}
After stopping the thread, we know that it will not touch any buffer
anymore, and won't send any new event. We may have messages queued at
that point, but they won't be processed asynchronously. Normally those
messages won't get delivered before we return to the event loop, but
ipa_->stop() calls dispatchMessages() to address the exact same issue as
the one we're dealing with here. This causes the queued messages from
the software ISP thread (if any) to be delivered immediately. Even if
that wasn't the case, we would still have an issue, as the messages
would get delivered when the pipeline handler thread returns to the
event loop. We can set a running state variable to false just before
calling ipa_->stop(), and block delivery of the signals when the
variable is false.
I think care also needs to be taken to return all queued buffers to
the pipeline handler, like we do with hardware devices. Before returning
to the caller from the stop() function, any buffer that has been queued
with SoftwareIsp::queueBuffers() should be returned to the pipeline
handler, the same way V4L2VideoDevice::streamOff() will return buffers
in the FrameCancelled state. The documentation of SoftwareIsp::stop()
should explain this.
There's one thing that makes me slightly uncomfortable with this though.
Blocking signal delivery based on the running state means that we leave
messages in the queue. ipa_->stop() will flush those, but if its
implementation changes later, then the messages would still be in the
queue when SoftwareIsp::stop() returns. That's mostly fine, control will
quickly return to the event loop, those messages will be delivered, and
the running state being set to false will block their delivery. However,
if the pipeline handler were to restart the software ISP right after
stopping it (I'm not sure why it would do so, but let's assume there
would be a use case), then the old messages will be delivered when
control returns to the event loop, as the running variable would be true
at that point. I'm wondering if it wouldn't be better to drop the
messages from the queue just before calling ipa_->stop(), with
Thread::dispatchMessages(). I think the function should be extended with
a 'Object *receiver' parameter, and only handle messages for that
receiver. The IPAProxy::stop() function could then limit message
dispatching to the appropriate receiver.
> > I can dig more into it, but I think it could be easier for you :-) Can
> > provide more data if needed.
> >
> > Thanks
> > Stanislaw
> >
> > 2:12:50.912442715] [11753] FATAL default soft_ipa_proxy.cpp:456 assertion "state_ == ProxyRunning" failed
> > in processStatsThread()
> > Backtrace:
> > libcamera::ipa::soft::IPAProxySoft::processStatsThread(unsigned int, unsigned int, libcamera::ControlList
> > const&)+0xd6 (src/libcamera/proxy/soft_ipa_proxy.cpp:457)
> > libcamera::ipa::soft::IPAProxySoft::processStats(unsigned int, unsigned int, libcamera::ControlList
> > const&)+0x57 (src/libcamera/proxy/soft_ipa_proxy.cpp:449)
> > libcamera::SoftwareIsp::processStats(unsigned int, unsigned int, libcamera::ControlList const&)+0x117
> > (../src/libcamera/software_isp/software_isp.cpp:174)
> > libcamera::SimpleCameraData::ispStatsReady(unsigned int, unsigned int)+0x76
> > (../src/libcamera/pipeline/simple/simple.cpp:894)
> > libcamera::BoundMethodMember<libcamera::SimpleCameraData, void, unsigned int, unsigned
> > int>::activate(unsigned int, unsigned int, bool)+0xa4 (../include/libcamera/base/bound_method.h:180)
> > libcamera::Signal<unsigned int, unsigned int>::emit(unsigned int, unsigned int)+0x99
> > (../include/libcamera/base/signal.h:152)
> > libcamera::SoftwareIsp::statsReady(unsigned int, unsigned int)+0x31
> > (../src/libcamera/software_isp/software_isp.cpp:360)
> > libcamera::BoundMethodMember<libcamera::SoftwareIsp, void, unsigned int, unsigned int>::invoke(unsigned
> > int, unsigned int)+0x80 (../include/libcamera/base/bound_method.h:191)
> > std::enable_if<std::is_void<void>::value, void>::type libcamera::BoundMethodArgs<void, unsigned int,
> > unsigned int>::invokePack<0ul, 1ul, void>(libcamera::BoundMethodPackBase*, std::integer_sequence<unsigned
> > long, 0ul, 1ul>)+0x5f (../include/libcamera/base/bound_method.h:116)
> > libcamera::BoundMethodArgs<void, unsigned int, unsigned
> > int>::invokePack(libcamera::BoundMethodPackBase*)+0x27 (../include/libcamera/base/bound_method.h:125)
> > libcamera::InvokeMessage::invoke()+0x46 (../src/libcamera/base/message.cpp:154)
> > libcamera::Object::message(libcamera::Message*)+0x7a (../src/libcamera/base/object.cpp:213)
> > libcamera::Thread::dispatchMessages(libcamera::Message::Type)+0x317
> > (../src/libcamera/base/thread.cpp:650)
> > libcamera::ipa::soft::IPAProxySoft::stopThread()+0x167 (src/libcamera/proxy/soft_ipa_proxy.cpp:285)
> > libcamera::ipa::soft::IPAProxySoft::stop()+0x39 (src/libcamera/proxy/soft_ipa_proxy.cpp:268)
> > libcamera::SoftwareIsp::stop()+0x64 (../src/libcamera/software_isp/software_isp.cpp:332)
> > libcamera::SimplePipelineHandler::stopDevice(libcamera::Camera*)+0xb6
> > (../src/libcamera/pipeline/simple/simple.cpp:1396)
> > libcamera::PipelineHandler::stop(libcamera::Camera*)+0x54 (../src/libcamera/pipeline_handler.cpp:367)
> > libcamera::BoundMethodMember<libcamera::PipelineHandler, void,
> > libcamera::Camera*>::invoke(libcamera::Camera*)+0x7f (../include/libcamera/base/bound_method.h:191)
> > std::enable_if<std::is_void<void>::value, void>::type libcamera::BoundMethodArgs<void,
> > libcamera::Camera*>::invokePack<0ul, void>(libcamera::BoundMethodPackBase*,
> > std::integer_sequence<unsigned long, 0ul>)+0x4a (../include/libcamera/base/bound_method.h:116)
> > libcamera::BoundMethodArgs<void, libcamera::Camera*>::invokePack(libcamera::BoundMethodPackBase*)+0x27
> > (../include/libcamera/base/bound_method.h:125)
> > libcamera::InvokeMessage::invoke()+0x46 (../src/libcamera/base/message.cpp:154)
> > libcamera::Object::message(libcamera::Message*)+0x7a (../src/libcamera/base/object.cpp:213)
> > libcamera::Thread::dispatchMessages(libcamera::Message::Type)+0x317
> > (../src/libcamera/base/thread.cpp:650)
> > libcamera::EventDispatcherPoll::processEvents()+0x38
> > (../src/libcamera/base/event_dispatcher_poll.cpp:149)
> > libcamera::Thread::exec()+0x70 (../src/libcamera/base/thread.cpp:310)
> > libcamera::CameraManager::Private::run()+0x121 (../src/libcamera/camera_manager.cpp:91)
> > libcamera::Thread::startThread()+0xf2 (../src/libcamera/base/thread.cpp:290)
> > void std::__invoke_impl<void, void (libcamera::Thread::*)(),
> > libcamera::Thread*>(std::__invoke_memfun_deref, void (libcamera::Thread::*&&)(),
> > libcamera::Thread*&&)+0x6a (/usr/include/c++/13/bits/invoke.h:74)
> > std::__invoke_result<void (libcamera::Thread::*)(), libcamera::Thread*>::type std::__invoke<void
> > (libcamera::Thread::*)(), libcamera::Thread*>(void (libcamera::Thread::*&&)(), libcamera::Thread*&&)+0x3b
> > (/usr/include/c++/13/bits/invoke.h:97)
> > void std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*>
> >>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>)+0x47 (/usr/include/c++/13/bits/std_thread.h:292)
> > std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator()()+0x1c
> > (/usr/include/c++/13/bits/std_thread.h:299)
> > std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(),
> > libcamera::Thread*> > >::_M_run()+0x20 (/usr/include/c++/13/bits/std_thread.h:244)
> > execute_native_thread_routine+0x14
> > (/build/gcc-14-ig5ci0/gcc-14-14.2.0/build/x86_64-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:93)
> > start_thread+0x384 (./nptl/pthread_create.c:447)
> > __clone3+0x2c (../sysdeps/unix/sysv/linux/x86_64/clone3.S:80)
> > Aborted (core dumped)
> >
> >> ---
> >> .../internal/software_isp/software_isp.h | 3 ++-
> >> src/libcamera/pipeline/simple/simple.cpp | 16 +---------------
> >> 2 files changed, 3 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/include/libcamera/internal/software_isp/software_isp.h b/include/libcamera/internal/software_isp/software_isp.h
> >> index d51b03fd..440a296d 100644
> >> --- a/include/libcamera/internal/software_isp/software_isp.h
> >> +++ b/include/libcamera/internal/software_isp/software_isp.h
> >> @@ -18,6 +18,7 @@
> >>
> >> #include <libcamera/base/class.h>
> >> #include <libcamera/base/log.h>
> >> +#include <libcamera/base/object.h>
> >> #include <libcamera/base/signal.h>
> >> #include <libcamera/base/thread.h>
> >>
> >> @@ -43,7 +44,7 @@ struct StreamConfiguration;
> >>
> >> LOG_DECLARE_CATEGORY(SoftwareIsp)
> >>
> >> -class SoftwareIsp
> >> +class SoftwareIsp : public Object
> >> {
> >> public:
> >> SoftwareIsp(PipelineHandler *pipe, const CameraSensor *sensor,
> >> diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp
> >> index 8ac24e6e..6e039bf3 100644
> >> --- a/src/libcamera/pipeline/simple/simple.cpp
> >> +++ b/src/libcamera/pipeline/simple/simple.cpp
> >> @@ -537,21 +537,7 @@ int SimpleCameraData::init()
> >> << "Failed to create software ISP, disabling software debayering";
> >> swIsp_.reset();
> >> } else {
> >> - /*
> >> - * The inputBufferReady signal is emitted from the soft ISP thread,
> >> - * and needs to be handled in the pipeline handler thread. Signals
> >> - * implement queued delivery, but this works transparently only if
> >> - * the receiver is bound to the target thread. As the
> >> - * SimpleCameraData class doesn't inherit from the Object class, it
> >> - * is not bound to any thread, and the signal would be delivered
> >> - * synchronously. Instead, connect the signal to a lambda function
> >> - * bound explicitly to the pipe, which is bound to the pipeline
> >> - * handler thread. The function then simply forwards the call to
> >> - * conversionInputDone().
> >> - */
> >> - swIsp_->inputBufferReady.connect(pipe, [this](FrameBuffer *buffer) {
> >> - this->conversionInputDone(buffer);
> >> - });
> >> + swIsp_->inputBufferReady.connect(this, &SimpleCameraData::conversionInputDone);
> >> swIsp_->outputBufferReady.connect(this, &SimpleCameraData::conversionOutputDone);
> >> swIsp_->ispStatsReady.connect(this, &SimpleCameraData::ispStatsReady);
> >> swIsp_->setSensorControls.connect(this, &SimpleCameraData::setSensorControls);
--
Regards,
Laurent Pinchart
More information about the libcamera-devel
mailing list