[PATCH v3] libcamera: software_isp: Handle signals in the proper thread
Milan Zamazal
mzamazal at redhat.com
Mon Feb 17 14:26:42 CET 2025
Hi Laurent,
thank you for thorough explanation and clarifications.
Laurent Pinchart <laurent.pinchart at ideasonboard.com> writes:
> Hi Milan,
>
> On Fri, Feb 14, 2025 at 09:57:22PM +0100, Milan Zamazal wrote:
>> Stanislaw Gruszka writes:
>> > On Fri, Jan 31, 2025 at 08:59:28PM +0100, Milan Zamazal wrote:
>> >> inputBufferReady ready signal in the simple pipeline is handled in the
>> >> pipeline handler thread. outputBufferReady and ispStatsReady signals
>> >> should be handled there too.
>> >>
>> >> Rather than relying on the user of the SoftwareIsp instance, let
>> >> SoftwareIsp inherits Object. SoftwareIsp serves as a signal proxy, the
>> >> signals above are emitted from signal handlers. This means that if
>> >> SoftwareIsp inherits Object then the slots are invoked in SoftwareIsp
>> >> thread. Which is the camera manager thread because the SoftwareIsp
>> >> instance is created there.
>> >>
>> >> Signed-off-by: Milan Zamazal <mzamazal at redhat.com>
>> >> Reviewed-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
>> >
>> > Unfortunately I have assertion failures like below with the change.
>> > Reproducible approximately 1 per 4 times when stopping qcam.
>>
>> I can reproduce it occasionally with cam too.
>>
>> > Could you please take a look?
>>
>> I think what happens there is:
>>
>> - The ISP stats signal is emitted at the "right" moment to cause the
>> race and gets queued.
>>
>> - The pipeline is asked to stop and the IPA proxy state is set to
>> ProxyStopping.
>>
>> - As part of stopping the IPA, pending messages of the given thread are
>> invoked.
>>
>> - The ISP stats message causes a call to IPA to process the stats, which
>> fails on the assertion due to ProxyStopping IPA state. The stats
>> related IPA call itself is not different from what hardware pipelines
>> do but the circumstances when it happens are different, either
>> preventing the race in hardware pipelines or making it much less
>> likely (I don't know).
>
> We indeed tried to model the software ISP similarly to hardware devices,
> in that they would run separately from the pipeline handler thread (the
> hardware obviously runs on its own separately from the CPU, and the
> software ISP runs in a separate thread to mimick that), and emit a
> signal received in the pipeline handler thread. There's a crucial
> difference though, related to how the signal is delivered.
>
> For hardware devices, the file descriptor of the kernel device is added
> to the event loop's even sources (through an EventNotifier). When the
> pipeline handler thread goes back to event loop processing, it sees an
> event on the file descriptor, and calls the event notifier's signal
> handler, in the same thread. For video devices used to capture
> statistics, that's an instance of the V4L2VideoDevice class, which
> decodes the event and emits the corresponding signal (e.g. bufferReady).
> That signal is connected to the pipeline handler's buffer ready event,
> which dequeues the statistics buffer. All of this happens synchronously
> in the same thread. When the video device is stopped, the notifier is
> disabled synchronously with the stop() call. As stop() is called in the
> pipeline handler thread, this guarantees that no event will be received
> by the pipeline handler when stop() returns. Races are not possible.
>
> For the software ISP, the event is delivered by emitting a signal from
> the ISP thread, in the DebayerCpu::process() function. The signal is
> connected to a function of the SoftwareIsp class, using queued signal
> delivery. The function then emits another signal, connected to the
> pipeline handler. Queued signal delivery means that emitting a signal
> queues a message to the receiver's message queue. That message is
> processed when the receiver's thread runs the event loop, and calls the
> signal handler synchronously from there. As for hardware devices, the
> signal handler is guaranteed to be called in the pipeline handler
> thread. However, because the signal message can be queued before the
> software ISP thread is stopped with SoftwareIsp::stop(), its delivery
> can occur after the stop() function returns.
>
>> Now the question is what can be done about it. Things should be already
>> run in the right thread as discussed previously. The signal emitter
>> cannot do anything about the issue. So the pending message should be
>> either discarded or the IPA call should be blocked if the IPA state is
>> not ProxyRunning.
>>
>> The only way I can see to discard the messages is to destroy the
>> corresponding object, which is not an option in this case.
>>
>> I can't see no public way to check the IPA state from outside, excluding
>> the option of blocking the IPA call too. I'm not sure how to interpret
>> the fact that ProxyState enum is public while its only use is to declare
>> a protected class member; does it indicate that the option of dealing
>> with the state from outside is left open?
>>
>> At the moment, what can help is tracking the stop action in SoftwareIsp
>> class itself and preventing forwarding the signal from there in such a
>> case. This apparently fixes the issue. I can post a patch next week if
>> there is no better suggestion.
>
> Event handling for hardware device was designed to avoid race
> conditions, and I think it has served us well. As seen here, handling
> race conditions with threads is difficult. I think the best approach is
> to design components and APIs to isolate the rest of libcamera from
> handling those problems. Handling the race condition in pipeline
> handlers is likely possible, but it will be more error-prone, especially
> if we consider using the software ISP (or other CPU- or GPU-based image
> processing) in other pipeline handlers in the future. We should
> therefore avoid event delivery after SoftwareIsp::stop() returns.
>
> Your proposed solution, of blocking the signal delivery in the
> SoftwareIsp class when stop() is called, seems good to me. It's as close
> as we can get to disabling the event notifier for hardware devices.
>
> The stop() function is currently implemented as
>
> void SoftwareIsp::stop()
> {
> ispWorkerThread_.exit();
> ispWorkerThread_.wait();
>
> ipa_->stop();
> }
>
> After stopping the thread, we know that it will not touch any buffer
> anymore, and won't send any new event. We may have messages queued at
> that point, but they won't be processed asynchronously. Normally those
> messages won't get delivered before we return to the event loop, but
> ipa_->stop() calls dispatchMessages() to address the exact same issue as
> the one we're dealing with here. This causes the queued messages from
> the software ISP thread (if any) to be delivered immediately. Even if
> that wasn't the case, we would still have an issue, as the messages
> would get delivered when the pipeline handler thread returns to the
> event loop. We can set a running state variable to false just before
> calling ipa_->stop(), and block delivery of the signals when the
> variable is false.
Yes, this is what I attempted on Friday and it worked fine.
> I think care also needs to be taken to return all queued buffers to
> the pipeline handler, like we do with hardware devices. Before returning
> to the caller from the stop() function, any buffer that has been queued
> with SoftwareIsp::queueBuffers() should be returned to the pipeline
> handler, the same way V4L2VideoDevice::streamOff() will return buffers
> in the FrameCancelled state. The documentation of SoftwareIsp::stop()
> should explain this.
Good point.
> There's one thing that makes me slightly uncomfortable with this though.
> Blocking signal delivery based on the running state means that we leave
> messages in the queue. ipa_->stop() will flush those, but if its
> implementation changes later, then the messages would still be in the
> queue when SoftwareIsp::stop() returns. That's mostly fine, control will
> quickly return to the event loop, those messages will be delivered, and
> the running state being set to false will block their delivery. However,
> if the pipeline handler were to restart the software ISP right after
> stopping it (I'm not sure why it would do so, but let's assume there
> would be a use case), then the old messages will be delivered when
> control returns to the event loop, as the running variable would be true
> at that point. I'm wondering if it wouldn't be better to drop the
> messages from the queue just before calling ipa_->stop(), with
> Thread::dispatchMessages(). I think the function should be extended with
> a 'Object *receiver' parameter, and only handle messages for that
> receiver. The IPAProxy::stop() function could then limit message
> dispatching to the appropriate receiver.
Let's try and see.
>> > I can dig more into it, but I think it could be easier for you :-) Can
>> > provide more data if needed.
>> >
>> > Thanks
>> > Stanislaw
>> >
>> > 2:12:50.912442715] [11753] FATAL default soft_ipa_proxy.cpp:456 assertion "state_ == ProxyRunning" failed
>> > in processStatsThread()
>> > Backtrace:
>> > libcamera::ipa::soft::IPAProxySoft::processStatsThread(unsigned int, unsigned int, libcamera::ControlList
>> > const&)+0xd6 (src/libcamera/proxy/soft_ipa_proxy.cpp:457)
>> > libcamera::ipa::soft::IPAProxySoft::processStats(unsigned int, unsigned int, libcamera::ControlList
>> > const&)+0x57 (src/libcamera/proxy/soft_ipa_proxy.cpp:449)
>> > libcamera::SoftwareIsp::processStats(unsigned int, unsigned int, libcamera::ControlList const&)+0x117
>> > (../src/libcamera/software_isp/software_isp.cpp:174)
>> > libcamera::SimpleCameraData::ispStatsReady(unsigned int, unsigned int)+0x76
>> > (../src/libcamera/pipeline/simple/simple.cpp:894)
>> > libcamera::BoundMethodMember<libcamera::SimpleCameraData, void, unsigned int, unsigned
>> > int>::activate(unsigned int, unsigned int, bool)+0xa4 (../include/libcamera/base/bound_method.h:180)
>> > libcamera::Signal<unsigned int, unsigned int>::emit(unsigned int, unsigned int)+0x99
>> > (../include/libcamera/base/signal.h:152)
>> > libcamera::SoftwareIsp::statsReady(unsigned int, unsigned int)+0x31
>> > (../src/libcamera/software_isp/software_isp.cpp:360)
>> > libcamera::BoundMethodMember<libcamera::SoftwareIsp, void, unsigned int, unsigned int>::invoke(unsigned
>> > int, unsigned int)+0x80 (../include/libcamera/base/bound_method.h:191)
>> > std::enable_if<std::is_void<void>::value, void>::type libcamera::BoundMethodArgs<void, unsigned int,
>> > unsigned int>::invokePack<0ul, 1ul, void>(libcamera::BoundMethodPackBase*, std::integer_sequence<unsigned
>> > long, 0ul, 1ul>)+0x5f (../include/libcamera/base/bound_method.h:116)
>> > libcamera::BoundMethodArgs<void, unsigned int, unsigned
>> > int>::invokePack(libcamera::BoundMethodPackBase*)+0x27 (../include/libcamera/base/bound_method.h:125)
>> > libcamera::InvokeMessage::invoke()+0x46 (../src/libcamera/base/message.cpp:154)
>> > libcamera::Object::message(libcamera::Message*)+0x7a (../src/libcamera/base/object.cpp:213)
>> > libcamera::Thread::dispatchMessages(libcamera::Message::Type)+0x317
>> > (../src/libcamera/base/thread.cpp:650)
>> > libcamera::ipa::soft::IPAProxySoft::stopThread()+0x167 (src/libcamera/proxy/soft_ipa_proxy.cpp:285)
>> > libcamera::ipa::soft::IPAProxySoft::stop()+0x39 (src/libcamera/proxy/soft_ipa_proxy.cpp:268)
>> > libcamera::SoftwareIsp::stop()+0x64 (../src/libcamera/software_isp/software_isp.cpp:332)
>> > libcamera::SimplePipelineHandler::stopDevice(libcamera::Camera*)+0xb6
>> > (../src/libcamera/pipeline/simple/simple.cpp:1396)
>> > libcamera::PipelineHandler::stop(libcamera::Camera*)+0x54 (../src/libcamera/pipeline_handler.cpp:367)
>> > libcamera::BoundMethodMember<libcamera::PipelineHandler, void,
>> > libcamera::Camera*>::invoke(libcamera::Camera*)+0x7f (../include/libcamera/base/bound_method.h:191)
>> > std::enable_if<std::is_void<void>::value, void>::type libcamera::BoundMethodArgs<void,
>> > libcamera::Camera*>::invokePack<0ul, void>(libcamera::BoundMethodPackBase*,
>> > std::integer_sequence<unsigned long, 0ul>)+0x4a (../include/libcamera/base/bound_method.h:116)
>> > libcamera::BoundMethodArgs<void, libcamera::Camera*>::invokePack(libcamera::BoundMethodPackBase*)+0x27
>> > (../include/libcamera/base/bound_method.h:125)
>> > libcamera::InvokeMessage::invoke()+0x46 (../src/libcamera/base/message.cpp:154)
>> > libcamera::Object::message(libcamera::Message*)+0x7a (../src/libcamera/base/object.cpp:213)
>> > libcamera::Thread::dispatchMessages(libcamera::Message::Type)+0x317
>> > (../src/libcamera/base/thread.cpp:650)
>> > libcamera::EventDispatcherPoll::processEvents()+0x38
>> > (../src/libcamera/base/event_dispatcher_poll.cpp:149)
>> > libcamera::Thread::exec()+0x70 (../src/libcamera/base/thread.cpp:310)
>> > libcamera::CameraManager::Private::run()+0x121 (../src/libcamera/camera_manager.cpp:91)
>> > libcamera::Thread::startThread()+0xf2 (../src/libcamera/base/thread.cpp:290)
>> > void std::__invoke_impl<void, void (libcamera::Thread::*)(),
>> > libcamera::Thread*>(std::__invoke_memfun_deref, void (libcamera::Thread::*&&)(),
>> > libcamera::Thread*&&)+0x6a (/usr/include/c++/13/bits/invoke.h:74)
>> > std::__invoke_result<void (libcamera::Thread::*)(), libcamera::Thread*>::type std::__invoke<void
>> > (libcamera::Thread::*)(), libcamera::Thread*>(void (libcamera::Thread::*&&)(), libcamera::Thread*&&)+0x3b
>> > (/usr/include/c++/13/bits/invoke.h:97)
>> > void std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*>
>> >>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>)+0x47 (/usr/include/c++/13/bits/std_thread.h:292)
>> > std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator()()+0x1c
>> > (/usr/include/c++/13/bits/std_thread.h:299)
>> > std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(),
>> > libcamera::Thread*> > >::_M_run()+0x20 (/usr/include/c++/13/bits/std_thread.h:244)
>> > execute_native_thread_routine+0x14
>> > (/build/gcc-14-ig5ci0/gcc-14-14.2.0/build/x86_64-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:93)
>> > start_thread+0x384 (./nptl/pthread_create.c:447)
>> > __clone3+0x2c (../sysdeps/unix/sysv/linux/x86_64/clone3.S:80)
>> > Aborted (core dumped)
>> >
>> >> ---
>> >> .../internal/software_isp/software_isp.h | 3 ++-
>> >> src/libcamera/pipeline/simple/simple.cpp | 16 +---------------
>> >> 2 files changed, 3 insertions(+), 16 deletions(-)
>> >>
>> >> diff --git a/include/libcamera/internal/software_isp/software_isp.h b/include/libcamera/internal/software_isp/software_isp.h
>> >> index d51b03fd..440a296d 100644
>> >> --- a/include/libcamera/internal/software_isp/software_isp.h
>> >> +++ b/include/libcamera/internal/software_isp/software_isp.h
>> >> @@ -18,6 +18,7 @@
>> >>
>> >> #include <libcamera/base/class.h>
>> >> #include <libcamera/base/log.h>
>> >> +#include <libcamera/base/object.h>
>> >> #include <libcamera/base/signal.h>
>> >> #include <libcamera/base/thread.h>
>> >>
>> >> @@ -43,7 +44,7 @@ struct StreamConfiguration;
>> >>
>> >> LOG_DECLARE_CATEGORY(SoftwareIsp)
>> >>
>> >> -class SoftwareIsp
>> >> +class SoftwareIsp : public Object
>> >> {
>> >> public:
>> >> SoftwareIsp(PipelineHandler *pipe, const CameraSensor *sensor,
>> >> diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp
>> >> index 8ac24e6e..6e039bf3 100644
>> >> --- a/src/libcamera/pipeline/simple/simple.cpp
>> >> +++ b/src/libcamera/pipeline/simple/simple.cpp
>> >> @@ -537,21 +537,7 @@ int SimpleCameraData::init()
>> >> << "Failed to create software ISP, disabling software debayering";
>> >> swIsp_.reset();
>> >> } else {
>> >> - /*
>> >> - * The inputBufferReady signal is emitted from the soft ISP thread,
>> >> - * and needs to be handled in the pipeline handler thread. Signals
>> >> - * implement queued delivery, but this works transparently only if
>> >> - * the receiver is bound to the target thread. As the
>> >> - * SimpleCameraData class doesn't inherit from the Object class, it
>> >> - * is not bound to any thread, and the signal would be delivered
>> >> - * synchronously. Instead, connect the signal to a lambda function
>> >> - * bound explicitly to the pipe, which is bound to the pipeline
>> >> - * handler thread. The function then simply forwards the call to
>> >> - * conversionInputDone().
>> >> - */
>> >> - swIsp_->inputBufferReady.connect(pipe, [this](FrameBuffer *buffer) {
>> >> - this->conversionInputDone(buffer);
>> >> - });
>> >> + swIsp_->inputBufferReady.connect(this, &SimpleCameraData::conversionInputDone);
>> >> swIsp_->outputBufferReady.connect(this, &SimpleCameraData::conversionOutputDone);
>> >> swIsp_->ispStatsReady.connect(this, &SimpleCameraData::ispStatsReady);
>> >> swIsp_->setSensorControls.connect(this, &SimpleCameraData::setSensorControls);
More information about the libcamera-devel
mailing list