[libcamera-devel] [PATCH v3] android: camera_device: Fix crash in calling CameraDevice::close()
Jacopo Mondi
jacopo at jmondi.org
Thu Sep 2 11:56:31 CEST 2021
Hello,
On Thu, Sep 02, 2021 at 11:51:02AM +0300, Laurent Pinchart wrote:
> Hi Jacopo,
>
> On Thu, Sep 02, 2021 at 10:25:59AM +0200, Jacopo Mondi wrote:
> > On Thu, Sep 02, 2021 at 10:40:39AM +0300, Laurent Pinchart wrote:
> > > On Thu, Sep 02, 2021 at 08:59:34AM +0200, Jacopo Mondi wrote:
> > > > On Thu, Sep 02, 2021 at 06:11:40AM +0300, Laurent Pinchart wrote:
> > > > > On Wed, Sep 01, 2021 at 05:47:08PM +0200, Jacopo Mondi wrote:
> > > > > > On Wed, Sep 01, 2021 at 03:37:39AM +0900, Hirokazu Honda wrote:
> > > > > > > The problem is happening because we seem to add a CameraStream
> > > > > > > associated buffer(depending on the CameraStream::Type) to the Request,
> > > > > > > in CameraDevice::processCaptureRequest().
> > > > > > >
> > > > > > > However, when the camera stops, all the current buffers are marked with
> > > > > > > FrameMetadata::FrameCancelled and proceed to completion. But the buffer
> > > > > > > associated with the CameraStream (that was previously added to the
> > > > > > > request) has now been cleared out with a part of streams_.clear(), even
> > > > > > > before the camera stop() has been invoked. Any access to those request
> > > > > > > buffers after they have been cleared, shall result in a crash.
> > > > > >
> > > > > > I recall it was painful at the time when we had to deal with flush()
> > > > > > to get the right ordering, and I had done the same. However, since
> > > > > > stop() is only called by configureStreams() I removed the
> > > > > > streams_.clear() there and had issues, so I gave up.
> > > > >
> > > > > Do you mean you have tried removing streams_.clear() from
> > > > > configureStreams() while testing this patch and ran into issues, or that
> > > > > you tried that when implementing support for flush() ?
> > > >
> > > > I meant when developing flush().
> > > > I haven't tried to do the same on top of this patch, but it might be
> > > > good to do so ?
> > >
> > > I think so, because if it works, it's a good cleanup (possibly as part
> > > of this patch actually, not on top), and if it doesn't, there's
> > > something fishy :-) Would you be able to test it ?
> >
> > With this applied
> >
> > --- a/src/android/camera_device.cpp
> > +++ b/src/android/camera_device.cpp
> > @@ -544,6 +544,7 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> > LOG(HAL, Error) << "No streams in configuration";
> > return -EINVAL;
> > }
> > + streams_.reserve(stream_list->num_streams);
> >
> > #if defined(OS_CHROMEOS)
> > if (!validateCropRotate(*stream_list))
> > @@ -560,14 +561,6 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> > return -EINVAL;
> > }
> >
> > - /*
> > - * Clear and remove any existing configuration from previous calls, and
> > - * ensure the required entries are available without further
> > - * reallocation.
> > - */
> > - streams_.clear();
> > - streams_.reserve(stream_list->num_streams);
> > -
> > std::vector<Camera3StreamConfig> streamConfigs;
> > streamConfigs.reserve(stream_list->num_streams);
> >
> > I can easily get a segfault running CTS.
> > I analyzed the coredump and the stack trace doesn't help much
> >
> > (soraka-libcamera-gdb) bt
> > #0 0x000000043983e000 in ?? ()
> > #1 0x00007eee396ee5b5 in base::OnceCallback<void ()>::Run() && (this=0x7eee1c009310) at ../../../libchrome-0.0.1/platform2/libchrome/base/callback.h:102
> > #2 base::TaskAnnotator::RunTask (this=<optimized out>, trace_event_name=<optimized out>, pending_task=0x7eee1c009310) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/common/task_annotator.cc:163
> > #3 0x00007eee396ff465 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl (this=this at entry=0x7eee2c035ed0, continuation_lazy_now=continuation_lazy_now at entry=0x7eee34bd0910)
> > at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:332
> > #4 0x00007eee396ff27c in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork (this=0x7eee2c035ed0)
> > at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:252
> > #5 0x00007eee396a322d in base::MessagePumpDefault::Run (this=0x7eee1c00cc70, delegate=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/message_loop/message_pump_default.cc:39
> > #6 0x00007eee396ff730 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run (this=<optimized out>, application_tasks_allowed=<optimized out>, timeout=...)
> > at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:446
> > #7 0x00007eee396d0325 in base::RunLoop::Run (this=0x7eee34bd0aa0) at ../../../libchrome-0.0.1/platform2/libchrome/base/run_loop.cc:124
> > #8 0x00007eee3971d04e in base::Thread::ThreadMain (this=0x7eee2c034090) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:382
> > #9 0x00007eee39718f3d in base::(anonymous namespace)::ThreadFunc (params=0x7eee2c033c50) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/platform_thread_posix.cc:87
> > #10 0x00007eee395adfbf in start_thread (arg=0x7eee34bd1640) at pthread_create.c:463
> > #11 0x00007eee393f735f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > (soraka-libcamera-gdb) quit
> >
> > The last ?? in #0 makes me think I might not have symbols for the
> > camera service, which I cannot easily recompile due to the cros-debug
> > USE flag issue. I need to rebuild all the packages and I've now kicked
> > it.
> >
> > One day I'll learn to shut up and ack patches without asking for
> > reasons to fall into rabbit holes like this one.
>
> Thanks for the quick test.
I can confirm removing streams_.clear() from configureStreams() easily
triggers a segfault when running CTS. That's indeed the same issue I
was facing when developing flush().
I really cannot figure out why it happens looking at the code. I got
a few more stack traces with debuga symbols in the camera service from the
coredumps but they do not seem that helpful at a first look.
[Current thread is 1 (Thread 0x7c372ffff640 (LWP 8265))]
(soraka-libcamera-gdb) bt
#0 std::__1::__cxx_atomic_fetch_sub<int> (__a=0x400000000, __delta=1, __order=std::__1::memory_order_acq_rel) at /usr/bin/../include/c++/v1/atomic:1072
#1 std::__1::__atomic_base<int, true>::fetch_sub (this=0x400000000, __op=1, __m=std::__1::memory_order_acq_rel) at /usr/bin/../include/c++/v1/atomic:1719
#2 base::AtomicRefCount::Decrement (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/atomic_ref_count.h:39
#3 base::subtle::RefCountedThreadSafeBase::ReleaseImpl (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:218
#4 base::subtle::RefCountedThreadSafeBase::Release (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:170
#5 base::RefCountedThreadSafe<base::internal::BindStateBase, base::internal::BindStateBaseRefCountTraits>::Release (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:398
#6 scoped_refptr<base::internal::BindStateBase>::Release (ptr=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/scoped_refptr.h:322
#7 scoped_refptr<base::internal::BindStateBase>::~scoped_refptr (this=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/scoped_refptr.h:224
#8 base::internal::CallbackBase::~CallbackBase (this=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/callback_internal.cc:85
#9 0x00007c37561eb82b in base::sequence_manager::internal::AtomicFlagSet::Group::~Group (this=0x7c373000ef80) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/atomic_flag_set.cc:132
#10 std::__1::default_delete<base::sequence_manager::internal::AtomicFlagSet::Group>::operator() (this=<optimized out>, __ptr=0x7c373000ef80) at /usr/bin/../include/c++/v1/memory:1335
#11 0x00007c37561eaede in base::sequence_manager::internal::AtomicFlagSet::RemoveFromAllocList (group=0x7c373000ef80, this=<optimized out>) at /usr/bin/../include/c++/v1/memory:1595
#12 base::sequence_manager::internal::AtomicFlagSet::AtomicFlag::ReleaseAtomicFlag (this=0x7c3730013b90) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/atomic_flag_set.cc:77
#13 0x00007c37561f3cec in base::sequence_manager::internal::TaskQueueImpl::UnregisterTaskQueue (this=0x7c37300139e0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/task_queue_impl.cc:201
#14 0x00007c37561ec1df in base::sequence_manager::internal::SequenceManagerImpl::~SequenceManagerImpl (this=0x7c37300008d0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/sequence_manager_impl.cc:234
#15 0x00007c37561ec52e in base::sequence_manager::internal::SequenceManagerImpl::~SequenceManagerImpl (this=0x7c37300008d0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/sequence_manager_impl.cc:212
#16 0x00007c3756218317 in std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl>::operator() (this=0x7c3730014848, __ptr=0x400000000) at /usr/bin/../include/c++/v1/memory:1335
#17 std::__1::unique_ptr<base::sequence_manager::internal::SequenceManagerImpl, std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl> >::reset (this=0x7c3730014848, __p=0x0)
at /usr/bin/../include/c++/v1/memory:1596
#18 std::__1::unique_ptr<base::sequence_manager::internal::SequenceManagerImpl, std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl> >::~unique_ptr (this=0x7c3730014848)
at /usr/bin/../include/c++/v1/memory:1550
#19 base::(anonymous namespace)::SequenceManagerThreadDelegate::~SequenceManagerThreadDelegate (this=0x7c3730014840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:67
#20 base::(anonymous namespace)::SequenceManagerThreadDelegate::~SequenceManagerThreadDelegate (this=0x7c3730014840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:67
#21 0x00007c3756218098 in std::__1::default_delete<base::Thread::Delegate>::operator() (this=0x7c3744039bc0, __ptr=0x400000000) at /usr/bin/../include/c++/v1/memory:1335
#22 std::__1::unique_ptr<base::Thread::Delegate, std::__1::default_delete<base::Thread::Delegate> >::reset (this=0x7c3744039bc0, __p=0x0) at /usr/bin/../include/c++/v1/memory:1596
#23 base::Thread::ThreadMain (this=0x7c3744039b40) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:400
#24 0x00007c3756213f3d in base::(anonymous namespace)::ThreadFunc (params=0x7c3730011840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/platform_thread_posix.cc:87
#25 0x00007c37560a8fbf in start_thread (arg=0x7c372ffff640) at pthread_create.c:463
#26 0x00007c3755ef235f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> Hiro, is this something you can look at ?
>
> > > > > > But I have this patch gone through cros_camera_test and also CTS seems
> > > > > > fine. Also looking at the code it seems an harmless change so I wonder
> > > > > > what was wrong at the time.
> > > > > >
> > > > > > As far as I can tell:
> > > > > >
> > > > > > Tested-by: Jacopo Mondi <jacopo at jmondi.org>
> > > > > > Acked-by: Jacopo Mondi <jacopo at jmondi.org>
> > > > > >
> > > > > > > Signed-off-by: Hirokazu Honda <hiroh at chromium.org>
> > > > > > > ---
> > > > > > > src/android/camera_device.cpp | 4 ++--
> > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> > > > > > > index 8ca76719..fda77db4 100644
> > > > > > > --- a/src/android/camera_device.cpp
> > > > > > > +++ b/src/android/camera_device.cpp
> > > > > > > @@ -423,8 +423,6 @@ int CameraDevice::open(const hw_module_t *hardwareModule)
> > > > > > >
> > > > > > > void CameraDevice::close()
> > > > > > > {
> > > > > > > - streams_.clear();
> > > > > > > -
> > > > > > > stop();
> > > > > > >
> > > > > > > camera_->release();
> > > > > > > @@ -457,6 +455,8 @@ void CameraDevice::stop()
> > > > > > > camera_->stop();
> > > > > > >
> > > > > > > descriptors_.clear();
> > > > > > > + streams_.clear();
> > > > > > > +
> > > > > > > state_ = State::Stopped;
> > > > > > > }
> > > > > > >
>
> --
> Regards,
>
> Laurent Pinchart
More information about the libcamera-devel
mailing list