[libcamera-devel] [PATCH v3] android: camera_device: Fix crash in calling CameraDevice::close()

Jacopo Mondi jacopo at jmondi.org
Fri Sep 3 19:06:25 CEST 2021


On Fri, Sep 03, 2021 at 05:26:41PM +0900, Hirokazu Honda wrote:
> Hi Jacopo,
>
> On Thu, Sep 2, 2021 at 6:55 PM Jacopo Mondi <jacopo at jmondi.org> wrote:
> >
> > Hello,
> >
> > On Thu, Sep 02, 2021 at 11:51:02AM +0300, Laurent Pinchart wrote:
> > > Hi Jacopo,
> > >
> > > On Thu, Sep 02, 2021 at 10:25:59AM +0200, Jacopo Mondi wrote:
> > > > On Thu, Sep 02, 2021 at 10:40:39AM +0300, Laurent Pinchart wrote:
> > > > > On Thu, Sep 02, 2021 at 08:59:34AM +0200, Jacopo Mondi wrote:
> > > > > > On Thu, Sep 02, 2021 at 06:11:40AM +0300, Laurent Pinchart wrote:
> > > > > > > On Wed, Sep 01, 2021 at 05:47:08PM +0200, Jacopo Mondi wrote:
> > > > > > > > On Wed, Sep 01, 2021 at 03:37:39AM +0900, Hirokazu Honda wrote:
> > > > > > > > > The problem is happening because we seem to add a CameraStream
> > > > > > > > > associated buffer(depending on the CameraStream::Type) to the Request,
> > > > > > > > > in CameraDevice::processCaptureRequest().
> > > > > > > > >
> > > > > > > > > However, when the camera stops, all the current buffers are marked with
> > > > > > > > > FrameMetadata::FrameCancelled and proceed to completion. But the buffer
> > > > > > > > > associated with the CameraStream (that was previously added to the
> > > > > > > > > request) has now been cleared out with a part of streams_.clear(), even
> > > > > > > > > before the camera stop() has been invoked. Any access to those request
> > > > > > > > > buffers after they have been cleared, shall result in a crash.
> > > > > > > >
> > > > > > > > I recall it was painful at the time when we had to deal with flush()
> > > > > > > > to get the right ordering, and I had done the same. However, since
> > > > > > > > stop() is only called by configureStreams() I removed the
> > > > > > > > streams_.clear() there and had issues, so I gave up.
> > > > > > >
> > > > > > > Do you mean you have tried removing streams_.clear() from
> > > > > > > configureStreams() while testing this patch and ran into issues, or that
> > > > > > > you tried that when implementing support for flush() ?
> > > > > >
> > > > > > I meant when developing flush().
> > > > > > I haven't tried to do the same on top of this patch, but it might be
> > > > > > good to do so ?
> > > > >
> > > > > I think so, because if it works, it's a good cleanup (possibly as part
> > > > > of this patch actually, not on top), and if it doesn't, there's
> > > > > something fishy :-) Would you be able to test it ?
> > > >
> > > > With this applied
> > > >
> > > > --- a/src/android/camera_device.cpp
> > > > +++ b/src/android/camera_device.cpp
> > > > @@ -544,6 +544,7 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> > > >                 LOG(HAL, Error) << "No streams in configuration";
> > > >                 return -EINVAL;
> > > >         }
> > > > +       streams_.reserve(stream_list->num_streams);
> > > >
> > > >  #if defined(OS_CHROMEOS)
> > > >         if (!validateCropRotate(*stream_list))
> > > > @@ -560,14 +561,6 @@ int CameraDevice::configureStreams(camera3_stream_configuration_t *stream_list)
> > > >                 return -EINVAL;
> > > >         }
> > > >
> > > > -       /*
> > > > -        * Clear and remove any existing configuration from previous calls, and
> > > > -        * ensure the required entries are available without further
> > > > -        * reallocation.
> > > > -        */
> > > > -       streams_.clear();
> > > > -       streams_.reserve(stream_list->num_streams);
> > > > -
> > > >         std::vector<Camera3StreamConfig> streamConfigs;
> > > >         streamConfigs.reserve(stream_list->num_streams);
> > > >
> > > >  I can easily get a segfault running CTS.
> > > >  I analyzed the coredump and the stack trace doesn't help much
> > > >
> > > > (soraka-libcamera-gdb) bt
> > > > #0  0x000000043983e000 in ?? ()
> > > > #1  0x00007eee396ee5b5 in base::OnceCallback<void ()>::Run() && (this=0x7eee1c009310) at ../../../libchrome-0.0.1/platform2/libchrome/base/callback.h:102
> > > > #2  base::TaskAnnotator::RunTask (this=<optimized out>, trace_event_name=<optimized out>, pending_task=0x7eee1c009310) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/common/task_annotator.cc:163
> > > > #3  0x00007eee396ff465 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl (this=this at entry=0x7eee2c035ed0, continuation_lazy_now=continuation_lazy_now at entry=0x7eee34bd0910)
> > > >     at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:332
> > > > #4  0x00007eee396ff27c in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork (this=0x7eee2c035ed0)
> > > >     at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:252
> > > > #5  0x00007eee396a322d in base::MessagePumpDefault::Run (this=0x7eee1c00cc70, delegate=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/message_loop/message_pump_default.cc:39
> > > > #6  0x00007eee396ff730 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run (this=<optimized out>, application_tasks_allowed=<optimized out>, timeout=...)
> > > >     at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:446
> > > > #7  0x00007eee396d0325 in base::RunLoop::Run (this=0x7eee34bd0aa0) at ../../../libchrome-0.0.1/platform2/libchrome/base/run_loop.cc:124
> > > > #8  0x00007eee3971d04e in base::Thread::ThreadMain (this=0x7eee2c034090) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:382
> > > > #9  0x00007eee39718f3d in base::(anonymous namespace)::ThreadFunc (params=0x7eee2c033c50) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/platform_thread_posix.cc:87
> > > > #10 0x00007eee395adfbf in start_thread (arg=0x7eee34bd1640) at pthread_create.c:463
> > > > #11 0x00007eee393f735f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > > > (soraka-libcamera-gdb) quit
> > > >
> > > > The last ?? in #0 makes me think I might not have symbols for the
> > > > camera service, which I cannot easily recompile due to the cros-debug
> > > > USE flag issue. I need to rebuild all the packages and I've now kicked
> > > > it.
> > > >
> > > > One day I'll learn to shut up and ack patches without asking for
> > > > reasons to fall into rabbit holes like this one.
> > >
> > > Thanks for the quick test.
> >
> > I can confirm removing streams_.clear() from configureStreams() easily
> > triggers a segfault when running CTS. That's indeed the same issue I
> > was facing when developing flush().
> >
> > I really cannot figure out why it happens looking at the code. I got
> > a few more stack traces with debuga symbols in the camera service from the
> > coredumps but they do not seem that helpful at a first look.
> >
> > [Current thread is 1 (Thread 0x7c372ffff640 (LWP 8265))]
> > (soraka-libcamera-gdb) bt
> > #0  std::__1::__cxx_atomic_fetch_sub<int> (__a=0x400000000, __delta=1, __order=std::__1::memory_order_acq_rel) at /usr/bin/../include/c++/v1/atomic:1072
> > #1  std::__1::__atomic_base<int, true>::fetch_sub (this=0x400000000, __op=1, __m=std::__1::memory_order_acq_rel) at /usr/bin/../include/c++/v1/atomic:1719
> > #2  base::AtomicRefCount::Decrement (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/atomic_ref_count.h:39
> > #3  base::subtle::RefCountedThreadSafeBase::ReleaseImpl (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:218
> > #4  base::subtle::RefCountedThreadSafeBase::Release (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:170
> > #5  base::RefCountedThreadSafe<base::internal::BindStateBase, base::internal::BindStateBaseRefCountTraits>::Release (this=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/ref_counted.h:398
> > #6  scoped_refptr<base::internal::BindStateBase>::Release (ptr=0x400000000) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/scoped_refptr.h:322
> > #7  scoped_refptr<base::internal::BindStateBase>::~scoped_refptr (this=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/memory/scoped_refptr.h:224
> > #8  base::internal::CallbackBase::~CallbackBase (this=<optimized out>) at ../../../libchrome-0.0.1/platform2/libchrome/base/callback_internal.cc:85
> > #9  0x00007c37561eb82b in base::sequence_manager::internal::AtomicFlagSet::Group::~Group (this=0x7c373000ef80) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/atomic_flag_set.cc:132
> > #10 std::__1::default_delete<base::sequence_manager::internal::AtomicFlagSet::Group>::operator() (this=<optimized out>, __ptr=0x7c373000ef80) at /usr/bin/../include/c++/v1/memory:1335
> > #11 0x00007c37561eaede in base::sequence_manager::internal::AtomicFlagSet::RemoveFromAllocList (group=0x7c373000ef80, this=<optimized out>) at /usr/bin/../include/c++/v1/memory:1595
> > #12 base::sequence_manager::internal::AtomicFlagSet::AtomicFlag::ReleaseAtomicFlag (this=0x7c3730013b90) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/atomic_flag_set.cc:77
> > #13 0x00007c37561f3cec in base::sequence_manager::internal::TaskQueueImpl::UnregisterTaskQueue (this=0x7c37300139e0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/task_queue_impl.cc:201
> > #14 0x00007c37561ec1df in base::sequence_manager::internal::SequenceManagerImpl::~SequenceManagerImpl (this=0x7c37300008d0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/sequence_manager_impl.cc:234
> > #15 0x00007c37561ec52e in base::sequence_manager::internal::SequenceManagerImpl::~SequenceManagerImpl (this=0x7c37300008d0) at ../../../libchrome-0.0.1/platform2/libchrome/base/task/sequence_manager/sequence_manager_impl.cc:212
> > #16 0x00007c3756218317 in std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl>::operator() (this=0x7c3730014848, __ptr=0x400000000) at /usr/bin/../include/c++/v1/memory:1335
> > #17 std::__1::unique_ptr<base::sequence_manager::internal::SequenceManagerImpl, std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl> >::reset (this=0x7c3730014848, __p=0x0)
> >     at /usr/bin/../include/c++/v1/memory:1596
> > #18 std::__1::unique_ptr<base::sequence_manager::internal::SequenceManagerImpl, std::__1::default_delete<base::sequence_manager::internal::SequenceManagerImpl> >::~unique_ptr (this=0x7c3730014848)
> >     at /usr/bin/../include/c++/v1/memory:1550
> > #19 base::(anonymous namespace)::SequenceManagerThreadDelegate::~SequenceManagerThreadDelegate (this=0x7c3730014840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:67
> > #20 base::(anonymous namespace)::SequenceManagerThreadDelegate::~SequenceManagerThreadDelegate (this=0x7c3730014840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:67
> > #21 0x00007c3756218098 in std::__1::default_delete<base::Thread::Delegate>::operator() (this=0x7c3744039bc0, __ptr=0x400000000) at /usr/bin/../include/c++/v1/memory:1335
> > #22 std::__1::unique_ptr<base::Thread::Delegate, std::__1::default_delete<base::Thread::Delegate> >::reset (this=0x7c3744039bc0, __p=0x0) at /usr/bin/../include/c++/v1/memory:1596
> > #23 base::Thread::ThreadMain (this=0x7c3744039b40) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/thread.cc:400
> > #24 0x00007c3756213f3d in base::(anonymous namespace)::ThreadFunc (params=0x7c3730011840) at ../../../libchrome-0.0.1/platform2/libchrome/base/threading/platform_thread_posix.cc:87
> > #25 0x00007c37560a8fbf in start_thread (arg=0x7c372ffff640) at pthread_create.c:463
> > #26 0x00007c3755ef235f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> >
> > >
> > > Hiro, is this something you can look at ?
>
> I guess this is because streams_ is not cleared when state_ is Stopped.
> Could you try again by modifying as follows?
>
> --- a/src/android/camera_device.cpp
> +++ b/src/android/camera_device.cpp
> @@ -448,8 +448,10 @@ void CameraDevice::flush()
>  void CameraDevice::stop()
>  {
>         MutexLocker stateLock(stateMutex_);
> -       if (state_ == State::Stopped)
> +       if (state_ == State::Stopped) {
> +               streams_.clear();
>                 return;
> +       }

Oh I see your reasoning!
It seems right, if we have a flush, it will stop the camera, and the
next call to stop() won't clear streams_.

However, I've not been able to run a single meaningful test the whole
day due to spurious crashes in the FrameBuffer class.

I'll wait for the dust to settle and re-test this change.
>
> Thanks,
> -Hiro
> > >
> > > > > > > > But I have this patch gone through cros_camera_test and also CTS seems
> > > > > > > > fine. Also looking at the code it seems an harmless change so I wonder
> > > > > > > > what was wrong at the time.
> > > > > > > >
> > > > > > > > As far as I can tell:
> > > > > > > >
> > > > > > > > Tested-by: Jacopo Mondi <jacopo at jmondi.org>
> > > > > > > > Acked-by: Jacopo Mondi <jacopo at jmondi.org>
> > > > > > > >
> > > > > > > > > Signed-off-by: Hirokazu Honda <hiroh at chromium.org>
> > > > > > > > > ---
> > > > > > > > >  src/android/camera_device.cpp | 4 ++--
> > > > > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> > > > > > > > > index 8ca76719..fda77db4 100644
> > > > > > > > > --- a/src/android/camera_device.cpp
> > > > > > > > > +++ b/src/android/camera_device.cpp
> > > > > > > > > @@ -423,8 +423,6 @@ int CameraDevice::open(const hw_module_t *hardwareModule)
> > > > > > > > >
> > > > > > > > >  void CameraDevice::close()
> > > > > > > > >  {
> > > > > > > > > - streams_.clear();
> > > > > > > > > -
> > > > > > > > >   stop();
> > > > > > > > >
> > > > > > > > >   camera_->release();
> > > > > > > > > @@ -457,6 +455,8 @@ void CameraDevice::stop()
> > > > > > > > >   camera_->stop();
> > > > > > > > >
> > > > > > > > >   descriptors_.clear();
> > > > > > > > > + streams_.clear();
> > > > > > > > > +
> > > > > > > > >   state_ = State::Stopped;
> > > > > > > > >  }
> > > > > > > > >
> > >
> > > --
> > > Regards,
> > >
> > > Laurent Pinchart


More information about the libcamera-devel mailing list