[PATCH v5 1/1] libcamera: add method to set thread affinity

Cheng-Hao Yang chenghaoyang at chromium.org
Thu Oct 17 07:59:58 CEST 2024


Hi Laurent,

I'll update a new version when all changes are applied.

On Thu, Oct 17, 2024 at 6:08 AM Laurent Pinchart
<laurent.pinchart at ideasonboard.com> wrote:
>
> Hi Harvey, Han-Lin,
>
> Thank you for the patch. This is taking shape.
>
> On Wed, Oct 16, 2024 at 08:43:44AM +0000, Harvey Yang wrote:
> > From: Han-Lin Chen <hanlinchen at chromium.org>
> >
> > Add method to set thread affinity to Thread class.
> >
> > Signed-off-by: Han-Lin Chen <hanlinchen at chromium.org>
> > Co-developed-by: Harvey Yang <chenghaoyang at chromium.org>
> > Signed-off-by: Harvey Yang <chenghaoyang at chromium.org>
> > ---
> >  include/libcamera/base/thread.h |  5 ++++
> >  src/libcamera/base/thread.cpp   | 46 +++++++++++++++++++++++++++++++++
> >  2 files changed, 51 insertions(+)
> >
> > diff --git a/include/libcamera/base/thread.h b/include/libcamera/base/thread.h
> > index 4f33de63d..3209d4f7c 100644
> > --- a/include/libcamera/base/thread.h
> > +++ b/include/libcamera/base/thread.h
> > @@ -15,6 +15,7 @@
> >
> >  #include <libcamera/base/message.h>
> >  #include <libcamera/base/signal.h>
> > +#include <libcamera/base/span.h>
> >  #include <libcamera/base/utils.h>
> >
> >  namespace libcamera {
> > @@ -35,6 +36,8 @@ public:
> >       void exit(int code = 0);
> >       bool wait(utils::duration duration = utils::duration::max());
> >
> > +     int setThreadAffinity(const Span<const unsigned int> &cpus);
> > +
> >       bool isRunning();
> >
> >       Signal<> finished;
> > @@ -54,6 +57,8 @@ private:
> >       void startThread();
> >       void finishThread();
> >
> > +     void setThreadAffinityInternal();
> > +
> >       void postMessage(std::unique_ptr<Message> msg, Object *receiver);
> >       void removeMessages(Object *receiver);
> >
> > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > index 8735670b8..5dc787dba 100644
> > --- a/src/libcamera/base/thread.cpp
> > +++ b/src/libcamera/base/thread.cpp
> > @@ -9,6 +9,7 @@
> >
> >  #include <atomic>
> >  #include <list>
> > +#include <optional>
> >  #include <sys/syscall.h>
> >  #include <sys/types.h>
> >  #include <unistd.h>
> > @@ -128,6 +129,8 @@ private:
> >       int exitCode_;
> >
> >       MessageQueue messages_;
> > +
> > +     std::optional<cpu_set_t> cpuset_ = std::nullopt;
>
> Is there a need to initialize this to std::nullopt, as that's what the
> default constructor does ?

Removed.

>
> >  };
> >
> >  /**
> > @@ -254,6 +257,8 @@ void Thread::start()
> >       data_->exit_.store(false, std::memory_order_relaxed);
> >
> >       thread_ = std::thread(&Thread::startThread, this);
> > +
> > +     setThreadAffinityInternal();
> >  }
> >
> >  void Thread::startThread()
> > @@ -410,6 +415,47 @@ bool Thread::wait(utils::duration duration)
> >       return hasFinished;
> >  }
> >
> > +/**
> > + * \brief Set the CPU affinity mask of the thread
> > + * \param[in] cpus The list of CPU indices that the thread is set affinity to
> > + *
> > + * The CPU indices should be within [0, std::thread::hardware_concurrency()).
> > + * If any index is invalid, no cpu indices will be set.
>
>  * If any index is invalid, this function won't modify the thread affinity and
>  * will return an error.

Done.

>
> > + *
> > + * \return 0 if all indices are valid, -EINVAL otherwise
> > + */
> > +int Thread::setThreadAffinity(const Span<const unsigned int> &cpus)
> > +{
> > +     const unsigned int numCpus = std::thread::hardware_concurrency();
>
> https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency
> says:
>
>   Returns the number of concurrent threads supported by the
>   implementation. The value should be considered only a hint.
>
>   [snip]
>
>   Return value
>
>   Number of concurrent threads supported. If the value is not well
>   defined or not computable, returns 0.
>
> libcxx seems to have the following implementation:
>
> unsigned thread::hardware_concurrency() noexcept {
> #if defined(_SC_NPROCESSORS_ONLN)
>   long result = sysconf(_SC_NPROCESSORS_ONLN);
>   // sysconf returns -1 if the name is invalid, the option does not exist or
>   // does not have a definite limit.
>   // if sysconf returns some other negative number, we have no idea
>   // what is going on. Default to something safe.
>   if (result < 0)
>     return 0;
>   return static_cast<unsigned>(result);
> #elif defined(_LIBCPP_WIN32API)
>   SYSTEM_INFO info;
>   GetSystemInfo(&info);
>   return info.dwNumberOfProcessors;
> #else // defined(CTL_HW) && defined(HW_NCPU)
>   // TODO: grovel through /proc or check cpuid on x86 and similar
>   // instructions on other architectures.
> #  if defined(_LIBCPP_WARNING)
>   _LIBCPP_WARNING("hardware_concurrency not yet implemented")
> #  else
> #    warning hardware_concurrency not yet implemented
> #  endif
>   return 0; // Means not computable [thread.thread.static]
> #endif // defined(CTL_HW) && defined(HW_NCPU)
> }
>
> while libstdc++ uses
>
> #if defined(_GLIBCXX_USE_GET_NPROCS)
> # include <sys/sysinfo.h>
> # define _GLIBCXX_NPROCS get_nprocs()
> #elif defined(_GLIBCXX_USE_PTHREADS_NUM_PROCESSORS_NP)
> # define _GLIBCXX_NPROCS pthread_num_processors_np()
> #elif defined(_GLIBCXX_USE_SYSCTL_HW_NCPU)
> # include <stddef.h>
> # include <sys/sysctl.h>
> static inline int get_nprocs()
> {
>  int count;
>  size_t size = sizeof(count);
>  int mib[] = { CTL_HW, HW_NCPU };
>  if (!sysctl(mib, 2, &count, &size, NULL, 0))
>    return count;
>  return 0;
> }
> # define _GLIBCXX_NPROCS get_nprocs()
> #elif defined(_GLIBCXX_USE_SC_NPROCESSORS_ONLN)
> # include <unistd.h>
> # define _GLIBCXX_NPROCS sysconf(_SC_NPROCESSORS_ONLN)
> #elif defined(_GLIBCXX_USE_SC_NPROC_ONLN)
> # include <unistd.h>
> # define _GLIBCXX_NPROCS sysconf(_SC_NPROC_ONLN)
> #else
> # define _GLIBCXX_NPROCS 0
> #endif
>
>   unsigned int
>   thread::hardware_concurrency() noexcept
>   {
>     int __n = _GLIBCXX_NPROCS;
>     if (__n < 0)
>       __n = 0;
>     return __n;
>   }
>
> get_nprocs() and sysconf(_SC_NPROCESSORS_ONLN) are identical (the latter
> is a wrapper around the former). So it seems that, while the C++ API
> doesn't guarantee that thread::hardware_concurrency() will provide an
> accurate value, the implementations we have will do the right thing.
> They're however a bit expensive, as they open a file in /sys, but that's
> probably the best we can do.
>
> > +
> > +     MutexLocker locker(data_->mutex_);
> > +     data_->cpuset_ = cpu_set_t();
> > +     CPU_ZERO(&data_->cpuset_.value());
> > +
> > +     for (const unsigned int &cpu : cpus) {
> > +             if (cpu >= numCpus) {
> > +                     LOG(Thread, Error) << "Ignore an invalid cpu index: " << cpu;
>
> It's not ignored anymore
>
>                         LOG(Thread, Error)
>                                 << "Invalid CPU " << cpu << " for thread affinity";
>

Right, updated.

> > +                     return -EINVAL;
> > +             }
> > +
> > +             CPU_SET(cpu, &data_->cpuset_.value());
> > +     }
> > +
> > +     if (data_->running_)
> > +             setThreadAffinityInternal();
> > +
> > +     return 0;
> > +}
> > +
> > +void Thread::setThreadAffinityInternal()
> > +{
> > +     if (!data_->cpuset_)
> > +             return;
> > +
> > +     const cpu_set_t &cpuset = data_->cpuset_.value();
> > +     pthread_setaffinity_np(thread_.native_handle(), sizeof(cpuset), &cpuset);
>
> With this implementation, it's not possible to set the thread's CPU
> affinity before starting the thread. That's a limitation of the C++
> std::thread API, it could be done if we used the native pthread API
> instead. Is that a problem ? Could you actually point me to code that
> uses Thread::setThreadAffinity() ?

There are two main usages in mtkisp7:
1. Setting pipeline handler's thread [1].
2. Setting IPA's two new threads [2].

The first one sets thread affinity to a thread that has already been
started, while the second one set new threads.

By "the native pthread API", you mean `pthread_attr_setaffinity_np`?
This means that we need to implement libcamera::Thread with pthread,
instead of std::thread?

[1]: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/libcamera/mtkisp7/src/libcamera/pipeline/mtkisp7/mtkisp7.cpp;l=511
[2]: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/libcamera/mtkisp7/src/ipa/mtkisp7/mtkisp7.cpp;l=119

>
> Even without the ability to set the affinity before starting the thread,
> we could set it from within the thread before calling the run()
> function. This would avoid the main workload being run on incorrect
> CPUs. Is that something that would be useful ?

You mean calling `setThreadAffinityInterna()l` in `startThread()`
(before `run()`), instead of at the end of `start()`?

>
> Regardless of which option we implement, Thread::setThreadAffinity()
> should document what happens if it's called before calling
> Thread::start().
>
> > +}
> > +
> >  /**
> >   * \brief Check if the thread is running
> >   *
>
> --
> Regards,
>
> Laurent Pinchart

BR,
Harvey


More information about the libcamera-devel mailing list