[libcamera-devel] [PATCH] cam: convert to libfmt
Eric Curtin
ecurtin at redhat.com
Mon May 23 11:40:13 CEST 2022
On Mon, 23 May 2022 at 09:23, Laurent Pinchart
<laurent.pinchart at ideasonboard.com> wrote:
>
> Hi Eric,
>
> On Sun, May 22, 2022 at 09:00:04PM +0100, Eric Curtin wrote:
> > On Sun, 22 May 2022 at 10:58, Laurent Pinchart wrote:
> > > On Sat, May 21, 2022 at 07:33:03PM +0100, Eric Curtin wrote:
> > > > On Tue, 10 May 2022 at 15:21, Laurent Pinchart wrote:
> > > > > On Tue, May 10, 2022 at 05:05:05PM +0300, Tomi Valkeinen wrote:
> > > > > > On 10/05/2022 16:22, Laurent Pinchart wrote:
> > > > > > > On Tue, May 10, 2022 at 10:16:17AM +0300, Tomi Valkeinen wrote:
> > > > > > >> This is just a conversation starter, not for merging. I really like
> > > > > > >> libfmt, and I really hate the C++ stream-based string formatting. libfmt
> > > > > > >> is the first library I add to any new C++ project I make.
> > > > > > >
> > > > > > > You've got it wrong. The first library you should add to any new C++
> > > > > > > project you make is libcamera :-)
> > > > > > >
> > > > > > >> You can find more information about libfmt from:
> > > > > > >>
> > > > > > >> https://github.com/fmtlib/fmt
> > > > > > >> https://fmt.dev/latest/index.html
> > > > > > >>
> > > > > > >> This patch is just a crude conversion with ugly macros to showcase what
> > > > > > >> the formatting code might look like.
> > > > > > >>
> > > > > > >> libfmt pages suggest that libfmt would be faster and smaller (exe size)
> > > > > > >> than iostreams, but for the size it didn't seem to be true in cam's case
> > > > > > >> as the tests below show. However, simple prints did reduce the exe size,
> > > > > > >> but the few more complex ones increased the size.
> > > > > > >>
> > > > > > >> Size tests with gcc 11.2.0-19ubuntu1
> > > > > > >>
> > > > > > >> - Without libfmt
> > > > > > >>
> > > > > > >> debug 3523400
> > > > > > >> debug lto 3269368
> > > > > > >> release 223056
> > > > > > >> release lto 172280
> > > > > > >>
> > > > > > >> - With libfmt
> > > > > > >>
> > > > > > >> debug 4424256
> > > > > > >> debug lto 4143840
> > > > > > >> release 303952
> > > > > > >> release lto 252640
> > > > > > >>
> > > > > > >> Above shows that cam's size clearly increases with libfmt. However, the
> > > > > > >> increase really comes only from one case, the use of fmt::memory_buffer
> > > > > > >> and std::back_inserter.
> > > > > > >
> > > > > > > Is this because libfmt is a header-only library ? What if it is built as
> > > > > > > a shared object, what's the impact on code size in cam ?
> > > > > >
> > > > > > libfmt supports header-only, but it's not header-only here.
> > > > > >
> > > > > > >> Converting that code to use fmt::format() and
> > > > > > >> naive string append gives:
> > > > > > >>
> > > > > > >> release with string append 233680
> > > > > > >> release lto with string append 186936
> > > > > > >
> > > > > > > What's the impact of switching to string concatenation on runtime ?
> > > > > >
> > > > > > What do you mean?
> > > > >
> > > > > I mean what is the impact on CPU usage of using the "naive" string
> > > > > append, compared to std::stringstream or fmt::memory_buffer ?
> > > > >
> > > > > > > Would you consider the option of keeping std::stringstream ?
> > > > > >
> > > > > > You mean formatting mostly with libfmt, but sometimes with streams? Or
> > > > > > using libfmt but appending the formatted string to stringstream?
> > > > >
> > > > > Mostly using fmt::, but using std::stringstream without
> > > > > fmt::memory_buffer when a string needs to be assembled. I'm not saying
> > > > > it's necessarily a good idea, just wondering.
> > > >
> > > > Something I wrote which you can use to format C++ strings in a printf-style
> > > >
> > > > ```
> > > > #pragma once
> > > >
> > > > #include <string.h>
> > > > #include <string>
> > > >
> > > > // function that will sprintf to a C++ string starting from std::string::size()
> > > > // so if you want to completely overwrite a string or start at a specific point
> > > > // use std::string::clear() or std::string::resize(). str is a std::string.
> > > > #define STRING_PRINTF(str, ...) \
> > > > do { \
> > > > const int size = snprintf(NULL, 0, __VA_ARGS__); \
> > > > const size_t start_of_string = str.size(); \
> > > > str.resize(start_of_string + size); \
> > > > snprintf(&str[start_of_string], str.size() + 1, __VA_ARGS__); \
> > > > } while (0)
> > > > ```
> > > >
> > > > This way you can format C++ strings in a sprintf-style. It would give
> > > > you many of the benefits of fmt with just a few lines of code
> > > > alternatively. I know we don't only use macros, but it makes sense
> > > > here because when the compiler see's the snprintf, it warns against
> > > > incorrect format strings when you do it this way.
> > >
> > > Beside the fact that snprintf() is called twice which may be less
> > > efficient (although maybe still more efficient than libfmt), the trouble
> > > here is that __VA_ARGS__ will be evaluated twice.
> > >
> > > unsigned int index;
> > > std::string str;
> > >
> > > index = 0;
> > > STRING_PRINTF(str, "%u", index++);
> >
> > These simple cases could use:
> >
> > std::string str = std::to_string(index++);
>
> Of course. It was just an example of potential issues with multiple
> evaluation of macro arguments. It should be possible to turn the macro
> into a function instead, but I think we'll then end up reinventing the
> wheel.
Yeah I get that, I'm not exactly hell bent on changing this, just
curiosity at this point (although admittedly I prefer printf or libfmt
style, and it's generally faster).
I did a crude benchmark here, just to solve curiosity:
https://github.com/ericcurtin/staging/blob/master/string_format_bench.cpp
$ g++ -O3 string_format_bench.cpp; ./a.out > /dev/null
folly: 3.447348 seconds
folly2: 3.343155 seconds
string_printf: 7.059757 seconds
string_printf2: 3.468643 seconds
to_string: 4.890018 seconds
ostringstream: 5.817828 seconds
string_printf2 takes some influence from folly and the first time it
parses, it optimistically guesses the output string could be less than
128 characters, which it probably is in many cases.
I actually wrote folly2 as a further optimization and opened a pull
request in Meta's folly, they missed a trick there by not writing
directly into the string, they do an extra unnecessary copy.
STRING_PRINTF2 is my favorite, but each to their own :)
>
> > It's unlikely that you are gonna beat std::to_string for a simple
> > conversion like this (well, libfmt beats everything in terms of
> > execution speed but it's another dependency, I've done a few
> > benchmarks in the past). STRING_PRINTF would easily convert to
> > std::format in future. The solution is efficient in terms of memory
> > but will do an evaluation twice as you said. The above macro was
> > written with readability and future compatibility to std::format in
> > mind, and is not very different to what folly from Meta does (only
> > much simpler and compiler friendly), see stringPrintf
> >
> > https://github.com/facebook/folly/blob/main/folly/String.cpp
> >
> > STRING_PRINTF suits the case where you have a longer format string and
> > you want to do something a bit more complex, so something like:
> >
> > std::stringstream info;
> > info << ts / 1000000000 << "."
> > << std::setw(6) << std::setfill('0') << ts / 1000 % 1000000
> > << " (" << std::fixed << std::setprecision(2) << fps << " fps)";
> >
> > for (const auto &[stream, buffer] : buffers) {
> > const FrameMetadata &metadata = buffer->metadata();
> >
> > info << " " << streamNames_[stream]
> > << " seq: " << std::setw(6) << std::setfill('0')
> > << metadata.sequence
> > << " bytesused: ";
> >
> > unsigned int nplane = 0;
> > for (const FrameMetadata::Plane &plane : metadata.planes()) {
> > info << plane.bytesused;
> > if (++nplane < metadata.planes().size())
> > info << "/";
> > }
> > }
> >
> > if (sink_) {
> > if (!sink_->processRequest(request))
> > requeue = false;
> > }
> >
> > std::cout << info.str() << std::endl;
> >
> > would become:
> >
> > std::string info
> > STRING_PRINTF(info, "%.6f (%.2f fps)", ts / 1000000000.0, fps);
> >
> > for (const auto &[stream, buffer] : buffers) {
> > const FrameMetadata &metadata = buffer->metadata();
> >
> > STRING_PRINTF(info,
> > " %s seq: %06d bytesused: ", streamNames_[buf.first].c_str(),
> > metadata.sequence);
> >
> > unsigned int nplane = 0;
> > for (const FrameMetadata::Plane &plane : metadata.planes()) {
> > frame_str += std::to_string(plane.bytesused);
> > if (++nplane < metadata.planes().size())
> > info += "/";
> > }
> > }
> >
> > if (sink_) {
> > if (!sink_->processRequest(request))
> > requeue = false;
> > }
> >
> > printf("%s\n", info.c_str());
> >
> > > /* Now index is equal to 2 */
> > >
> > > > Just an option...
>
> --
> Regards,
>
> Laurent Pinchart
>
More information about the libcamera-devel
mailing list