[libcamera-devel] [RFC PATCH 1/1] test: cam tool testing with valgrind

Laurent Pinchart laurent.pinchart at ideasonboard.com
Wed Oct 7 15:10:23 CEST 2020


Hi Kieran,

On Wed, Oct 07, 2020 at 01:26:40PM +0100, Kieran Bingham wrote:
> On 07/10/2020 13:03, Niklas Söderlund wrote:
> > On 2020-10-07 12:25:44 +0100, Kieran Bingham wrote:
> >> Implement a test which runs cam to capture 10 frames,
> >> and validates there are no memory leaks with valgrind.
> >>
> >> Signed-off-by: Kieran Bingham <kieran.bingham at ideasonboard.com>
> >> ---
> >>
> >> A few points to consider here:
> >>
> >> Duration
> >> --------
> >> Unfortuantely this test adds an extra 13 seconds to the test run on my
> >> system. We could reduce the duration of course, but it will still be a
> >> reasonably substantial addition.
> > 
> > I think this is indeed a good tool to have in our arsenal. I'm starting 
> > to wonder if we need to split our test directory in a 'release' and 
> > 'normal' mode. In release we would have tests that takes longer and are 
> > more focused to catch deeper errors such as memory leaks or other issues 
> > that requires a longer runtime. While in 'normal' we would have tests 
> > similar that we have today that focus on incorrect behavior of the API.  
> > What do you think?
> > 
> > I think I would be unhappy for this test to run (without any way to 
> > disable it) for every ninja test run as I already think our tests takes 
> > too much time ;-)
> 
> I'm fine with that, a 'smoke-test' suite, and a full suite. I'll leave
> the definition of what goes it the smoke-test to later, after we
> determine if it's possible. But I certainly don't object ;-)

Deciding which category a test will go to will be an interesting issue.

> >> TAP
> >> ---
> >> This runs a test for mulitiple cameras, and meson now supports 'tap'
> >> format tests (multiple tests in one execution). In this patch - the
> >> 'protocol : "tap" is disabled, because meson doesn't like the output of
> >> the rest of the test being mixed with the tap output. I'm not sure of a
> >> good way to handle that. The output is useful, so we want it somewhere,
> >> and ideally in the test log - but we also want the tap parsing to work
> >> in that case too.
> >> Anyway, with the tap protocol disabled, it just falls back to the return
> >> status which is also handled here. Still, this is a good opportunity to
> >> see how 'tap' could help our tests, and might revive my old patches to
> >> use common test helpers across the rest of test/

I'd certainly be happy to see improvements in our test infrastructure,
including better logging support that could help with TAP support in a
way that would be transparent for individual tests. We're not doing that
bad on the unit test front, but we're slowly accumulated technical debt
which would be very nice to see handled.

> > I'm a huge fan of TAP, do you think it would require a lot of work to 
> > change our current tests output to be valid TAP? If I recall correctly 
> > prepending a # too all non-TAP lines is nought to make any text stream 
> > into a valid (yet not so useful) TAP stream.
> 
> Ok - prepending sounds like a good route then. Maybe we can throw
> something together that automatically redirects stdout/stderr to two
> different prefixes.
> 
> 1...3
> #out: a standard out message
> #err: A standard error mesage/warning
> not ok - 1 There was an error in this test.

There's also the option to redirecting the ancillary output to a log
file.

> > One thing about TAP I learnt the hard way is that the number of planed 
> > tests (keyword "1..5") does not need to go first in the TAP stream, it's 
> > perfectly valid to provide this last. The results should only be 
> 
> Yes, I discovered this recently too. That helps a lot when you don't
> know how many tests you are going to run until you run them ;-)
> 
> > considered incomplete if there is no plan provided anywhere. This have 
> > helped me in the past turning things into being TAP compatible.
> > 
> > Lastly a word of caution of the TAP13 specification, please don't :-)
> 
> Ack. I haven't looked at TAP13. I'll avoid it ;-)
> 
> >> Valgrind
> >> --------
> >> Parsing valgrind is a real pain. This test *only* checks to see if there
> >> are any memory leaks, and does not report on any of the other topics
> >> valgirind might highlight. (They could be added later of course).
> >> Ideally we would want a machine parsable output from valgrind. It can
> >> output an xml report ... but ... that's a lot more parsing too ;-)
> >>
> >> Thoughts/Suggestions?
> > 
> > Is it really useful information to parse this out? As long as the 
> > valgrind output is keept in a log file I think a PASS/FAIL of the run is 
> > enough as someone who debugs it would like find more informaiton in the 
> > log then from parsed values presented in the test run.
> 
> The issue is - It's hard to (easily/simply) identify if there is a
> reason to consider valgrind a pass/fail.
> 
> grepping the logs seems fragile and awkward --- It would be nice if
> valgrind had a better summary that said:
>   - leaks: 5
>   - uninitialsed data: 2
>   - Total errors: 7
> 
> Or something really easy to consider a pass/fail.
> 
> For example, without Tomi's current fixup patches, this test 'passes'
> even though valgrind has (correctly) identified and reported the use of
> uninitialised variables, and an invalid close(-1) syscall.
> 
> It would be nice if (easily) we could report all valgrind errors here.
> It feels too obvious - so I must have missed something staring me in the
> face :-(

I don't have the answer to that problem, but I also feel there should be
something :-)

> >> https://github.com/adobe/chromium/blob/master/tools/valgrind/memcheck_analyze.py
> >> seems to show an existing tool that parses the valgrind output as one
> >> possible path - but I'm happy to hear of any better ideas too.
> > 
> > As a last point I wonder if we should create a new binary in tests that 
> > iterates over all cameras while using the same CameraManager instance 
> > and looper over them 2-3 times. I think that could exercise more code 
> > paths then creating a new CamereManger for each run (as we do with cam).  
> > But this can of course be done on top or maybe even in addition.
> 
> Yes, that sounds like another internal test to add.
> 
> Meson does have the facilility to wrap everthing with valgrind too ...
> so perhaps we should also consider that - but I thought this route might
> be interesting too.
> 
> $ meson test --wrap='valgrind --tool=helgrind -v --error-exitcode=1'
> 
> Ok - so I grabbed that from a search - and look at that
> --error-exitcode=1 ! That looks like what I might have been missing ;-)
> 
> 
> Well well well ... so I've run meson test --wrap=valgrind and I get:
> 
> $ meson test --wrap=valgrind
> 14/57 libcamera:ipc / unixsocket                       FAIL
> 16/57 libcamera:log / log_process                      FAIL
> 22/57 libcamera:process / process_test                 FAIL
> 55/57 libcamera / timer                                FAIL
> 
> (trimmed of course)
> 
> and worse:
> 
> meson test --wrap='valgrind --tool=helgrind -v --error-exitcode=1'
> <trim all the failures>
>  1/57 libcamera:integration / cam-capture-test         OK
> 38/57 libcamera / span                                 OK
> 
> Ok:                 2
> Expected Fail:      0
> Fail:               55
> Unexpected Pass:    0
> Skipped:            0
> Timeout:            0
> 
> It seems that might be potentially due to a common failure in the
> CameraTest::CameraTest though.
> 
> Time for some digging I guess...

We have leaks flagged for instance due to one-time initialization inside
libudev that has no cleanup API. We will need a valgrind suppression
file if we want to be able to wrap all tests with valgrind.

> >>  test/cam/capture-test.sh | 71 ++++++++++++++++++++++++++++++++++++++++
> >>  test/cam/meson.build     | 10 ++++++
> >>  test/meson.build         |  1 +
> >>  3 files changed, 82 insertions(+)
> >>  create mode 100755 test/cam/capture-test.sh
> >>  create mode 100644 test/cam/meson.build
> >>
> >> diff --git a/test/cam/capture-test.sh b/test/cam/capture-test.sh
> >> new file mode 100755
> >> index 000000000000..ab808976be72
> >> --- /dev/null
> >> +++ b/test/cam/capture-test.sh
> >> @@ -0,0 +1,71 @@
> >> +#!/bin/sh
> >> +
> >> +# SPDX-License-Identifier: GPL-2.0-or-later
> >> +
> >> +TestPass=0
> >> +TestFail=255
> >> +TestSkip=77
> >> +
> >> +# Initialise success, set for failure.
> >> +ret=$TestPass
> >> +
> >> +ok() {
> >> +	echo "ok $*"
> >> +}
> >> +
> >> +nok() {
> >> +	echo "not ok $*"
> >> +	ret=$TestFail
> >> +}
> >> +
> >> +valgrind=$(command -v valgrind)
> >> +
> >> +if [ x"" = x"$valgrind" ] ; then
> >> +	echo "skip 1 - Valgrind unavailable ..."
> >> +	exit $TestSkip
> >> +fi
> >> +
> >> +# Tests expect to be run from the meson.project_build_root()
> >> +cam=${1:-src/cam/cam}
> >> +
> >> +if [ ! -e "$cam" ] ; then
> >> +	nok "1 - failed to find cam utility."
> >> +	exit $TestFail
> >> +fi
> >> +
> >> +# Unfortunately, we don't have a 'machine interface', so we rely on parsing the
> >> +# output of cam...
> >> +num_cameras=$("$cam" -l | grep -v "Available" | wc -l)
> >> +
> >> +# Enter TAP plan
> >> +echo "1..$num_cameras"
> >> +
> >> +for i in $(seq 1 1 "$num_cameras");
> >> +do
> >> +	"$cam" -c "$i" -C10
> >> +	ret=$?
> >> +	if [ $ret != 0 ] ; then
> >> +		nok "$i - $cam returned $ret"
> >> +		continue
> >> +	fi
> >> +
> >> +	log_file="valgrind-cam-$i.log"
> >> +	"$valgrind" "$cam" -c "$i" -C10 > "$log_file" 2>&1
> >> +	ret=$?
> >> +	if [ $ret != 0 ] ; then
> >> +		nok "$i - $valgrind returned $ret"
> >> +		continue
> >> +	fi
> >> +
> >> +	# I'd prefer a better way of checking there are no leaks, as well as reporting
> >> +	# the different categories from valgrind as distinct tests.
> >> +	if ! grep "no leaks are possible" "$log_file" > /dev/null; then
> >> +		nok "$i - Valgrind Errors detected"
> >> +		cat $log_file > /dev/stderr
> >> +		continue
> >> +	fi
> >> +
> >> +	ok "$i - Camera $i reports no leaks"
> >> +done;
> >> +
> >> +exit $ret
> >> diff --git a/test/cam/meson.build b/test/cam/meson.build
> >> new file mode 100644
> >> index 000000000000..834c9bcf6b86
> >> --- /dev/null
> >> +++ b/test/cam/meson.build
> >> @@ -0,0 +1,10 @@
> >> +# SPDX-License-Identifier: CC0-1.0
> >> +
> >> +cam_capture = files('capture-test.sh')
> >> +
> >> +test('cam-capture-test', cam_capture,
> >> +    args : cam,
> >> +    suite : 'integration',
> >> +    is_parallel : false,
> >> +    #protocol : 'tap',
> >> +    timeout : 60)
> >> diff --git a/test/meson.build b/test/meson.build
> >> index 0a1d434e3996..d1b24220dc7c 100644
> >> --- a/test/meson.build
> >> +++ b/test/meson.build
> >> @@ -2,6 +2,7 @@
> >>  
> >>  subdir('libtest')
> >>  
> >> +subdir('cam')
> >>  subdir('camera')
> >>  subdir('controls')
> >>  subdir('ipa')

-- 
Regards,

Laurent Pinchart


More information about the libcamera-devel mailing list