[libcamera-devel] [RFC PATCH v2 05/14] test: yaml-parser: Test dictionary items ordering

Thu Jun 16 10:48:24 CEST 2022

Hi David,

On Thu, Jun 16, 2022 at 09:32:21AM +0100, David Plowman wrote:
> Hi Laurent
> 
> Sorry for the delay!

Thanks for the quick reply :-)

> On Thu, 16 Jun 2022 at 09:10, Laurent Pinchart wrote:
> > On Mon, Jun 13, 2022 at 12:48:55PM +0300, Laurent Pinchart via libcamera-devel wrote:
> > > On Mon, Jun 13, 2022 at 10:25:15AM +0100, David Plowman wrote:
> > > > Hi Laurent, everyone
> > > >
> > > > Thanks for the suggestions. I'm fine to change the syntax to make
> > > > things clearer, but I wonder if we could avoid breaking existing JSON
> > > > files? There are probably low numbers of them out there beyond the
> > > > ones that we've supplied, but you never quite know and backwards
> > > > compatibility is still a nice thing. Do you think that's something we
> > > > can arrange?
> > >
> > > This patch series guarantees ordering of entries in mapping (at the
> > > expense of duplicated key storage in memory, but that could probably be
> > > fixed), so there should be no breakage. The main drawback is that
> > > YamlParser and YamlObject will then expose an order that is
> > > implementation-specific and shouldn't be relied on by applications
> > > according to the JSON and YAML specifications. There's thus a risk of
> > > introducing new code that relies on mappings being ordered, which
> > > wouldn't be the case if the implementation didn't guarantee ordering
> > > (although one may argue that in the case the users of YamlObject may
> > > still rely on a different implementation-specific order without
> > > realizing it, we would need to randomize the order to avoid that, which
> > > I don't think is a good idea).
> > >
> > > With versioned tuning files the IPA could fairly easily support both the
> > > current format and the new format, so that shouldn't be a problem. I
> > > think we should then provide a Python script to convert the old format
> > > to the new one, and print a warning to the log. Would you expect the
> > > Raspberry Pi IPA module to support the current format forever, or only
> > > for a fixed duration to help users transition ?
> >
> > I'll post a v3 series of the YamlObject changes, and I'd like to decide
> > on which direction to take (if possible :-)). Could you share your
> > thoughts on this ?
> 
> I wasn't totally sure what the precise question is, but here are a few
> random answers:
> 
> - I'd like existing turning files to continue working, could they
> implicitly be regarded as "version 0" or something if they don't say
> otherwise? It also means not having to touch the tuning tool just yet.

We can, the v2 I've posted can handle this. The only drawback is that it
risks introducing another dependency on YAML mapping ordering, but I
suppose that's worth it to avoid breaking all existing tuning files.

Would you consider dropping support for v0 at some point ? If so, any
idea when ?

> - I don't like having different priorities stored with different
> algorithms. You'll forever be hunting through the file reverse
> engineering the actual order. I'd rather have an "order" field (or
> something) that simply lists the correct order.
> 
> - The "order" field would default to the "standard order" if not
> present. The order can list algorithms that aren't present and that
> would just be ignored. It would be common simply not to have all
> algorithms, or to comment some out for debugging.
> 
> I think there is some complexity in the order matching. For example,
> I'd want "rpi.awb" to match just "awb". But if "rpi.awb" is listed in
> the order, that would take precedence.
> 
> The idea is that we can list the standard order like "black_level",
> "dpc", "lux", "noise", "geq", "sdn", "awb", and so on. But if someone
> writes an algorithm "foo.awb" they can force it to go somewhere else
> if they have too, but by default it would take up the standard "awb"
> position.
> 
> Hmm, that's starting to get a bit annoying, isn't it? Storing
> priorities in the algorithm would be less of a problem in this
> respect, but I really don't like it...

It's getting complex indeed. I have no issue keeping the existing
mechanism working for as long as we support v0. In v1 it would work with
the modification to the JSON file explained below. The only cost will be
one more level in the hierachy. I've also shown how it looks like in
YAML as it's less verbose and thus easier to discuss by e-mail, but that
doesn't mean I want to convert existing files to YAML.

> - Don't want to lose json, but happy to have both with the appropriate
> file extension.

The YamlParser supports JSON just fine :-) The only caveat is that
spaces have to be used for indentation, not tabs.

> Did I answer everything or was there anything I overlooked?

There's the question of how the v1 format should look like. I'd propose
the following.

{
    "version": 1
    "algorithms": [
        {
            "rpi.black_level":
            {
                "black_level": 4096
            },
        },
        {
            "rpi.noise":
            {
                "reference_constant": 0,
                "reference_slope": 3.67
            },
        }
    ]
}

As the contents of the "algorithms" key is a list, it is ordered, so we
keep the current feature of algorithm ordering without abusing a mapping
order.

Another option would be

{
    "version": 1
    "algorithms": [
        {
            "name": "rpi.black_level",
            "data": {
                "black_level": 4096
            },
        },
        {
            "name": "rpi.noise",
            "data": {
                "reference_constant": 0,
                "reference_slope": 3.67
            },
        }
    ]
}

I don't know what's better, if it's a good practice to make keys well
known (as in the second option, where an algorithm entry has two keys
that are known at compile time, "name" and "data") or if it's fine (or
even better) to make keys more dynamic (which then requires iterating
over the parsed object, as we don't know the key name in advance).

> > > > On Mon, 13 Jun 2022 at 09:46, Laurent Pinchart wrote:
> > > > > On Mon, Jun 13, 2022 at 09:05:21AM +0100, Naushir Patuck wrote:
> > > > > > On Mon, 13 Jun 2022 at 07:18, Laurent Pinchart wrote:
> > > > > > > On Mon, Jun 13, 2022 at 01:22:19PM +0900, paul.elder at ideasonboard.com wrote:
> > > > > > > > On Sat, Jun 04, 2022 at 09:59:30PM +0300, Laurent Pinchart wrote:
> > > > > > > > > The order of items in a YAML dictionary may matter. Update the test to
> > > > > > > > > ensure that it is preserved. The test currently fails at the YamlParser
> > > > > > > >
> > > > > > > > My understanding is that YAML mappings are unordered [1] [2], and if
> > > > > > > > order in the mapping is significant, then either a sequence of mappings
> > > > > > > > [3] or flow mapping adjacent values [4] should be used.
> > > > > > >
> > > > > > > That's a very good point. [5] even mentions
> > > > > > >
> > > > > > > "while imposing a order on mapping keys is necessary for flattening YAML
> > > > > > > representations to a sequential access medium, this serialization detail
> > > > > > > must not be used to convey application level information."
> > > > > > >
> > > > > > > The same applies to JSON ([6]).
> > > > > > >
> > > > > > > Fixing this would require changing the syntax of the tuning files. It's
> > > > > > > inconvenient, but not doing so opens the door to more issues in the
> > > > > > > future :-S
> > > > > >
> > > > > > Given that the IPA does require ordering and we have been lucky with
> > > > > > Boost preserving order in the JSON parser, I think we probably ought
> > > > > > to specify ordering in the config file with a specific key.
> > > > >
> > > > > Sounds good to me.
> > > > >
> > > > > It could be done through a priority key in each algorithm, or by
> > > > > converting the mapping to a list. In YAML format, that would be moving
> > > > > from
> > > > >
> > > > > {
> > > > >     "rpi.black_level":
> > > > >     {
> > > > >         "black_level": 4096
> > > > >     },
> > > > >     "rpi.noise":
> > > > >     {
> > > > >         "reference_constant": 0,
> > > > >         "reference_slope": 3.67
> > > > >     },
> > > > > }
> > > > >
> > > > > to
> > > > >
> > > > > [
> > > > >     {
> > > > >         "rpi.black_level":
> > > > >         {
> > > > >             "black_level": 4096
> > > > >         },
> > > > >     },
> > > > >     {
> > > > >         "rpi.noise":
> > > > >         {
> > > > >             "reference_constant": 0,
> > > > >             "reference_slope": 3.67
> > > > >         },
> > > > >     }
> > > > > ]
> > > > >
> > > > > In YAML format, it would translate as a move from
> > > > >
> > > > > rpi.black_level:
> > > > >   black_level: 4096
> > > > > rpi.noise:
> > > > >   reference_constant: 0
> > > > >   reference_slope: 3.67
> > > > >
> > > > > to
> > > > >
> > > > > - rpi.black_level:
> > > > >     black_level: 4096
> > > > > - rpi.noise:
> > > > >     reference_constant: 0
> > > > >     reference_slope: 3.67
> > > > >
> > > > > As we have to change the tuning files, I'd like to take this as an
> > > > > opportunity to add a format version. Something along those lines maybe ?
> > > > >
> > > > > version: 1.0
> > > > > algorithms:
> > > > >   - rpi.black_level:
> > > > >       black_level: 4096
> > > > >   - rpi.noise:
> > > > >       reference_constant: 0
> > > > >       reference_slope: 3.67
> > > > >
> > > > > And while we're discussing this, does someone know about best practice
> > > > > rules to design JSON/YAML grammars ? I've been wondering for a long time
> > > > > if the following grammar would have any advantage:
> > > > >
> > > > > version: 1.0
> > > > > algorithms:
> > > > >   - name: rpi.black_level
> > > > >     data:
> > > > >       black_level: 4096
> > > > >   - name: rpi.noise:
> > > > >     data:
> > > > >       reference_constant: 0
> > > > >       reference_slope: 3.67
> > > > >
> > > > > > When I get a
> > > > > > change, I'll look to add a patch to allow this on the existing codebase,
> > > > > > and this series ought to "just work" after that.
> > > > > >
> > > > > > > > [1] https://yaml.org/spec/1.2.2/#mapping
> > > > > > > > [2] https://yaml.org/spec/1.2.2/#3221-mapping-key-order
> > > > > > > > [3] https://yaml.org/spec/1.2.2/#example-ordered-mappings
> > > > > > > > [4] https://yaml.org/spec/1.2.2/#example-flow-mapping-adjacent-values
> > > > > > >
> > > > > > > [5] https://yaml.org/spec/1.2.2/#32-information-models
> > > > > > > [6] https://www.json.org/json-en.html
> > > > > > >
> > > > > > > > > doesn't correctly preserve the order, this will be fixed by the next
> > > > > > > > > commit.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
> > > > > > > > > ---
> > > > > > > > >  test/yaml-parser.cpp | 7 +++----
> > > > > > > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/test/yaml-parser.cpp b/test/yaml-parser.cpp
> > > > > > > > > index 5ff4c3236dbf..582c9caed836 100644
> > > > > > > > > --- a/test/yaml-parser.cpp
> > > > > > > > > +++ b/test/yaml-parser.cpp
> > > > > > > > > @@ -29,8 +29,8 @@ static const string testYaml =
> > > > > > > > >     "  - Mary\n"
> > > > > > > > >     "dictionary:\n"
> > > > > > > > >     "  a: 1\n"
> > > > > > > > > -   "  b: 2\n"
> > > > > > > > >     "  c: 3\n"
> > > > > > > > > +   "  b: 2\n"
> > > > > > > > >     "level1:\n"
> > > > > > > > >     "  level2:\n"
> > > > > > > > >     "    - [1, 2]\n"
> > > > > > > > > @@ -428,7 +428,6 @@ protected:
> > > > > > > > >             }
> > > > > > > > >
> > > > > > > > >             auto memeberNames = dictObj.memberNames();
> > > > > > > > > -           sort(memeberNames.begin(), memeberNames.end());
> > > > > > > > >
> > > > > > > > >             if (memeberNames.size() != 3) {
> > > > > > > > >                     cerr << "Dictionary object fail to extra member names" << std::endl;
> > > > > > > > > @@ -436,8 +435,8 @@ protected:
> > > > > > > > >             }
> > > > > > > > >
> > > > > > > > >             if (memeberNames[0] != "a" ||
> > > > > > > > > -               memeberNames[1] != "b" ||
> > > > > > > > > -               memeberNames[2] != "c") {
> > > > > > > > > +               memeberNames[1] != "c" ||
> > > > > > > > > +               memeberNames[2] != "b") {
> > > > > > > > >                     cerr << "Dictionary object fail to parse member names" << std::endl;
> > > > > > > > >                     return TestFail;
> > > > > > > > >             }

-- 
Regards,

Laurent Pinchart