[libcamera-devel] [PATCH 2/2] checkstyle: Add support for pre-commit hook
Nicolas Dufresne
nicolas at ndufresne.ca
Fri Jan 17 16:55:38 CET 2020
Le vendredi 17 janvier 2020 à 16:55 +0200, Laurent Pinchart a écrit :
> Hi Nicolas,
>
> On Fri, Jan 17, 2020 at 08:50:56AM -0500, Nicolas Dufresne wrote:
> > Le vendredi 17 janvier 2020 à 09:46 +0200, Laurent Pinchart a écrit :
> > > On Thu, Jan 16, 2020 at 08:23:36PM -0500, Nicolas Dufresne wrote:
> > > > Le vendredi 17 janvier 2020 à 00:31 +0200, Laurent Pinchart a écrit :
> > > > > On Thu, Jan 16, 2020 at 10:25:31PM +0000, Kieran Bingham wrote:
> > > > > > On 16/01/2020 18:26, nicolas at ndufresne.ca wrote:
> > > > > > > From: Nicolas Dufresne <nicolas.dufresne at collabora.com>
> > > > > > >
> > > > > > > This adds support for pre-commit hook workflow. In pre-commit hook we
> > > > > > > check the style on the changes currently staged. Note that this patch
> > > > > > > also includes a bit of sugar to support --amend.
> > > > > > >
> > > > > > > Signed-off-by: Nicolas Dufresne <nicolas.dufresne at collabora.com>
> > > > > > > ---
> > > > > > > utils/checkstyle.py | 55 +++++++++++++++++++++++++++++-------------
> > > > > > > utils/hooks/pre-commit | 16 ++++++++++++
> > > > > > > 2 files changed, 54 insertions(+), 17 deletions(-)
> > > > > > > mode change 100644 => 100755 utils/checkstyle.py
> > > > > > > create mode 100755 utils/hooks/pre-commit
> > > > > > >
> > > > > > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > > > > > > old mode 100644
> > > > > > > new mode 100755
> > > > > > > index 7edea25..23eb3f6
> > > > > > > --- a/utils/checkstyle.py
> > > > > > > +++ b/utils/checkstyle.py
> > > > > > > @@ -458,11 +458,16 @@ class StripTrailingSpaceFormatter(Formatter):
> > > > > > > # Style checking
> > > > > > > #
> > > > > > >
> > > > > > > -def check_file(top_level, commit, filename):
> > > > > > > +def check_file(top_level, commit, filename, staged):
> > > > > > > # Extract the line numbers touched by the commit.
> > > > > > > - diff = subprocess.run(['git', 'diff', '%s~..%s' % (commit, commit), '--',
> > > > > > > - '%s/%s' % (top_level, filename)],
> > > > > > > - stdout=subprocess.PIPE).stdout
> > > > > >
> > > > > > I wonder if we could/should automate this by detecting staged changes,
> > > > > > and then setting staged based on that?
> > > > > >
> > > > > > "git status --porcelain | grep ^M"
> > > > > >
> > > > > > Could detect if there are staged changes.
> > > > > >
> > > > > > But equally it might be better to allow the caller to be explicit.
> > > > >
> > > > > I've been wondering about this too, I have a slight preference for
> > > > > auto-detection, but that's not a hard requirement if too difficult to
> > > > > implement or impractical to use.
> > > >
> > > > I don't think it hard, but I wasn't sure it was a good idea. One thing
> > > > that I've been a little slowpy is with the meaning of commit here (of
> > > > the rev list in fact). Before I added amend support, I was simply
> > > > ignoring that (no meaning). But then I recycled it to support the
> > > > commit --amend case, where you want to included the staged changes and
> > > > the previous commit.
> > > >
> > > > What I found worked to get that information was to only pass one hash,
> > > > which is the commit before the last commit (HEAD~). Anyway, I think I
> > > > should fix this, and so '%s~' % commit instead if I want to maintained
> > > > the behaviour. But by doing that, there is no longer anything to
> > > > describe that when I commit --amend, I want the combined report, not
> > > > just what was staged. And that I also don't want the previous commit to
> > > > be checked independently from the staged, because that would produce
> > > > that warning I'm trying to fix.
> > > >
> > > > So my conclusion, is that to disambiguate all this, I should in fact be
> > > > even more explicit, and also introduce --amend. A better implementation
> > > > would be to find a way to integrated these special revisions into the
> > > > revlist, this way (not sure if there is a use case) but you could
> > > > request a report for specific commits, the staging changes and the
> > > > amended changes in one call.
> > >
> > > That would be neat indeed. From a quick look at git-rev-parse there's no
> > > revision syntax to express this that I could find. We could extend the
> > > revision syntax, but that may be dangerous.
> >
> > I was thinking to keep the --staged/amend options, but to use mixed
> > type in the python list, and do type checks when iterating. That would
> > avoid the binary parameter, which is just bad API really.
> >
> > > If we instead go for explicit arguments, --staged or --cached would be
> > > fine with me, but I'd rather use --worktree (or something similar) than
> > > --amend to reflect that the revision corresponds to the working tree.
> >
> > I believe worktree is a much worst choice as it matches a completely
> > unrelated concept in git (see git worktree --help). The concept of an
> > amendment isn't defined, but can logically match the combination of the
> > previous commit and the staged changes, since this is what git commit
> > --amend will combine to replace the last commit.
>
> "Working tree" had an established meaning before worktree was introduced
> :-) That's how git-diff is documented:
>
> "Show changes between the working tree and the index or a tree, changes
> between the index and a tree, changes between two trees, changes between
> two blob objects, or changes between two files on disk."
>
> --working-tree could work too but is a bit long, although if it's only
> used by the pre-commit hook (and/or if we add a short -w option) it
> should work.
I believe you didn't interpret this correctly. The working tree is the
code as seen on your file system, the index is the changes that are to
be commited. What I check is the index (pre-commit changes), not he
working tree.
Right now, I'm just paying for git's mistake. What I want to deal with
is the index, which in git-diff is shown through --cached or --staged
command line option for unknown reasons. Looking at the changes in the
working tree to totally irrelevant to a pre-commit hook, but could be
an interesting feature to add if you feel it's useful to you.
But here, my goal is to add support for pre-commit hook, which is a
check run before you fill in the message. So I need to pick what git is
about to commit. In the case of commit, that's just the index, and for
commit --amend, it's the combination of the last commit and the index.
>
> It would be really nice if git had a reference specifier for the index
> in the working tree, we could then do
>
> checkstyle.py HEAD..INDEX
> checkstyle.py HEAD..WORK
>
> (or similar). Unless I'm mistaken, the pre-commit hook would then use
>
> checkstyle.py HEAD..INDEX
>
> for the normal git commit case, and
>
> checkstyle.py HEAD~..INDEX
>
> for the --amend case. This would have my preference. And now that I've
> written this, --worktree seems a bad name indeed if it's an alias for
> HEAD~ on the left-hand side. I proposed it thinking it would be an alias
> for <commit>..WORK, but that doesn't seem needed for the pre-commit
> hook.
>
> > > > > > > + if staged:
> > > > > > > + diff = subprocess.run(['git', 'diff', '--staged', commit, '--',
> > > > > > > + '%s/%s' % (top_level, filename)],
> > > > > > > + stdout=subprocess.PIPE).stdout
> > > > > > > + else:
> > > > > > > + diff = subprocess.run(['git', 'diff', '%s~..%s' % (commit, commit), '--',
> > > > > > > + '%s/%s' % (top_level, filename)],
> > > > > > > + stdout=subprocess.PIPE).stdout
> > > > > > > diff = diff.decode('utf-8').splitlines(True)
> > > > > > > commit_diff = parse_diff(diff)
> > > > > > >
> > > > > > > @@ -476,8 +481,12 @@ def check_file(top_level, commit, filename):
> > > > > > >
> > > > > > > # Format the file after the commit with all formatters and compute the diff
> > > > > > > # between the unformatted and formatted contents.
> > > > > > > - after = subprocess.run(['git', 'show', '%s:%s' % (commit, filename)],
> > > > > > > - stdout=subprocess.PIPE).stdout
> > > > > > > + if staged:
> > > > > > > + after = subprocess.run(['git', 'show', ':%s' % (filename)],
> > > > > > > + stdout=subprocess.PIPE).stdout
> > > > > > > + else:
> > > > > > > + after = subprocess.run(['git', 'show', '%s:%s' % (commit, filename)],
> > > > > > > + stdout=subprocess.PIPE).stdout
> > > > > > > after = after.decode('utf-8')
> > > > > > >
> > > > > > > formatted = after
> > > > > > > @@ -521,13 +530,20 @@ def check_file(top_level, commit, filename):
> > > > > > > return len(formatted_diff) + len(issues)
> > > > > > >
> > > > > > >
> > > > > > > -def check_style(top_level, commit):
> > > > > > > - # Get the commit title and list of files.
> > > > > > > - ret = subprocess.run(['git', 'show', '--pretty=oneline','--name-only', commit],
> > > > > > > - stdout=subprocess.PIPE)
> > > > > > > - files = ret.stdout.decode('utf-8').splitlines()
> > > > > > > - title = files[0]
> > > > > > > - files = files[1:]
> > > > > > > +def check_style(top_level, commit, staged):
> > > > > > > + if staged:
> > > > > > > + # Get list of staged changed files
> > > > > > > + ret = subprocess.run(['git', 'diff', '--staged', '--name-only', commit],
> > > > > > > + stdout=subprocess.PIPE)
> > > > > > > + files = ret.stdout.decode('utf-8').splitlines()
> > > > > > > + title = "Pre-commit"
> > > > > >
> > > > > > This is not necessarily a "Pre-commit" (hook). It's just staged.
> > > > > > A user could run the tool directly to validate some staged changes
> > > > > > before committing...
> > > > > >
> > > > > > So I'd perhaps make this
> > > > > > title = "Staged"
> > > > >
> > > > > Or maybe "Staged changes" to be slightly more explicit ?
> > > >
> > > > It started "Staged", then I splitted it between amend and staged, and
> > > > for some reason ended up like this. I think I'll split it again, and
> > > > use "Staged changes" or "Amended changes".
> > >
> > > Maybe "Work tree changes" instead of "Amended changes" ?
> >
> > See my comment above on why Worktree isn't a great choice. What I maybe
> > suggest, is to title this one after the previous commit. That would
> > look like "Amend: <subject line>". The truth though is that nobody
> > using pre-commit will ever even look at the subject, so we are putting
> > a little too much effort in naming here.
> >
> > > > > > > + else:
> > > > > > > + # Get the commit title and list of files.
> > > > > > > + ret = subprocess.run(['git', 'show', '--pretty=oneline', '--name-only', commit],
> > > > > > > + stdout=subprocess.PIPE)
> > > > > > > + files = ret.stdout.decode('utf-8').splitlines()
> > > > > > > + title = files[0]
> > > > > > > + files = files[1:]
> > > > > > >
> > > > > > > separator = '-' * len(title)
> > > > > > > print(separator)
> > > > > > > @@ -541,11 +557,11 @@ def check_style(top_level, commit):
> > > > > > > files = [f for f in files if len([p for p in patterns if fnmatch.fnmatch(os.path.basename(f), p)])]
> > > > > > > if len(files) == 0:
> > > > > > > print("Commit doesn't touch source files, skipping")
> > > > > > > - return
> > > > > > > + return 0
> > > > > > >
> > > > > > > issues = 0
> > > > > > > for f in files:
> > > > > > > - issues += check_file(top_level, commit, f)
> > > > > > > + issues += check_file(top_level, commit, f, staged)
> > > > > > >
> > > > > > > if issues == 0:
> > > > > > > print("No style issue detected")
> > > > > > > @@ -554,6 +570,8 @@ def check_style(top_level, commit):
> > > > > > > print("%u potential style %s detected, please review" % \
> > > > > > > (issues, 'issue' if issues == 1 else 'issues'))
> > > > > > >
> > > > > > > + return issues
> > > > > > > +
> > > > > > >
> > > > > > > def extract_revlist(revs):
> > > > > > > """Extract a list of commits on which to operate from a revision or revision
> > > > > > > @@ -595,6 +613,8 @@ def main(argv):
> > > > > > > parser = argparse.ArgumentParser()
> > > > > > > parser.add_argument('--formatter', '-f', type=str, choices=['astyle', 'clang-format'],
> > > > > > > help='Code formatter. Default to clang-format if not specified.')
> > > > > > > + parser.add_argument('--staged', '-s', action='store_true',
> > > > > > > + help='Looks at the staged changes, used for pre-commit, defaults to False')
> > > > > > > parser.add_argument('revision_range', type=str, default='HEAD', nargs='?',
> > > > > > > help='Revision range (as defined by git rev-parse). Defaults to HEAD if not specified.')
> > > > > > > args = parser.parse_args(argv[1:])
> > > > > > > @@ -632,11 +652,12 @@ def main(argv):
> > > > > > >
> > > > > > > revlist = extract_revlist(args.revision_range)
> > > > > > >
> > > > > > > + issues = 0
> > > > > > > for commit in revlist:
> > > > > > > - check_style(top_level, commit)
> > > > > > > + issues += check_style(top_level, commit, args.staged)
> > > > > > > print('')
> > > > > > >
> > > > > > > - return 0
> > > > > > > + return issues
> > > > > >
> > > > > > I'd be tempted to split the
> > > > > > "checkstyle: Report issue count as a failure"
> > > > > > and
> > > > > > "checkstyle: Support validating staged git state"
> > > > > >
> > > > > > to two patches, but perhaps that's not necessary unless you see benefit
> > > > > > in splitting.
> > > > >
> > > > > I don't think there's a need to resubmit just for this, but if a v2 is
> > > > > needed, it may be nice to split the changes indeed, to explain why
> > > > > changing the return value is needed.
> > > >
> > > > I think a v2 is needed, so I'll do.
> > > >
> > > > > According to the exit man page,
> > > > >
> > > > > RATIONALE
> > > > > As explained in other sections, certain exit status values have been reserved for special uses and should be used by applications only for those purposes:
> > > > >
> > > > > 126 A file to be executed was found, but it was not an executable utility.
> > > > >
> > > > > 127 A utility to be executed was not found.
> > > > >
> > > > > >128 A command was interrupted by a signal.
> > > > >
> > > > > And according to exit(),
> > > > >
> > > > > The value status & 0377 is returned to the parent process as the process's exit status, and can be collected using one of the wait(2) family of calls.
> > > > >
> > > > > In the unfortunate event that a multiple of 256 issues would be found,
> > > > > the caller would think everything went fine. Should we thus return 0 on
> > > > > success and 1 if any issue was found ?
> > > >
> > > > Agreed, will turn the number of issues into 0 and 1.
> > > >
> > > > > > > if __name__ == '__main__':
> > > > > > > diff --git a/utils/hooks/pre-commit b/utils/hooks/pre-commit
> > > > > > > new file mode 100755
> > > > > > > index 0000000..57d23ef
> > > > > > > --- /dev/null
> > > > > > > +++ b/utils/hooks/pre-commit
> > > > > >
> > > > > > Though I really think this deserves it's own commit.
> > > > > >
> > > > > > "utils/hooks: Provide a pre-commit checkstyle hook"
> > > > > >
> > > > > > > @@ -0,0 +1,16 @@
> > > > > > > +#!/bin/sh
> > > > > > > +
> > > > > > > +# Execute the checkstyle script before committing any code, failing the commit
> > > > > > > +# if needed. With this workflow, false positive can be ignored using:
> > > > > > > +# git commit -n
> > > > > > > +#
> > > > > > > +# To utilise this hook, install this file with:
> > > > > > > +# cp utils/hooks/pre-commit .git/hooks/pre-commit
> > > > > >
> > > > > > One of the things I had toyed with is installing the hook by setting the
> > > > > > git hook directory to utils/hooks/ :
> > > > > >
> > > > > > git config core.hooksPath utils/hooks
> > > > > >
> > > > > > But doing so now would invoke both the pre-commit and the post-commit hook.
> > > > > >
> > > > > > I think it's good to provide the option of how someone might install
> > > > > > their hook
> > > > > >
> > > > > > > +
> > > > > > > +commit=
> > > > > > > +if ps -ocommand= -p $PPID | grep -- "--amend"
> > > > > > > +then
> > > > > > > + commit="HEAD~"
> > > > > >
> > > > > > Is this really needed?
> > > > > >
> > > > > > Edit: Yes, I've just played around with it - and I see what's happening.
> > > > > > Good catch to find that something extra was required here.
> > > > > >
> > > > > > > +fi
> > > > > > > +
> > > > > > > +./utils/checkstyle.py --staged $commit
> > > > > >
> > > > > > I wonder if the checkstyle.py could detect if we had staged changes if
> > > > > > we could fall back to needing only a single variant of the hook which
> > > > > > the user could choose to install as either a pre-commit or a post-commit
> > > > > > hook.
> > > > > >
> > > > > > But perhaps it's more straight-forward to have two hooks to keep it
> > > > > > easier to support any future differences too.
> > > > >
> > > > > I'm fine either way.
> > > >
> > > > I think having two files in hook/ directory will be less confusing
> > > > even if they are identical in the end. But I don't think the auto
> > > > detection is a good idea, it's makes the CLI of the tool ambiguous at
> > > > least.
More information about the libcamera-devel
mailing list