[libcamera-devel] [PATCH 2/2] checkstyle: Add support for pre-commit hook

Thu Jan 16 23:31:30 CET 2020

Hi Kieran,

On Thu, Jan 16, 2020 at 10:25:31PM +0000, Kieran Bingham wrote:
> On 16/01/2020 18:26, nicolas at ndufresne.ca wrote:
> > From: Nicolas Dufresne <nicolas.dufresne at collabora.com>
> > 
> > This adds support for pre-commit hook workflow. In pre-commit hook we
> > check the style on the changes currently staged. Note that this patch
> > also includes a bit of sugar to support --amend.
> > 
> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne at collabora.com>
> > ---
> >  utils/checkstyle.py    | 55 +++++++++++++++++++++++++++++-------------
> >  utils/hooks/pre-commit | 16 ++++++++++++
> >  2 files changed, 54 insertions(+), 17 deletions(-)
> >  mode change 100644 => 100755 utils/checkstyle.py
> >  create mode 100755 utils/hooks/pre-commit
> > 
> > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > old mode 100644
> > new mode 100755
> > index 7edea25..23eb3f6
> > --- a/utils/checkstyle.py
> > +++ b/utils/checkstyle.py
> > @@ -458,11 +458,16 @@ class StripTrailingSpaceFormatter(Formatter):
> >  # Style checking
> >  #
> >  
> > -def check_file(top_level, commit, filename):
> > +def check_file(top_level, commit, filename, staged):
> >      # Extract the line numbers touched by the commit.
> > -    diff = subprocess.run(['git', 'diff', '%s~..%s' % (commit, commit), '--',
> > -                           '%s/%s' % (top_level, filename)],
> > -                          stdout=subprocess.PIPE).stdout
> 
> I wonder if we could/should automate this by detecting staged changes,
> and then setting staged based on that?
> 
> 	"git status --porcelain | grep ^M"
> 
> Could detect if there are staged changes.
> 
> But equally it might be better to allow the caller to be explicit.

I've been wondering about this too, I have a slight preference for
auto-detection, but that's not a hard requirement if too difficult to
implement or impractical to use.

> > +    if staged:
> > +        diff = subprocess.run(['git', 'diff', '--staged', commit, '--',
> > +                               '%s/%s' % (top_level, filename)],
> > +                              stdout=subprocess.PIPE).stdout
> > +    else:
> > +        diff = subprocess.run(['git', 'diff', '%s~..%s' % (commit, commit), '--',
> > +                               '%s/%s' % (top_level, filename)],
> > +                              stdout=subprocess.PIPE).stdout
> >      diff = diff.decode('utf-8').splitlines(True)
> >      commit_diff = parse_diff(diff)
> >  
> > @@ -476,8 +481,12 @@ def check_file(top_level, commit, filename):
> >  
> >      # Format the file after the commit with all formatters and compute the diff
> >      # between the unformatted and formatted contents.
> > -    after = subprocess.run(['git', 'show', '%s:%s' % (commit, filename)],
> > -                           stdout=subprocess.PIPE).stdout
> > +    if staged:
> > +        after = subprocess.run(['git', 'show', ':%s' % (filename)],
> > +                               stdout=subprocess.PIPE).stdout
> > +    else:
> > +        after = subprocess.run(['git', 'show', '%s:%s' % (commit, filename)],
> > +                               stdout=subprocess.PIPE).stdout
> >      after = after.decode('utf-8')
> >  
> >      formatted = after
> > @@ -521,13 +530,20 @@ def check_file(top_level, commit, filename):
> >      return len(formatted_diff) + len(issues)
> >  
> >  
> > -def check_style(top_level, commit):
> > -    # Get the commit title and list of files.
> > -    ret = subprocess.run(['git', 'show', '--pretty=oneline','--name-only', commit],
> > -                         stdout=subprocess.PIPE)
> > -    files = ret.stdout.decode('utf-8').splitlines()
> > -    title = files[0]
> > -    files = files[1:]
> > +def check_style(top_level, commit, staged):
> > +    if staged:
> > +        # Get list of staged changed files
> > +        ret = subprocess.run(['git', 'diff', '--staged', '--name-only', commit],
> > +                             stdout=subprocess.PIPE)
> > +        files = ret.stdout.decode('utf-8').splitlines()
> > +        title = "Pre-commit"
> 
> This is not necessarily a "Pre-commit" (hook). It's just staged.
> A user could run the tool directly to validate some staged changes
> before committing...
> 
> So I'd perhaps make this
> 	title = "Staged"

Or maybe "Staged changes" to be slightly more explicit ?

> > +    else:
> > +        # Get the commit title and list of files.
> > +        ret = subprocess.run(['git', 'show', '--pretty=oneline', '--name-only', commit],
> > +                             stdout=subprocess.PIPE)
> > +        files = ret.stdout.decode('utf-8').splitlines()
> > +        title = files[0]
> > +        files = files[1:]
> >  
> >      separator = '-' * len(title)
> >      print(separator)
> > @@ -541,11 +557,11 @@ def check_style(top_level, commit):
> >      files = [f for f in files if len([p for p in patterns if fnmatch.fnmatch(os.path.basename(f), p)])]
> >      if len(files) == 0:
> >          print("Commit doesn't touch source files, skipping")
> > -        return
> > +        return 0
> >  
> >      issues = 0
> >      for f in files:
> > -        issues += check_file(top_level, commit, f)
> > +        issues += check_file(top_level, commit, f, staged)
> >  
> >      if issues == 0:
> >          print("No style issue detected")
> > @@ -554,6 +570,8 @@ def check_style(top_level, commit):
> >          print("%u potential style %s detected, please review" % \
> >                  (issues, 'issue' if issues == 1 else 'issues'))
> >  
> > +    return issues
> > +
> >  
> >  def extract_revlist(revs):
> >      """Extract a list of commits on which to operate from a revision or revision
> > @@ -595,6 +613,8 @@ def main(argv):
> >      parser = argparse.ArgumentParser()
> >      parser.add_argument('--formatter', '-f', type=str, choices=['astyle', 'clang-format'],
> >                          help='Code formatter. Default to clang-format if not specified.')
> > +    parser.add_argument('--staged', '-s', action='store_true',
> > +                        help='Looks at the staged changes, used for pre-commit, defaults to False')
> >      parser.add_argument('revision_range', type=str, default='HEAD', nargs='?',
> >                          help='Revision range (as defined by git rev-parse). Defaults to HEAD if not specified.')
> >      args = parser.parse_args(argv[1:])
> > @@ -632,11 +652,12 @@ def main(argv):
> >  
> >      revlist = extract_revlist(args.revision_range)
> >  
> > +    issues = 0
> >      for commit in revlist:
> > -        check_style(top_level, commit)
> > +        issues += check_style(top_level, commit, args.staged)
> >          print('')
> >  
> > -    return 0
> > +    return issues
> 
> I'd be tempted to split the
>   "checkstyle: Report issue count as a failure"
> and
>   "checkstyle: Support validating staged git state"
> 
> to two patches, but perhaps that's not necessary unless you see benefit
> in splitting.

I don't think there's a need to resubmit just for this, but if a v2 is
needed, it may be nice to split the changes indeed, to explain why
changing the return value is needed.

According to the exit man page,

RATIONALE
       As explained in other sections, certain exit status values have been reserved for special uses and should be used by applications only for those purposes:

        126    A file to be executed was found, but it was not an executable utility.

        127    A utility to be executed was not found.

       >128    A command was interrupted by a signal.

And according to exit(),

The value status & 0377 is returned to the parent process as the process's exit status, and can be collected using one of the wait(2) family of calls.

In the unfortunate event that a multiple of 256 issues would be found,
the caller would think everything went fine. Should we thus return 0 on
success and 1 if any issue was found ?

> >
> >
> >  if __name__ == '__main__':
> > diff --git a/utils/hooks/pre-commit b/utils/hooks/pre-commit
> > new file mode 100755
> > index 0000000..57d23ef
> > --- /dev/null
> > +++ b/utils/hooks/pre-commit
> 
> Though I really think this deserves it's own commit.
> 
>  "utils/hooks: Provide a pre-commit checkstyle hook"
> 
> > @@ -0,0 +1,16 @@
> > +#!/bin/sh
> > +
> > +# Execute the checkstyle script before committing any code, failing the commit
> > +# if needed. With this workflow, false positive can be ignored using:
> > +#   git commit -n
> > +#
> > +# To utilise this hook, install this file with:
> > +#   cp utils/hooks/pre-commit .git/hooks/pre-commit
> 
> One of the things I had toyed with is installing the hook by setting the
> git hook directory to utils/hooks/ :
> 
>  git config core.hooksPath utils/hooks
> 
> But doing so now would invoke both the pre-commit and the post-commit hook.
> 
> I think it's good to provide the option of how someone might install
> their hook
> 
> > +
> > +commit=
> > +if ps -ocommand= -p $PPID | grep -- "--amend"
> > +then
> > +   commit="HEAD~"
> 
> Is this really needed?
> 
> Edit: Yes, I've just played around with it - and I see what's happening.
> Good catch to find that something extra was required here.
> 
> > +fi
> > +
> > +./utils/checkstyle.py --staged $commit
> 
> I wonder if the checkstyle.py could detect if we had staged changes if
> we could fall back to needing only a single variant of the hook which
> the user could choose to install as either a pre-commit or a post-commit
> hook.
> 
> But perhaps it's more straight-forward to have two hooks to keep it
> easier to support any future differences too.

I'm fine either way.

-- 
Regards,

Laurent Pinchart