[PATCH 1/3] utils: checkstyle.py: Rework commit message parsing

Dan Scally dan.scally at ideasonboard.com
Mon Aug 12 14:42:03 CEST 2024


Hi Laurent

On 10/08/2024 01:58, Laurent Pinchart wrote:
> When parsing commit messages, the Commit class tries to optimize the
> process by invoking git-show once only, extracting both the commit
> author, title and modified files in one go. As a result, the derived
> Amendment class needs to implement the commit parsing separately, as the
> modified files need to be extracted differently for amendments (and the
> commit ID also needs to be retrieved differently). Furthermore, because
> of the list of named files, extracting the trailers needs to invoke
> git-show separately.
>
> Improve the situation by reworking the commit message parsing in three
> steps. In the first step, use git-show to extract the commit ID, author,
> title and body. In the second step, invoke git-interpret-trailers to
> extract the trailers from the body that was previously extracted. The
> third and final step extracts the list of modified files, using
> different methods for regular commits and amendments.
>
> This allows sharing code for the first two steps between the Commit and
> Amendment classes, making the code simpler. The Commit class still
> invokes git three times, while the Amendment class runs it three times
> instead of four, improving performance slightly.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
> ---


Reviewed-by: Daniel Scally <dan.scally at ideasonboard.com>

>   utils/checkstyle.py | 60 +++++++++++++++++++++++++--------------------
>   1 file changed, 33 insertions(+), 27 deletions(-)
>
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index dae5d518920a..aa0731abdb5a 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -211,33 +211,42 @@ class CommitFile:
>   
>   class Commit:
>       def __init__(self, commit):
> -        self.commit = commit
> +        self._commit = commit
>           self._author = None
>           self._trailers = []
>           self._parse()
>   
> -    def _parse_trailers(self):
> -        proc_show = subprocess.run(['git', 'show', '--format=%b',
> -                                    '--no-patch', self.commit],
> -                                   stdout=subprocess.PIPE)
> +    def _parse_commit(self):
> +        # Get and parse the commit message.
> +        ret = subprocess.run(['git', 'show', '--format=%H%n%an <%ae>%n%s%n%b',
> +                              '--no-patch', self.commit],
I should read the whole commit when I get confused; it took me way too long to spot that this now 
references the property.
> +                             stdout=subprocess.PIPE).stdout.decode('utf-8')
> +        lines = ret.splitlines()
> +
> +        self._commit = lines[0]
> +        self._author = lines[1]
> +        self._title = lines[2]
> +        self._body = lines[3:]
> +
> +        # Parse the trailers. Feed git-interpret-trailers with a full commit
> +        # message that includes both the title and the body, as it otherwise
> +        # fails to find trailers when the body contains trailers only.
> +        message = self._title + '\n\n' + '\n'.join(self._body)
>           trailers = subprocess.run(['git', 'interpret-trailers', '--parse'],
> -                                  input=proc_show.stdout,
> +                                  input=message.encode('utf-8'),
>                                     stdout=subprocess.PIPE).stdout.decode('utf-8')
>   
>           self._trailers = trailers.splitlines()
>   
>       def _parse(self):
> -        # Get the commit title and list of files.
> -        ret = subprocess.run(['git', 'show', '--format=%an <%ae>%n%s',
> -                              '--name-status', self.commit],
> +        self._parse_commit()
> +
> +        # Get the list of files. Use an empty format specifier to suppress the
> +        # commit message completely.
> +        ret = subprocess.run(['git', 'show', '--format=', '--name-status',
> +                              self.commit],
>                                stdout=subprocess.PIPE).stdout.decode('utf-8')
> -        lines = ret.splitlines()
> -
> -        self._author = lines[0]
> -        self._title = lines[1]
> -        self._files = [CommitFile(f) for f in lines[2:] if f]
> -
> -        self._parse_trailers()
> +        self._files = [CommitFile(f) for f in ret.splitlines()]
>   
>       def files(self, filter='AMR'):
>           return [f.filename for f in self._files if f.status in filter]
> @@ -246,6 +255,10 @@ class Commit:
>       def author(self):
>           return self._author
>   
> +    @property
> +    def commit(self):
> +        return self._commit
> +
>       @property
>       def title(self):
>           return self._title
> @@ -284,21 +297,14 @@ class StagedChanges(Commit):
>   
>   class Amendment(Commit):
>       def __init__(self):
> -        Commit.__init__(self, '')
> +        Commit.__init__(self, 'HEAD')
>   
>       def _parse(self):
> -        # Create a title using HEAD commit and parse the trailers.
> -        ret = subprocess.run(['git', 'show', '--format=%an <%ae>%n%H %s',
> -                             '--no-patch'],
> -                             stdout=subprocess.PIPE).stdout.decode('utf-8')
> -        lines = ret.splitlines()
> +        self._parse_commit()
>   
> -        self._author = lines[0]
> -        self._title = 'Amendment of ' + lines[1]
> +        self._title = f'Amendment of "{self.title}"'
>   
> -        self._parse_trailers()
> -
> -        # Extract the list of modified files
> +        # Extract the list of modified files.
>           ret = subprocess.run(['git', 'diff', '--staged', '--name-status', 'HEAD~'],
>                                stdout=subprocess.PIPE).stdout.decode('utf-8')
>           self._files = [CommitFile(f) for f in ret.splitlines()]
>
> base-commit: 62760bd2605a83e663b9003244ff42f8946f8955


More information about the libcamera-devel mailing list