Check if the last Git commit has test coverage
##############################################

:date: 2018-07-26T12:49:52Z
:category: blog
:tags: python,development,testing
:url: 2018/07/26/check-if-last-git-commit-has-test-coverage/
:save_as: 2018/07/26/check-if-last-git-commit-has-test-coverage/index.html
:status: published
:author: Gergely Polonkai

I use Python at work and for private projects.  I also aim to write tests for my code, especially
recently.  And as I usually don’t start from 100% code coverage (TDD is not my game), I at least
want to know if the code I just wrote have full coverage.

The trick is to collect all the lines that changed, and all the lines that has no coverage.  Then
compare the two, and you have the uncovered lines that changed!

Getting the list of changed lines
=================================

Recently, I bumped into
`this article <https://adam.younglogic.com/2018/07/testing-patch-has-test/>`_.  It is a great awk
script that lists the lines that changed in the latest commit.  I have really no problem with awk,
but I’m pretty sure it can be done in Python, as that is my main language nowadays.

.. code-block:: python

   def get_changed_lines():
       """Get the line numbers that changed in the last commit
       """

       git_output = subprocess.check_output('git show', shell=True).decode('utf-8')

       current_file = None
       lines = {}
       left = 0
       right = 0

       for line in git_output.split('\n'):
           match = re.match(r'^@@ -([0-9]+),[0-9]+ [+]([0-9]+),[0-9]+ @@', line)

           if match:
               left = int(match.groups()[0])
               right = int(match.groups()[1])

               continue

           if re.match(r'^\+\+\+', line):
               current_file = line[6:]

               continue

           if re.match(r'^-', line):
               left += 1

               continue

           if re.match(r'^[+]', line):
               # Save this line number as changed
               lines.setdefault(current_file, [])
               lines[current_file].append(right)
               right += 1

               continue

           left += 1
           right += 1

       return lines

OK, not as short as the awk script, but works just fine.

Getting the uncovered lines
===========================

Coverage.py can list the uncovered lines with ``coverage report --show-missing``.  For
Calendar.social, this looks something like this:

.. code-block:: log

   Name                                     Stmts   Miss  Cover   Missing
   ----------------------------------------------------------------------
   calsocial/__init__.py                      173     62    64%   44, 138-148, 200, 239-253, 261-280, 288-295, 308-309, 324-346, 354-363
   calsocial/__main__.py                        3      3     0%   4-9
   calsocial/account.py                       108     51    53%   85-97, 105-112, 125, 130-137, 148-160, 169-175, 184-200, 209-212, 221-234
   calsocial/app_state.py                      10      0   100%
   calsocial/cache.py                          73     11    85%   65-70, 98, 113, 124, 137, 156-159
   calsocial/calendar_system/__init__.py       10      3    70%   32, 41, 48
   calsocial/calendar_system/gregorian.py      77      0   100%
   calsocial/config_development.py             11     11     0%   4-17
   calsocial/config_testing.py                 12      0   100%
   calsocial/forms.py                         198     83    58%   49, 59, 90, 136-146, 153, 161-169, 188-195, 198-206, 209-212, 228-232, 238-244, 252-253, 263-267, 273-277, 317-336, 339-342, 352-354, 362-374, 401-413
   calsocial/models.py                        372     92    75%   49-51, 103-106, 177, 180-188, 191-200, 203, 242-248, 257-268, 289, 307, 349, 352-359, 378, 392, 404-409, 416, 444, 447, 492-496, 503, 510, 516, 522, 525, 528, 535-537, 545-551, 572, 606-617, 620, 652, 655, 660, 700, 746-748, 762-767, 774-783, 899, 929, 932
   calsocial/security.py                       15      3    80%   36, 56-58
   calsocial/utils.py                          42      5    88%   45-48, 52-53
   ----------------------------------------------------------------------
   TOTAL                                     1104    324    71%

All we have to do is converting these ranges into a list of numbers, and compare it with the
result of the previous function:

.. code-block:: python

   def get_uncovered_lines(changed_lines):
       """Get the full list of lines that has not been covered by tests
       """

       column_widths = []
       uncovered_lines = {}

       for line in sys.stdin:
           line = line.strip()

           if line.startswith('---'):
               continue

           if line.startswith('Name '):
               match = re.match(r'^(Name +)(Stmts +)(Miss +)(Cover +)Missing$', line)
               assert match

               column_widths = [len(col) for col in match.groups()]

               continue

           name = [
               line[sum(column_widths[0:idx]):sum(column_widths[0:idx]) + width].strip()
               for idx, width in enumerate(column_widths)][0]
           missing = line[sum(column_widths):].strip()

           for value in missing.split(', '):
               if not value:
                   continue

               try:
                   number = int(value)
               except ValueError:
                   first, last = value.split('-')
                   lines = range(int(first), int(last) + 1)
               else:
                   lines = range(number, number + 1)

               for lineno in lines:
                   if name in changed_lines and lineno not in changed_lines[name]:
                       uncovered_lines.setdefault(name, [])
                       uncovered_lines[name].append(lineno)

       return uncovered_lines

At the end we have a dictionary that has filenames as keys, and a list of changed but uncovered
lines.

Converting back to ranges
=========================

To make the final result more readable, let’s convert them back to a nice ``from_line-to_line``
range list first:

.. code-block:: python

   def line_numbers_to_ranges():
       """List the lines that has not been covered
       """

       changed_lines = get_changed_lines()
       uncovered_lines = get_uncovered_lines(changed_lines)

       line_list = []

       for filename, lines in uncovered_lines.items():
           lines = sorted(lines)
           last_value = None

           ranges = []

           for lineno in lines:
               if last_value and last_value + 1 == lineno:
                   ranges[-1].append(lineno)
               else:
                   ranges.append([lineno])

               last_value = lineno

           range_list = []

           for range_ in ranges:
               first = range_.pop(0)

               if range_:
                   range_list.append(f'{first}-{range_[-1]}')
               else:
                   range_list.append(str(first))

           line_list.append((filename, ', '.join(range_list)))

       return line_list

Printing the result
===================

Now all that is left is to print the result on the screen in a format digestable by a human being:

.. code-block:: python

   def tabular_print(uncovered_lines):
       """Print the list of uncovered lines on the screen in a tabbed format
       """

       max_filename_len = max(len(data[0]) for data in uncovered_lines)

       for filename, lines in uncovered_lines:
           print(filename.ljust(max_filename_len + 2) + lines)

And we are done.

Conclusion
==========

This task never seemed hard to accomplish, but somehow I never put enough energy into it to make
it happen.  Kudos to Adam Young doing some legwork for me!