Testing

Original notebook by Jarrod Millman, part of the Python-bootcamp.

Modifications Hans Fangohr, Sept 2013:

  • Add py.test example
  • minor edits

Move to Python 3, Sept 2016.

Motivation

Computing is error prone

In ordinary computational practice by hand or by desk machines, it is the custom to check every step of the computation and, when an error is found, to localize it by a backward process starting from the first point where the error is noted.

    - Norbert Wiener (1948)

More computing, more problems

The major cause of the software crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.

    - Edsger W. Dijkstra (1972)

What testing is and is not...

Testing and debugging

  • debugging is what you do when you know a program is broken
  • testing is a determined, systematic attempt to break a program
  • writing tests is more interesting than debugging

Program correctness

Program testing can be used to show the presence of bugs, but never to show their absence!

    - Edsger W. Dijkstra (1969)

In the imperfect world ...

  • avoid writing code if possible
  • write code as simple as possible
  • avoid cleverness
  • use code to generate code

Program languages play an important role

Programmers are always surrounded by complexity; we cannot avoid it. Our applications are complex because we are ambitious to use our computers in ever more sophisticated ways. Programming is complex because of the large number of conflicting objectives for each of our programming projects. If our basic tool, the language in which we design and code our programs, is also complicated, the language itself becomes part of the problem rather than part of its solution.

--- C.A.R. Hoare - The Emperor's Old Clothes - Turing Award Lecture (1980)

Testing and reproducibility

In the good old days physicists repeated each other's experiments, just to be sure. Today they stick to FORTRAN, so that they can share each other's programs, bugs included.

    - Edsger W. Dijkstra (1975)

Pre- and post-condition tests

  • what must be true before a method is invoked
  • what must be true after a method is invoked
  • use assertions

Program defensively

  • out-of-range index
  • division by zero
  • error returns

Be systematic

  • incremental
  • simple things first
  • know what to expect
  • compare independent implementations

Automate it

  • regression tests ensure that changes don't break existing functionality
  • verify conservation
  • unit tests (white box testing)
  • measure test coverage

Interface and implementation

  • an interface is how something is used
  • an implementation is how it is written

Testing in Python

Landscape

  • errors, exceptions, and debugging
  • assert, doctest, and unit tests
  • logging, unittest, and nose

Errors & Exceptions

Syntax Errors

  • Caught by Python parser, prior to execution
  • arrow marks the last parsed command / syntax, which gave an error
In [52]:
while true:
    print('Hello world')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-52-07d9d27fa722> in <module>()
----> 1 while true:
      2     print('Hello world')

NameError: name 'true' is not defined

Exceptions

  • Caught during runtime
In [53]:
1/0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-53-05c9758a9c21> in <module>()
----> 1 1/0

ZeroDivisionError: division by zero
In [54]:
factorial
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-54-62dd995547c1> in <module>()
----> 1 factorial

NameError: name 'factorial' is not defined
In [55]:
'1' + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-55-2e87809dd063> in <module>()
----> 1 '1' + 1

TypeError: Can't convert 'int' object to str implicitly

Exception handling

In [56]:
try:
   file = open('filenamethatdoesnotexist.txt')
except FileNotFoundError:
   print('No such file')
No such file

Raising exceptions

In [57]:
def newfunction():
    raise NotImplementedError("Still need to write this code")

newfunction()
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-57-c1922a8a630b> in <module>()
      2     raise NotImplementedError("Still need to write this code")
      3 
----> 4 newfunction()

<ipython-input-57-c1922a8a630b> in newfunction()
      1 def newfunction():
----> 2     raise NotImplementedError("Still need to write this code")
      3 
      4 newfunction()

NotImplementedError: Still need to write this code

Debugging

In [60]:
def foo(x):
    return 1/x

def bar(y):
    return foo(1-y)

bar(1)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-60-e30cf484dce8> in <module>()
      5     return foo(1-y)
      6 
----> 7 bar(1)

<ipython-input-60-e30cf484dce8> in bar(y)
      3 
      4 def bar(y):
----> 5     return foo(1-y)
      6 
      7 bar(1)

<ipython-input-60-e30cf484dce8> in foo(x)
      1 def foo(x):
----> 2     return 1/x
      3 
      4 def bar(y):
      5     return foo(1-y)

ZeroDivisionError: division by zero
In [61]:
%debug
> <ipython-input-60-e30cf484dce8>(2)foo()
      1 def foo(x):
----> 2     return 1/x
      3 
      4 def bar(y):
      5     return foo(1-y)

ipdb> p x    # can x really be zero?
0
ipdb> up
> <ipython-input-60-e30cf484dce8>(5)bar()
      3 
      4 def bar(y):
----> 5     return foo(1-y)
      6 
      7 bar(1)

ipdb> p y     # what is y (one function call UP)
1
ipdb> exit   

Fixing bugs

In [62]:
def foo(x):
    if x==0:
        return float('Inf')
    else:
        return 1/x

bar(1)
Out[62]:
inf
In [63]:
def foo(x):
    try:
        return 1/x
    except ZeroDivisionError:
        return float('Inf')

bar(1)
Out[63]:
inf

Test as you code

Type checking

In [64]:
s = input("Please enter an integer: ")  # s is a string
if not isinstance(s, int):
    print("Casting ", s, " to integer.")
    i = int(s)
Please enter an integer: 5
Casting  5  to integer.

Assert invariants

In [65]:
if i % 3 == 0:
    print(1)
elif i % 3 == 1:
    print(2)
else:
    assert i % 3 == 2
    print(3)
3

Example

Let's make a factorial function.

In [66]:
%%file myfactorial.py

def factorial2(n):
    """ Details to come ...
    """

    raise NotImplementedError

def test():
     from math import factorial
     for x in range(10):
         print(".", end="")
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x
Overwriting myfactorial.py

Let's test it ...

In [67]:
import myfactorial
myfactorial.test()

Looks like we will have to implement our function, if we want to make any progress...

In [68]:
%%file myfactorial.py

def factorial2(n):
    """ Details to come ...
    """

    if n == 0:
        return 1
    else:
        return n * factorial2(n-1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x
Overwriting myfactorial.py

Let's test it ...

In [69]:
import importlib
importlib.reload(myfactorial)
myfactorial.test()

Seems to be okay so far. However, calling factorial2 with a negative number, say, will result in infinite loop. Thus:

What about preconditions

What happens if we call factorial2 with a negative integer? Or something that's not an integer?

In [70]:
%%file myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.
    """

    assert n >= 0 and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x
Overwriting myfactorial.py

doctests -- executable examples

In [71]:
importlib.reload(myfactorial)
from myfactorial import factorial2
[factorial2(n) for n in range(5)]
Out[71]:
[1, 1, 2, 6, 24]
In [72]:
%%file myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x
Overwriting myfactorial.py

Running doctests

In [73]:
!python -m doctest -v myfactorial.py
Trying:
    from myfactorial import factorial2
Expecting nothing
ok
Trying:
    [factorial2(n) for n in range(5)]
Expecting:
    [1, 1, 2, 6, 24]
ok
2 items had no tests:
    myfactorial
    myfactorial.test
1 items passed all tests:
   2 tests in myfactorial.factorial2
2 tests in 3 items.
2 passed and 0 failed.
Test passed.

Real world testing and continuous integration

unittest and nose

Test fixtures (Unittest)

  • create self-contained tests
  • setup: open file, connect to a DB, create datastructures
  • teardown: tidy up afterward

Test runner (nose, pytest)

  • nosetests, py.test
  • test discovery: any callable beginning with test in a module beginning with test

Testing scientific computing libraries

Such libraries have often testing routines, for example:

In [74]:
import scipy.integrate
scipy.integrate.test()
...................................................................
Running unit tests for scipy.integrate
NumPy version 1.10.4
NumPy relaxed strides checking option: False
NumPy is installed in //anaconda/lib/python3.5/site-packages/numpy
SciPy version 0.17.0
SciPy is installed in //anaconda/lib/python3.5/site-packages/scipy
Python version 3.5.1 |Anaconda 4.0.0 (x86_64)| (default, Dec  7 2015, 11:24:55) [GCC 4.2.1 (Apple Inc. build 5577)]
nose version 1.3.7
........................................................................................................................................K.........................................
----------------------------------------------------------------------
Ran 245 tests in 3.009s

OK (KNOWNFAIL=1)
Out[74]:
<nose.result.TextTestResult run=245 errors=0 failures=0>

Assertions revisited - numerical mathematics

Mathematically

$ x = (\sqrt(x))^2$.

So what is happening here:

In [75]:
import math
assert 2 == math.sqrt(2)**2
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-75-7136ce5c9672> in <module>()
      1 import math
----> 2 assert 2 == math.sqrt(2)**2

AssertionError: 
In [ ]:
math.sqrt(2)**2

NumPy Testing

What if we consider x and y almost equal? Can we modify our assertion?

In [ ]:
import numpy as np
np.testing.assert_almost_equal(2, math.sqrt(2) ** 2)
In [76]:
x=1.000001
y=1.000002
np.testing.assert_almost_equal(x, y, decimal=5)

Testing with py.test

Going beyond doctest and Unittest, there are two frameworks widely spread for regression testing:

Here, we focus on pytest.

The example we use is the myfactorial.py file created earlier:

In [77]:
# %load myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x
In [78]:
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Providing test functions

(Addition to original notebook, Hans Fangohr, 21 Sep 2013)

py.test is an executable that will search through a given file and find all functions that start with test, and execute those. Any failed assertions are reported as errors.

For example, py.test can run the test() function that has been defined already in myfactorial:

In [79]:
!py.test myfactorial.py
============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 1 items 

myfactorial.py .

=========================== 1 passed in 0.01 seconds ===========================

This output (the '.' after myfactorial.py) indicates success. We can get a more detailed output using the -v switch for extra verbosity:

In [80]:
!py.test -v myfactorial.py
============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 1 items 

myfactorial.py::test PASSED

=========================== 1 passed in 0.00 seconds ===========================

Sometimes, we like having the tests for myfactorial.py gathered in a separate file, for example in test_myfactorial.py. We create such a file, and within the file we create a number of test functions, each with a name starting with test:

In [81]:
%%file test_myfactorial.py

from myfactorial import factorial2 

def test_basics():
    assert factorial2(0) == 1
    assert factorial2(1) == 1
    assert factorial2(3) == 6
    
def test_against_standard_lib():
    import math
    for i in range(20):
        assert math.factorial(i) == factorial2(i)
        
def test_negative_number_raises_error():
    import pytest

    with pytest.raises(AssertionError):    # this will pass if 
        factorial2(-1)                     # factorial2(-1) raises 
                                           # an AssertionError
      
    with pytest.raises(AssertionError):
        factorial2(-10)
    
Overwriting test_myfactorial.py

We can now run the tests in this file using

In [82]:
!py.test -v test_myfactorial.py 
============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 3 items 

test_myfactorial.py::test_basics PASSED
test_myfactorial.py::test_against_standard_lib PASSED
test_myfactorial.py::test_negative_number_raises_error PASSED

=========================== 3 passed in 0.01 seconds ===========================

The py.test command can also be given a directory, and it will search all files and files in subdirectories for files starting with test, and will attempt to run all the tests in those.

Or we can provide a list of test files to work through:

In [83]:
!py.test -v test_myfactorial.py myfactorial.py
============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 4 items 

test_myfactorial.py::test_basics PASSED
test_myfactorial.py::test_against_standard_lib PASSED
test_myfactorial.py::test_negative_number_raises_error PASSED
myfactorial.py::test PASSED

=========================== 4 passed in 0.01 seconds ===========================
In [84]:
!hg tip
changeset:   616:a8ee6aafe5a7
tag:         tip
user:        Hans Fangohr [bin] <fangohr@soton.ac.uk>
date:        Fri Sep 16 19:57:15 2016 +0100
summary:     update to Python3, minor improvements

In [ ]: