Testing¶

Original notebook by Jarrod Millman, part of the Python-bootcamp.

Modifications Hans Fangohr, Sept 2013:

Add py.test example
minor edits

Move to Python 3, Sept 2016.

Motivation¶

Computing is error prone¶

In ordinary computational practice by hand or by desk machines, it is the custom to check every step of the computation and, when an error is found, to localize it by a backward process starting from the first point where the error is noted.
    - Norbert Wiener (1948)

More computing, more problems¶

The major cause of the software crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.
    - Edsger W. Dijkstra (1972)

What testing is and is not...¶

Testing and debugging¶

debugging is what you do when you know a program is broken
testing is a determined, systematic attempt to break a program
writing tests is more interesting than debugging

Program correctness¶

Program testing can be used to show the presence of bugs, but never to show their absence!
    - Edsger W. Dijkstra (1969)

In the imperfect world ...¶

avoid writing code if possible
write code as simple as possible
avoid cleverness
use code to generate code

Program languages play an important role¶

Programmers are always surrounded by complexity; we cannot avoid it. Our applications are complex because we are ambitious to use our computers in ever more sophisticated ways. Programming is complex because of the large number of conflicting objectives for each of our programming projects. If our basic tool, the language in which we design and code our programs, is also complicated, the language itself becomes part of the problem rather than part of its solution.

--- C.A.R. Hoare - The Emperor's Old Clothes - Turing Award Lecture (1980)

Testing and reproducibility¶

In the good old days physicists repeated each other's experiments, just to be sure. Today they stick to FORTRAN, so that they can share each other's programs, bugs included.
    - Edsger W. Dijkstra (1975)

Pre- and post-condition tests¶

what must be true before a method is invoked
what must be true after a method is invoked
use assertions

Program defensively¶

out-of-range index
division by zero
error returns

Be systematic¶

incremental
simple things first
know what to expect
compare independent implementations

Automate it¶

regression tests ensure that changes don't break existing functionality
verify conservation
unit tests (white box testing)
measure test coverage

Interface and implementation¶

an interface is how something is used
an implementation is how it is written

Testing in Python¶

Landscape¶

errors, exceptions, and debugging
assert, doctest, and unit tests
logging, unittest, and nose

Errors & Exceptions¶

Syntax Errors¶

Caught by Python parser, prior to execution
arrow marks the last parsed command / syntax, which gave an error

while true:
    print('Hello world')

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-52-07d9d27fa722> in <module>()
----> 1 while true:
      2     print('Hello world')

NameError: name 'true' is not defined

Exceptions¶

Caught during runtime

1/0

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-53-05c9758a9c21> in <module>()
----> 1 1/0

ZeroDivisionError: division by zero

factorial

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-54-62dd995547c1> in <module>()
----> 1 factorial

NameError: name 'factorial' is not defined

'1' + 1

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-55-2e87809dd063> in <module>()
----> 1 '1' + 1

TypeError: Can't convert 'int' object to str implicitly

Exception handling¶

try:
   file = open('filenamethatdoesnotexist.txt')
except FileNotFoundError:
   print('No such file')

No such file

Raising exceptions¶

def newfunction():
    raise NotImplementedError("Still need to write this code")

newfunction()

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-57-c1922a8a630b> in <module>()
      2     raise NotImplementedError("Still need to write this code")
      3 
----> 4 newfunction()

<ipython-input-57-c1922a8a630b> in newfunction()
      1 def newfunction():
----> 2     raise NotImplementedError("Still need to write this code")
      3 
      4 newfunction()

NotImplementedError: Still need to write this code

Debugging¶

def foo(x):
    return 1/x

def bar(y):
    return foo(1-y)

bar(1)

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-60-e30cf484dce8> in <module>()
      5     return foo(1-y)
      6 
----> 7 bar(1)

<ipython-input-60-e30cf484dce8> in bar(y)
      3 
      4 def bar(y):
----> 5     return foo(1-y)
      6 
      7 bar(1)

<ipython-input-60-e30cf484dce8> in foo(x)
      1 def foo(x):
----> 2     return 1/x
      3 
      4 def bar(y):
      5     return foo(1-y)

ZeroDivisionError: division by zero

%debug

> <ipython-input-60-e30cf484dce8>(2)foo()
      1 def foo(x):
----> 2     return 1/x
      3 
      4 def bar(y):
      5     return foo(1-y)

ipdb> p x    # can x really be zero?
0
ipdb> up
> <ipython-input-60-e30cf484dce8>(5)bar()
      3 
      4 def bar(y):
----> 5     return foo(1-y)
      6 
      7 bar(1)

ipdb> p y     # what is y (one function call UP)
1
ipdb> exit

Fixing bugs¶

def foo(x):
    if x==0:
        return float('Inf')
    else:
        return 1/x

bar(1)

inf

def foo(x):
    try:
        return 1/x
    except ZeroDivisionError:
        return float('Inf')

bar(1)

inf

Test as you code¶

Type checking¶

s = input("Please enter an integer: ")  # s is a string
if not isinstance(s, int):
    print("Casting ", s, " to integer.")
    i = int(s)

Please enter an integer: 5
Casting  5  to integer.

Assert invariants¶

if i % 3 == 0:
    print(1)
elif i % 3 == 1:
    print(2)
else:
    assert i % 3 == 2
    print(3)

3

Example¶

Let's make a factorial function.

%%file myfactorial.py

def factorial2(n):
    """ Details to come ...
    """

    raise NotImplementedError

def test():
     from math import factorial
     for x in range(10):
         print(".", end="")
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Overwriting myfactorial.py

Let's test it ...¶

import myfactorial
myfactorial.test()

Looks like we will have to implement our function, if we want to make any progress...

%%file myfactorial.py

def factorial2(n):
    """ Details to come ...
    """

    if n == 0:
        return 1
    else:
        return n * factorial2(n-1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Overwriting myfactorial.py

Let's test it ...¶

import importlib
importlib.reload(myfactorial)
myfactorial.test()

Seems to be okay so far. However, calling factorial2 with a negative number, say, will result in infinite loop. Thus:

What about preconditions¶

What happens if we call factorial2 with a negative integer? Or something that's not an integer?

%%file myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.
    """

    assert n >= 0 and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Overwriting myfactorial.py

`doctests` -- executable examples¶

importlib.reload(myfactorial)
from myfactorial import factorial2
[factorial2(n) for n in range(5)]

[1, 1, 2, 6, 24]

%%file myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Overwriting myfactorial.py

Running doctests¶

!python -m doctest -v myfactorial.py

Trying:
    from myfactorial import factorial2
Expecting nothing
ok
Trying:
    [factorial2(n) for n in range(5)]
Expecting:
    [1, 1, 2, 6, 24]
ok
2 items had no tests:
    myfactorial
    myfactorial.test
1 items passed all tests:
   2 tests in myfactorial.factorial2
2 tests in 3 items.
2 passed and 0 failed.
Test passed.

Real world testing and continuous integration¶

`unittest` and `nose`¶

Test fixtures (Unittest)¶

create self-contained tests
setup: open file, connect to a DB, create datastructures
teardown: tidy up afterward

Test runner (nose, pytest)¶

nosetests, py.test
test discovery: any callable beginning with test in a module beginning with test

Testing scientific computing libraries¶

Such libraries have often testing routines, for example:

import scipy.integrate
scipy.integrate.test()

...................................................................

Running unit tests for scipy.integrate
NumPy version 1.10.4
NumPy relaxed strides checking option: False
NumPy is installed in //anaconda/lib/python3.5/site-packages/numpy
SciPy version 0.17.0
SciPy is installed in //anaconda/lib/python3.5/site-packages/scipy
Python version 3.5.1 |Anaconda 4.0.0 (x86_64)| (default, Dec  7 2015, 11:24:55) [GCC 4.2.1 (Apple Inc. build 5577)]
nose version 1.3.7

........................................................................................................................................K.........................................
----------------------------------------------------------------------
Ran 245 tests in 3.009s

OK (KNOWNFAIL=1)

<nose.result.TextTestResult run=245 errors=0 failures=0>

Assertions revisited - numerical mathematics¶

Mathematically

$ x = (\sqrt(x))^2$.

So what is happening here:

import math
assert 2 == math.sqrt(2)**2

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-75-7136ce5c9672> in <module>()
      1 import math
----> 2 assert 2 == math.sqrt(2)**2

AssertionError:

math.sqrt(2)**2

NumPy Testing¶

What if we consider x and y almost equal? Can we modify our assertion?

import numpy as np
np.testing.assert_almost_equal(2, math.sqrt(2) ** 2)

x=1.000001
y=1.000002
np.testing.assert_almost_equal(x, y, decimal=5)

Testing with py.test¶

Going beyond doctest and Unittest, there are two frameworks widely spread for regression testing:

nose (http://nose.readthedocs.org/en/latest/)
pytest (http://pytest.org)

Here, we focus on pytest.

The example we use is the myfactorial.py file created earlier:

# %load myfactorial.py
def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

def factorial2(n):
    """ Find n!. Raise an AssertionError if n is negative or non-integral.

    >>> from myfactorial import factorial2
    >>> [factorial2(n) for n in range(5)]
    [1, 1, 2, 6, 24]
    """

    assert n >= 0. and type(n) is int, "Unrecognized input"

    if n == 0:
        return 1
    else:
        return n * factorial2(n - 1)

def test():
     from math import factorial
     for x in range(10):
         assert factorial2(x) == factorial(x), \
                "My factorial function is incorrect for n = %i" % x

Providing test functions¶

(Addition to original notebook, Hans Fangohr, 21 Sep 2013)

py.test is an executable that will search through a given file and find all functions that start with test, and execute those. Any failed assertions are reported as errors.

For example, py.test can run the test() function that has been defined already in myfactorial:

!py.test myfactorial.py

============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 1 items 

myfactorial.py .

=========================== 1 passed in 0.01 seconds ===========================

This output (the '.' after myfactorial.py) indicates success. We can get a more detailed output using the -v switch for extra verbosity:

!py.test -v myfactorial.py

============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 1 items 

myfactorial.py::test PASSED

=========================== 1 passed in 0.00 seconds ===========================

Sometimes, we like having the tests for myfactorial.py gathered in a separate file, for example in test_myfactorial.py. We create such a file, and within the file we create a number of test functions, each with a name starting with test:

%%file test_myfactorial.py

from myfactorial import factorial2 

def test_basics():
    assert factorial2(0) == 1
    assert factorial2(1) == 1
    assert factorial2(3) == 6
    
def test_against_standard_lib():
    import math
    for i in range(20):
        assert math.factorial(i) == factorial2(i)
        
def test_negative_number_raises_error():
    import pytest

    with pytest.raises(AssertionError):    # this will pass if 
        factorial2(-1)                     # factorial2(-1) raises 
                                           # an AssertionError
      
    with pytest.raises(AssertionError):
        factorial2(-10)

Overwriting test_myfactorial.py

We can now run the tests in this file using

!py.test -v test_myfactorial.py

============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 3 items 

test_myfactorial.py::test_basics PASSED
test_myfactorial.py::test_against_standard_lib PASSED
test_myfactorial.py::test_negative_number_raises_error PASSED

=========================== 3 passed in 0.01 seconds ===========================

The py.test command can also be given a directory, and it will search all files and files in subdirectories for files starting with test, and will attempt to run all the tests in those.

Or we can provide a list of test files to work through:

!py.test -v test_myfactorial.py myfactorial.py

============================= test session starts ==============================
platform darwin -- Python 3.5.1, pytest-2.8.5, py-1.4.31, pluggy-0.3.1 -- //anaconda/bin/python
cachedir: .cache
rootdir: /Users/fangohr/hg/teaching-python/notebook, inifile: 
plugins: nbval-0.3.1, cov-2.2.1
collected 4 items 

test_myfactorial.py::test_basics PASSED
test_myfactorial.py::test_against_standard_lib PASSED
test_myfactorial.py::test_negative_number_raises_error PASSED
myfactorial.py::test PASSED

=========================== 4 passed in 0.01 seconds ===========================

Final thoughts¶

Learn more¶

!hg tip

changeset:   616:a8ee6aafe5a7
tag:         tip
user:        Hans Fangohr [bin] <fangohr@soton.ac.uk>
date:        Fri Sep 16 19:57:15 2016 +0100
summary:     update to Python3, minor improvements

Testing¶

Motivation¶

Computing is error prone¶

More computing, more problems¶

What testing is and is not...¶

Testing and debugging¶

Program correctness¶

In the imperfect world ...¶

Program languages play an important role¶

Testing and reproducibility¶

Pre- and post-condition tests¶

Program defensively¶

Be systematic¶

Automate it¶

Interface and implementation¶

Testing in Python¶

Landscape¶

Errors & Exceptions¶

Syntax Errors¶

Exceptions¶

Exception handling¶

Raising exceptions¶

Debugging¶

Fixing bugs¶

Test as you code¶

Type checking¶

Assert invariants¶

Example¶

Let's test it ...¶

Let's test it ...¶

What about preconditions¶

doctests -- executable examples¶

Running doctests¶

Real world testing and continuous integration¶

unittest and nose¶

Test fixtures (Unittest)¶

Test runner (nose, pytest)¶

Testing scientific computing libraries¶

Assertions revisited - numerical mathematics¶

NumPy Testing¶

Testing with py.test¶

Providing test functions¶

Final thoughts¶

Learn more¶

`doctests` -- executable examples¶

`unittest` and `nose`¶