Mikael Lundin

June 04, 2015

Property-Based Testing with F#

This post is based on a talk I did at Valtech Tech Day, 4 June 2015. You can find the slides for this talk here and the code examples in a zip archive here.

See the writeup for this whole talk in english below the video and slides.

Video (in Swedish)

Slides (in English)

Tech Day 2015 - Property-based testing med F# from Valtech AB

Introduction

Before diving into property-based testing, let's do a recap of testing in sofware development today.

Unit Testing

The purpose of unit testing it so verify the correctness of our code. We write tests that verify that the code of our system under test (SUT) is doing what we expect it to. We also write unit tests to drive good design from our code. The unit tests forces design patterns on a macro level like separation of concerns and low coupling / high cohesion.

Integration Tests will verify how our system fits together with other systems

An example of unit testing is where a test verify that our code set status to 'Committed' after committing a transaction.

Summary

Unit testing is white box testing
Unit testing specifies what your code is supposed to do
Unit testing affects your code design on a macro level

Integration Testing

The point of integration testing is to verify how our system depends on other systems and how other systems will affect or system.

Integration Tests will verify how our system fits together with other systems

In the follwing integration test scenario we're testing how the database AutoIncremented ID affects our system's entities.

Summary

We use integration tests to verify how our system depends on other systems
We use integration tests to verify how our system is affected by other systems

Functional Testing

The functional tests that we write operate on a higher abstraction level. The point of these tests is to specify what's expected of the features, and what functionality features are expected to sustain. This makes functional tests the obvious choice for regression testing, as they are the first thing to break when the feature is no longer fulfilling its promise.

Functional tests works at the feature abstraction making it closer to promise of what is being delivered

One way to do functional testing is by executable specification, of which this is an example of.

Summary

Functional tests works well as regression tests
Functional tests works well as documentation
Functional tests are closer to the user story that is being developed

Manual Testing

Developers often underestimate manual testing, because they think they're doing it, but in fact they're only checking that their feature is rendered correctly in the browser.

Instead, manual testing is perfomed by a person called 'tester' who is experimenting with the system to find out if it solves the problem it was designed to solve. This doesn't mean that a developer can't be a tester, any person could be a tester, but a developer is not suitable to be a tester for a system he is developing.

Anyone person that is testing can be called a tester, but a developer is not a suitable tester for his own code

In contrary to the act of writing tests (unit, integration, functional) the manual testing is a creative task. Automated tests are a specification of what is going to be written, or what has already been written. Manual testing is the act of exploring, investigating and challeging status quo of the system under test.

Property-Based Testing™

This is where property-based testing comes into play. It is a tool for us developers to think about or system like a manual tester and challenge its functionality. We do this by defining properties for our system.

A property is a high-level specification of behavior that should hold for a range of data points.

In more concrete terms, we state something that is true about our system, and this should be true for any kind of data we feed that statement with.

The property is a high-level specification of behavior that should hold for a range of data points

Example of a property

Let's say that our system is a sort algorithm. Then we could write a unit test testing its functionality like this.

This test will provide 100% code coverage, but is it really enough in terms of verifying the functionality?

An optimistic developer would say the system is 100% verified. A pessimistic developer would say the system is 100% verified for the 7 first numbers of fibonacci

Instead we could define a set of properties that always hold true for this system.

Sort Properties

Sorting the list once is equal to sorting the list twice
The first item in the sorted list is the smallest
The Last item in the sorted list is the largest
All items from the original list are present in the sorted list
The items in the sorted list are ordered

Let's start looking at the implementation.

Implementation

The system that we're testing is a sort function. Let's make the fun excersise and implement BubbleSort.

The first property that we intend to implement is Sorting the list once is equal to sorting the list twice.

We start by creating an F# project and install FsCheck.

Create new F# project
Create F# code file, ArrayUtilsProperties.fs
Package Manager Console: Install-Package FsCheck

And then we can input the following code into the code window.

This looks very much like a standard unit test in F# with the exception that it takes an input argument. This argument will be randomly generated by FsCheck.

We can verify the property by evaluate the following in F# interactive.

Check.Quick ``Sorting the list once is equal to sorting the list twice``

The result will not be successful. Instead we will get the following error.

Falsifiable, after 3 tests (0 shrinks) (StdGen (1204045486,296013961)):
[||]
with exception:
System.IndexOutOfRangeException: Index was outside the bounds of the array.

This means that after generating 3 data sets for our property, one was found to break the functionality, namely when FsCheck sent in an empty array. It seems like our code cannot handle it.

We'll easily fix that in our SUT with a check for empty arrays.

When we reverify the property we will get a successful result.

Ok, passed 100 tests.

This means that the property returned true for 100 randomly generated data sets. Next up is to formalize this into a test and making it reproducible.

Turning properties into tests

To make a property-based test we need to turn our property into a test. This is done quite easily by installing a few more packages into our F# project.

Install-Package FsCheck.xUnit
Install-Package Xunit.runner.visualstudio

Now we can add a simple Property attribute to the function just like we would add Fact attribute to a unit test. The Property attribute is actually a subclass of Fact and it will let FsCheck to identifiy the Property-based tests and generate data for its input argument.

After compiling this code, the test will turn up in Test Explorer and you can execute it as any other test i Visual Studio.

Our property-based test appears in the Visual Studio Test Explorer and we can run it like any unit test

Conditions in FsCheck

Sometimes we have properties that aren't valid under all circumstances of generated input data. In these cases we would like to add conditions for when a property should be run.

Consider the following property

The first item in the sorted list is the smallest

This property should always be true, but it is not applicable when the input is empty, because there is no first item on an empty list/array.

The condition is implemented through the operator ==> where the left hand part is the expression that returns true when the property is elegible, and the right hand value is the property. In order to not have it evaluate before checking the condition, we can make the whole property lazy evaluating.

Named property asserts

In some cases we want to make severals checks within our property, and in those cases it can be a trifle to know what check failed. For this, we can name the internal checks in our property to get the reported error to tell us what check failed.

The property does this by ordering the sorted array pairwise and then validating for each pair that left value is less or equal to right value.

Concrete Property-Based Testing Example: Rating

It is not a common scenario in a LOB application to implement a BubbleSort algorithm. That is one quite academic example, but good for learning the basics Property-Based Testing.

I was asked to implement a rating functionality where a user would be able to rate articles with a star rating 1-5. The system would then display the average rating in a discrete number of stars, with a unit of 0.5 stars.

The public interface of my Rating class looks like this.

The interesting part to test here is the Stars property which could have any of the following values.

1 star
1.5 stars
2 stars
2.5 stars
3 stars
3.5 stars
4 stars
4.5 stars
5 stars

The Stars value would be decided by the Average property value.

Property: Number of stars should be closest proximity discrete star to the average value

The model of selecting the number of stars to display as average, based on the average value of user ratings

The way to read this is that the Stars property value (on top) is decided on the Average interval on the bottom.

Implementing a Generator

Before we start implementing our property we have a problem that the input data to our Rating class can only be integers in the range 1 to 5. This is not something we would use a condition for, as the restriction is not in our property but in the SUT.

Instead we need to create a random value generator that will only generate values between 1 and 5.

A range is a generated integer less than 6 and larger than 0. Then generate a list of such ranges.

Now we can register this generator with the following statement, Arb.registerByType (typeof<RatingProperties>), and the full property that checks the Stars property will look like this.

Quite neat huh!

Summary

Property-Based Testing is...

High-level explorative properties of your system under test
Challenging your system with loads of data
Easily finding edge cases with your code
Making you think of your program outside-in