Chapter 5 - Objections¶

In the previous chapters we covered both the rationale for software testing, and provided some practical guidelines and techniques to create effective tests. These concepts may be met with some resistance from incumbent personnel with established views on how testing ought to be conducted.

Dealing with objections¶

The following are some common objections which may arise when adopting a new testing methodology and suggestions for dealing with these objections .

We need integration tests to “really” test the code¶

There is often a belief that until an entire function is exercised (i.e. end-to-end) that a true test has not been performed. This belief assumes that it is only in the case where a real world scenario is created that the software can be considered “effectively tested”.

Consider an example used earlier:

public void doSomething(String data) {

    DatabaseWriter writer = new DatabaseWriter();

    if(data != null) {

        Emailer emailer = new Emailer();

        try {
            Serializer serializer = new Serializer();
            Object value = serializer.deserialize(data);
            writer.writeValueToDatabase(value);
            emailer.sendConfirmationEmail();
        }
        catch (Exception e) {
            writer.logErrorToDatabase(e);
            emailer.sendErrorEmail();
        }
    }
    else {
        writer.logErrorToDatabase("No data");
    }
}

A proponent of the “we need integration tests” view might suggest that in order for this feature to be considered “tested” we would need to perform a range of black box tests considering various situations but all of which would actually send email and write to a database.

The problem with this view is that requiring integration tests as the baseline for all software testing necessarily inherits all of the problems and challenges already discussed.

The salient question here is,

“What exactly are we testing?”

In this case we would testing both the logic of the code in the method, and the functionality of the email system and the database system.

If we knew that the email system and the database system were working correctly, would we really need them to be used in the test?

The view that an integration test is the only way to ensure things are working correctly is usually predicated on an assumption that some systems will fail (the email server might go down, the database might run out of disk etc), but these are really infrastructure tests and should not be included in software tests.

There is a simple way to summarize the preferred approach:

Key point:

Consider every possible point of failure in the system and create a single dedicated test for that point of failure.

Following this approach in the above example we would create:

An infrastructure test to verify that the email server is working
An infrastructure test to verify that the database is working
A unit test to ensure the Emailer is behaving correctly
A unit test to ensure the DatabaseWriter is behaving correctly
A unit test to ensure the Serializer is behaving correctly
A unit test to ensure the doSomething method is behaving correctly

With these tests in place there is no need for an integration test for the doSomething method because there is nothing left to test.

Additionally the creation of dedicated infrastructure tests ensures that the cause of system failures can be accurately and quickly pinpointed without needing to trace through code that is actually working.

Now an immediate objection to this approach may be that we have just replaced a single integration test with 6 new test cases!

The different is that these tests only need to be written once, and will act to verify the behavior of the tested components no matter where they are used within the application, as opposed to testing them each and every time they are used.

Key points:

If unit test are implemented correctly, integration tests may become redundant.

We shouldn’t test implementation, only behavior¶

One of the commonly accepted tenants of testing is that tests should assert behavior independent of implementation.

That is, if the internal mechanics of a unit of code change but the outcome of its execution do not change then pre-existing tests should not fail.

When we look at the output of a unit test that makes extensive use of mocks, the code in the test can look suspiciously like the code in the method being tested.

Consider the following:

// The method to be tested
public void someMethod(Emailer emailer, DatabaseReader reader) {

    DatabaseObject value = reader.readObject();

    if(value.isActive()) {
        emailer.sendEmail();
    }
}

// The test
public void testSomeMethod() {

    Emailer emailer = Mockito.mock(Emailer.class);
    DatabaseReader reader = Mockito.mock(DatabaseReader.class);
    DatabaseObject object = Mockito.mock(DatabaseObject.class);

    Mockito.when(reader.readObject()).thenReturn(object);
    Mockito.when(object.isActive()).thenReturn(true);

    someMethod(emailer, reader);

    Mockito.verify(emailer).sendEmail();
}

In this example the lines of code that comprise the test appear to emulate the lines of code in the function under test. This could be easily confused as a test that is asserting implementation as opposed to behavior, however if we adopt a model of mocking dependencies the resulting test is simply a natural conclusion of this.

In reality the above test does merely assert behavior. The lines of code that may be confusing are simply there to orchestration of the mocks. In a traditional integration test this orchestration may be outisde the test code (e.g. setting up test databases etc) and so a newcomer to the mock dependencies approach may see this as testing implementation where in fact it’s simply testing behavior using mocks.

There are cases however where changing the mechanics of a method which doesn’t alter the outcome does in fact lead to a test failure. This is generally due to one of the following:

Assertions in the test are too stringent. Many mocking systems for example allow the engineer to assert that a particular mocked implementation will only be called once. Adding an additional call, even though it may not affect the outcome will cause an assertion failure in these cases. Often the assertion conditions are simply too stringent, however there are also cases where the failure is valid.
The failure is valid and would not have been detected by the integration test. The duplicate update problem we saw earlier is an example of this. An engineer who inadvertently adds a duplicate update call will cause a stringently implemented mock assertion to fail even though the outcome of the method has not changed. In this case the failure in the mock is correct.

Key points:

Tests that mock dependencies often appear to be testing implementation but they are generally testing behavior

Automated tests cannot replace QA¶

Making a claim that QA is not required can often be perceived as contentious, particularly when talking with QA professionals, however all evidence suggests that a traditional QA testing approach simply does not effectively provide assurance as to the ongoing integrity of software.

This was covered in detail in Chapter 2, the salient points of which were:

Manual testing doesn’t scale.
Manual testing requires humans, who make mistakes.
Engineers won’t test if they don’t have to.

As covered previously, this does not necessarily include UAT, which can be considered an essential part of the development lifecycle, but equally does not necessarily need to be conducted by staff internal to te project.

Key point:

UAT staff can be used for testing usability, but software tests can and should be automated.

Testing increases development time¶

This is a tricky one because it is difficult to confidently claim that the creation of tests does not increase development time. Clearly if an engineer has to write BOTH product code and test code they are spending more time writing code than they would be if they didn’t have to write tests. But “development time” is not just the time it takes to write the first version of the product. We have to create a new metric for this because this does not take into account the concept of value.

A software product that can be produced in record time but delivers no value is of little use. Likewise a product that delivers maximum value but is never actually delivered is equally of little use. Hence the goal should be to maximize the balance between value and time.

Typically product owners (company/customer stakeholders etc) will have well cultivated expectations of software quality. Products which exhibit many faults will often be rejected either by the group commissioning the work, or in many cases by the end users themselves as they seek other competing solutions. Mitigation of this can only be found in the resolution of these faults however there’s a problem with this and it relates to another key concept in software testing, that software faults increase exponentially with software complexity.

As the complexity of a software product increases the number of possible conditions under which a fault may occur increases exponentially. It’s fairly simple to see how this works:

If the average number of software faults for every “x” amount of code (use whatever metric you like here, lines of code, function points etc) is consistent, then one would think that doubling the amount of code would double the number of faults. But in reality the number of faults increases by MORE than a factor of two. This is because the new code written may cause new faults in existing code. As complexity increases the requirements of the original code are changed. A function that at one time had a very simple task to perform now has many more things it needs to do in order to satisfy the increased complexity being introduced.

Conversely if the original code was covered by tests any changes made which would have lead to a fault being introduced into that code would have been caught (by the tests) and later time spent diagnosing and resolving this fault will be saved.

The other somewhat more subtle, but arguably more powerful reason that testing increases development time is more-or-less a fallacy, is in how testing changes the way code is written. Code that was not written specifically with testing in mind is often difficult to test. In many cases either the original code needs to be re-written simply to make it “testable”, or the tests that are created are unnecessarily complex because they need to work around assumptions that were made without testing in mind. Not injecting dependencies is a classic example of this. Code that is written with tightly bound dependencies can be extremely difficult and time consuming to test because the dependencies need to be orchestrated simply to execute the code under test.

Key point:

Testing will actually reduce the overall development time by mitigating complexity