A test plan review with a colleague yesterday left me kind of baffled. Test Rationalisation is a great tool in capable hands, reducing waste and diagnosing problems as rapidly as possible, providing a shorter feedback loop and reducing effort to the minimum.
Whilst I’m all for using the leanest viable approach and cutting waste where necessary, sometimes being too concise can lead to confusion – in the case of test cases, this can lead to tests no longer testing anything much at all.
We are testing a new count included in an export.
- Types A, B and C count as 1.
- Also, multiples of A, B, and/or C count as 1 (eg AB, BC, ABC).
- Types X, Y and Z don’t contribute to the count, and no combination of X, Y or Zs count.
- A combination of A/B/C and X/Y/Z is systematically impossible.
My colleague was busily setting up the scenarios to create types A, B, C, X, Y and Z. They also had two multiples: AB, to test this was counted once, rather than as two separate counts, and YZ.
Assuming everything goes to plan, they should get a count of 4 – that is, A, B, C and AB.
See the problem?
One possible interpretation of the count 4 is everything worked. 4! We passed the test! We have working code! We all get a free car and two months paid holiday!
Another interpretation is that AB counted as 2, A counted, B counted, but C didn’t. 4!
Another is that X counted as 2, Y and Z each counted as 1, and A, B, C and AB weren’t counted. 4!
Another is that C counted as 4, and none of the others were counted at all. 4!
Another is that A counted as 8, B and C counted as -2 each, and AB didn’t count. 4!
Etc etc etc… the permutations are essentially endless. 4!
From this sort of “one hit”, “happy path” test, in this scenario, you really can’t tell whether the system behaved as expected – just that the test passed. Has the code worked? Both yes and no… we got 4, but we don’t know how.
Killing No Birds with One Massive Stone
It’s easy to fall into this trap, isn’t it? You think you’re being sensible by building contiguous test runs, typically in system areas where the process (for example creating scenarios and then performing a lengthy export) takes a lot of time. If we can hit multiple variables in a single pass why wouldn’t we?
The problem is without testing each of these rules individually, you really don’t know how each individual rule behaves. As it happens, the code paths for A, B and C are all completely separate, as are X, Y and Z, as are combinations thereof. Finding this out was as easy as checking with the developer who was working on the story. This sort of information is crucial in determining our approach.
So, the reason we shouldn’t be testing multiple variables in a single pass in instances like this, is that we do not have visibility over individual passes or fails as we go. Where we have distinct outputs for the various variables, there may be more argument for such merging of test cases – although there is a risk of cross-pollination here, that certain combinations are problematic – but a more sensible approach would be to check each rule individually before hitting these combinations as a further round of tests.
Just because testing something is painful, that doesn’t mean we get to skip it.
The Unhappy Path of Brittleness
If we assume things will work, we’re fundamentally failing to perform our job role. This is why “happy path” testing gets so much ire – happy path meaning we assume everything will work just fine and just set out to prove it, much like the test case at the top of this article. I mean, we got 4, right? Good enough?
But this leads to the other crucial issue here is that if something’s wrong, by approaching our testing this way we have no key to diagnosing the problem. Sure, we know something went wrong. But what? Say the count of our test case ended up being 6. Or (null). Or X. What caused that? Was it the behaviour of A? The combination of B and X? What are the reproduction steps? All we have is a blank “fail”, but no real sensible intel on why it failed.
By designing and executing more tests up front, we can avoid this situation entirely – we know A worked, B worked, but C? Well, C blew up the export. We have certain code paths which a dev can exclude as unproblematic, and we have the focus area ready to go. Our whole test doesn’t live or die by its weakest link; instead we have a number of smaller, more granular tests which have individual passes or fails.
Your test plan should be able to withstand bugs – it’s there to find them!
Chopping things down into smaller chunks like this increases visibility, reduces the overhead of diagnosis, increases confidence in system behaviour and, frankly, makes an indisputable amount of sense.
Granularity: Follow The Thread
Granularity is the key to all this. Things may well be as cut and dried as the example above, but often they are less clear. Where can we make rationalisations? Where can we cut or merge tests?
When I started my life as a tester I was forever making matrices, alternating common variables and testing lots of permutations in longer runs. This was, in all honesty, immaturity as a tester. I hit every major combination, right? That’s the same as knowing every part works? It’s a heuristic I rely on less and less, as I have experienced a world of pain in false passes, undiagnosable failures and other errors introduced by this coarse granularity in my testing.
As I have developed, as I have moved into more agile, multi-functional co-located environments, I have learned to just ask the devs I work with the damn questions. Are the code paths distinct (ie do I need to hit each component individually)? Is there a point where you can expose what’s being processed and how to me (sometimes these multi-threaded pathways can be broken down behind the scenes, such as by stepping through them with a debugging tool)? If not, the tests need to stay distinct, with a fine enough granularity that I can identify that each component variable is behaving as expected.