Kill Your Darlings – Why Deleting Tests Raises Software Quality

“In writing, you must kill all your darlings.”

William Faulkner

Testing is often thought of as a “how long is a piece of string” activity. Whilst there’s no clear definition of just how many tests per point, and no ratio of coding:testing which makes any amount of sense (this is always contextual anyway), a common rule of thumb is: the more time you spend on testing, the more likely you are to identify bugs… right?

This article is here to challenge that preconception, and uncover times when too many tests, too much testing, might actually result in a lower quality release.

The Problem

We don’t know if there are bugs in the code/solution/platform. We don’t know that, when we give it to our customers, the software will behave as we would like. We don’t know every permutation of behaviour, interaction or sequencing our code/solution/platform is likely to encounter. We don’t know it’s going to work.

The (Proposed) Solution

Throw everything at it. Everything we can think of, the kitchen sink (in multiple styles and finishes, with and without mixer taps, with and without a plug-riser…). Test in as many ways as we can think of. Test in all the ways we can come up with. Test each unit, integration, feature (with automation and manual tests). Perform UAT, OAT, Security, Accessibility, Performance tests. Think up some new styles of testing. Test test test. Then test more.

Sometimes in testing… less is more

Yeah but…

You’ve got to release this software someday, you know.

Also, you won’t think of every thing to try with the software. Sorry. You just won’t. Additionally, should you try, your company will likely go under waiting 5-10 years for every release. There is simply too much to test to test everything, all the time. And this is especially true of regression testing – when you add a new feature, this attitude to testing says “test everything again”, because you never know, right? There’s change in the system, well, that’s invalidated everything. Not just the stuff we testing we think it’s invalidated – E V E R Y T H I N G!

Writing unnecessary tests is a waste of your time, your team’s time, your business’ time. It consumes your creative juices, lulls you into a false sense of security, obscures meaningful data with noise. It’s not just a waste of time, it’s actively damaging the quality of your software. And you need to stop doing it.

This style of testing is also incredibly tedious, very expensive to maintain (imagine updating thousands of tests every time you make a change… time you could spend learning about your software and avoiding this situation in future), very expensive on your testers’ morale. It’s unsustainable. And it’s not the only option.

The Painful Truth

Some of your testing is waste. Sorry.

The Answer

There is a solution. By building an understanding of the risk – that is, realised risk – in our system, we get better at assessing what we really need to test for. Some of this is really easy: if you run the same test every release, and always get a pass or a fail, it’s not providing any information and can potentially be stripped out. Additionally there’s a good chance some of your tests in feature development really add nothing to your understanding of the product, and get run “just because”. They don’t have a possibility of failing, and a test is only adding value if it risks failing. Learning which tests to delete, and which tests never to write again, can really boost the amount of productive testing you do.

How does one go about building this “risk assessment” skill? It’s contextual and depends on the product. One way is by getting closer to Support, and the bug triage/prioritisation process. By seeing which bugs are actually causing pain to customers, one develops a sense of what’s really a problem, and what’s more minor. Another is by getting closer to developers, and asking questions about the downstream impact of code changes. Developers may have a great quality focus and provide this up front, but it never hurts to ask more questions. What haven’t they considered? This is just good testing practice and very valuable in making assessments of risk. Another thing to consider is, what have the developers already covered with unit or integration tests? What testing is just duplication at the point a tester sits down with a mouse and keyboard?

Building your skill at assessing risk requires regular practice

It’s fair to say the skill of assessing risk is like a muscle, developed by use. If you never do it, you can’t expect to be any good at it, and should make your first decisions on a smaller scale to avoid exposing the product to too much risk. Trying to find one test you can kill may be enough at first – next sprint, find two, next sprint, look for three.

Testing vs Checking rears its head again here, too – something you expect to happen, happening, is not a test. Checking has a place in good quality practice, but it’s sensible to use it mindfully rather than pretending a check is testing something. Checks are binary, pass/fail. A test can uncover something unexpected. It can provide new information beyond a simple tick or cross – exploratory testing is non-binary in this way, and provides teams with information about more than just “what we thought to look for”. This is a great way of adding value to testing and reducing waste.

Another critical dichotomy in this conversation is Risk vs Fear. Acting out of fear (“if this goes wrong I’m in trouble!”) rather than from a consideration of risk (“this is likely to have gone wrong for x reason”) motivates a lot of wasteful testing. Working from a basis of understanding risk in your business context makes it a lot easier to pick out what’s a valid test, focused on something your business actually cares about, rather than something unimportant.

The Problem of Regression

Regression packs (automated and manual) are a particular pain point for this kind of thing, and indeed the reason I’ve started thinking about this at all. A large and largely static regression pack is a nightmare. Tests were added several years ago, perhaps by people who don’t work here anymore, perhaps for reasons now unclear – meaning we err on the side of caution and keep running the tests “just in case”, because we can’t know the impact of not doing so.

Let’s not forget… regression bugs are things we didn’t anticipate introducing into our software. They’re an unexpected consequence of a change. They can be avoided like any other bug, found in development like any other bug… and yet we have this whole form of testing which is often run independent of any development work. We accept regression bugs because we’re not prepared to think through the consequences of our work to their full extent. We’ve gotta keep delivering! It’s worth challenging this attitude. We can get better at identifying risk vs fear. We can pick which tests to run every time, which to run when we have doubts about an area, and which we don’t need to run at all anymore. And doing so will reduce the overall number of tests run, whilst building the understanding of our software required to assess risk properly.


Of course, when something in the software changes, many of those tests will need updating too. That cost of maintenance is something we just have to accept when we don’t understand enough about the tests or the software they test. The good thing is, we have a fairly straightforward measure of information provision when it comes to tests: if they always pass (or, whisper it, always fail), they’re not giving us information. They’re providing nothing to our ability to assess the quality of our software. Oh sure, we might like to give a feature some “bedding in” time, we might accept a test passes for 6 months before failing – but if a test has only ever yielded one result for 2 YEARS?! That test isn’t giving you information. That test is giving you noise. That test is giving you maintenance overhead. That test is ready to be deleted.

Is that always true? Obviously, context varies and if you have certain core workflows which are absolutely essential to your product (I’m thinking… things where you’re going to get a phonecall at 2am if it goes wrong), you may want to maintain some testing in that area, even if it’s fairly stable. I’d advocate stripping this testing down to the essentials – perhaps a few “canary” tests, a few exploratory test charters, a few automated tests are enough?

Finally, to be clear, there’s a difference between deleting tests, and deleting the data tests hold. In most test management tools, one can remove a test from future execution plans without actually deleting the test. You might actually want to kill an outdated test which is providing no information… and you may just want to decommission it somehow, so it won’t be run “automatically” next time.

The Challenge

I challenge you to do some digging and figure out what that waste looks like. In the process of doing so, you also build an understanding of what testing is NOT waste in the context of your product, your dev culture, your environment.

I found a really nice summation of this whole idea in an article about unit tests, by Chairat Onyaem. I agree with every word here, and apply it to both manual and automated testing in any form – at a certain point, more tests are just noise:

2019-04-16 10_54_47-Why Most Unit Testing is Waste — Tests Don’t Improve Quality_ Developers Do.png

How do you assess risk in your software? How do you reduce waste and cut your focus on fear? How do you go about assessing if a test is providing information or noise? Let me know in the comments!

Do Testers Understand Testing? – Jeff Nyman

Why Most Unit Testing is Waste — Tests Don’t Improve Quality: Developers Do – Chairat Onyaem

One thought on “Kill Your Darlings – Why Deleting Tests Raises Software Quality

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s