Top 5 Automated Testing mistakes: A tale of Dev Confidence

Confidence Scenarios

Scenario 1: Lacking Automated Testing leads to developer uncertainty

Software Developer Anna is new. It’s her second week and she’s finally gotten everything up an running. HR is no longer making her jump through all of the hoops. Her dev environment is all setup – she’s completed an initial change to the code base – and her first, somewhat, solo change is complete. She sends the Pull Request off for code review – but suddenly notices – she’s gotten an alert about some test failure. Her intro dev comes over to her, and after a quick discussion – she is told how to run the test locally so she can fix everything. Her mentor is encouraging – complimenting her on her quick work to that point. An excellent example of Automated Testing gone right. But not all of us are so lucky.

Meet Dan, he’s pretty much in the same boat as Anna. However, his unwittingly broken code made it through the test suite, if you could call it that. It’s also made it through code review and has just been deployed. Unbeknownst to him, as he sits there feeling accomplished – the website he now develops has just gone down. About 15 minutes pass when he starts to notice a bunch of chatter at the Architect’s desk. Curios, he investigates. As he walks over – the architect exclaims, “Got it, it’s Dan’s check-in!” Dan continues to observe as the architect backs out his change. Anxiety continues to rise as he realizes just how much of the website is down. After pensively refreshing the browser on his phone for what seemed the thousands time, Dan sees a bunch of high fives as the Architect exclaims, “Alright, it should be all up! Dan – let’s go talk about this change of yours.”

Sufficient automated test coverage is key to building developer confidence.

Sadly, the second scenario is all too common.  Instead of spreading confidence around – the development environment awards a lot of it to emergency response heroes. As for everyone else, the only confidence they have – is that their changes are going to break something until they too can become an emergency response hero. Sufficient automated test coverage is key to building developer confidence.

Scenario 2: Contentious Test environment

“Mike and Charles getting into it again?” John asks Elaine.
“Ya, something about testing private methods with reflection”, Elaine replies, quickly adding, “This is why I don’t like writing Unit Tests here.”
“Every time I try to write one, I get hit up in a code review that I’m doing it wrong.”, she quipped, “The senior devs can’t make up their minds. I know how to write tests for my code – but it’s not how they feel it should be”.
“I’ll keep that in mind – I’m going to talk to our manager about this.” John says as he starts to walk away.

Writing test code is not the same as writing production code. Yet, for those who write unit tests – it’s the same developer mindset between the two. Often developers are highly opinionated and unwillingly to move their opinions. But unit testing doesn’t have to be that hard. A well communicated standard around unit tests is required. But, it needs to be less restrictive then regular code. Causing devs to doubt the addition of a unit test leads to scenario 1. Build confidence by checking that devs are testing the code they wrote – but don’t care so much how they did it. The biggest exception to this is if testing performance suffers.

Scenario 3: Code reuse is not a friend

It’s the end of the day and Ed is adding some new functionality. He’s finished his code work – tested it locally quite thoroughly, and even built an automated test for it. He commits his code and waits for feedback. It doesn’t take long – 30% of the tests fail, but his passed before that happened. Ed start feeling quite discouraged. He did everything right, but now he has to spend the next half day working through the test failures. After this amount of time elapses, Ed realized his override of the base class persisted on a static variable. He cleans up his change and everything passes. Had his test been entirely encapsulated – a lot of the discouraging troubleshooting would have been avoided.

A great developer knows when Worse is Better

A good developer is familiar with DRY and Code Reuse. A great developer knows when Worse is Better. For PHPUnit testing, I’ve seen many practices of creating helper methods – and extension abstraction to create a ‘base’ test class. In all of these instances – the piece built to help cause frail tests. Doing things the right way for development is not the same as doing things the right way for testing. Some examples of ood reuse are mock objects. Some examples of bad code reuse are helper methods that perform actions – and – class extensions of PHPUnit or CodeCeption classes.

Scenario 4: Missing the point

Alex is looking through George’s code review. The functionality of the code looks great – there are even some well built unit tests – they cover every method. Alex can see that the test harness already ran and passed and goes on to approve the code review. Time passes. Jenny is modifying the code that George wrote. She doesn’t know much about the code, but she’s used to trusting the Automated Test suites to keep her safe. She makes some changes, checks them and after they pass and make it through code review – she remarks to herself how glad she is the Tests are in place. Once again, a website goes down and her confidence is shaken. What Alex and George failed to notice, is that even though all methods were built out as ‘tests’, and there was 100% code coverage, the tests were only checking that a return occurred. They didn’t care so much about the content.

This is a simplified scenario, but it is quite frequent. Often, devs will focus on the quantity of the test, instead of quality. Or they will conflate quality with ‘live’ code standards as in Scenario 2. A false sense of security from poorly built tests that always pass is missing the point. The point is to fail fast and fail often. Keep things that aren’t fully understood from going out the door. Prevent unintended consequences. The point of the test is there is an input and an output with something in between. Code coverage is great and all – but outcome coverage is better.

Scenario 5: False Positives

“Hey Lacey, why so down?”
“Oh, hi there Steve – looks like the pipeline is blocking me again.”
“Again? Really? What is it this time, Martians?” Lacey slightly smiled at that – Steve was always trying to cheer people up.
“No”, she replied, “It’s the functional tests. They take forever to run, then they fail for non-issues. I don’t know if I can even trust them, they fail so often on false positives.”
“Yeah, I know what you mean.”

Lacey and Steve are running into the problem of false positive fail* cases. Often these are environmental. They largely stem from ‘catch all’ style testing. Instead of building out tests at the functional, then the integration, then the GUI level tests** – someone has decided an end to end test is sufficient coverage. While End to End tests can be a great catch all for problems. They are expensive to build, more expensive to maintain, and downright frustrating to troubleshoot. They lead once again to Emergency Response heroes.

*(Scenario 4 also covers false positives but in regards to passing when they are supposed to fail)
** (see the Testing Pyramid)

Confidence Unleashed

Testing is an incredibly powerful weapon in a developer’s arsenal. Most recognized for keeping ‘live’ code safe from bugs. It’s hidden Easter egg feature is developer confidence. Building a great unit test suite doesn’t drive business value directly. But, I suspect – given a study of Dev workspaces – those with good testing environments lead to improved developer confidence. This in turns leads to increase productivity. Less bugs, improved productivity? A development team would be wise to implement good testing practices.