Friday 18 March 2016

Better Unit Test Structures

I have issues with the conventional structuring of unit tests in .NET. 

The prevailing convention follows the idea of having (at least) 1 test class for each class to test, with at least 1 test method per method in the class under test. Some folks get very particular about how test methods are named, claiming that there's a naming "standard". The reality is, there is no such standard. There's convention. And sadly, the most common convention is a poor one. 

In this post I'll attempt to show you a different - better - way of structuring your unit tests, along with the benefits of why you should change. I'll be using the example of a simple CRUD (Create, Retrieve, Update, Delete) repository. You can download the code for this blog entry on Github


The Example

In this example we have a simple POCO model with a couple of fields. Some of the fields are decorated with data annotations for validation. 


POCO model

The POCO object interacts with a simple CRUD repository which accepts a generic type. Some fairly straightforward stuff.  


Simple CRUD repository


Even though this is a simple example, it's a pragmatic one. All professional .NET developers have seen/written/maintained a CRUD repository in some form or other. 


Current Convention

Possibly the most popular convention when it comes to naming tests in .NET is
 
<MethodName>_<Scenario>_<Expectation>

The idea being that when you're testing your class, you prefix your test methods with the name of the method you're testing, followed by the feature/scenario you're testing, followed by what the expected result/state is. 

Sounds about right, yeah? Seems legit? The resulting test class for the repository could probably end up looking something like this. 


Handful of methods from the RepositoryTests test class


While this is a handy convention it comes with some problems (in no particular order): 


  • It assumes - even promotes - that you have one test class for the entire class under test. This can lead to gaps in test coverage, especially when it comes to large or complex classes. Not cool at all. 
  • It becomes easy to hide code smell under tons of tests in a single test class. It looks like there's good test coverage of the class but it's not that easy to see whether you've covered all possible/realistic scenarios. It's just too much to absorb at once.
  • Related to the point above, it doesn't intuitively reveal which tests you should be writing. Again, it's because there's too much to absorb at once. Having a single test class means you need mental discipline to focus on writing a test for the method you're interested in as in the back of your mind you may actually be thinking about how to write tests for the entire class. Writing tests for the entire class usually means thinking about every execution path in that class, exceptions thrown within/by the method or by dependencies in the class, etc. This is counter-intuitive to the point of unit tests - we write tests to test methods, the unit in unit tests! When you're thinking about testing the class, even if it is just in the back of your head, your focus is in the wrong place. 
  • It gets unwieldy pretty quickly - even our simple repository example is already over 200 lines of code in a single test class and we haven't really written anything more than some trivial tests! Can you imagine how huge this test class could grow in real world, production quality software?
  • Prefixing each test with the method name, while necessary when using this structuring technique, is a pain in the ass. It's repetitive, boring and feels like redundancy. Who wants to write boring, repetitive code? Nobody. 
  • And finally - the underscores - they break readability, not improve it. In .NET, namespaces, classes and methods are not recommended to be named with underscores. This means that when reading test names, it's unlike reading any other code. That's a mental break and gear shift you don't really need however minor. Especially when you're in the zone. 
It almost seems that when it comes to unit tests, the usual care and attention to detail goes out the window. Why? It's not different code. Why should we treat it differently when it comes to the names of methods or the size of classes? 

All in all, this usual convention can get messy. Messy is bad. Messy is annoying. Messy doesn't inspire confidence or make you want to write tests. Messy is anti-TDD and leads to tests being added as an afterthought. Bad. Bad. Bad. Tests should come first - without writing tests first, you may end up writing code which is impossible to test. Or re-writing code just so that it passes the test - a dangerous road to lost productivity, lost time and shortcut hacks. I know. I've been that guy. 

Writing tests should feel natural. It should be fun. You should be writing tests not only because it's the right thing to do, but because you actually want to. 


The Alternative

I picked up on this technique from a Phil Haack blog entry a few years ago and have been using it ever since. I've refined it a little since, making it even easier/cleaner/more beneficial. 

If I could distill the technique into a single sentence, the basic gist of it would be to have a test class per method

I've taken it one step further and don't use nested classes or any inheritance within my test code. I have a separate test file per method, grouped in a folder named after the class under test. 


Test class per method FTW

This comes with a number of benefits:

  • Grouping in a folder by the name of the class under test helps keep namespaces sane. 
  • It's easy to navigate to the set of tests you're most interested in.
  • It makes it easier to add new tests to the method(s) you're interest in - simply append a new test to the end of the file. No need to navigate through a huge test class or nested classes. No need to try find where your Create_xxx test methods end and your Update_xxx test methods begin.
  • When you open an individual test class it makes it easier to see which tests you still need to write - have you covered all input parameters? Have you considered exceptions thrown by the method? etc. etc.
  • It helps keeps the size of the individual test classes more manageable.
  • It helps you focus your thoughts on how to test a particular method. Your focus is on testing the method, not the class. This is a subtle, positive reinforcement of unit testing/TDD.
  • Common setup code/utility methods/helpers/etc. can (and should) be made available through separate classes within your testing namespace instead of creating inheritance structures. This promotes code reuse and keeping stuff DRY.
  • You can easily group related tests together in subfolders of your test class folder. For instance, you may want to have integration or performance tests to compliment your unit tests. Grouping them this way keeps your tests neat, clean and low maintenance. Adding new test types and where to keep them becomes intuitive. 

Intuitively support multiple test categories


  • Splitting our tests into individual files/classes also shows up where we may have gaps in our test coverage. For instance when we take another look at our Create tests, it's easy to notice that although we're testing for validation failures from the repository on the Name field of our model, we don't have any test coverage for our Id or Quantity fields. Gap spotted and simple to correct - append more tests. 
Spot gaps easily


The Naming Situation...

Having a test class per method makes the MethodName prefix of test methods instantly redundant. Score! Less repetitive code to write. That still leaves us with <Scenario>_<Expectation> portion. These are still useful so we'll kinda keep 'em but that underscore has to go

My rationale is based - in small part - on the idea that we've already moved away from conventional unit test structures. So I'm kinda thinking we needn't remain bound to the same naming either. The real reason is because I find the underscore in test names to be really, really annoying. 

A quick search on unit test naming will spit out a few links, including this post from StackOverflow on Unit test naming best practices, Roy Osherove's Naming standards of unit tests (also linked to from the SO post), and DZone's 7 Popular Unit Test Naming Conventions. A cursory glance at these posts shows that the underscore is way popular in unit test names.

My preference is to use the "Feature to be tested" convention. It compliments the folder/file structure and gives you the option of having succinct test names which still convey their intent. For instance our Create_ValidModel_WillSaveRecord method could now be named much more succinctly as WillSaveValidRecord - it's already in our Create test class. Looking at it from a higher level you could read this with a pseudo qualified name as "Repository.Create.WillSaveValidRecord" which is pretty much exactly what you would expect the happy path of your code to do. 

Also, if you use NCrunch (and you absolutely should) grouping your tests by Namespace, Fixture, Test in the Tests window shows that short, succinct names actually improve the readability of your tests without sacrificing their intent.


NCrunch Tests Window
It becomes even more valuable when you have a solution with thousands of tests where you might only want feedback on failing tests. You have instant information on which area of the code is failing, exactly which tests are failing and why they're failing. 

Although that's more of a feature of NCrunch, the way you structure your tests compliments this feedback loop very, very well. This reduces the time you spend in code/compile/debug cycles, making you more productive in less time. I'll hopefully get around to publishing a post on why NCrunch is awesome. 


Failure feedback



Conclusion

While there may be some who will read this and consider it patently wrong the point of unit testing/TDD is not to enforce strict rules, it's to support the writing of high quality software rapidly. This structuring technique works very, very well and provides many more benefits than more vanilla/traditional conventions.

Conventions, while great for establishing some commonality, should not be confused for rules or standards. The world of software engineering is constantly evolving. How we go about developing our software needs to evolve with it.