CommentReader - Place your test data next to the test code

Published: 2012-04-15 by Lars Thorup and Sune Gynthersen  codetesting

Have you ever faced the problem of writing unit tests which relies on textual test data? This is a classic issue where you usually end up putting the test data in a string variable or in an external file, depending on the amount of text. Neither of these options are particularly elegant. In this article however, we will show you how we came up with a more developer-friendly approach.

In our current project, we are writing code that manipulates source code. Of course we like to write unit tests, and these unit tests often needs to use some sample source code as parameters to the method under test. Consider a function that extracts all string constants from a source file. The test case should take an input source file like the following:

class Sample
{
  private string s = "some value";
  public string Method()
  {
    return "Here is " + s + "\r\n";
  }
}

The result of calling our function should be this list:

{"some value", "Here is", "\r\n"}

To write this unit test we will need a way to represent the input source file. Two options comes to mind immediately:

  1. Use an external file for the source and load this file from the unit test
  2. Encode the file a string constant inside the unit test

Using an external file looks like this:

[Test]
public void ExtractCommentTest1()
{
  string fileContent = File.ReadAllText(@"....DataExtractCommentTest.cs");
  string[] comments = ExtractComments(fileContent);
  Assert.AreEqual(3, comments.Length);
}

The problem with using an external file is that it makes it harder to read and modify the test case, because you need to look in two different files, both the unit test itself and the external file, not to say that you have to remember where the external file is located.

Using a string constant inside the unit test to hold the file looks like this:

[Test]
public void ExtractCommentTest2()
{
  string fileContent =
    "class Sample\r\n" +
    "{\r\n" +
    "	private string s = \"some value\";\r\n" +
    "	public string Method()\r\n" +
    "	{\r\n" +
    "		return \"Here is \" + s + \"\\r\\n\";\r\n" +
    "	}\r\n" +
    "}\r\n";
  string[] comments = ExtractComments(fileContent);
  Assert.AreEqual(3, comments.Length);
}

The problem with using an embedded string literal is that most programming languages lack adequate support for file-like string literals (Pythons tripple quoted strings being an exception). In C# you need to add explicit carriage-return-line-feed and - worse - you have to escape all double quotes and backslashes within the file.

But here is a third option:

Using a comment could look like this:

/// <code> 
/// class Sample 
/// { 
///   private string s = "some value";
///   public string Method()
///   { 
///     return "Here is " + s + "\r\n"; 
///   } 
/// } 
/// </code>
[Test]
public void ExtractCommentTest3()
{
  string fileContent = CommentReader.GetElement("code");
  string[] comments = ExtractComments(fileContent);
  Assert.AreEqual(3, comments.Length);
}

The CommentReader class is coded to take advantage of the XML-file that the C# compiler can generate from the comments when compiling the source code. To support the ability to have several different embedded files in a single unit test you surround each file in an XML-element, and use the element name to lookup the right embedded file.

Triple comment markers are quite easy to use for this purpose, because Visual Studio will automatically insert the markers whenever you press [Enter] on your keyboard, making it easy to write the embedded file. It is also fairly easy to paste code from elsewhere; you would then need to add two sets of normal comments (Ctrl-K, Ctrl-C, Ctrl-K, Ctrl-C) and then removing the first column (Shift+Alt+arrows, Delete).

If you employ continuous integration you will need to make sure that the generated XML-file is available on the integration machine; however this will normally be the case, since the integration machine has just compiled the source code before running the unit tests.

Embedding the file in comments within the source file avoids the problems associated with the other two options: Now the file is located right beside the unit test referencing the file making it easy to read and manipulate. And there is no need to escape anything (except minimal XML escaping), because all types of code can be used verbatim.

The need to reference complete files as input or expected output in unit tests is not restricted to implementations of programming tools. Any program that works with data files, configuration files or other types of files will need unit tests that test the parsing and generation of these files.

An implementation can be found here: CommentReader.cs

Getting Started

  1. Setup Visual Studio to output an XML documentation file (Project | Properties | Build | "XML documentation file")
  2. Suppress warning '1591' to stop the compiler from warning you that every method should have an XML summary
  3. Start writing unit tests!

This blog post was originally written in June 2006 in collaboration with Sune Gynthersen, BestBrains, and now rehosted here at fullstackagile.eu.

Discuss on Twitter