Simple Programming Tip #2 by Charlie Calvert
Copyright © 2003 by Charlie Calvert
This programming tip describes an infrequently mentioned set of
benefits derived from using testing tools such as JUnit for Java or DUnit for Delphi. More
particularly, this article shows how to use testing tools to help you
create small, robust, reusable modules out of which complete programs can
be created. Metaphorically, the image to keep in mind mirrors the process
you go through to create a model out of
LEGOs. In the ideal architectural model championed in this article,
each class
you create should be small and robust, just like a LEGO piece. Using
these classes should be just as simple and reliable a process as building
a model with LEGOs. Of course, in the real world things won't be quite
that simple, but that is the ideal put forward in this article.
Before starting the main body of this article, let me say a few
words for those who have not yet started writing test code. My
favorite tools for testing are JUnit for
Java and DUnit for
Delphi. I use Ant for Java
or Want for Delphi
to tie my test suites together so that I can run large numbers of test
at one time. C# developers use NUnit and some C++ programmers use CppUnit.
For further information on the basics of creating test suites, follow
the links found in this paragraph, or see my Java article on using JUnit.
Please remember that this article is not a broad
description of testing in general, but rather a discussion of one
particular benefit I believe that testing can produce. The main benefit
to be derived from testing your code is increased robustness and
reliability. I'm taking that much for granted, and instead going after a
related subject which I hope some readers will find useful or
informative.
The code and theory shown in this tip applies equally to the C++,
C#, Delphi and Java languages. I have striven to keep the text found
here as short as possible, so that you can read this tip in just a few
minutes.
A Modest Proposal
The controversial formulation used by the most "extreme" advocates of
testing suites runs as follows: "Before you write a new class or
method, first create one or more routines for testing it." I'll
confess that I rarely follow this recommendation to the letter, but
nevertheless I like the spirit it fosters.
Suppose you are given an assignment to write code that will open a
file, parse its contents, and then return certain pieces of information
from that file. According to the radical proposition listed in the
previous paragraph, the first thing you should do is write one or more
tests. That is, you first write code for testing your real code, then
you write your real code and run the test suite to see if it works.
Such a proposition goes against the grain of most programmer's
instincts. If we have a job to do, we want to first write code that
accomplishes the task before us. We want to go away to that quiet place,
pound out the code that will dazzle our peers and impress our
superiors, and return in a matter of hours covered with glory. But our
radical formulation of the testing theory says that we should first
write code that will test the code that we are about to create. To most
of us, this seems like an impediment, like a step backwards rather than
a step forward. But let's take a moment to consider why testing of this
kind might be useful.
Create Small Easy to Test Classes
The argument in this tip is simply that writing lots of tests
can help you create appropriately granular, easy to understand, robust
classes that tend to be easily reusable. There is no room in this tip to
explain in detail why I think such classes are worth creating.
Furthermore, there is nothing in creating test suites that forces you to
produce classes of this type. Nevertheless, if you agree that such
classes are indeed useful, I will try to show how writing test suites can
help you create code that adheres to this often praised architecture.
We all know what it is like to have a huge body of code, and to
suspect that a problem is in a particular part of it, and to find that
the only way to test the suspect portions of the code is to initialize
the entire program so that you can correctly call the code you want to
test. After launching this monolithic program, you can no longer be sure
that the problem is indeed isolated in the suspect portion of the code.
For instance, the error might be an artifact of memory corruption
problems encountered elsewhere in your code.
It is my contention that writing test suites can help you avoid the
problem described in the previous paragraph. Once again, test suites
don't force you to avoid monolithic architectures. It is my contention,
however, that they can help you avoid creating such programs.
The first thing you discover when you start writing test routines
is that you must create a second program, much smaller than your main
program. This second program will hold your test suite.
When writing this smaller test suite, it is only natural that will
want it to be capable of testing just one small portion of your program.
It is very difficult, and not particularly useful, to try to test an
entire program all at one time. Instead, you want to be able to test
small, easy to understand subsections of your main program. This simple
fact can have some fairly interesting ramifications. In particular,
your desire to create simple test suites can strongly encourage you to
break your program up into small, well designed classes. The end result
is a program that is relatively easy to understand, and relatively easy
to maintain.
It is precisely at this point that you can see the wisdom of writing
your test suite first, and then writing your actual code second. If your
first priority is writing a test suite, then you will naturally want to
craft classes that are easy to test. In short, you will want to create
small, modular, classes with few dependencies. If you write the code for
your main program first, then you won't feel quite so strongly compelled
to write small, modular classes. Instead, your priority will be to write
code that adds a new feature to your program as quickly as possible. Such
code is often monolithic in nature. It is simpler to stuff your new
feature into the current class you are working on, rather than creating a
new class that is more modular. In short, thinking about testing your
code helps you think in a modular mode that leads to robustness and reuse.
Conversely, thinking in terms of adding features to your current program
can encourage you to write large, monolithic programs that are hard to
maintain. Therefore, it is arguable that writing test suites encourages
you to pursue good programming practices such as writing small, modular,
easy to reuse classes.
A More Concrete Example
Suppose you are creating a class that is designed to calculate the
time the Voyager spacecraft will take to travel between various planets
in our solar system. Suppose further that the data about the distances
between planets needed for making those calculations is stored in an XML
file that you have been asked to parse. More particularly, suppose you
are working with a class that looks something like this:
public class MyPlanetTravelTimeCalculator
{
private:
Double CalculateDistance(String PlanetA, String PlanetB)
public:
Double CalculateTravelTime(String PlanetA, String PanetB,
int Speed)
Double CalculateTravelTime(String PlanetA, String PanetB,
String Planet C, int Speed)
Double CalculateTravelTime(String PlanetA, String PanetB,
String Planet C, String Planet D, int Speed)
}
Your test class, on the other hand, is located in a different
module, and looks like this:
public class MyTestSuite
{
public:
void TestCalculateDistance()
}
As hinted at earlier in this article, your job is not to calculate
the time for traveling between the two planets, but instead to write the
single private routine called CalculateDistance in the class
called MyPlanetTravelTimeCalculator. The CalculateDistance
routine will need to open up an XML file, and retrieve data about the
distance between two planets. Such a routine is not difficult to write
or test. But it does have several points of failure, such as locating
the file, getting the rights to use the file, properly parsing the
file, and properly retrieving the correct data from the file. Writing
a test suite encourages you to careful check each of these features to
ensure they are working properly.
As mentioned above, you want to create a small, simple test suite
that deals with a small, and relatively isolated portion of your code.
By thinking this way, it should be obvious that you will be better off
if you create a separate class designed to parse the XML file and
retrieve the sought after data. In short, instead of working with two
classes, you want to be working with the following three classes:
public class MyPlanetTravelTimeCalculator; // Shown above
public class MyPlanetDistanceCalculator; // Contains your code
public class MyTestSuite; // Contains test code
As you can see, the very idea of creating a test suite has encouraged
us to adopt good programming practices. Creating a separate class for
each task in your program is the right thing to do, and the very act of
creating a test suite has encouraged you to engage in this practice. If
you did not create the test suite, you might tend to stuff the new
functionality you are creating into the
MyPlanetTravelTimeCalculator class. This would indeed be the
easiest way to proceed, but not necessarily the wisest.
Of course, there will be those who think it is a waste of time to
create a separate class for opening an XML file and parsing out the data
defining the distances between planets. In programming, however, things
are rarely as simple as they seem. For instance, calculating the time for
traveling between two planets is a non trivial task. Both planets are
moving around the Sun at different rates, and both planets will be in
different positions at different times of the year. When looked at from
that perspective, suddenly the calculations made in the
MyPlanetTravelTimeCalculator no longer seem so trivial.
Furthermore, you don't want to burden that class with having to worry
about the validity of the data it is using. Instead, you will most
sincerely want to create a separate class for calculating the distance
between planets, and you will be glad for a test suite that ensures that
this secondary class that you depend on is absolutely foolproof. You have
enough troubles without having to worry about being fed invalid data.
Working with Layers
Savvy readers might note that the MyPlanetTravelTimeCalculator
class described in the previous section is more "monolithic" than the
MyPlanetDistanceCalculator class. In particular, the former class
depends on the latter class and can't be tested on its own.
Having a dependency one level deep is not a serious problem in terms
of initializing or testing a class. However, experienced programmers
know that these dependencies can grow over time, until finally one is
working with a class that has dependencies six or seven levels deep. In
such cases, it is hard to see how testing suites can help you keep your
program modular.
The danger of deeply nested hierarchies is discussed
very well in Item 14 of Johsua Bloch's excellent book entitled
"Effective Java."
Test suites can help you recognize such deeply nested dependencies, and
can encourage you to refactor your program so that it is not quite so
monolithic. In particular, if you are creating a test suite, and find
that your deep nesting of dependencies is making it hard to create a test
program for your classes, then you can see a problem emerging and attempt
to fix it. In particular, it is best to strive to keep dependencies
shallow, that is, to never create hierarchies more than three classes
deep.
However, in the real world, despite our best efforts, there will be
times when deep hierarchies must be created. In such cases, the right
thing to do is to break out your code into "layers." Then you should
write test suites that "prove" the correctness of a particular layer.
Once that layer is shown to be valid, you can build on top of it with
confidence.
All programmers have experience working with such "layers" of code.
For instance, Delphi programmers build on top of a "layer" of code
called the VCL, which is in turn built on top of a layer of code
comprising the core features of the Pascal language as it is defined by
the compiler and the System unit. Java programmers typically build on
top of J2SE. In fact, Java programmers typically build on top of
multiple layers. For instance, J2EE is dependent on J2SE, and tools
like JaxB or SOAP are built on top of the core Java classes.
If you have a well tested layer of code, then you need not even
think of it as a dependency that needs testing. For instance, C++
programmers that write a class which uses the STL don't usually need to
think of their code as being nested several layers deep. That
programmer can "assume" that the STL is solid, just as Delphi
programmers "trust" the VCL and Java programmers "trust" the core
classes in J2SE. Just how well founded our trust and assumptions may be
is fortunately not a question I have room to consider in this article!
The lesson to learn from reading this section is simple. Try to create
simple, modular classes that are easy to test. Do everything you can to
create simple architectures of this type whenever possible. However, if
you must create deeply layered hierarchies, then try to find ways to break
your code up into layers that can be throughly tested on their own. For
instance, if the IO operations to retrieve data from a set of XML files
ends up being quite complex, then separate all those operations off into
their own "layer" and test them thoroughly. Then you can write code that
uses that layer without having to worry about its validity. The Java
language with its built in support for packages
and jar files is particularly conducive to this kind of architecture.
C++ and Delphi programmers can achieve similar functionality by creating
components or DLLs.
NOTE: A word needs to be added here about different
kinds of dependencies. Consider, for instance, the hierarchy found in
the Java ArrayList class:
java.lang.Object
|
+--java.util.AbstractCollection
|
+--java.util.AbstractList
|
+--java.util.ArrayList
Here we can see that the ArrayList class is "dependent" on
three other classes, in that it has a fairly deep hierarchy. However, if
you are familiar with Java and with this particular class, then you
know that from the point of view of a test suite, the ArrayList class is very
straightforward. It is easy to initialize, and easy to test. There is
little need to worry about the validity of simple classes such as AbstractList of AbstractCollection. Problems with
those classes could be solved either by the tests written for those
classes, or in the tests you write for the ArrayList class. In fact, this
example helps to illustrate why I think writing test suites helps you
create modular, properly designed classes. The fact that the ArrayList class is so easy to test
is in a sense a "proof" that it is properly designed. This class is
modular, reusable, and easy to maintain. In this sense, it is a "layer"
on which you can safely construct other classes. My point here is that
you don't need to worry that the hierarchy of this class is four layers
deep. The "proof" that this hierarchical depth is not a problem is
demonstrated by how easily you can test it with JUnit. In other words, it is not
always the depth of the hierarchy with which you need be concerned, but
rather the ease with which you can test a particular class.
Summary
As you recall, this article is an examination of the following
controversial statement: "Before you write any new routine, first
create a routine for testing it." When I first heard that
statement it sounded a bit too fussy, a bit too compulsive. But after
considering the problem for awhile, I saw that there might be more to
the proposition than I at first supposed. I still don't follow the rule
to the letter, but I think it helps point us toward a very valuable
set of programming practices.
In particular, we have seen that writing test routines can help us
achieve some worth while goals:
- Writing test routines strongly encourages us to separate our solution
to a problem into a very modular class that solves one, simple problem.
Limiting the problems tackled by any one class is a good programming
practice that can help in surprising ways. In particular, it helps ensure
that any problems with your code is not masked by a problem with other
code in your program.
- The act of writing a test suite encourages reuse. By writing a test
suite, we proved that our code is built so that it can be used in two
places: our main program and the
test suite. In short,
encouraging reuse is a benefit that tends to emerge automatically when you
write test suites. It is, as they say, an emergent property of test suite
writing. Again, anything that encourages reuse is good programming
practice.
- By isolating our problem so that it could be tested, we simplified it.
Though I did not stress this point in the article, it is nonetheless
centrally important to my overall theme. Test suites encourage us to
write simple, easy to test classes. Simple classes are easier to
understand, easier to analyze, easier to trouble shoot, and easier to
maintain than big, unwieldy classes with lots of dependences and multiple
uses. In particular, remember how easy it was to think of weak points in
the MyPlanetDistanceCalculator class once we broke it out into its
own discrete problem.
Besides these benefits, I should add that the act of creating a test
suite can get our creative juices flowing. It forces us to think about
the kinds of problems we are likely to encounter. When testing a class,
we tend to think about what can go wrong for a user of a class. Instead
of looking for an expedient solution to the problem that we can plug into
a bigger program, we instead think about what could go wrong with a
simple, easy to understand class. This kind of thinking can help us find
ways to write robust code.
After considering these benefits of creating test suites, it is perhaps easy
to see why some people endorse this technology. Of course,
there are many other reasons to write test suites than those discussed
in this article. But I believe the benefits discussed here are important in
and of themselves.