Blue Flower

Test Driven Development (TDD) is method from Extreme Programming (XP) that reverses the traditional order of coding and then testing. It works like this: Before you add a feature to a module, you devise a test that will demonstrate that the module properly supports that feature. You run the test, and it fails, since you haven't implemented the feature yet. You code up the feature, then run the test again, and this time it works. Then you go onto the next feature, repeating the same procedure, but running the previous test as well as the new test. You continue this cycle, building up a collection of tests as you add features to the module.

Now, why would anyone want work in this backwards fashion? After all, how do you even know what to test for before you have written the code?

That last question is a clue that something is wrong, and helps explain the reason for TDD.

In Agile development, each user story is required to have acceptance criteria, which are a means to answer the question "Did this coding change satisfy what the story asked for?" For example, consider a story that says "Users signing on to the system should be authenticated." There are all manner of solutions that could meet this criterion, and not all of them would satisfy what the author of the story wanted. On the other hand, consider a story with these acceptance criteria:

  • Sign onto the system, using a user ID and password that are in the user database. The signon should be successful.
  • Sign onto the system, passing in an OAuth 2.0 bearer token from Facebook. The signon should be successful.
  • Attempt to sign onto the system with an invalid user ID or password. The signon should fail.
  • Attempt to sign onto the system with an invalid OAuth 2.0 bearer token. The signon should fail.
  • Perform three failed signon attempts, then attempt to sign on with a valid user ID and password. The attempt should fail until 30 minutes have passed.

These acceptance criteria give us a much clearer target to code for. TDD takes this a step further, by developing acceptance criteria for each step of the development, in the form of tests. You add tests and features in very small steps. Sometimes they feel unnaturally small, but they keep you thinking about edge conditions, as each test suggests the next one. Suppose, for example, you are writing a routine to format money. You start with a test that passes in a number, and when you implement the change, the output looks good. "Hmmm," you say. "What if the number is negative?" So you write a test to that, and the result does not look so good. So you change the code to do something sane with negative numbers, like displaying an error message, and you run the test again. "But what if the number is really big?" you think, so you write another test, and so on. Each time you do this, you write the simplest code that will pass the test, then you refactor it, if necessary, to make it cleaner and to avoid duplicate code. You can do this safely, because with the suite of tests you build up, you can be assured that the refactoring has not altered the behavior of the module.

One of the best resources I have found for starting out with TDD is Test-Driven Development by Example, by Kent Beck. The Part I of the book, chapters 1 through 17, take a fictitious application for managing bond portfolios and add to it the ability to handle multiple currencies. The author leads us through the test-code-test-refactor-test cycle little by little, showing how one test suggests another, and how it is possible to vary the size of our steps from baby steps, when things are confusing, to large steps when things are more clear.

Part II of the book, chapters 18 through 24, uses TDD to develop an xUnit harness in the Python language. xUnits are test harnesses that make TDD go much faster by keeping track of the tests for each module, running them, comparing their output with the expected output, and indicating which tests, if any, failed. This makes TDD much more painless than if one had to run each test manually and visually verify the results. xUnit is a generic term for TDD testing harnesses. Examples of more specific ones are JUnit for Java, CppUnit for C++, NUnit for .Net, and UUnit for Unity. This section of the book, in addition to giving a practical demonstration of TDD, also imparts a good understanding of how xUnit harnesses work, and is invaluable for anyone who needs to write one for a language that is not currently supported.

Part III of the book, chapters 25 through 32, deals with design patterns for tests, and for getting tests working. This includes answers to fundamental questions such as:

  • What do we mean by testing?
  • When do we test?
  • How do we choose what logic to test?
  • How do we choose what data to test?

It also includes special testing considerations, such as mocks (programs that simulate slow or hard-to-set-up things such as database calls, in order that the tests can run quickly), and fakes (a method used in some complicated cases where you start by hard-coding the result you want, which tests the test itself, before refactoring the code to do the desired processing).

The book is a good introduction to TDD, and does a good job of explaining the rationale, demonstrating how it is done, and giving you tools to help get you started.

My favorite quote from the book is "I taught Bethany, my oldest daughter, TDD as her first programming style when she was about age 12. She thinks you can't type in code unless there is a broken test. The rest of us have to muddle through reminding ourselves to write the tests."

The book is available from Amazon here.

When I was a young programmer, a colleague was writing an engine-test system to run on a minicomputer. The only available high-level language was FORTRAN II.

Edsger Dijkstra had just published his "Go-to statement considered harmful" paper, and there was much talk about how you could not code in FORTRAN without GOTOs, since it was not a block-structured language. (Modern FORTRAN is a completely different story. As they say, "It's not your grandfather's FORTRAN.")

Unconcerned, my colleague designed his programs using Nassi-Shneiderman structured flowcharts, and coded them using GOTOs only to build the structured-programming constructs he needed. Higher-level languages had block-structured constructs built in, which made structured programming easier, but given sufficient desire and self-discipline, it was possible to achieve similar results. (I used this approach for many years in coding assembly language.)

Object-oriented languages brought us encapsulation, where you can instantiate a class and call its methods, without ever being able to see its internal data. In 1967, George Dodd, of General Motors Research Labs, developed a package called APL ("Associative Programming Language") for processing computer-graphics data. (Within GM, when someone mentioned APL, the listener immediately responded "Dodd's or Iverson's?") Upon first calling APL, you were given a handle, which you passed in on subsequent calls. The data structures APL used were pointed to by the handle, but they were opaque to the caller. This was encapsulation, a dozen years before Bjarne Stroustrup started his work on C++.

A few years ago I wrote some new code for an assembly-language system, and I wanted to use Test-Driven Development. There was nothing like JUnit or NUnit or UUnit available for assembly language, so I wrote my own simple test harness, and structured my internal routines to facilitate independent testing. I was amazed by the number of small errors I caught during the red-green-refactor process and was happy to discover them now, rather than waiting for a customer to find them. The benefits were definitely worth the effort, even though I would have preferred to use a higher-level language.

In all these cases, it was possible to get by without modern tools. No doubt, tools can make things easier. When you code in an object-oriented language, your focus stays on the problem being solved and the structure of the solution, without getting distracted by which hardware instruction to use (something that modern compilers are more qualified to judge than we are). Test runners like JUnit, NUnit and UUnit take the drudgery out of unit testing, making TDD much smoother. But they are just tools.

Sometimes I think about the cartoons I watched as a child. Today, computers can take a starting and ending pose and fill in all the frames in-between. Back then, each frame, 24 per second, was drawn by hand. It is hard to imagine anything more tedious. But with all that drudgery, they were still able to tell stories.

Tools are good, because they help eliminate mistakes and remove drudgery. Any time you can remove drudgery, you reduce the risk that you will say "I'm really busy, I'll just skip the tests this one time." But you don't always need them.

This applies to agile development tools as well as engineering tools. I get unhappy whenever I hear a Scrum team say "We can't do this because Version One, or Rally, or RTC, doesn't support it." The tool's job is to make our work easier and better, if we have to bend up what we do to suit the tool, the tool is not doing it's job, and we need to find, or make, a different tool.

Remember, it's just a tool.

"The quality goes in before the name goes on"--Zenith Electronics slogan.

"Quality is job one"--Ford Motor Company slogan.

When I worked on an automotive assembly line in the early 1970s, they had a Quality Control department. It acted like a sort of quality high-pass filter: incoming parts and assembled trucks were inspected; if they were high enough quality, they were used or shipped; if not, they were scrapped or sent to the rework department.

Over the years, Quality Control departments transformed themselves into Quality Assurance departments, the idea being that rather than just filtering out low-quality products, they would endeavor to make sure the products got built right in the first place.

The Toyota Production System refined this idea with poka-yoke (error proofing) and the andon light (where anyone on the line could signal a problem, and the line would stop while they figured out how to keep the problem from happening again).

Sadly, many software development organizations, particularly in mature companies, are still stuck in the era of Quality Control. The QA department finds bugs and reports them, the bugs are fixed, and charts are distributed to management so they know which groups to beat up about having too many bugs.

What is missing is the big picture of quality: it is not a department that acts as a gate, or even worse, gets rated on how many bugs they find. (I once worked with a QA department that sent very vague bug reports to development, things like "This message may be wrong." Their explanation was that they got rated on how many bugs they found, so their boss didn't want them to waste any time doing research on whether or not something really was a bug.)

Since we're talking advertising slogans, remember the Maytag repairman, who never had any work because Maytag washing machines were so good? That's kind of what we are shooting for, but we don't have a QA department that is sitting idle. Instead, we have a QA presence on each Scrum team (as required in order to have cross-functional teams), and in addition to figuring out how best to test things, he or she also helps the team design for quality and for testing.

Poka-yoke applies to software in many ways. Modules should be small, and should adhere to the Single-Responsibility Principle. (Robert C. Martin: "Functions should do one thing. They should do it well. They should do it only.) Their APIs should be well documented, preferably in extractable comments, like Doxygen or Javadoc, so they are more likely to stay in sync with the code. Their actions should make sense; in other words, when a programmer makes an assumption about what the module is going to do, he or she should usually be right.

Program source tends to get sloppy over time, particularly when you have new programmers who are accustomed to different bracketing styles. This can cause errors when a bracket isn't where it is expected and is overlooked. (Keeping modules small makes this a lot less of a problem, though.) Program source can be standardized to a common bracketing style using tools like Artistic Style.

Following the Single-Responsibility Principle makes it easier to unit-test modules. If a bunch small modules that have been thoroughly unit tested are assembled to make a product, the amount of testing required for the complete product is reduced. We still have to test to make sure we didn't have misunderstandings in the design or assembly, but if we know the modules are well behaved, we no longer have to throw everything but the kitchen sink at it.

With small, simple modules, static analysis tools like Sonarqube can help you analyze your test coverage, and make sure modules are not too complicated to adequately test. You also might look at the techniques that teams who write safety-critical software use, like The Power of 10: Rules for Developing Safety-Critical Code. Some of their rules, like no dynamic memory allocation, are too stringent for a lot of products, but it is worth looking at what they do.

Another area where design can improve testing is separating data from its presentation, as in the Model-View-Controller, Model-View-Presenter, or Model-View-ViewModel design patterns. Rather than having to test via screen-scraping tools that work with the GUI, you can test at the Model API, which makes it a lot easier to automate tests. You still have to test the View and Controller, Presenter, or ViewModel to make sure they accurately reflect the Model, but the effort is a lot less.

Debugging flags can be helpful too. I mentioned them here in the context of shipping with partially-implemented features, but they are also useful for displaying information that the customer normally does not want to see, but can be useful for debugging and testing. When the customer calls in with a problem, you tell them to turn on this debugging flag and send you the output, which makes it a lot easier to triage the problem remotely. And during testing, you can turn them on to see intermediate results that help you know the product is operating properly. And since you have followed the Single-Responsibility Principle, you can be pretty confident that turning on the debugging flags will not suddenly make the product act different.

Finally, automated test runners, as have been made popular by Test-Driven Development, let you run tests automatically whenever a module changes. Some examples of these are Jenkins (formerly known as Hudson) and TeamCity.

The Quality Control Inspector of the 1970s has given way to the Quality Assurance Designer of the new millennium, who is less concerned with finding bugs, and more concerned with making sure they do not happen.

Domain-Specific Languages (DSLs) are a new buzzword these days. But DSLs have been around for a long time, and are not always computer related. For example, various notations for chess moves, like N-QB3 or Nf6, have been in use since at least the 1700s, and probably earlier. They can describe a chess game much more concisely than any natural language. Some SWAT teams use a visual DSL called SWATCOM, a series of hand signals that can say things like, “I see two men and one woman with guns, moving towards you.” These languages are very concise, but at a cost of being very narrow in scope. For example, neither chess notation nor SWATCOM can say “How did you like the new Britney Spears video?”

This same characteristic applies to computer DSLs. A programmer can use a standard programming language, such as C++, Java, or assembly language, to write a program. But sometimes you need to have programs written by subject-matter experts who are not programmers. Teaching them a general-purpose language is a major undertaking. But you can probably teach them a simple language tailored to their domain.

The RTSTRAN story

In the mid-1970s, I was working for a division of General Motors that needed a way to describe vehicle wiring diagrams to an automated test system. If you have ever looked at the wiring diagram in the back of the owner’s manual for your car, it is drawn like a plate of spaghetti, with devices located based on where they are in the car, and lines crisscrossing each other to connect the devices. It would be very difficult to enter this kind of diagram into the computer without making mistakes. Clearly, we needed another approach.

The first step was to get the Electrical department to redraw the electrical diagrams in the style of industrial ladder diagrams. These diagrams have vertical lines representing the power supply and ground, and a series of horizontal lines that go from the power supply wire to the ground, showing all the devices in between. There are as many lines as are required to describe all the circuits. No attempt is made to represent where the device is physically located.

Here is an example of how a horn circuit looked in the new style of diagram:

Once we had the wiring diagrams in this new form, we developed a domain-specific language, RTSTRAN, that could describe them. Here is how the diagram above could be described in this language:

Since the wiring diagrams were drawn in a standard way, it was fairly easy to teach someone how to convert them to RTSTRAN and enter them into the test system.

By the way, if I were doing this today, three decades later, I would consider having the engineers enter the diagrams using a graphical ladder diagram editor, which would eliminate re-keying the information completely. Here is one example of such an editor.

There are several other available ladder diagram editors, both proprietary and open source, as well as compilers that will compile the diagrams into XML. For more information about them, do a search on the Internet for “IEC 61131-3”, “LD”, and “compiler”. (IEC 61131-3 is an international standard that defines programming languages for programmable logic controllers. The LD section describes graphical ladder diagram representations.)

Once we had the diagrams entered in RTSTRAN, we needed to process them in order to convert them to tables that would be downloaded to the minicomputer that ran the test system. W.H. McKeeman had published a book called A Compiler Generator. It described, and provided the source code for, a compiler for a PL/I-like language called XPL, as well as a grammar analyzer that made parsing tables that could be used to process other languages. These were written in XPL. (It was a self-compiling compiler).

Luckily, Mary Pickett, Heinrik Schultz, and Fred Krull at General Motors Research Labs were working on various industrial computer languages and had ported the XPL compiler and analyzer to PL/I, so they could run on the IBM 370.

Using these programs, along with the RTSTRAN grammar (which was written in BNF), produced a language translator with stubs for processing each element of the language. All I had to do was fill in the processing in each stub, which was a lot easier than writing a compiler from scratch would have been.

Here is what the beginning of the RTSTRAN grammar specification looked like. (In all, there were 111 rules):

Today, there are several open-source compiler-generating tools such available, such as the following:

Other uses

You might be thinking “That’s nice, but I don’t have any electrical wiring diagrams to process in my product.” That may be, but before you reject the idea of a DSL, think about whether you have any complicated configuration information that the customer has to enter. For example, does the customer need to describe his network to your product, or perhaps the devices in his or her data center? For complicated input of this nature, a DSL can make it much easier and faster for the customer to enter the necessary information.

ColmoneBOL

Implementing a DSL does not have to be as complicated as writing a full compiler. For example, in the late 1980s, my colleague, Ron Colmone, worked on a security product for a minicomputer. He developed a DSL that, with a nod to COBOL, was nicknamed ColmoneBOL. The purpose of the DSL was to process command packets sent up from the minicomputer and issue the appropriate commands on the mainframe security system. There were many types of command packets, and the DSL provided the building blocks necessary to add conditional logic to some very powerful runtime functions. The DSL also provided a debugging/trace facility that assisted in diagnostics and debugging.

In this case, the DSL was processed by the MVS High-Level Assembler, using a macro that did an AREAD for the entire script, and then generated blocks that could be executed by the interpreter.

Domain-Specific Languages by Martin Fowler with Rebecca Parsons

This brings us to the main subject of this review, Domain-Specific Languages, by Martin Fowler with Rebecca Parsons. It was published in 2011 as part of the Martin Fowler Signature Series. The book begins with some introductory material to ease the reader into the topic of DSLs, then provides a number of chapters about various aspects of DSLs.

Chapter 1 begins with a hypothetical example that is simple and fun, but also lays out some of the reasons one might want to use a DSL. It describes a “Gothic Security System”, based on old Gothic movies, where one opens a secret panel in the wall by pulling the candle holder at the top of the stairs and tapping the wall twice, or something like that. It discusses how one might design a language so this system could be easily adapted for each haunted mansion where it is installed.

In the example, a customer wants a secret panel to open when she closes her bedroom door, opens a drawer in her dresser, and turns on the bedside light. The toy DSL for this project defines the events, shown below in green (the designations like D1CL refer to the inputs from the various sensors), the commands, shown below in red (the designations like PNUL refer to outputs to control locks and such), and then a series of states, shown below in blue. Each state lists events that cause a change of state, and what the new state is. They can optionally specify commands to be issued while in that state. The “resetEvents” section lists events that immediately put the system back into the idle state.

events
      doorClosed        D1CL
      drawerOpened      D2OP
      lightOn           L1ON
      doorOpened        D1OP
      panelClosed       PNCL
end

resetEvents
      doorOpened
end

commands
      unlockPanel       PNUL
      lockPanel         PNLK
      lockDoor          D1LK
      unlockDoor        D1UL
end

state idle
      actions {unlockDoor lockPanel}
      doorClosed => active
end

state active
      drawerOpened => waitingForLight
      lightOn => waitingForDrawer
end  

state waitingForLight
      lightOn => unlockedPanel
end

state waitingForDrawer
      drawerOpened => unlockedPanel
end

state unlockedPanel
      actions {unlockPanel lockDoor}
      panelClosed => idle
end

Here is a diagram that shows the state transitions of the finite state machine described by the above language. (The commands issued in each state are not shown.) The reset events are not shown, as they would make the diagram difficult to read. They would appear as lines from each node below the “idle” node, going back to the “idle” node and labeled with “doorOpened”.

The above language, while perhaps not intuitive, is concise, but flexible, since it can trigger actions based any series of events (e.g., close the door, open each dresser door in order, tap on the wall three times, turn on the TV, close each dresser door in order, etc.). And it doesn’t have a lot of syntactic noise, like semicolons and continuation characters.

The book mentions the possibility of using XML to configure the Gothic Security system, and quickly rejects it. XML has a lot of syntactic noise, and a lot of opportunities for syntax errors, with its nested tags, and opening and closing angle brackets that have to match, and that sort of thing. This all makes it fairly unfriendly to humans, particularly those who are not programmers by trade.

The book then talks about internal versus external DSLs. An external DSL is a standalone language, like the one described above. An internal DSL (sometimes referred to as an embedded DSL) is a general-purpose language, like Java or Ruby, that is warped, through special usage, into a DSL. For example, here is the same description of a Gothic Security system, but described in specially formatted Ruby.

event :doorClosed,    "D1CL"
event :drawerOpened,  "D2OP"
event :lightOn,       "L1ON"
event :doorOpened,    "D1OP"
event :panelClosed,   "PNCL"

command :unlockPanel, "PNUL"
command :lockPanel,   "PNLK"
command :lockDoor,    "D1LK"
command :unlockDoor,  "D1UL"

resetEvents :doorOpened

state :idle do
  actions :unlockDoor, :lockPanel
  transitions :doorClosed => :active
end

state :active do
  transitions :drawerOpened => :waitingForLight, 
              :lightOn => :waitingForDrawer
end 

state :waitingForLight do 
  transitions :lightOn => :unlockedPanel 
end 

state :waitingForDrawer do 
  transitions :drawerOpened => :unlockedPanel 
end 

state :unlockedPanel do 
  actions :unlockPanel, :lockDoor 
  transitions :panelClosed => :idle
end

This looks very much like the original DSL, with a bit more syntactic noise, but it is actually valid Ruby. The keywords event, command, resetEvents, and state are actually methods. The blocks like “state :unlockedPanel do” use a Ruby construct where the contents of the block, method calls and all, are passed to the method.

Thus it is possible to tack the user’s configuration specification onto the end of the program that defines the methods, and run the whole thing through the Ruby interpreter. This has pros and cons: you don’t have to write a separate compiler for your DSL, but it is syntactically noisier, and the interpreter will probably give confusing messages if the user makes any errors.

The rest of Chapter 1 deals with semantic models, code generation, language workbenches, and visualization. This is followed by 56 chapters in six parts:

  • Narratives
  • Common Topics
  • External DSL Topics
  • Internal DSL Topics
  • Alternative Computational Models
  • Code Generation

With 57 chapters total, the book seems a bit intimidating at first, and it would be, except for the way it is organized, which is similar to the “design patterns” books in the Martin Fowler Signature Series. The book consists of very short chapters, often just three or four pages, that talk about a particular topic. When that topic is referenced, the page where its chapter starts is given in parentheses, like this:

With each event declaration, I can create an event from the Semantic Model and put it into a Symbol Table (165).

Using the page number in the reference, rather than a chapter or section number, or referring to a footnote, means you do not have to go to the table of contents or look somewhere else to find the page you want. And the book has a built-in ribbon to hold your place, so if you want to spend some time in the referenced chapter, you do not have keep your finger on the page you came from to hold your position. This seems like a simple thing, but in practice, it makes it much easier to follow the references.

The short chapters, each about a single topic, make it easy to look up what you need to, and skip things you already know or do not need. If you are implementing a DSL on a deadline, rather than taking a college course on compilers, this organization works much better than the older compiler books that would have a large chapter on each general area of compiler design, long on theory and short on practice. This is a book for the working programmer who needs to implement a DSL.

Now, you may never need to implement a DSL. But it is also possible that doing so would make your life a lot easier, and you have just not considered the possibility. This book will help you decide, and show you some of the possibilities, without bogging you down with too much theory.

One parting thought: some of us still spend most of our working lives coding in assembly language. It is possible to implement a DSL in assembly language. (In fact, ColmoneBOL, mentioned earlier, was implemented in assembly language.) But there are so many more tools and facilities available in high-level languages, not to mention faster coding (studies have shown that programmers code about the same number of lines of code per day, regardless of whether they are coding in assembly language or a high-level languages, and high-level languages do more per line) and fewer error possibilities, that the cost of learning a new language may be outweighed by the improved efficiency it brings.