Category Archives: Code

Code Kata 1: The Closer You Get

I’ve already written about Code Katas and how they can help you become a better programmer; this time, I’m getting serious about it.

Like all craftsmen, we need to practice a lot; eight hours of professional software development is not enough, especially if a great share of these hours is dedicated to email and meetings. A carpenter is not just building houses but also tries out new ideas on his workbench in the basement; painters sketch and explore new combinations of paint and color. Photographers do it, musicians do it: everyone who wants to become better at their craft has to practice in a safe and dry place.

Today’s kata is a simple one. Imagine you have a set of numeric values. If I give you another value (which may or may not be within the set), find a value from the set that matches the given value as close as possible. Don’t write any code yet! Proceed as indicated below.

  1. Think about the problem and draw a picture that illustrates the problem.
  2. What special cases do you see?
  3. Try to come up with a simple algorithm (use pseudo code, flowcharts, whatever you prefer).
  4. Write down a few good test cases and mentally check your algorithm against them. Select a programming language of your choice and code a dummy implementation that always returns the same (wrong) result.
  5. Code your test cases and execute them against your dummy implementation. You don’t have to use unit testing frameworks like JUnit; use the simplest way of comparing results to expected results and flag an error if there is no match.
  6. Watch your tests fail.
  7. Implement your algorithm.
  8. Execute your tests and debug your code until all tests pass.
  9. If you haven’t done it yet: single-step through every line of your code.

Bonus exercise:

  1. Turn your algorithm into a generic routine that is suitable for a library; that is, support different value types (floating point types, integer types).

Comment-Driven Development

comment.jpgMany developers hate documenting their code: They view writing comments a nuisance, something that slows them down. All they want to do is get the darn code to run; comments just mess-up the code and — behold! — are for wimps, anyway.

Sometimes even followers of this school of thought write comments, but these comments are frequently a mixture of self-indulgence, inside jokes or just plain insults:

Then there are folks who take the exact opposite position. They want comments everywhere. They believe that comments should be of such a density, that by just reading the comments they would get a perfect understanding of the whole system. Never would they have to look at such ugly things as — behold! — source code again.

Like it or not: I have observed that many of the highly gifted developers belong to the first group.

But let’s face it: neither position is right. As professional software developers we need to understand that we have an obligation to preserve the value that we create. Good comments do help understanding the code and thus improve maintainability and we are in maintenance mode most of our lives. Like the old saying goes: “Be nice to the next guy”; and often that next guy is you (or me).

On the other hand, crafting good comments is hard and time consuming and as professional software developers we also have an obligation to spend the resources of our companies or clients prudently. Therefore, we need to strike a balance: sometimes it is best to leave the nitty-gritty details for the code.

My rules of thumb so far have been:

1. Focus on good layout and self-explanatory identifier names.
2. Explain the WHAT and not the HOW.
3. Comment surprises; that is, unexpected/unusual things.
4. Comment all non-private parts (public, protected, default)
5. The more public, the more detailed the documentation has to be.

I used to defer writing API documentation comments as long as possible. My reasoning was that otherwise I would have to update/rewrite the API documentation while iteratively developing my code, which I considered a big waste of time and energy. But I’ve changed my mind completely in this respect.

I strongly believe that writing the API documentation before doing any coding is a great benefit. By clearly spelling out the purpose of a piece of code, developers engage in a brainstorming session with themselves that leads to further insights. Until this purpose cannot be stated concisely there is no point in doing any coding. Let me repeat that: If you cannot crisply describe WHAT a method or class is supposed to do, you shouldn’t start implementing it. Granted, most developers do have at least a partial model in their mind before they start coding. If they would also write it down, they could win in several ways.

Naturally, even API documentation comments need to be developed iteratively and often need some fine-tuning during the course of implementing the API. In fact, the process is very similar to test-driven development: One of the biggest benefits of TDD is that by writing tests before doing any coding, you imagine (and in fact see and experience) how the interface is used by clients early on; that’s valuable feedback which helps getting interfaces better. The same holds for the WHAT of a method or class as well.

The notion of writing comments or documentation before doing the coding can be extended to the implementation of a routine as well. In his landmark book, “Code Complete”, Steve McConnell describes a comment-first process for implementing a routine called “Pseudo code Programming Process”: You start by typing in the high-level steps that a routine takes, in plain English, using vocabulary from the problem domain:

Once your are satisfied with your pseudo code, you fill in the real code. As a final step, you turn the original pseudo code into comments — voilĂ !

[Note: I know that this example is a little bit contrived. A cooler implementation of CalculateTaxRate() would return a tax rate of zero in case the income is below the threshold. This would result in shorter, branch-free code.]

Writing good comments is essential for every professional software developer. Since writing comments is hard, it is usually left out if it is considered an afterthought, especially under deadline pressure. If it is done upfront, it can serve as a valuable design tool that yields great documentation for free.

Playgrounds

I can still remember the day when two colleagues had an argument about a piece of code similar to this one:

While one of them argued that ‘k’ was initialized upon every loop iteration, the other claimed that the variable was only initialized (to zero) once. After almost an quarter of an hour, they wanted to hear my opinion on this matter.

I didn’t tell them the answer, but instead asked: “Why don’t you just try it out? It shouldn’t take more than a minute…”. They looked baffled, so I did it on my machine while they were watching. Granted, it actually took me two minutes, but, hey, I was not used to work in front of such a big audience.

legobricks.jpgDuring a typical development day, lots of questions arise. Some are simple, some are not so simple and some of them are really tough. In many cases it is necessary to look up the missing information in a manual, in other cases it is best to consult a friend. But very often, one should just try things out in a safe place. That’s learning by doing, or — ideally — learning by playing, which is probably the best way of acquiring knowledge.

Playing should be fun, but at the very least it has to be painless — otherwise it won’t be done. If you have to close down/restart your IDE and/or have to go through an eight-step wizard to create a toy project before you can start, you wont do it. Experimenting shouldn’t be disruptive — it should have as little impact on your main work as possible. So what we need is safe place that is easily accessible and fun to work in — a playground.

Let me tell you about my solution, my playground. A virtual desktop manager (VDM) is at its center and I wouldn’t want to live without it anymore.

Normally, you only have one desktop, but by using a VDM you get many, as many as you desire. Most Linux desktops like KDE and Gnome already support virtual desktops out-of-the box and when I’m doing development on a Windows machine, I use VirtuaWin (virtuawin.sourceforge.net).

The main benefit of VDMs is that they allow you to separate different activities: you can go from one activity to another without closing/reopening applications; restoring context is effortless: just press the corresponding shortcut to switch from one desktop to another.

For me, four virtual desktops are sufficient: desktop 1 is for my main development work, desktop 2 for any urgent, higher-priority interrupts (like fixing the build breaker that I checked-in last night — damn!), desktop 4 for mail, instant messaging and Internet browsing. Desktop 3, if you haven’t guessed it by now, is my playground.

On desktop 3 there is always a terminal waiting for me and its current directory points to ~/pg, short for ‘playground’. Within this directory there are several subdirectories:

Each of these directories contains various code snippets, results from experiments that I did to find out certain aspects of a programming language, tool or environment.

In most directories I keep a write-protected file named ‘pg’, which serves as a default template to save me from having to type the same boilerplate code over and over again. As an example, here is the ‘pg’ file of the ‘cpp’ subdirectory:

With a playground like this in place, tackling the loop initialization problem becomes easy:

Actually, it’s even easier, as I have defined some shortcuts that let me compile and run code from within my editor; that is, without having to alt-tab to my bash console.

But this is just my playground, a place that gives me lots of joy. Maybe you work with Microsoft’s Visual Studio or Eclipse. In that case you could place links to template projects on your desktop; or, you place batch files in your playground directories that launch a new instance of your IDE and load a template project. Set-up your playground such that it is a fun place for you, but make sure you have one.

Like I said earlier, I keep my experiments for future reference and I suggest you do so, too. Over time, you not just improve your programming kungfu, you also build-up a valuable knowledge repository.

Single-step your code!

watch_your_step.jpegI was so proud: after I had gotten rid of some minor compile-time issues (ie. typos), my unit tests ran over my newly written code without any errors. Granted, the changes I made comprised less than 500 lines, but still, it meant something to me. Feeling happy and content, I hummed R. Kelly’s “The World’s Greatest” while carrying on with other work.

A few days later, I wrote some black-box tests and — to my big surprise — I got a couple of “fails”. After some debugging, I found more than five bugs in the code that had passed my unit tests so nicely. I was completely puzzled. What went wrong? Why didn’t my unit tests catch these trivial bugs?

As it turned out, I forgot to register my test code with the CppUnit test framework, so my tests were not executed at all! Once I had added the missing line

to my test suite class, all five bugs surfaced in an instant. I was so angry! My first reaction was to curse CppUnit: with JUnit this would not have happened. I would have used the @Test annotation and my test would have been auto-registered — unless I had forgotten to tag it with the @Test annotation…

Later that day, I realized that the actual mistake was a violation of Steve Maguire’s powerful principle: “Step through every line of code that you added or changed with your debugger”. Had I set a breakpoint in my code, I would have seen that it was never executed.

Years ago, I used to be a passionate follower of this principle, but somehow unlearned it, largely — I presume — due to the rising unit testing hype. Don’t get me wrong: I think that unit testing is great (and black-box testing is great, too), but it is no replacement for single-stepping through your code.

Reviewing your own code is good, but actually stepping through your code is much cooler. The cursor showing the next statement to be executed focuses your attention and you really experience the program flow instead of having to make guesses about it. Further, you have all the data available and you can even modify it. You can invoke functions from your debugger (e. g. ‘call myfunc()’ in gdb), play with different combinations of parameters, member variables and the like, re-execute just executed code without restarting the debugger by setting the “next statement to execute” a couple of lines up. Probably the biggest benefit is that you get a deeper understanding of your code: maybe you step over a library call that works as expected but takes two seconds to execute; or you observe that you unnecessarily visit the remaining elements of a collection after you found what you’ve been looking for — no unit test would give you this kind of insight.

Often, it is difficult to unit test for certain failure causes, like malloc() returning NULL on out-of-memory conditions:

How would you unit test that? Such error handling code is usually left untested and is the reason why so much software crashes under heavy load. While you’re in a debugger, testing is easy: just set the “next statement to execute” to the error-handling code (right before stepping over the call to malloc), step through it and convince yourself that it works as expected. Again, how would you unit test that? Answer: factor out the error-handling code:

Now, you can call your error handling code from your unit tests. Still, testing the code by using the debugger is easier, doesn’t require any context set-up and gives more insight.

It helps, of course, if you write your code such that debugging is as painless as possible. A line like this is fine, of course:

but writing it like this is (probably) more readable and you can inspect (and alter) intermediate values in your debugger:

If you think this is too much typing, get better at typing and/or get yourself a better editor. If you think this wastes code, bear in mind that we don’t live in the 1970s anymore. If you think that you can always step inside convertSensorReading() and inspect/change the parameters there, you are right, at least as long as you have access to the source code of the function you want to step into.

Macros are bad since you cannot step into them. Use them only if you have no other choice; instead prefer (inline) functions and template functions: they come with the same efficiency advantages and you get type-safety and debuggability as a bonus.

And, speaking of the preprocessor, stop using #define’d symbolic constants: all preprocessor symbols are inlined during the preprocessor phase and I don’t know of any debugger that can resolve their values. Instead, use enums or, even better, const variables:

Mouse over MIN_COUNT in your debugger and you will see nothing; mouse over MAX_COUNT and you will get “the answer” ;-)

Automated unit tests are great, but stepping through your code gives quick feedback and a lot of insight into what is happening at run-time. Sometimes, hard-to-write unit tests can be avoided by consequently following the “step through all of your code” paradigm. As a simple guideline write unit tests — if you like — before starting with the implementation. Then single-step your code by executing your unit tests in a debugger and watch your step.

The Answer To The Last Question

coke_can.jpgToday is towel day, but due to higher priorities I have to celebrate this important day all by myself. I can’t make it to Innsbruck this year, but I swear to you that I’m wearing my towel around my neck while I’m typing this blog entry, which I dedicate to Douglas Adams.

Few people know that his idea about “The great question of life, the universe, and everything” (to which the answer is, as everybody knows, “fourty-two”) was in fact a little bit inspired by Isaac Asimov’s great short story “The Last Question“, where generations after generations build more powerful computers to find out how to reverse entropy and thus prevent the universe from becoming an infinite starless nothingness. “The Last Question” is a great read with a surprising end. I won’t spoil it, don’t worry.

While it is impossible for humans to stop “real” entropy from increasing (let alone reversing it) it is certainly doable in the software world. But how?

It’s not by carrying out the big refactorings and redesigns that nobody wants to do and that no profit-oriented organization can afford: for months valuable resources are so busy cleaning up instead of implementing cool features customers are willing to pay for. It’s the small stuff that counts: the teeny-weeny improvements that you do on a regular basis. Like James O. Coplien said: “Quality is the result of a million selfless acts of care”

I very much like Uncle Bob’s boy scout rule analogy: “Always leave the campground cleaner than you found it”. This principle is helpful for life in general and software development in particular. If the system you are working on is a complete mess, don’t resign. Even if you just improve a comment or rename a badly named variable, you have made the system better. Then, if everybody acts like you, software entropy will be reversed.

The Code is the Design, Baby!

What exactly is the “design” of a software system? Is it a detailed UML diagram? Is it 300 pages of prose? I recently came across an article from 1992 by Jack W. Reeves, mentioned in an interview on the subject of TDD; in his article, Jack argues that the design is actually the source code itself.

What!? The code is the design? This proposition is surprising at first, but not really if you think about it: only the tested and debugged code is a precise description of how the system should behave, how it should be built. And the process of “building” a software system is merely the feeding of the code to tools like make, Ant, compilers and linkers. Thus, “designing” software is hard, building software is straightforward.

Often, people demand that the design should be a UML diagram, something that is easy to read and comprehend by anyone. However, for any non-trivial system, the diagram(s) would become at least as complicated as the real code. It’s always the same: you start out with a couple of classes that have only a few relations, and everything looks nice; but when you finally have 200 classes you suddenly cannot see the forest for the trees anymore. UML is good for highlighting certain parts and aspects of a system, but a complete specification, one that can be handed down to the “code monkeys” is a myth. Even the author of the most successful UML book ever, Martin Fowler, is very skeptical about UML as a blueprint design tool. As Fred Brooks pointed out long ago: “software is not inherently embedded in space and has no ready geometric representation” (like hardware or buildings do).

There are technical issues, too. How do multiple “designers” collaborate on a UML design? How do they create parallel versions and merge conflicts? How do they keep track of variants? Countless tools exist for designs based on plain-text (aka. source code); next to nothing exists for UML.

Why do so many folks fall prey for this “everything-has-to-be-designed-in-UML” idea? Some (most likely managers or sales people) simply confuse programming with typing. Others (most likely developers that never got the hang of programming), want to introduce a white-collar/blue-collar developer system. Yet others think that there must be a method (in this case UML) that would render a complex system in such a way that it can be understood fully by anyone (including managers, sales people and developers who never got the hang of programming).

The design of a modern airplane comprises hundreds of thousands of pages. Nobody can and must comprehend everything. But even laymen can look at the 3D model and understand a lot: ah, this is where the engines are, and yep, here we have the anti-lock brake system and, boy-oh-boy, here we have the flight controller.

Such design overview diagrams are very important for comprehension and communication and no source code can ever replace them. It’s important to have such a diagram (call it high-level design or system architecture diagram or whatever you like). If you want, you can draw it in UML but I suggest you don’t: UML is way too boring. Have a look at the Google Android architecture diagram — it’s shiny, it’s been reduced to the absolute minimum, it’s freakin’ awesome!

Is there a need for designers, anyway? I definitely think so! You need somebody who takes care of the big-picture technical decisions, like the high-level composition of a system, the coding conventions, the hardware decisions and so on. Let’s call this person the systems architect. And, of course, you need somebody who takes care of the look & feel, the user-interface, a Steve Jobs kind of person. And, last but not least, you need many small-to-medium scale “designers” who iteratively detail the “design” through so-called “programming” such that compilers, linkers and the like can build the rest easily.