Documenting is a Team Sport, Too!

Everyone likes good documentation — unless they have to write it themselves, right?

One reason for this is that writing good documentation is hard, very hard in fact. It took Joseph Heller eight years to complete “Catch-22” and many other novels took even longer to write. As a countermeasure, some authors use a pipelined approach to writing (see Gerald M. Weinberg, “Weinberg on Writing: The Fieldstone Method“) that nevertheless allows them to release in shorter time-frames by working on many projects in parallel.

Speaking as a developer, documentation gets into my way of doing other, more enjoyable things, like, well, coding, for instance. I’m writing (!) this article to the defense of the poor chap who has been given the ungrateful job of writing version 1 of a document.

Imagine this situation. One of your team members, let’s call him Jack, is given the task of finding out how to setup a new development environment for some embedded Linux development board. After a week of trial and error he finally gets everything to work properly. Now — of course — he is expected to document what he did so that everyone else on the team can set up their own boards, too. Being a professional developer he sits down and types away; an hour later, he is finished.

What happens next is typical: Harry, the first guy who tries out Jack’s HOWTO, runs into problems. Not one — many. In some cases essential steps are missing, while others are confusing or just plain wrong.

Harry is upset. He runs about and whines how bad the documentation is, what a poor job Jack did and how unfair life is in general…

For sure, in a perfect world, Jack would have written a perfect document that lays out the shortest route from A to B; it would be instructive, entertaining, a work of great pedagogical value. In real life, Jack is exhausted. He had been a pioneer for an extended period of time, tried out many things that didn’t work, suffered hours of frustration and got stuck in a rut many times. Most likely he operated under time pressure and even more likely he doesn’t exactly remember what he did (and did not). Isn’t it a bit too much to expect that he now takes the perspective of the uninitiated developer and writes the perfect manual?

In my view, Harry shouldn’t complain — he should rather spend his energy on improving the document. He benefits tremendously from Jack’s pioneering work and I think it is only fair if he contributes a share. And what he can contribute is something that the original author can’t: When he reads Jack’s document his mind is fresh and clear, without any assumptions, so he is the best person to tune the document for the same kind of audience. And Jack is always there to support him — provided Harry didn’t insult him for not doing his job properly…

But even the next guy after Harry might spot mistakes or inconsistencies; and many month later people will discover parts that are obsolete because the environment has changed in the meantime. Then, it is their job to clean up; they are again the best persons to do it.

Writing good documentation takes both, a different mindset and time; and as the writer’s saying goes: “All good writing is rewriting”. Especially in an agile environment it is a bit too much to expect to get everything from a single person. XPers have long been used to this mode of software development through the principles of collective ownership and refactoring. I believe, these principles apply to writing documentation as well.

Where Richard Feynman Was Wrong

I’ve always been a great admirer of Richard Feynman. Too me, his intelligence combined with his ability to explain even the most complicated facts in easy-to-grasp words is unparalleled.
When he was asked what his recipe for solving problems was, he gave the following advice, which has become known as the “Feynman approach to problem solving”:

1. Define the problem.
2. Sit down and think hard about the problem.
3. Write down the solution.

This is a good example of why I like him so much: he was a joker, a prankster, a guy who never took himself and life too seriously.

Alas, according to what we know about how our brain works, his advice doesn’t work, at least not for really hard problems.

While focusing on the topic and tormenting your brains works for many problems (logic problems, like solving typical math problems or Sudokus), solving hard problems requires just the opposite: complete detachment from the problem.

The reason for this counterintuitive approach is that the part of our brain that solves hard problems (the creative part) is not only slow, but also works asynchronously. In fact, thinking hard about a problem is more than useless: it actually disturbs the the creative part and often prevents it from doing its job.

Does this mean you shouldn’t think about the problem at all? By no means! You should try to gather all kinds of information and facts about a problem, without paying attention to possible solutions. Just load your brains with information and than get away from the problem. Go for a walk, take a nap, or have a beer. Don’t stare at the screen for hours. Relax, even if it is hard. I know, this is the hardest part about solving hard problems.

Habits for Safer Coding

If you have been coding in C and C++ for some time, you know that it is easy to introduce bugs: type in a single wrong character and you wreak havoc. It is therefore a good idea to avoid using certain (dangerous) language features or at least using them in a way that is safer.

My favorite example is always putting braces around blocks, even if the block only consists of a single statement:


if (retries <= MAX_RETRIES) {
    enableOutput();
}

if (retries <= MAX_RETRIES) {

enableOutput();

}

I’ve witnessed cases where really experienced programmers omitted these two “redundant” characters for brevity only to be later faced with this classic bug:


if (retries <= MAX_RETRIES)  // Added today.
    enableOutput();          // Added today.
    restartEngine();         // Added two weeks later.

if (retries <= MAX_RETRIES) // Added today.

enableOutput(); // Added today.

restartEngine(); // Added two weeks later.

It is so easy to make this mistake — especially during merging — that MISRA (Guidelines for the Use of the C++ Language in Critical Systems) made curly braces around if/else/do/while blocks a “required rule”. Note that by following this rule, you are also protected from “dangling else” mistakes:


if (a < b)
    if (c < d)
        foo();
else    // This else actually belongs to the inner 'if'
    bar();

if (a < b)

if (c < d)

foo();

else // This else actually belongs to the inner 'if'

bar();

In general, I follow such “best practices” if they meet these criteria:

– They must solve a real problem
– They must not require any thinking

The “always-put-braces-around-blocks” rule does solve a real problem: I’m not aware of any compiler (not even Sun’s Java compiler) that warns you about the mistake shown above. (Of course, static code analysis tools (like PC-Lint) do check for these errors, and many, many more, but unfortunately only few developers use them.)

Does it require any thinking? I don’t think so. The only thing I have to remember is this: whenever I use ‘if’, ‘else’, ‘for’, ‘while’, ‘do’ I put curly braces around my code. That’s all!

Maybe you have come across this convention:


if (MAX_RETRIES == retries) {  // Constant comes first.
    ...
}

if (MAX_RETRIES == retries) { // Constant comes first.

...

}

Advocates of this style reason as follows: in C/C++ you can easily — accidentally — use an assignment operator when you actually wanted to use an equality operator. Since you cannot assign to a constant, the compiler will refuse to compile your buggy code:


if (MAX_RETRIES = retries) // Good: caught at compile-time.
if (retries = MAX_RETRIES) // Bad: not caught.

if (MAX_RETRIES = retries) // Good: caught at compile-time.

if (retries = MAX_RETRIES) // Bad: not caught.

I must admit that I’ve never liked this convention. One reason is that it looks unnatural. Another is that it lacks symmetry; that is, it is not consistent with other comparison operators. What if you need to change the comparison some day to also include values greater than MAX_RETRIES?


if (MAX_RETRIES <= retries)     // Looks strange.
if (retries >= MAX_RETRIES)     // Looks natural, but requires an
                                // argument swap.

if (MAX_RETRIES <= retries) // Looks strange.

if (retries >= MAX_RETRIES) // Looks natural, but requires an

// argument swap.

Or what if MAX_RETRIES is changed from a compile-time constant to a “normal” variable whose value is obtained from a config file? Do you swap then? In fact, this convention doesn’t provide any help for mistakes in variable/variable comparisons, which occur also very frequently in day-to-day programming:


if (cfgMaxRetries = retries) // Bug!

if (cfgMaxRetries = retries) // Bug!

For my taste, the utility of this convention is rather limited and it requires too much thinking: “If I compare a compile-time constant (or a literal) against a variable using the equality operator, I put the constant first”. Isn’t this a clear hindsight rule? It’s a bit like attaching a tag saying “always lock the door” to your key.

Why not, instead, turn this into an habit: “If I do an equality comparison, I double check that there are two equal signs”. That’s not any more difficult to adopt, doesn’t impair readability and has the additional benefit of working in all cases — not just when comparing against constants.

I’m prepared to follow even conventions that look funny if they solve a real problem but I don’t think that this convention does: today, most compilers produce warnings when you make this kind of mistake (for GCC it’s -Wparenthesis, which is part of -Wall).

In C, there are numerous ways to make silly mistakes (in C++ there are many, many more). Instead of using questionable practices that address one tiny area, it is much better to enable all compiler warnings or — even better — use static code checkers like PC-Lint. In my opinion, that’s the most worthwhile habit for safer coding of all.

When bytes bite

Not long ago I had a discussion with a friend who had just bought a new 2 TB hard disk. He complained that those hard disk manufacturers always cheat: they sell you 2 TB but in the end the drives only have 1.8 TB of memory available. For them, a kilobyte comprises only 1000 bytes and not 1024, as for the rest of mankind.

We use the term “byte” everyday but it is surprising how man developers don’t know exactly what a byte is. “Byte” — and especially byte quantifiers like kilo, mega, and giga — seem to be surrounded by many misuses and misconceptions.

Traditionally, a byte is a collection of bits used to encode a single character for a system. It could be 4, 7, 8, 9, or any other number that the designer of a system happens to choose. This is the main reason for the CHAR_BITS symbolic constant in ISO C’s limits.h: it specifies precisely how many bits there are in a character (char).

Today, of course, we can safely assume that a byte comprises exactly 8 bits, but it is important to note that this is no universal, standardized definition. That’s why the term “octet” is used in RFCs and ASN.1: an octet is defined to be always 8 bits.

But what the heck is a kilobyte? Is it 1024 bytes? Or 1000? We use byte quantifiers so frequently, but not always correctly.

Some folks use the well known SI prefixes to mean powers of 1024:


1 kB = 1024^1 bytes = 1024 bytes
1 MB = 1024^2 bytes = 1048576 bytes
1 GB = 1024^3 bytes = 1073741824 bytes
...

1 kB = 1024^1 bytes = 1024 bytes

1 MB = 1024^2 bytes = 1048576 bytes

1 GB = 1024^3 bytes = 1073741824 bytes

...

While hard disk manufacturers usually have a different definition:


1 kB = 1000^1 bytes = 1000 bytes
1 MB = 1000^2 bytes = 1000000 bytes
1 GB = 1000^3 bytes = 1000000000 bytes
...

1 kB = 1000^1 bytes = 1000 bytes

1 MB = 1000^2 bytes = 1000000 bytes

1 GB = 1000^3 bytes = 1000000000 bytes

...

Which makes quite a difference, especially if sizes grow larger: for a 1 TB hard disk you might get about 10% less than you thought you’d paid for…

But don’t blame it on the hard disk manufacturers. They’re right, of course. Using SI prefixes to mean ‘powers of 1024’ is strongly discouraged by the SI institute.

Still, it is sometimes useful to use powers of 1024 but how should this be done without annoying the SI institute, or — more importantly — without confusing people?

Fortunately, there is an ISO standard (with roots back to work done by IEC in 1998) that addresses this problem: ISO/IEC IEC 80000-13:2008.

According to this standard, you use binary prefixes like “kibi” and it’s friends, where kibi is short for “kilo binary”; and instead of using SI’s “k” prefix you use “Ki”. Have a look at this table:


1 KiB   1 kibibyte  1024^1 bytes = 1024 bytes
1 MiB   1 mebibyte  1024^2 bytes = 1048567 bytes
1 GiB   1 gibibyte  1024^3 bytes = 1073741824 bytes
...

1 KiB 1 kibibyte 1024^1 bytes = 1024 bytes

1 MiB 1 mebibyte 1024^2 bytes = 1048567 bytes

1 GiB 1 gibibyte 1024^3 bytes = 1073741824 bytes

...

Binary prefixes can (and should) be applied to any other unit when the value is based on 1024: 1 Mib/s is 1024^2 bits per second, while 1 Mb/s is 1000^2 bits per second.

The international system of units is a wonderful achievement of the 20th century. We, especially as developers, should honor it and finally stop misusing its well-defined prefixes and instead use ISO binary prefixes.

Dangerously Confusing Interfaces

Designing intuitive interfaces that are easy to use and easy to learn is hard, often very hard; and for economic reasons it might not always be possible to strive for perfection. Nevertheless, in my view, at the very least, interfaces should be designed such that obvious, day-to-day usage doesn’t lead to damage.

In his classic book “Writing Solid Code”, Steve Maguire calls confusing interfaces that lead to unexpected bugs “Candy Machine Interfaces”. He tells a story from a vending machine at Microsoft that used to cause him grief: The machine displayed “45 cent” for “number 21”, but after he had finally inserted the last coin he would sometimes enter “45” instead of “21” (and would get a jalapeño flavored bubble-gum instead of the peanut butter cookie that he wanted so much — Ha Ha Ha!). He suggests an easy fix: replace the numeric keypad with a letter keypad and no confusion between money and items would be possible anymore.

The other day I did something like this:


rsync -r /media/backup/gamma/ /home/ralf

rsync -r /media/backup/gamma/ /home/ralf

My goal was to recursively copy the ‘gamma’ folder to my home folder. What I expected was a ‘gamma’ folder within my home directory, but instead I ended up with hundreds of files from the ‘gamma’ directory right at the top-level of my home directory — the ‘gamma’ directory simply wasn’t created!

I have to confess that similar things sometimes happen to me with other recursive-copy-like tools, too — this seems to be my candy machine problem. Now you know it.

As for ‘rsync’, there is a feature that allows you to copy just the contents of a directory, without creating the directory, flat into a target directory. Granted, this is sometimes useful, but do you know how to activate this mode? By appending a trailing slash to the source directory! That’s what happened in my case. But I didn’t even add the slash myself: if you use Bash’s TAB completion (like I did) a trailing slash is automatically appended for directories…

But good old ‘cp’ puzzles me even more. If you use it like this


cp -r /from1/from2/from3 /to1/to2

cp -r /from1/from2/from3 /to1/to2

it will copy ‘from3’ to a folder named ‘to2’ under ‘to1’ such that both directories (‘from3’ and ‘to2’) will have the same contents, which is more or less a copy-and-rename-at-the-same-time operation. Unless ‘to2’ already exists, in which case ‘from3’ will be copied in ‘to2’ resulting in ‘to1/to2/from3’. Unless, as an exception within an exception, there is already a ‘from3’ directory under ‘to2’; in this case ‘cp’ will copy ‘from3’ flat into the existing ‘to2/from3’ which might overwrite existing files in that folder.

Both, ‘cp’ and ‘rsync’ suffer from fancy interfaces that try to add smart features — which is normally good — but they do it in an implicit, hard-to-guess, hard-to-remember way — which is always bad. Flat copies are sometimes useful but they might be dangerous as they could inadvertently overwrite existing files or at least deluge a target directory. A potential cure could be an explicit ‘–flat’ command-line option.

To me, a wonderfully simple approach is the one taken by Subversion: checkouts are always flat and I’ve never had any problems with it:


svn checkout http://someurl.com/marble/trunk ~/work/marble

svn checkout http://someurl.com/marble/trunk ~/work/marble

This copies (actually checks-out) the contents of the ‘trunk’ flat into the specified destination directory — always, without any exceptions. That’s the only thing you have to learn and remember. There are no trailing backslashes or any other implicit rules. It will also create the target parent directories up to any level, if needed.

Naturally, dangerously confusing interfaces exist in programming interfaces, too. Sometimes the behavior of a method depends on some global state, sometimes it is easy to confuse parameters. The ‘memset’ function from the C standard library is a classic example:


memset(buffer, 32, 40);

memset(buffer, 32, 40);

Does this put 40 times the value of 32 in ‘buffer’ or is it the other way around?

I have no idea how many programmers don’t know the answer to this question or how many bugs can be attributed to this bad interface, but I suspect that in both cases the answer must be “way too many”. I don’t want to guess or look up the specification in a manual — I want the compiler to tell me if I’m wrong. Here is an alternative implementation:


typedef struct {
    char fill;
} memset_fill_t;

void memset(void* p, memset_fill_t fill, size_t n);

typedef struct {

char fill;

} memset_fill_t;

void memset(void* p, memset_fill_t fill, size_t n);

Now you write


memset_fill_t fill = { 32 };
memset(buffer, fill, 40);

memset_fill_t fill = { 32 };

memset(buffer, fill, 40);

If you confuse the fill character with the length parameter the compiler will bark at you — a parameter mix-up is impossible. Even though this is more to type than the original (dangerous) interface: it is usually worth the while if there are two parameters of the same (or convertible) type next to each other.

Like I said in the beginning: designing intuitive interfaces is hard but spending extra effort to avoid errors for the most typical cases is usually a worthwhile investment: don’t make people think, make it difficult for them to do wrong things — even if it sometimes means a little bit more typing.

The Principle of Proximity

It is common wisdom that opposites attract. In programming, however, it is desirable to keep things that are related together — that’s at least what the “Principle of Proximity” states.

This principle has many manifestations, some of which are well known by most software developers, for instance:

-Keep the documentation (comments) as close as possible to the code;
-Initialize variables as close as possible to the point where you use them;
-Limit the scope of declarations (i. e. use namespaces and don’t make constants public if private is sufficient);

As opposed to opposites, related things not always attract, or — as a matter of fact — attract in a suboptimal way.

Here is an example. Assume that you have to process a list of different objects (let’s call them “boxes”, for the sake of this example) that you have just received, maybe over a socket connection. This list always consists of a blue box, a red box, and a green box, exactly in that order. These boxes are encrypted and protected by an integrity checksum. Before actually processing them, you need to perform decryption and integrity checking. (Also assume that the boxes are completely different. They have different content, different security mechanisms, and require different processing.) Below is one way to go about it:


void onReceiveBoxes1(void* boxes) throw(BoxSecurityException) {
    // Get pointers to boxes.
    blueBox_t* blueBox = (blueBox_t*)boxes;
    redBox_t* redBox = (redBox_t*)(blueBox + 1);
    greenBox_t* greenBox = (greenBox_t*)(redBox + 1);

    // Check box integrity and decrypt box content.
    applySecurityToBlueBox(blueBox);
    applySecurityToRedBox(redBox);
    applySecurityToGreenBox(greenBox);

    // Process the actual boxes.
    processBlueBox(blueBox);
    processRedBox(redBox);
    processGreenBox(greenBox);
}

void onReceiveBoxes1(void* boxes) throw(BoxSecurityException) {

// Get pointers to boxes.

blueBox_t* blueBox = (blueBox_t*)boxes;

redBox_t* redBox = (redBox_t*)(blueBox + 1);

greenBox_t* greenBox = (greenBox_t*)(redBox + 1);

// Check box integrity and decrypt box content.

applySecurityToBlueBox(blueBox);

applySecurityToRedBox(redBox);

applySecurityToGreenBox(greenBox);

// Process the actual boxes.

processBlueBox(blueBox);

processRedBox(redBox);

processGreenBox(greenBox);

}

At first glance, this code doesn’t look bad at all. It is grouped in such a way that the three steps are clearly visible: 1. get a box; 2. apply security to box; 3. process box. If you zoom out a little, the structure looks like this:


    operation 1
        object a
        object b
        object c
    operation 2
        object a
        object b
        object c
    operation 3
        object a
        object b
        object c

operation 1

object a

object b

object c

operation 2

object a

object b

object c

operation 3

object a

object b

object c

Is this the principle of proximity in action? Are related things close together?

Not really. The things that are close together are the objects under each operation, but the objects themselves have little in common. Contrast this with this approach:


void onReceiveBoxes2(void* boxes) throw(BoxSecurityException) {
    // Handle blue box.
    blueBox_t* blueBox = (blueBox_t*)boxes;
    applySecurityToBlueBox(blueBox);
    processBlueBox(blueBox);

    // Handle red box.
    redBox_t* redBox = (redBox_t*)(blueBox + 1);
    applySecurityToRedBox(redBox);
    processRedBox(redBox);

    // Handle green box.
    greenBox_t* greenBox = (greenBox_t*)(redBox + 1);
    applySecurityToGreenBox(greenBox);
    processGreenBox(greenBox);
}

void onReceiveBoxes2(void* boxes) throw(BoxSecurityException) {

// Handle blue box.

blueBox_t* blueBox = (blueBox_t*)boxes;

applySecurityToBlueBox(blueBox);

processBlueBox(blueBox);

// Handle red box.

redBox_t* redBox = (redBox_t*)(blueBox + 1);

applySecurityToRedBox(redBox);

processRedBox(redBox);

// Handle green box.

greenBox_t* greenBox = (greenBox_t*)(redBox + 1);

applySecurityToGreenBox(greenBox);

processGreenBox(greenBox);

}

The structure is now inverted:


    object a
        operation 1
        operation 2
        operation 3
    object b
        operation 1
        operation 2
        operation 3
    object c
        operation 1
        operation 2
        operation 3

object a

operation 1

operation 2

operation 3

object b

operation 1

operation 2

operation 3

object c

operation 1

operation 2

operation 3

The objects and their operations are close together; in fact, they are completely encapsulated. I like to call this ‘encapsulation at runtime’, which is not to be confused with traditional object-oriented encapsulation where you put data and its related operations close together at coding time, in a class. (Which is another instance of the principle of proximity, BTW.)

What I don’t like about onReceiveBoxes1 is that it mixes up things that are unrelated: order of boxes and order of box actions. Just because the boxes are ordered in a particular way, doesn’t mean that we have to perform the box actions in that particular box-order. Unnecessary dependencies are usually bad for maintenance.

Ah, maintainability, that’s where the second implementation really shines! If you have to add a yellow box someday, you just copy and paste the block of an existing box and do some minor modifications. And if the order in which boxes arrive changes, adapting onReceiveBoxes2 is likewise trivial. Better maintainability means that the risk of introducing an error is much lower, which in turn means that you spend less time debugging and have more time for doing code katas.

Honoring the principle of proximity almost always gives you better efficiency, either. Notice that in the first implementation, the pointers to all boxes have a fairly long lifetime and must be kept in memory (or CPU registers) as they are needed until operation 3 has finished. onReceiveBoxes2 only needs a pointer to the box that is currently worked on, which means that the compiler only needs to allocate one pointer.

Approxion

Code – People – Everything

Documenting is a Team Sport, Too!

Where Richard Feynman Was Wrong

When bytes bite

Dangerously Confusing Interfaces

The Principle of Proximity