Category Archives: C/C++/Embedded

Circular Adventures V: The Christmas Edition

“And the Grinch, with his Grinch-feet ice cold in the snow,
stood puzzling and puzzling, how could it be so?
It came without ribbons. It came without tags.
It came without packages, boxes or bags.
And he puzzled and puzzled ’till his puzzler was sore.
Then the Grinch thought of something he hadn’t before.
What if Christmas, he thought, doesn’t come from a store?
What if Christmas, perhaps, means a little bit more?”

— Dr. Seuss
“How the Grinch Stole Christmas”

Oh well, oh well, it’s Christmas time again. Year after year, in this dark season, people contemplate the circle of life, watch reruns on TV, and rush to shopping malls to buy overpriced items for their loved ones; it is not unlikely, that — as soon as the shops are open again — these loved ones rush to return all the stuff that they don’t really need and buy something cool instead. You can also be sure that the level of madness increases every year — due to a well-known effect that economists call “inflation”.

Since so many things recur around Christmas, this is the perfect time for me to share more “circular” thoughts with you.

But first some background on the origins of Christmas. The reason why people celebrate Christmas on December 25 has little to do with the birth of Jesus, but rather with an ancient Roman cult called “Sol Invictus” (which means something like “unconquerable or invincible sun”). When the Julian calendar system was introduced around BC 45, the winter solstice occurred on December  25 (as opposed to today, where it happens on either December 21 or 22); this day was celebrated as the birth of the sun: on December 25, the sun would come back and rise to its full power over the next months.

It is believed that early Christians also took part in these festivities and celebrated together with the pagans by kindling lights. When the “Sol Invictus” cult was finally replaced by Christianity around AD 300, Christians decided to keep that special day but celebrate the birth of Jesus Christ, instead. (Today, it is assumed that Jesus Christ was born somewhere around March, BC 4.)

But now back to more technical stuff. You might have observed that the weekday of a particular date within a year gets advanced by one weekday in the following year. That is, if first of August is a Tuesday this year it will be a Wednesday next year. But how come?

Being a veteran circular adventurer by know, you should be able to come up with an answer yourself — at least you should try. Don’t cheat. Don’t read on. Think about it.

OK, here you go. A regular year (no leap year) has 365 days and a week comprises seven days. 365 mod 7 is 1, which means that after 52 7-day weeks, there is still this one day left in the old year to be filled with a weekday. You can think of it like this: the old year nibbles away one weekday from the new year and thus weekdays in the new year are “rotated left” by one. For leap years, the “rotate value” is two, since 366 mod 7 equals 2.

Regardless of your convictions, regardless of how strange they might look to adherents of other convictions, regardless of whether or what you celebrate, I wish all of you, dear readers of my blog, a Great Time and the best for the New Year.

More circular adventures…

Being Tolerant Towards NULLs

black_hole

KING LEAR: “What can you say to draw a third [of the kingdom] more opulent than your sisters? Speak.”
CORDELIA: “Nothing, my lord.”
KING LEAR: “Nothing!”
CORDELIA: “Nothing.”
KING LEAR: “Nothing will come of nothing; speak again.”

Around the year 2000, when the mindless “outsourcing-to-India-will-solve-all-our-problems” hype was near its top, I saw an interview with an Indian minister on TV. When the reporter asked, why there were so many talented software engineers in India, the minister replied: “Well, the number zero was invented in India, and programming is all about ones and zeros…”.

Now, this could have been a good joke, but trust me: that man was dead serious about his statement. My first reaction was anger, my second compassion: this poor guy clearly didn’t know what he was talking about. At least on the matter of programming, he was the human equivalent of a null device.

Some months ago, I worked on a Java program. In one of the dialogs the user was asked to specify a path to a file in an edit field; the default value of this edit field came from a property file which I used to store previous path entries made by the user:

Even though the code above can be shortened by using the ternary operator, I thought that this ‘null’ handling business unnecessarily cluttered-up my code. All I wanted was just this: “if there is a value in the property file, use it; otherwise leave the edit field as it is (ie. empty)”. I wanted to write something that was easy on the eye, like this:

But had I done so, I would have gotten a dreaded NullPointerException in cases where there was no filnename property in the property file. Darn! I really wish that sometimes APIs could cope with NULL values; that is, silently ignore them.

Since the early days of databases we have used NULL values to signal the absence of a real value. In SQL, it is well defined how NULL values are interpreted in the context of SQL operators (for instance, if you perform a logical OR operation like TRUE OR NULL, you will get TRUE as a result). In contemporary programming languages, NULL values are often treated like orphans.

In C, for instance, using NULL values usually puts you in the realm of undefined behavior:

Depending on the platform you are on, this might either do nothing or lead to a core dump — you cannot tell for sure. Java is more strict in this respect: if you pass a NULL reference to the VM it will complain by throwing a NullPointerException:

But wouldn’t it be equally valid to expect that nothing happens in these cases? If you copy 42 to nowhere (or from nowhere), why shouldn’t this be valid? In this case, NULL would behave like a black hole: it sucks up everything and there is no way to get anything out of it.

In OOP, people often apply the Null Object Pattern to get no-op behavior. But wouldn’t it be much nicer if methods would automatically do nothing if invoked on NULL references?

Now that I’ve thought about it for a while, what I really want is that APIs and/or higher-level languages behave like /dev/null in Unix: if there is nothing, do nothing. I believe this is just another variant of the Rule of Silence:

Contrary to what most people believe, doing ‘nothing’ is often harder than one might expect. Here are some examples:

  • Setting/printing a NULL string: don’t output anything
  • Writing to a file through a file pointer that is NULL: don’t write, don’t complain.
  • Writing NULL to a properly opened file: don’t write anything, leave file pointer unchanged.
  • Opening a file when filename is NULL: don’t open it, just return NULL.
  • Using memcpy where dest is NULL: don’t copy, just return NULL.
  • Using memcmp where either src or dest is NULL: return -1 (or +1).
  • Reading 42 bytes from a socket where dest array is NULL: receive 42 bytes and discard them.

Even though I’m convinced that NULL-tolerant APIs would often simplify a programmer’s life, for a low-level programming language like C, performance is everything and hence being NULL-tolerant is probably not an option. But higher-level languages (and dynamic languages and high-level APIs implemented in lower-level languages) could certainly benefit.

The True Value of PC-lint

nobugs

“An ounce of prevention is worth a pound of cure.”

— Benjamin Franklin

When you ask folks who sell static analysis tools why you should buy their expensive products, they all will tell you “to find bugs, of course!”. Very likely, they will show you a diagram that displays the exponential cost growth of fixing a bug, depending on the stage where it is detected. Also very likely, they will brag that — compared to PC-lint — they have a very low “false positive” rate.

Fair enough, but the story doesn’t end here. Take this code, for instance:

The code, as it stands, is 100% error-free. Yet, PC-lint will not like it for several good reasons. For instance:

  1. There are no virtual functions in Base, so what’s the point deriving from it? Was it an oversight?
  2. There is no default constructor, and hence you cannot put objects of type Base in standard library containers
  3. The Base constructor is not explicit, so plain integers might get silently converted into Base objects
  4. The add method could be declared const; adding const-correctness later is usually difficult
  5. The ADD macro is not parenthesized and neither are its parameters; this gives rise to all sorts of operator precedence problems
  6. Base’s destructor is not virtual, which means that the behavior is undefined when somebody deletes Derived objects through a pointer to Base
  7. The iostream header file is not used and hence not needed; removing the #include statement improves clarity and compilation times

So there are no bugs. But are these issues flagged by PC-lint really false positives?

Too me, the reported warnings are a sign of the poor quality of this code; this code is full of latent bugs, just waiting to become alive in the future or in somebody else’s hands. Shady code like this increases technical debt, makes maintenance harder, riskier, and more expensive.

I want such issues reported and resolved before I check in my code, actually, before I even execute it. Right after a successful compile I pay attention to PC-lint’s feedback and resolve real bugs, latent bugs and any bad coding practices. I don’t want to get a ticket three weeks later from a software QA guy, after I’m done with debugging and when the mental distance has become large. So large  that it would take a lot of effort to recall what I was thinking at the time of writing. I want quick and easy desktop checking such that my bad code is never seen by anyone but me.

Finding bugs and code smells at the earliest possible time and thus keeping maintenance cost lost; not just focusing on finding bugs, but preventing bugs — today and tomorrow — in the first place. That’s the true value of PC-lint.

 

The Principle of Least Surprise

jokey_smurf

“He reached out and pressed an invitingly large red button on a nearby panel. The panel lit up with the words PLEASE DO NOT PRESS THIS BUTTON AGAIN. He shook himself.”

“The Hitchhiker’s Guide to the Galaxy”

I’ve written elsewhere that to me, the source code itself constitutes the design of a software product. All other forms of (design) documentation fall behind, sooner or later. If you have other documentation in parallel — however legitimate your reasons may be — you have to pay the price that all violations of the DRY principle incur.

If the code is the documentation of the design, it should be easy to read. Good identifier names, short routines that do just one thing, no super-clever hacks and so on. Most importantly, it should be free of surprises such that it can be read (and skimmed) without major effort. Enter the Principle of Least Surprise (PoLS).

A for-loop like this is a surprise:

Why? If a programmer sees a start index of 0 (s)he assumes an iteration over a half-open range; that is, the upper bound is excluded. Even if it is OK in this case and not a bug (who is to tell without browsing a lot of other code?) it results in quite some head-scratching. Contrast this with this rewrite:

Even without comments, everyone gets it: “Do something ‘gross_len’ times”.

Recently, I came about code that initialized a state machine:

I was hunting down a bug. Since no events had occurred, I expected the state machine to be still in state ‘permanent’ but the behavior of the component made me believe that it wasn’t. So I loaded the code into a debugger and set a write breakpoint on variable ‘state’ to find out which code (actually who) reset the ‘permanent’ flag. But apart from the initialization of ‘state’ I didn’t get a hit. After walking around for some time (it is always a good idea to get away from the computer when solving difficult problems) I had another desperate idea: maybe ‘state’ was never initialized to ‘permanent’. I opened up the header file that defined the flags and what I stared at with horror was this:

I guess I must have looked like Steve McConnell’s “Coding Horror” guy.

Now, I consider it OK if a platform doesn’t support all features; but this way of dealing with it is probably the worst. And it is a clear violation of PoLS: when I see code that looks like it sets a flag, I expect that it sets the darn flag. Period.

So the general advice for adhering to PoLS is “Write WYSIWYG Code”. Code that uses well-known idioms consistently, code that can be grasped without debugging and jumping back and forth between files and declarations. Put all cards on the table; say what you mean and mean what you say.

Avoid Array Parameters in C

Below is a function that implements a 128-bit counter in C. It expects an array containing 16 bytes (totaling 128 bits) and increments its value by one. The byte at index 0 is the low-order byte, thus the counter is in fact a little-endian counter. Note how any carry is elegantly propagated to higher-order bytes by checking for wrap-around of the currently incremented byte.

This code looks neat, compiles fine and works — but only partially. Depending on your platform, it might only use the lower 4 or 8 bytes of the 16 bytes that you pass to the function. How come?

Even though counter looks like an array of 16 bytes, it is actually a pointer. In C, looks can certainly be deceiving!

As a consequence, on a 32-bit platform sizeof(counter) yields 4 (8 on a 64-bit platform, respectively). According to the language rules of C/C++, arrays are always passed by reference (by passing a pointer to the first array element) and there is nothing we can do about it.

Don’t get fooled into believing that dereferencing the counter pointer fixes the problem:

The type of counter is ‘pointer to uint8_t’, the dimension [16] is completely ignored by the compiler. Even worse: providing a dimension lures people into a false sense of type-safety. Hence my advice: avoid using array parameters in C (and C++).

Obviously, the unlucky programmer wanted to prevent people from making mistakes by explicitly showing that an array of 16 bytes is required. The irony is that in the end he screwed up himself.

What our programmer wanted to achive was this:

Here, counter is declared to be a pointer to an array of 16 bytes and everything works as expected. But isn’t this somewhat ugly?

As almost always, a typedef can clean up things a bit:

But my preferred solution is to wrap the array in a struct:

For most people, array members of structs are easier to grok than pointers to arrays. That’s why I try to avoid them whenever possible. Further, having a struct opens the door for future extensions and has the additional benefit of making direct value assignments possible without having to resort to memcpy: