The True Value of PC-lint

nobugs

“An ounce of prevention is worth a pound of cure.”

— Benjamin Franklin

When you ask folks who sell static analysis tools why you should buy their expensive products, they all will tell you “to find bugs, of course!”. Very likely, they will show you a diagram that displays the exponential cost growth of fixing a bug, depending on the stage where it is detected. Also very likely, they will brag that — compared to PC-lint — they have a very low “false positive” rate.

Fair enough, but the story doesn’t end here. Take this code, for instance:


#include <iostream>

#include <iostream>


#define ADD(x, y) x + y

#define ADD(x, y) x + y


class Base {
public:
    Base(int i) : _i(i) { ... }
    int add(int a, int b) { return ADD(a, b); }
    ~Base() { ... }
private:
    int _i;
};

class Base {

public:

Base(int i) : _i(i) { ... }

int add(int a, int b) { return ADD(a, b); }

~Base() { ... }

private:

int _i;

};


class Derived : public Base {
    void increment() { ... }
};

class Derived : public Base {

void increment() { ... }

};

The code, as it stands, is 100% error-free. Yet, PC-lint will not like it for several good reasons. For instance:

There are no virtual functions in Base, so what’s the point deriving from it? Was it an oversight?
There is no default constructor, and hence you cannot put objects of type Base in standard library containers
The Base constructor is not explicit, so plain integers might get silently converted into Base objects
The add method could be declared const; adding const-correctness later is usually difficult
The ADD macro is not parenthesized and neither are its parameters; this gives rise to all sorts of operator precedence problems
Base’s destructor is not virtual, which means that the behavior is undefined when somebody deletes Derived objects through a pointer to Base
The iostream header file is not used and hence not needed; removing the #include statement improves clarity and compilation times

So there are no bugs. But are these issues flagged by PC-lint really false positives?

Too me, the reported warnings are a sign of the poor quality of this code; this code is full of latent bugs, just waiting to become alive in the future or in somebody else’s hands. Shady code like this increases technical debt, makes maintenance harder, riskier, and more expensive.

I want such issues reported and resolved before I check in my code, actually, before I even execute it. Right after a successful compile I pay attention to PC-lint’s feedback and resolve real bugs, latent bugs and any bad coding practices. I don’t want to get a ticket three weeks later from a software QA guy, after I’m done with debugging and when the mental distance has become large. So large that it would take a lot of effort to recall what I was thinking at the time of writing. I want quick and easy desktop checking such that my bad code is never seen by anyone but me.

Finding bugs and code smells at the earliest possible time and thus keeping maintenance cost lost; not just focusing on finding bugs, but preventing bugs — today and tomorrow — in the first place. That’s the true value of PC-lint.

The Principle of Least Surprise

jokey_smurf

“He reached out and pressed an invitingly large red button on a nearby panel. The panel lit up with the words PLEASE DO NOT PRESS THIS BUTTON AGAIN. He shook himself.”

“The Hitchhiker’s Guide to the Galaxy”

I’ve written elsewhere that to me, the source code itself constitutes the design of a software product. All other forms of (design) documentation fall behind, sooner or later. If you have other documentation in parallel — however legitimate your reasons may be — you have to pay the price that all violations of the DRY principle incur.

If the code is the documentation of the design, it should be easy to read. Good identifier names, short routines that do just one thing, no super-clever hacks and so on. Most importantly, it should be free of surprises such that it can be read (and skimmed) without major effort. Enter the Principle of Least Surprise (PoLS).

A for-loop like this is a surprise:


for (int i = 0; i <= len; i++) ...

for (int i = 0; i <= len; i++) ...

Why? If a programmer sees a start index of 0 (s)he assumes an iteration over a half-open range; that is, the upper bound is excluded. Even if it is OK in this case and not a bug (who is to tell without browsing a lot of other code?) it results in quite some head-scratching. Contrast this with this rewrite:


const int SPARE_ELEMENT_COUNT = 1;
...
int gross_len = len + SPARE_ELEMENT_COUNT;
for (int i = 0; i < gross_len; i++) ...

const int SPARE_ELEMENT_COUNT = 1;

...

int gross_len = len + SPARE_ELEMENT_COUNT;

for (int i = 0; i < gross_len; i++) ...

Even without comments, everyone gets it: “Do something ‘gross_len’ times”.

Recently, I came about code that initialized a state machine:


state = STATE_READY | STATE_PERMANENT | STATE_PRIO_HIGH;

state = STATE_READY | STATE_PERMANENT | STATE_PRIO_HIGH;

I was hunting down a bug. Since no events had occurred, I expected the state machine to be still in state ‘permanent’ but the behavior of the component made me believe that it wasn’t. So I loaded the code into a debugger and set a write breakpoint on variable ‘state’ to find out which code (actually who) reset the ‘permanent’ flag. But apart from the initialization of ‘state’ I didn’t get a hit. After walking around for some time (it is always a good idea to get away from the computer when solving difficult problems) I had another desperate idea: maybe ‘state’ was never initialized to ‘permanent’. I opened up the header file that defined the flags and what I stared at with horror was this:


static const int STATE_READY      = 0x01;

static const int STATE_READY = 0x01;


static const int STATE_PRIO_HIGH  = 0x02;

static const int STATE_PRIO_HIGH = 0x02;


// static const int STATE_PERMANENT  = 0x04;

// static const int STATE_PERMANENT = 0x04;


// Our platform doesn't support 'persistent' --> ignore!
static const int STATE_PERMANENT  = 0x00;

// Our platform doesn't support 'persistent' --> ignore!

static const int STATE_PERMANENT = 0x00;

I guess I must have looked like Steve McConnell’s “Coding Horror” guy.

Now, I consider it OK if a platform doesn’t support all features; but this way of dealing with it is probably the worst. And it is a clear violation of PoLS: when I see code that looks like it sets a flag, I expect that it sets the darn flag. Period.

So the general advice for adhering to PoLS is “Write WYSIWYG Code”. Code that uses well-known idioms consistently, code that can be grasped without debugging and jumping back and forth between files and declarations. Put all cards on the table; say what you mean and mean what you say.

In Defense of B Players

A couple of months ago, I came across Jon Soberg’s post “Why hiring B players will kill your startup”. According to him, B players (who actually do a good-enough job) are the worst hires you can make. I totally disagree.

I’m convinced that having only A players on a team is neither possible nor desirable. Just imagine you would assign rock stars developers like Anders Hejlsberg, Linus Torvalds, and James Gosling to a project. What do you think the outcome would be? I wouldn’t be surprised to find them all dead within 24 hours…

You certainly do need A players. They are usually full of energy and highly creative. But in my experience, they often lack persistence and they get bored easily when they are assigned mundane work, especially over an extended period of time. Just like athletes permanently want to put their muscles to the test, A-level coders permanently want to challenge their brains.

Unfortunately, they don’t always use their brains to invent and implement awesome features that customers are willing to pay for. Instead they love to try out new technologies and spend countless hours tinkering with questionable language features.

Very often, these alpha coders have strong opinions, behave like prima donnas and are — let’s face it — not exactly what you would call great team players.

On a real-world project, you need B players, developers who don’t shun away from grunt work. Even if they are not brilliant all the time, their main strength is getting things done. Instead of just conceptualizing features, they actually implement them — in good-enough quality. They rarely gold-plate or over-engineer; instead, they live by the Extreme Programming maxim “The simplest design that could possibly work”. They are what Joel Spolsky calls “Duct Tape Programmers”.

To be successful as a company, you will need both, A players and even more B players. Use A players as lead developers, to generate ideas and to develop prototypes; employ steady performing B players to ensure that your products actually ship.

Just having A players can certainly ruin your business, too.

Avoid Array Parameters in C

Below is a function that implements a 128-bit counter in C. It expects an array containing 16 bytes (totaling 128 bits) and increments its value by one. The byte at index 0 is the low-order byte, thus the counter is in fact a little-endian counter. Note how any carry is elegantly propagated to higher-order bytes by checking for wrap-around of the currently incremented byte.


static void incrementCounter(uint8_t counter[16]) {
    int i;
    for (i = 0; i < sizeof(counter); ++i) {
        if (++counter[i] != 0) {
            break;
        }
    }
}

static void incrementCounter(uint8_t counter[16]) {

int i;

for (i = 0; i < sizeof(counter); ++i) {

if (++counter[i] != 0) {

break;

}

This code looks neat, compiles fine and works — but only partially. Depending on your platform, it might only use the lower 4 or 8 bytes of the 16 bytes that you pass to the function. How come?

Even though counter looks like an array of 16 bytes, it is actually a pointer. In C, looks can certainly be deceiving!

As a consequence, on a 32-bit platform sizeof(counter) yields 4 (8 on a 64-bit platform, respectively). According to the language rules of C/C++, arrays are always passed by reference (by passing a pointer to the first array element) and there is nothing we can do about it.

Don’t get fooled into believing that dereferencing the counter pointer fixes the problem:


for (i = 0; i < sizeof(*counter); ++i) {

for (i = 0; i < sizeof(*counter); ++i) {

The type of counter is ‘pointer to uint8_t’, the dimension [16] is completely ignored by the compiler. Even worse: providing a dimension lures people into a false sense of type-safety. Hence my advice: avoid using array parameters in C (and C++).

Obviously, the unlucky programmer wanted to prevent people from making mistakes by explicitly showing that an array of 16 bytes is required. The irony is that in the end he screwed up himself.

What our programmer wanted to achive was this:


static void incrementCounter(uint8_t (*counter)[16]) {
    int i;
    for (i = 0; i < sizeof(*counter); ++i) {
        if (++(*counter)[i] != 0) {
            break;
        }
    }
}

static void incrementCounter(uint8_t (*counter)[16]) {

int i;

for (i = 0; i < sizeof(*counter); ++i) {

if (++(*counter)[i] != 0) {

break;

}

Here, counter is declared to be a pointer to an array of 16 bytes and everything works as expected. But isn’t this somewhat ugly?

As almost always, a typedef can clean up things a bit:


typedef uint8_t counter_t[16];

static void incrementCounter(counter_t* counter) {
    int i;
    for (i = 0; i < sizeof(*counter); ++i) {
        if (++(*counter)[i] != 0) {
            break;
        }
    }
}

typedef uint8_t counter_t[16];

static void incrementCounter(counter_t* counter) {

int i;

for (i = 0; i < sizeof(*counter); ++i) {

if (++(*counter)[i] != 0) {

break;

}

But my preferred solution is to wrap the array in a struct:


typedef struct {
    uint8_t value[16];
} counter_t;

static void incrementCounter(counter_t* counter) {
    int i;
    for (i = 0; i < sizeof(counter->value); ++i) {
        if (++counter->value[i] != 0) {
            break;
        }
    }
}

typedef struct {

uint8_t value[16];

} counter_t;

static void incrementCounter(counter_t* counter) {

int i;

for (i = 0; i < sizeof(counter->value); ++i) {

if (++counter->value[i] != 0) {

break;

}

For most people, array members of structs are easier to grok than pointers to arrays. That’s why I try to avoid them whenever possible. Further, having a struct opens the door for future extensions and has the additional benefit of making direct value assignments possible without having to resort to memcpy:


counter_t counter1, counter2;
...
counter2 = counter1;

counter_t counter1, counter2;

...

counter2 = counter1;

The Art of Writing Efficient Software

I’ve just uploaded my talk “The Art of Writing Efficient Software” to slideshare.
Share and Enjoy!

Bitten by the Python

python It’s amazing how one can get used to something nice, up to a point where you don’t recognize it anymore. Sometimes, you don’t recognize it even when its not there anymore — at least initially.

After a night of Python coding, I was back at my embedded C++ project, when I wrote code similar to this:


...
// If item is in range.
int item = ItemStore.getCurrentItem();
if (0 <= item < ItemStore.getItemCount()) {
    ...
}

...

// If item is in range.

int item = ItemStore.getCurrentItem();

if (0 <= item < ItemStore.getItemCount()) {

...

}

The expression that I used to check whether the item index is in range is valid in both, Python and C++. Alas, only in Python it does what it should; in C++ it’s a clear bug. Obviously, the if statement should have looked like this:


if (0 <= item && item < ItemStore.getItemCount()) {
    ...
}

if (0 <= item && item < ItemStore.getItemCount()) {

...

}

Duh! Of course the comparison operator doesn’t work like this in Java/C#/C/C++, but while Java and C# compilers reject this code, WindRiver’s Diab C/C++ compiler happily accepts it.

I wasn’t so much annoyed about the fact that the Diab compiler accepted it, since it is syntactically correct according to the rules of C and C++; what upset me was the “happily” — it didn’t produce any warning about my nonsense!

There is, as I found out later, a warning level that would have reported my mistake, but this warning level was not enabled in our makefile, probably because it would have generated many false positives.

Even though I always pay attention to compiler warnings before flashing my code to the target, I had to find this bug the hard way: by using a debugger. I guess that I wasted more than one hour in total to track it down.

What I expect from compiler vendors is that (a) by default, warnings are on and that (b) this default warning level includes all warnings that are easy to detect and are almost always bugs or latent bugs, including (but not limited to):

– Use of unitinitalized variables
– Test of assignment, eg. if (a = b)
– Expressions with no side-effects, eg. a == b;
– Returning addresses of automatic variables
– Forgetting to return a value from a non-void function
– My stupid comparison mistake

I don’t expect all C/C++ compilers to be as thorough as PC-Lint. Ironically, most of today’s compilers are already able to report the aforementioned issues (and many more), but they do this only at higher warning levels that nobody usually enables.

Approxion

Code – People – Everything

The Principle of Least Surprise

In Defense of B Players

Avoid Array Parameters in C

The Art of Writing Efficient Software