Bug Hunting Adventures #7: Random Programming

If you want to draw m random samples out of a set of n elements, Knuth’s algorithm ‘S’ (ref. The Art of Computer Programming, Vol 2, 3.4.2) comes in handy. It starts out by attempting to select the first element with a probability of m/n; subsequent elements are selected with a probability that depends on whether a previous element was drawn or not.

Let’s do an example. To select 2 random samples out of a set of 5 (say, the digits ‘0’, ‘1’, ‘2’, ‘3’, and ‘4’), we select ‘0’ with a probability of 2/5. If ‘0’ is not chosen, we attempt to select ‘1’ with a probability of 2/4 (since there are only 4 candidates left); otherwise (if ‘0’ has been selected) with a probability of 1/4, since one attempt to draw has already been used up. It’s easy to show the corresponding element selection probabilities in a binary tree:

In this tree, the nodes represent the likelihood that the element in the column on the right will be picked. Transitions from a node to the right mean that the element was picked, while transistions to the left denote that the corresponding element was not picked. In the ‘selected’ case, the numerator is decreased by one; the denominator is decreased in either case.

Here is an implementation of Knuth’s algorithm in C:

For the sake of demonstration, assume that bigrand() is a function that returns a random number that is “big enough”, meaning a value that is greater or equal to 0 and has an upper bound of at least n – 1 (a larger upper bound does not harm). Spend some time to understand how the ‘if’ statement implements the selection with the correct probablity.

So far, so good. But how would this clever algorithm look in another language? Let’s port it to Ruby!

Not being a Ruby programmer (but being an experienced Online Programmer), I had to ask Auntie Google many questions in order to cobble together this code:

This Ruby code looks remarkably similar to the C version. It doesn’t contain any syntax errors, but nevertheless doesn’t output what it is supposed to. Can you see why?

For those folks who don’t want to waste their time on Ruby, here is the Python version, sporting the same behavior and thus the same blunder:

Solution

The Game of Life

game_of_life

“Imagine there’s no heaven
It’s easy if you try
No hell below us
Above us only sky”

— John Lennon “Imagine”

Once again, like every year, time has come to celebrate Towel Day, a great occasion to ponder Life, the Universe and Everything.

Speaking of Life — a surprising number of people, including software developers, don’t know about LIFE, also called Conway’s Game of Life; an even smaller number is aware of the corollaries, let alone accept them as a fact of life (pun intended!). So what is LIFE?

In LIFE, which isn’t really a game, but rather a simulation, there is an infinite field of cells; cells can be in either of two states: dead or alive. Conway defined four simple rules:

  1. A live cell with fewer than two live neighbors dies (think: dies of loneliness).
  2. A live cell with two or three live neighbors continues to live.
  3. A live cell with more than three live neighbors dies (think: dies of overpopulation).
  4. A dead cell with exactly three live neighbors becomes a live cell (think: birth of a cell).

You start with an initial (e.g. random) set of live and dead cells, let time increase in discrete steps and after every step you apply these four rules. That’s all. After every step the board contains a new set of live and dead cells.

It is quite fascinating to see how structures, patterns (or objects) emerge, move, disappear and reappear. Some of these objects have been given names that aptly describe their nature, like “pulsars”, “gliders”, “glider guns”, just to name a few.

What is even more fascinating is that these objects are governed by higher-order “laws” that are not obvious from the four simple rules. For instance, you can observe that “blocks” never move, “gliders” always move diagonally and “lightweight spaceships” always move from right to left. (Here is a great site for trying out various patterns yourself.)

Isn’t this very much like our own universe? In our universe, we have some fundamental laws, which give rise to higher-level structures and laws, up to highest-level laws of physics or principles of human behavior.

What Conway proved was that complex structures can emerge from simple rules; he proved that you don’t need a Creator to obtain a complex universe, just simple rules, time and favorable circumstances.

Religious people often have a hard time accepting that. One the one hand, they argue, Conway didn’t prove the absence of a Creator, and second, Conway himself acted as a Creator himself since he — after all — created the rules of the game. Isn’t there, in our world, at least room for such a “Creator of Rules”?

Nobody knows, but I personally don’t think so. What Conway did was not create the rules — he rather zoomed in on a particular universe with a particular rule set, chosen from an infinite set of rules: He just shed light on one particular universe that is favorable of life in an infinite multiverse.

Let me close this post with the words of Stephen Hawking. Like all human beings, he doesn’t know everything, but he is probably one of the persons who has the best grasp of our universe:

“We are each free to believe what we want and it is my view that the simplest explanation is there is no God. No one created the universe and no one directs our fate. This leads me to a profound realization. There is probably no heaven, and no afterlife either. We have this one life to appreciate the grand design of the universe, and for that, I am extremely grateful.”

Oh No! The ‘in’ Operator Strikes Again!

depressed

“Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.”
— Scott Adams

When I discovered the eerie truth about JavaScript’s ‘in’ operator, I thought that it couldn’t get any weirder. But boy-oh-boy, was I wrong!

Let’s first recap what’s so bad about JavaScript’s implementation of ‘in’. If you have a list (aka. array) in JavaScript and use the ‘in’ operator to test for membership, you are asking for trouble:

Contrary to what you might expect, the ‘in’ operator doesn’t test whether the left-hand side value is a member of the array, but rather whether it is a valid index into the array. This makes absolutely no sense for arrays, but it does for dictionaries, since it tests whether a given key is present or not:

The moral of the story? Don’t use the ‘in’ operator in JavaScript on arrays to test for membership; with dictionaries everything is fine.

The other day, I found another shocking behavior of the ‘in’ operator, but this time in Groovy; not when used on arrays — when used on dictionaries! Consider this routine. It has two parameters, a list of words (the input text) and a dictionary which contains an entry for every word whose number of occurrences in the input text is to be counted:

The initial word counts in ‘wordsToCount’ are set to 0; the word counts are increased for every corresponding word found in ‘wordsList’. For instance:

should print:

At least, this is what most people would expect and what one would get in Python or JavaScript. Alas, what you will get in Groovy is this:

When used on dictionaries, Groovy’s ‘in’ operator not only checks whether the given key on its left-hand side exists in the dictionary, it also overeagerly checks if the corresponding value is ‘true’ according to the Groovy Truth Rules; that is, 0, null, empty strings and empty lists will make the ‘in’ operator return false.

The ‘in’ operator should only test for membership; that is, it should test if the left-hand side is in the collection on the right-hand side. On no account should it interpret an element’s value.

Not following a well-established convention is bad enought, but being inconsistent within the same language is a lot worse: when used with lists/arrays, the ‘in’ operator just checks for membership and doesn’t care a bag of beans about the value of a member:

It won’t get any weirder — it simply can’t!

Bug Hunting Adventures #6: Logarithmic Errors

Sometimes, it is necessary to find out how an application uses dynamically memory. Especially on constrained (think embedded) systems you should know, for instance, what block sizes are typically requested, how much memory is allocated at any one time, and how much memory has been requested at most so far (allocation high-water mark). Data like this allows you to fine-tune your memory allocator and/or roll out your own highly-efficient, special-purpose allocator for popular block sizes.

I recently implemented such a statistics-enhanced version of operator new/delete and one of its tasks was to find out how frequently blocks of certain sizes were requested. Instead of keeping track of exact request sizes (which would have been prohibitively expensive on the target system) I came up with a simpler, but sufficient scheme where I counted allocations to block sizes, rounded to the next base-2 boundary:

log2ceil is the ‘ceiled’ (rounded-up) binary logarithm of a given value. As an example, log2ceil(100) and log2ceil(128) both yield the same value of 7. Hence, the value g_allocations[7] would tell me how many allocations of block sizes in the range of 65 – 128 bytes there have been.

Below is the code of a log2ceil function; it works by counting the number of leading (left-most) zero bits in a word and subtracting this value from 31. In order to get the desired rounding behavior, a well-known trick is applied: first, decrement the argument by one and then increment the result by one.

This implementation is straightforward and all the tests pass. Still, it has a subtle portability issue. Can you see it?

Code
Solution

The Reason We Have a Job

“There’s truths you have to grow into.”
― H.G. Wells, Love and Mr. Lewisham

Let me ask you a simple question. Why do you have a job?

Maybe you are convinced that you have a job because you have a lot of obligations, like hungry mouth to feed and bills to pay. Or because someone told you that every decent citizen simply must have a job these days.

But no, that’s not the answer I am looking for. These are all valid reasons why you want or need or should have a job.

Or maybe you believe the reason is that you are such an irresistibly smart person that every company just wants to posses.

No, again, such factors only make you attractive to your employer, they give you a competitive edge over other people.

So what is the real reason you have a job? Here is my answer: unless you work for a non-profit organization (or the government) the only reason why you have a job is that someone — the business owner — believes that, overall, you generate much more money than you cost.

The verb believes is essential here. It could be that you, objectively, turn out not to be such a great investment for your company, maybe even a massive waste of money, but that doesn’t matter much: you can still keep your job as long as your boss’ belief in your money-generating abilities remains strong.

However, once this belief (or hope) is gone, your days are numbered. You may stay on the payroll for quite some time, but you will be walking around neither dead nor alive, just like a zombie. The question is not if you will lose your job — the question is when.

But how cruel are these greedy company owners? What about social responsibility? Is it always just about the money?

This attitude arises from a common misconception, and yes, it is always just about the money. It is not the responsibility of a business owner to make employees rich (not even happy) — it is just the other way around. Sure, there are hip companies which treat their employees extremely well, pay high salaries and grant extraordinary benefits; but let’s not forget: it is only because there is (still) an enormous belief in a high return on investment. If everything turns out well in this gamble there is a win-win situation: happy business owner, happy employees — all the ingredients for a long-lasting virtuous circle.

In the not-so-fortunate case — and every company will face stormy weather, even the hippest — a business owner has to eliminate cost to ensure that his business stays profitable; not performing these life-saving amputations would put his and his employees’ future at risk. Unprofitable companies cannot exist for an extended period of time in a free market; they will sooner or later be eaten by the competition. And that’s not a statement — that’s a fact.

So wise developers should avoid becoming dependent on an employer’s dreams and beliefs. Instead, developers should constantly and aggressively invest in their knowledge portfolio and keep their resume up-to-date. They should neither mentally nor financially become too attached to a company or technology. Instead, they should always be ready to move on.

Give an outstanding performance for your money, but treat every day as if it would be your last. Get rid of those personal items on your desk. Accept that employments are transient. This not only gives you independence and self-confidence, it also makes you indispensable for your current employer — and your next.

Bug Hunting Adventures #5: False Diagnosis

Today’s cars host dozens of ECUs (electronic control units) that exchange tens of thousands of messages and signals on various buses (eg. CAN, MOST, Flexray, Ethernet). On top of that, ECUs provide hundredths of diagnostic services, that allow factories, maintenance personnel and even devices on the car to obtain (and change) various parameters, including vehicle speed, diagnostic trouble codes, and sensor/actuator data. By using diagnostic services, you can literally remote control your car, provided you get passed the security mechanisms, of course.

Diagnostic commands are exchanged based on the UDS protocol, as specified by ISO 14229-1:2013, via request/response pairs. Requests and responses are coded like this:

Request: SA, DA, SID1, ..., SIDn
Response (OK): SA, DA, SID1|0x40, SID2, ..., SIDn, RESP-PAYLOAD
Response (error): SA, DA, 0x7F, SID1, ERROR-CODE

Every ECU in a car has a unique 1-byte diagnostic address that uniquely identifies it (signified as source address SA and destination address DA, depending on whether a device acts as a sender or receiver) and each diagnostic service of an ECU is addressed by a 1 to n byte long service ID (SID). (Note that in reality, the n-byte “SID” consists of the real SID and a sub-level function ID, but this detail is of no importance for our considerations.)

If, for example, a device with SA = 0x20 wants to read all diagnostic trouble codes from an ECU with DA = 0x60, it would issue this request:

0x20 0x60 0x19 0x0A

where 0x19 0x0A is the two-byte service ID for the “READ DIAGNOSTIC TROUBLE CODES” service.

The positive response of the targeted ECU might look like this:

0x60 0x20 0x59 0x0A 0xFF 0x02 0x40 0x00 0x50 0x02 ...

Notice that the values of SA and DA are swapped; the replying ECU is now the sender and the requesting ECU the receiver. 0x59 is the first byte of the service ID (SID1) or’ed with the “response OK” flag 0x40. The response payload (the requested information, that is, the actual trouble codes) starts with 0xFF 0x02… .

If an ECU cannot fulfill a request, it sends a negative response indicator, followed by an error code that gives details on what went wrong:

0x60 0x20 0x7F 0x019 0x11

Here, 0x7F denotes ‘error’, 0x19 is the first SID byte of the original request (this time without the “response OK” flag) and 0x11 is the error code for “SERVICE NOT SUPPORTED”.

There are two error code values that usually require special treatment: If an ECU responds with error code 0x21 it means that it is currently busy and unable to process the request; if it responds with 0x78 it denotes that it is able to process the request but needs more time to complete it.

This description of the UDS protocol is of course an oversimplification, but it is enough for our bug-hunting purposes. Attached is an extract of an application that incorrectly processes the response to a diagnostic service request. Can you diagnose the problem?

Code
Solution