— Scott Meyers
Dangerously confusing interfaces can lurk anywhere, even in the venerable (yuck!) DOS batch scripting language. Some time ago, I burnt my fingers when I made a tiny tweak to an existing batch file, deploy.bat, which was part of a larger build script:
1 2 3 4 5 6 |
: rem Deploy executable to release folder. copy image.bin %DEPLOY_PATH% : |
Because we had seen the ‘copy’ command fail in the past, I tried to improve things a little by adding an ‘if’ statement to ensure that we would get a clear error message in such events:
1 2 3 4 5 6 7 8 9 10 |
: rem Deploy executable to release folder. copy image.bin %DEPLOY_PATH% if %ERRORLEVEL% NEQ 0 ( echo Deploying to %DEPLOY_PATH% failed. exit /b 1 ) : |
Alas, it didn’t work. There still was no error message produced in case the copy command failed. Worse yet, the outer build script happily continued to run. Puzzled, I opened a DOS box and did some experiments:
1 2 3 4 5 6 7 8 9 10 |
H:\>copy foo bar The system cannot find the file specified. H:\>echo %ERRORLEVEL% 1 H:\>if %ERRORLEVEL% NEQ 0 echo Errorlevel is set Errorlevel is set |
Hmm. Everything worked as expected. Why didn’t it work in deploy.bat? Next, I changed deploy.bat to output the exit code:
1 2 3 4 5 6 7 8 9 |
rem Deploy executable to release folder. copy foo bar echo %ERRORLEVEL% if %ERRORLEVEL% NEQ 0 ( echo Errorlevel not equal to 0. exit /b 1 ) |
And tried again:
1 2 3 4 |
The system cannot find the file specified 0 |
What? The copy command failed and yet the exit code was zero? How can this be? After some head scratching, I vaguely remembered that there was another (arcane) way of checking the exit code, namely ERRORLEVEL (without the percentage signs), so I tried it out:
1 2 3 4 5 6 7 8 9 |
rem Deploy executable to release folder. copy foo bar echo %ERRORLEVEL% if ERRORLEVEL 1 ( echo Errorlevel not equal to 0. exit /b 1 ) |
I never really liked this style of checking the exit code, because ‘ERRORLEVEL n’ actually doesn’t test whether the last exit code was n; it rather checks if the last exit code was at least n. Thus, this statement
1 2 3 4 5 |
if ERRORLEVEL 0 ( echo Hi ) |
doesn’t check if the exit code is zero (ie. no error occurred). What it really does is check if the exit code is greater to or equal to zero, which is more or less always true, no matter the value of the exit code. That’s pretty confusing, if you’d ask me.
Anyway, for some reason, it seemed to work nicely in deploy.bat:
1 2 3 4 5 |
The system cannot find the file specified 0 Errorlevel not equal to 0. |
I hardly couldn’t believe my eyes. The copy command obviously failed, %ERRORLEVEL% was obviously zero, still the if statement detected a non-zero exit code. What was going on? I delved deeply into the documentation of the DOS batch language. After some searching I found this paragraph:
%ERRORLEVEL% will expand into a string representation of the current value of ERRORLEVEL, provided that there is not already an environment variable with the name ERRORLEVEL, in which case you will get its value instead.
Whoa! There are two kinds of ERRORLEVEL, who knew? One (the one whose value you can query with %ERRORLEVEL%) will be set to the value of the former, provided there is no variable named ERRORLEVEL already. Now I had a suspision what was going on. I opened the parent batch file and came across the following:
1 2 3 4 |
rem Reset error level. set ERRORLEVEL=0 |
In an attempt to clear the error level, some unlucky developer introduced a variable named ERRORLEVEL which shadowed the value %ERRORLEVEL% from this point on. This can be easily verified:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
H:\>dir does_not_exist The system cannot find the file specified H:\>echo %ERRORLEVEL% 1 H:\>set ERRORLEVEL=0 H:\>echo %ERRORLEVEL% 0 H:\>dir does_not_exist The system cannot find the file specified H:\>echo %ERRORLEVEL% 0 |
Once the problem was understood, it was easy to fix: clear the error level in an “accepted way” (yuck, again!) instead of wrongly tying it to zero:
1 2 3 4 5 6 7 8 9 10 11 12 |
H:\>dir does_not_exist The system cannot find the file specified H:\>echo %ERRORLEVEL% 1 H:\>ver > nul H:\>echo %ERRORLEVEL% 0 |
Even though the interface to DOS exit codes is dangerously confusing (and disgusting as well), it facilitates a nice practical joke: next time a colleague leaves the room without locking the screen, open Windows control panel, create a new global environment variable called ERRORLEVEL and set it to 0.