2024-12-26

The complete text of "All Watched Over by Machines of Loving Grace"

Richard Brautigan's poem "All Watched Over by Machines of Loving Grace" is somewhat well known in tech. circles but I couldn't find a complete PDF of the original 1967 publication of it (and other poems) online. 

The copyright notice in Brautigan's collection reads:

  © Copyright 1967 by Richard Brautigan 

  Permission is granted to reprint
  any of these poems in magazines,
  books and newspapers if they are
  given away free. 

I am interpreting "print" as including the availability of a free PDF and making a scan of the complete book available here. 


2024-12-15

badkeming: a site for kerning so bad it's keming

I made another silly Tumblr (to go along with Movie Code and Low Background Steel). This times it's a Tumblr for all those font disasters and other typographical amusements. Welcome to the wonderful world of badkeming.com:


Submissions gladly accepted.

2024-12-08

"All your base are belong to us" introduction to my 2004 MIT Spam Conference talk

As I've mentioned before my talk at the 2004 MIT Spam Conference was about using one machine learning system to beat another. In that case a spammer using machine learning to beat a machine learning spam filter.

The talk started with a short video playing off the All your base are belong to us meme. I very carefully edited the images to change the text to be about spam and incorporate Paul Graham (who was the instigator of the MIT Spam Conference and an early machine learning spam filter pioneer). 

I thought I'd lost the audio of that introduction and randomly came across it today. So here is the restored introduction to the talk made on a Mac in 2004 (I still haven't found the audio of the actual talk).

2024-11-15

Personalized voice recordings by Elwood "You've got mail!" Edwards

If you're old enough to have ever used, seen or overheard the once ubiquitous AOL software you'll have heard the voice of Elwood Edwards. His voice was known to millions for saying "You've got mail!", "Welcome", "File's done" and "Goodbye" when using the AOL software. He died last week which reminded me of the time I paid him to record a customized greeting for me.

Here's Elwood Edwards explaining how that came about:


Back in the early 2000s I was working on a machine learning email filtering program called POPFile and I discovered that Elwood Edwards had a web site makinwavs.com where you could order custom messages recorded by him in that "AOL voice".


So, I ordered "Mail classified by POPFile" and "Use the source, Luke!" from him for a total cost of $30. Here's the original PayPal receipt:

    Date: Wed, 13 Nov 2002 11:54:25 -0800
    From: [email protected]
    Subject: Receipt for your Payment

    This email confirms that you have paid EVO, Inc. $30.00 
    using PayPal.
    This credit card transaction will appear on your bill as  
    "PAYPAL*EVO INC".

    SHOPPING CART CONTENTS:

    1.

      Item Name:   Custom set of 5 non-AOL  .WAV Files
      Option 1:    Enter your 5 scripts:: "Use the source, Luke!", 
      "Mail classified by POPFile"
      Option 2:    (If needed, enter more info):: (Yep, I know it's 
      only two, but that's all I need...)
      Item Number: EVO-4
      Item Amount: $30.00
      Quantity:    1
         Total:    $30.00

         Cart Subtotal:  $30.00
              Handling:  $0.00
              Shipping:  
    Sales Tax (0.000%):  $0.00
            Cart Total:  $30.00

    ------------------------------
    Payment Details:
    ------------------------------

    Amount: $30.00
    Buyer: John Graham-Cumming

    ------------------------------
    Business Information:
    ------------------------------

    Business: EVO, Inc.
    Contact E-Mail: [email protected]

Three days later I had three voice files (the two I'd ordered plus he threw in "You've got mail, John!" for free).

    From: [email protected]
    Date: Sat, 16 Nov 2002 22:16:04 -0500
    Subject: Your files :)

    Hi, John.

    Thanks for your order.  Here are your files... and I included a 
    "You've got mail, John" file, too.  Enjoy!!

    El Edwards

You can hear all three original files below.

2024-10-08

Rabbit hole: stumbling across two Portuguese punched cards

Look, here's the thing, I don't like mysteries that involve technology. I think I hate them because I know that some other human created the thing I'm staring at and, dammit, I should be able to understand it. 

And, then one day I was browsing in one of those places where people leave books for others to take. I flipped through an old Portuguese book about computers published in the 1970s and two punched cards fell out.

Nerd. Sniped. By. Gravity.

One punched card contains the data M000015 JOAO A. FERNANDES 150000190 and the other 1000015 100170476. OK, so 000015 is probably some sort of identifier (employee ID?) but the other numbers are a mystery. Could they be Portuguese tax identification numbers (NIFs)? Those start with 1, 2 or 3 (for individuals) and are nine digits.

Nope. I wrote some code to verify the NIF's check digit and both aren't NIFs. 

use strict;
use warnings;
sub nif_check {
    my ($nif) = @_;
    my $check = 0;
    foreach my $i (reverse 0..7) {
        $check += substr($nif, $i, 1) * (9-$i);
    }
    $check %= 11;
    if ($check < 2) {
        $check = 0;
    } else {
        $check = 11 - $check;
    }

    print "$nif $check\n";
}

nif_check($ARGV[0]);

$ perl nif.pl 100170476
100170476 1
$ perl nif.pl 150000190
150000190 7

So what's encoded on these cards? And does 000015 mean they are related?

I found the answer starting with the fact that these punched cards are custom printed with the letters In Es Me on them. What's In Es Me? It took a while to answer that, but it turns out that In Es Me was a business called "Instituto de Estudos Mecanográficos" based in Lisbon in the late 1960s and early 1970s. 

It appears to have been a school that taught people the exciting new world of computing, which, they claim in their advertisements was "Uma profissão actual e rentavel" (A modern and profitable profession) and also "Uma das profissões de melhor futuro" (One of the professions with a great future).




Great, but apparently no closer to knowing about João A. Fernandes and those numbers. But then I discovered someone selling a book called "Introdução aos Ordenadores" (Introduction to Computers) which was apparently distributed by In Es Me exclusively in Portugal:



So, I bought it. And, inside that book on page 1-31, are the very two cards that had fallen from that other book. So, it looks like the cards were examples (perhaps from the training course) drawn from the book.


The cards in the book are life size:


(Can you spot the printing error in the book?)

The good news is that the previous pages describe the numbers on those cards via the columns.


Card code: column 1 (1)
Employee number: columns 2 to 7
Payment period: columns 33 to 38
Total hours worked: columns: 39 to 41


For the other card...

Card code: column 1 (M)
Employee number: columns 2 to 7
Name: columns: 8 to 32
Hourly salary: 42 to 46
Fixed discounts for the week: columns 47 to 50

So, 000015 is the employee number. 150000190 indicates that João A. Fernandes is paid 15$000 (15 Portuguese escudos) per hour and there's a weekly discount (deductions from the salary) of 0$190. Note that for historical reasons sometimes the escudo is written with a three digit decimal, and because earlier on the book we're introduced to João and his colleagues I know that's the format being used.


The other card is a record of his hours worked for a specific week: 100170476. 10-01-70 is January 10, 1970 and indicates the week for which the record is generated. 476 is 47.6 hours of work.

Mystery solved.

Well, almost, In Es Me didn't write this book. If you look at the title page it was apparently created by "ENIASA" and I've found very little information about who they were. Oh well.

PS Here's a little taste of what the book contains:





2024-09-27

I made a rubbish clock

With separate plastic, metal, paper, and glass recycling, composting, and pick up of all the rest, knowing which days to put out which bins can be complicated. Some have turned to high-tech solutions like Darren Tarbard's wonderful "bindicator".

There's even a commercial version of that.

I thought about making something similar, but decided that a high-tech solution isn't always the right one. Instead I made a clock, or rather a clock face. My rationale was as follows: the bin days don't change frequently so no need to call an API to get them, and anyway most councils don't have an API for this sort of thing. Also, I really didn't want yet another thing with a wall wart, or WiFi to configure, or code to debug (there comes a time in every programmer's life when they can't face debugging yet another thing that should be simple and just work). 

Not everything needs to be Turing Complete! But you can buy cheap clock mechanisms where the hand go full circle in seven days instead of 12 hours. Something like this

So, I wrote a bunch of code to produce an SVG (and PDF and PNG) for the clock face. The code is here. You end up with something like this which can be printed.



The colours and wording are all determined by the schd and bins variables. The code should be easy to customize for your location's schedule. It supports up to two different bin types per day. If you need more you'll have to modify the code.

    bins = {
        "none": ["white", ""],
        "lixo": ["#72859E", "Lixo"],
        "papl": ["#255FC9", "Papel"],
        "embl": ["#DED044", "Embalagens"],
    }

    days = ["Sun", "Mon", "Tues", "Wednes", "Thurs", "Fri", "Satur"]
    schd = [["none"],
            ["lixo"],
            ["papl", "embl"],
            ["lixo"],
            ["papl"],
            ["lixo", "embl"],
            ["papl"]]

There's no need for you to use Portuguese; I have because I'm dealing with Portuguese rubbish collection.

The only interesting piece of code is the generation of the curves for the seven daily segments (and also for the curved wording). This is done by the function path:

    def getXY(p, r, s):
        a = 2.0 * math.pi * (p * pps + rot)
        return s % (r * math.cos(a),
                    r * math.sin(a))

    def path(p, r):
        s = " %.2f %.2f "
        pa = "M"
        pa += getXY(p, r, s)
        pa += "A %.2f %.2f 0 0 1" % (r, r)
        pa += getXY((p+1), r, s)
        return pa

In SVG you can define a path element which can be used to make all sorts of curves, arcs, and lines. My path function takes two parameters: p is a number between zero and six representing the seven daily segments needed on the clock; r is the radius of the arc. The function uses getXY to find the position of a point on the arc. It does this twice to find the start and end point. 

So, path ends up returning something like M x0 y0 A r r 0 0 1 x1 y1 where x0, y0, x1, y1 are the calculated end points of the arc and r is the radius of the arc. The M x0 y0 means "move to (x0, y0)" The A r r 0 0 1 x1 y1 means "draw an arc of radius r to the point (x1, y1)". The 0 0 1 in the middle correspond to three parameters: x-axis-rotation, large-arc-flag, and sweep-flag; it's best to refer to the documentation for those because they determine the direction in which the arc points and whether it is rotated.

The final part of the path SVG element, L 0 0, is added elsewhere in the code. It sets the centre of the arc at (0, 0). I used the viewBox attribute of the SVG to make the SVG geometry be between (-1, -1) and (1, 1) thus making the centre of the image (0, 0).  

    print('<svg xmlns="http://www.w3.org/2000/svg" viewBox="-1 -1 2 2" width="%s" height="%s">' % (wh, wh))

Armed with a suitable printed clock face, I got a simple square picture frame, a few bits of wood for spacers, and put it all together into a "rubbish clock".






2024-09-07

Cracking an old ZIP file to help open source the ANC's "Operation Vula" secret crypto code

It's not often that you find yourself staring at code that few people have ever seen, code that was an important part in bringing down the apartheid system in South Africa, and code that was used for secure communication using one-time pads smuggled into South Africa by a flight attendant on floppy disks. But I found myself doing that one morning recently after having helped decrypt a 30 year old PKZIP file whose password had long been forgotten.

Some time ago I became interested in the secure communications used by the ANC as part of Operation Vula in the late 1980s. Operation Vula was the infiltration of ANC leaders (and materiel) into South Africa to set up an underground network bringing together the various elements of ANC activism operating inside the country. 

The operation needed secure communications to be successful and these were established using 8-bit computers, DTMF tones, acoustic couplers and a variety of other equipment for exchanging one-time pad encrypted messages using programs written in PowerBASIC.

I won't go into the detail of how this worked as Tim Jenkin, the person primarily responsible for the encryption system, has now open sourced the original code, and which can be found here. Tim's write up on the encryption system can be found here. I thoroughly recommend reading it for the details.

The code hadn't been open sourced before for one simple reason: on leaving the UK for South Africa in 1991 he had zipped up all the source code and set a password on it. In the intervening years he'd simply forgotten the password! So, when I emailed him to ask if it had been open sourced he replied:

I still have the Vula source code but unfortunately most of it I can't access because when I left the UK in 1991 to return to South Africa, I zipped up all the files with a password. I was able to decode and extract one of the files but it was a very early version of the software. The rest I couldn't extract because I forgot the password. When I got back to SA there was no need to access the code. I thought I would never forget the password but when I tried to decode it a few years later, I couldn't remember it.

If you could find out how to decode zipped files that would release the software, which I would be more than happy to share. I have made a few attempts to crack the code but so far have been unsuccessful.

I readily agreed and he sent me two files: ALLBAS.ZIP and CODMAY93.ZIP. These were both created with an early version of PKZIP and had passwords set on them. Luckily, there is a known plain text attack against the ZipCrypto scheme used in that era's ZIP format. And an open source implementation of the attack called bkcrack.

So, it was a "simple matter" of predicting 12 bytes of plain text at a known location inside the ZIP file. Here's a sample of what's inside the ZIP file:

$ bkcrack -L ALLBAS.ZIP | head -n 20

bkcrack 1.7.0 - 2024-05-26
Archive: ALLBAS.ZIP
Index Encryption Compression CRC32    Uncompressed  Packed size Name
----- ---------- ----------- -------- ------------ ------------ ----------------
    0 ZipCrypto  Shrink      b0f86b1d          163          117 A1PSW.BAS
    1 ZipCrypto  Shrink      8fa662d4          163          118 A2PSW.BAS
    2 ZipCrypto  Shrink      0c5a7295          163          119 A3PSW.BAS
    3 ZipCrypto  Shrink      49907f86          179          125 A4PSW.BAS
    4 ZipCrypto  Shrink      3d20eb7a          163          120 A5PSW.BAS
    5 ZipCrypto  Shrink      f8b558f0          136          128 BIOS.INC
    6 ZipCrypto  Implode     799074ed          377          278 CHKERR.INC
    7 ZipCrypto  Implode     c44ea0a5        17906         5401 CODSUBS.INC
    8 ZipCrypto  Implode     7bd7e23d        27287         8297 COMAID.BAS
    9 ZipCrypto  Implode     03dc63da         2109         1001 COMKEY.BAS
   10 ZipCrypto  Store       3500d320         2372         2384 CONFIG.TIM
   11 ZipCrypto  Shrink      35a85089          147          111 CONPSW.BAS
   12 ZipCrypto  Implode     55be75ce         2094          825 DOS.INC
   13 ZipCrypto  Shrink      3387d043          134          127 DOSVER.INC
   14 ZipCrypto  Implode     28a32efa         1304          535 DOSX.INC
   15 ZipCrypto  Implode     6578a66c         3196          966 EDDY.BAS

Tim had some unencrypted .BAS files lying around but they were different versions than those in the file and trying the bkcrack attack using them (after running them through original PKZIP in DOSBox) was unsuccessful and I decided to apply some brain power before attempting further attacks.

ALLBAS.ZIP contained a number of files that were uncompressed, because they were already binary and not worth compressing. These files are marked as Store:

$ bkcrack -L ALLBAS.ZIP | grep Store
   10 ZipCrypto  Store       3500d320         2372         2384 CONFIG.TIM
   23 ZipCrypto  Store       14a285ac            2           14 KEYCOD.EXE
   25 ZipCrypto  Store       d6343ce1         4767         4779 KEYONE.ZIP
   26 ZipCrypto  Store       650778b7         6523         6535 KEYTHREE.ZIP
   30 ZipCrypto  Store       12a711cd        58172        58184 OLDCOD.ZIP
   41 ZipCrypto  Store       00000000            0           12 TAPCOD.EXE
   44 ZipCrypto  Store       55000714        12716        12728 TECOD5.ZIP
   45 ZipCrypto  Store       f4f4366c         9230         9242 TECOD6.ZIP

Files that are Store'd are fruitful for plaintext prediction because they have not undergone compression and there's no need to have the original file to compress in order to obtain plaintext. Focussing on the ZIP files, since the ZIP files start with a PK header, seemed like a good place to find predictable plaintext at a known position. Here are the fields in the standard PK header at the very start of a ZIP file:


A viable attack appeared to be to predict the name of the first file in the archive. If the file name was at least 8 characters (which seemed pretty easy since at least four characters were used for .BAS, .INC etc.) then at least 12 characters of plaintext would be available when the file name size (offset 0x1A, 0x1B) and the length of the extra field (which appeared to be 0x00, 0x00 in all the ZIPs Tim sent) was added.

In the worst case, it would be possible to bruteforce the potential names of files given that they all appear to be uppercase/number combinations with a maximum length of eight characters plus extension. But that turned out not to be necessary.

Happily, Tim had a different version of OLDCOD.ZIP (one of the ZIP files inside ALLBAS.ZIP) and was able to tell me that the first file in it was COMKEY.BAS. So, I whipped up a quick Perl program to create the necessary plaintext in that hope that the OLDCOD.ZIP inside ALLBAS.ZIP did start with COMKEY.BAS:

$ cat maken.pl
use strict;
use warnings;

my $outfile = "hexname-$$.txt";

while (<>) {
    chomp;
    my $bas = $_;
    print("$bas / $outfile\n");
    my $n = sprintf("%c\x00\x00\x00$bas",length($bas));
    open G, ">$outfile";
    print G $n;
    close G;
    system("bkcrack -C ALLBAS.ZIP -c OLDCOD.ZIP -p $outfile -o 26 -j 8");
}

23 minutes later bkcrack spit out the key to the ALLBAS.ZIP file and was able to decrypt it. The same key worked for CODMAY93.ZIP also.

$ time echo "COMKEY.BAS" | perl maken.pl

COMKEY.BAS / hexname-41227.txt

bkcrack 1.7.0 - 2024-05-26

[07:49:38] Z reduction using 6 bytes of known plaintext

100.0 % (6 / 6)

[07:49:38] Attack on 925073 Z values at index 33

Keys: 98e0f009 48a0b11a c70f8499

80.6 % (745571 / 925073) 

Found a solution. Stopping.

You may resume the attack with the option: --continue-attack 745571

[18:13:49] Keys

98e0f009 48a0b11a c70f8499


real23m4.371s

user162m3.520s

sys0m37.752s

bkcrack does the decryption once the key has been found:

$ bkcrack -C ALLBAS.ZIP -k 98e0f009 48a0b11a c70f8499 -D ALLBAS-DECRYPTED.ZIP

bkcrack 1.7.0 - 2024-05-26

[07:52:22] Writing decrypted archive ALLBAS-DECRYPTED.ZIP

100.0 % (81 / 81)

$ bkcrack -C CODMAY93.ZIP -k 98e0f009 48a0b11a c70f8499 -D CODMAY93-DECRYPTED.ZIP

bkcrack 1.7.0 - 2024-05-26

[07:58:31] Writing decrypted archive CODMAY93-DECRYPTED.ZIP

100.0 % (40 / 40)


And with that the long encrypted source code used to help set up secure communications for the ANC was available!


Had that failed I was going to attack one of the other ZIP files using the same attack (before resorting to bruteforceing file names). I'd guessed that TECOD5.ZIP was probably a ZIP of just the file TECOD.BAS (or maybe TECOD5.BAS) based on the compressed size of TECOD.BAS in ALLBAS.ZIP). Turns out I wouldn't have had to wait 23 minutes if I'd started there:


$ time echo "TECOD5.BAS" | perl maken.pl

TECOD5.BAS / hexname-41544.txt

bkcrack 1.7.0 - 2024-05-26

[18:14:51] Z reduction using 6 bytes of known plaintext

100.0 % (6 / 6)

[18:14:51] Attack on 880113 Z values at index 33

Keys: 98e0f009 48a0b11a c70f8499

2.4 % (20737 / 880113)

Found a solution. Stopping.

You may resume the attack with the option: --continue-attack 20737

[18:15:29] Keys

98e0f009 48a0b11a c70f8499


real0m38.152s

user4m35.318s

sys0m0.897s


The known plaintext attack against ZipCrypto works quickly with the right plaintext. If you ever have to do something similar it's worth spending time thinking about the plaintext. In particular, files that are Store'd in the ZIP file are useful to examine since they are uncompressed and it may be easier to predict their contents (rather than having to find an original file and compress it to match what's stored in the ZIP).


Running the code


I compiled two of the programs and ran them using DOSBox. The first, RANDOM.BAS, was used to create disks of random numbers to be used as a one time pad, the other, TECOD.BAS, was used to encrypt and decrypt messages sent via email. The code I compiled and the generated executables can be found here.


Compilation is simply running the PowerBASIC compiler as follows:


C:\>EXE\PBC TECOD.BAS PowerBASIC Compiler Version 3.00b Copyright (c) 1989-1993 by Robert S. Zale Spectra Publishing, Sunnyvale, CA, USA C:\TECOD.BAS 2575 statements, 2329 lines Compile time: 00:12.0 Compilation speed: 12600 stmts/minute 45984 bytes code, 4880 bytes data, 2048 bytes stack Segments(1): 46k C:\>EXE\PBC RANDOM.BAS PowerBASIC Compiler Version 3.00b Copyright (c) 1989-1993 by Robert S. Zale Spectra Publishing, Sunnyvale, CA, USA C:\RANDOM.BAS 2194 statements, 1940 lines Compile time: 00:10.1 Compilation speed: 12600 stmts/minute 33328 bytes code, 4704 bytes data, 3072 bytes stack Segments(1): 34k C:\>


The first step is to create random data on disk to be used as a one time pad. RANDOM.EXE uses three different randomness generating algorithms (one based on a random key you type in yourself).



Encryption and decryption is done through TECOD.EXE which is password protected. 


Although the password is embedded in the program and quite simple Tim Jenkin did obfuscate it as follows:

DIM PW$(PL)
PW$(9)=CHR$(66):PW$(4)=CHR$(66):PW$(1)=CHR$(84):PW$(5)=CHR$(79):PW$(2) = CHR$(73)
PW$(3)=CHR$(77):PW$(6)=CHR$(66):PW$(8)=CHR$(77):PW$(10)=CHR$(79):PW$(7)=CHR$(73)

In this particular version of the program that makes the password TIMBOBIMBO which when entered takes you to the main menu. Note that each version of these programs being sent to different members of the ANC had different passwords.

If you're interested in running these programs yourself, the manual is here.

Here are three short videos showing the creation of random data in RANDATA.1 for the key using RANDOM.EXE, followed by encrypting a message stored in PLAIN.TXT on a RAM disk (all crypto operations were meant to happen on a RAM disk) and turning it into PLAIN.BIN (and the reverse).

Creating random data to be used as an encryption key



Encrypting a file


Here the programs (TECOD.EXE/TECOD.CNF) are on a floppy disk in A:, the data disk (containing the key file created above) are in B: and there's a RAM disk on R:. To get this to work the RANDATA.1 file created in the step above needs to be renamed to SNUM. 



Decrypting a file


Here the programs (TECOD.EXE/TECOD.CNF) are on a floppy disk in A:, the data disk (containing the key file created above) are in B: and there's a RAM disk on R:. The RANDATA.1 needs to be called RNUM on B:. 


There are lots of interesting details of how these programs work that deserve another longer blog post when I have time. Or a detailed study by someone else. For example, the key material is destroyed after use, the RANDOM.EXE program has multiple ways of making randomness and code to check the distribution of the random bytes created. There's an emphasis on using the RAM disk for all cryptographic operations. 

2024-09-03

Steve Ballmer's incorrect binary search interview question

In this short video Steve Ballmer talks about a puzzle question he would ask candidates interviewing at Microsoft. Solving it is based on binary search and the expected value.


Here's what he says: "I'm thinking of a number between 1 and 100. You can guess, after each guess I'll tell you whether high or low. You get it the first guess I'll give you five bucks. Four bucks, three, two, one, zero, you pay me a buck, you pay me two, you pay me three". 

The question is "Should you accept to play this game?". In the interview, Ballmer states that the answer is "No" for two reasons: firstly, because he can pick numbers that'll be the most difficult for you to determine, secondly because the expected value of the game (assuming Ballmer chooses randomly) is negative: you end up paying Ballmer.

He's right on the first count. If you follow a binary search strategy (which will be optimal if he's choosing randomly) and he chooses one of 2, 5, 8, 11, 14, 17, 20, 22, 24, 27, 30, 33, 36, 39, 42, 45, 47, 49, 52, 55, 58, 61, 64, 67, 70, 72, 74, 77, 80, 83, 85, 87, 90, 93, 96, 98 or 100 then you owe him $1. For all other numbers you get $0 (if he chose 1, 4, 7, 10, 13, 16, 19, 23, 26, 29, 32, 35, 38, 41, 44, 48, 51, 54, 57, 60, 63, 66, 69, 73, 76, 79, 82, 86, 89, 92, 95 or 99) or a positive outcome (some of his money!).

In the video above Ballmer chooses 59 which a binary search strategy would have found in 5 steps resulting in the interviewer, Emily Chang, winning $1. She was actually pretty close to doing that. The binary search steps would be 50, 75, 62, 56, 59 and she guessed 50, 75, 60, 55, 57, 58, 59. 

On the second count (Baller implies the expected value is negative), if he's choosing randomly, then he's wrong. The expected value is $0.20 (calculated discretely using the code below). The code calculates the number of guesses for each value and an overall expected value assuming Ballmer chooses randomly.

use strict;
use warnings;

my @v = (0, 5, 4, 3, 2, 1, 0, -1, -2);

my $ev = 0;
my $ec = 0;

my @range = (1..100);

foreach my $r (@range) {
    my $l = $range[0];
    my $h = $range[$#range];

    my $s = 0;
    while (1) {
        $s += 1;
        my $g = int(($l + $h)/2);

        if ($r == $g) {
            print "$r found in $s steps (" . dollar($v[$s]) . ")\n";
            $ev += $v[$s];
            $ec += 1;
            last;
        }

        if ($g < $r) {
            $l = $g + 1;
            next;
        }

        if ($g > $r) {
            $h = $g - 1;
            next;
        }
    }
}

$ev /= $ec;
print "Game expected value is " . dollar($ev) . "\n";

sub dollar {

    my ($d) = @_;

    my $f = (int($d) == $d)?'%d':'%.2f';
    return sprintf("%s\$$f", ($d<0)?'-':'', abs($d));
}

This chart shows the expected winnings (or loss) depending on the number Ballmer chooses. The shape of the binary search can be seen in the chart itself.

A different way to think about the expected value and binary search is as follows:

1. On the first guess you choose 50 and win $5 with a probability of 1/100

2. On the second guess you choose 25 or 75 and win $4 with a probability of 2/100

3. On the third guess you choose 12, 37, 62 or 88 and win $3 with a probability of 4/100

4. On the fourth guess you choose 6, 18, 31, 43, 56, 68, 81 or 94 and win $2 with a probability of 8/100

5. And so on.

The gives the expected value as 5 * 1/100 + 4 * 2/100 + 3 * 4/100 + 2 * 8/100 + 1 * 16/100 + 0 * 32/100 + -1 * 37/100 (note the last term is the remaining possible numbers having reached the end of the binary search). That's 0.2.

Why was Ballmer wrong?

One possibility is that he didn't mean to have the $0 for 6 guesses. If he'd said "I'm thinking of a number between 1 and 100. You can guess, after each guess I'll tell you whether high or low. You get it the first guess I'll give you five bucks. Four bucks, three, two, one, you pay me a buck, you pay me two, you pay me three" then the expected value is -$0.49.