Showing posts with label popfile. Show all posts
Showing posts with label popfile. Show all posts

2024-11-15

Personalized voice recordings by Elwood "You've got mail!" Edwards

If you're old enough to have ever used, seen or overheard the once ubiquitous AOL software you'll have heard the voice of Elwood Edwards. His voice was known to millions for saying "You've got mail!", "Welcome", "File's done" and "Goodbye" when using the AOL software. He died last week which reminded me of the time I paid him to record a customized greeting for me.

Here's Elwood Edwards explaining how that came about:


Back in the early 2000s I was working on a machine learning email filtering program called POPFile and I discovered that Elwood Edwards had a web site makinwavs.com where you could order custom messages recorded by him in that "AOL voice".


So, I ordered "Mail classified by POPFile" and "Use the source, Luke!" from him for a total cost of $30. Here's the original PayPal receipt:

    Date: Wed, 13 Nov 2002 11:54:25 -0800
    From: [email protected]
    Subject: Receipt for your Payment

    This email confirms that you have paid EVO, Inc. $30.00 
    using PayPal.
    This credit card transaction will appear on your bill as  
    "PAYPAL*EVO INC".

    SHOPPING CART CONTENTS:

    1.

      Item Name:   Custom set of 5 non-AOL  .WAV Files
      Option 1:    Enter your 5 scripts:: "Use the source, Luke!", 
      "Mail classified by POPFile"
      Option 2:    (If needed, enter more info):: (Yep, I know it's 
      only two, but that's all I need...)
      Item Number: EVO-4
      Item Amount: $30.00
      Quantity:    1
         Total:    $30.00

         Cart Subtotal:  $30.00
              Handling:  $0.00
              Shipping:  
    Sales Tax (0.000%):  $0.00
            Cart Total:  $30.00

    ------------------------------
    Payment Details:
    ------------------------------

    Amount: $30.00
    Buyer: John Graham-Cumming

    ------------------------------
    Business Information:
    ------------------------------

    Business: EVO, Inc.
    Contact E-Mail: [email protected]

Three days later I had three voice files (the two I'd ordered plus he threw in "You've got mail, John!" for free).

    From: [email protected]
    Date: Sat, 16 Nov 2002 22:16:04 -0500
    Subject: Your files :)

    Hi, John.

    Thanks for your order.  Here are your files... and I included a 
    "You've got mail, John" file, too.  Enjoy!!

    El Edwards

You can hear all three original files below.

2023-07-05

How to beat an adaptive/Bayesian spam filter (2004)

That was the title of my talk at the 2004 MIT Spam Conference on January 16, 2004. As I recently recovered the slides I am creating this blog for posterity.

The core of the talk was that it was possible to take one machine learning spam filter and use another identical one to learn the characteristics of the other. That way one machine learning system would fight spam and the other would automatically identify the other's weaknesses. Thus a machine learning algorithm could learn how to write spam that would get through a tuned machine learning spam filter. This is now referred to as "Adversarial Machine Learning".

The talk also point out that spammers were trying a technique dubbed "Word Salad" to include random words to try to evade filtering.

Slides are here as a PDF and embedded below as images.
























2023-05-24

Bringing the POPFile web site back from the dead

Over 20 years I wrote some code to scratch an itch. The specific itch was that I was getting a lot of email and I wanted it to be automatically sorted into folders. At work we were using Microsoft Outlook and I figured there had to be a way to do this with machine learning.

I got into a discussion with Brent Welch at work and he pointed me to an extension for exmh called ifile. The chap who wrote ifile, Jason Rennie, had also written a paper about it and I read that. It describes using naive Bayesian text classification to sort email. Just what I was looking for. Except I was using Microsoft Outlook.

So, in around 2000 I wrote a Visual Basic extension for Microsoft Outlook that did exactly the same sort of classification and automatically learned the right categories by observing the folder structure and when mail was moved from one folder to another. The user literally did nothing but sort out mail that wasn't in the right spot.

This was... AutoFile. The Visual Basic code wrapped the bow toolkit from CMU. I've made the code (from a 2002 attempt to make this into a shareware program) available here.

But AutoFile was Microsoft Outlook only and relied on someone else's machine learning toolkit. To make things really easy to use I created POPFile in 2001. POPFile intercepted a mail program downloading mail via POP3 (and later IMAP) and performed classification. It added a header or altered the subject line and so it was possible to use simple filters to get mail sorted into folders automatically.

To make it possible to reclassify badly sorted mail I built a web-based user interface (which was somewhat of a novelty in 2001!) which looked like this:

And I decided to make it open source (I was active in Steve Gibson's newsgroups and a lot of early POPFile testers were from there), and forget about shareware.

A whole community developed around POPFile with people writing documentation (one of whom was... a very young Tom Scott!), building installers, translating it into multiple languages, and running a dedicated website for it.


For a while POPFile was in the news and in 2005 won a Jolt Productivity Award.

That website was run by a volunteer for years and years until his server died a couple of years ago. He kindly gave me a backup and I've brought it back from the dead. It was based on ancient versions of DokuWiki and Trac (thankfully both upgraded smoothly), Subversion, Apache and Python 2.

Thanks to the efforts of the DokuWiki and Trac teams I could jump from this ancient version to the latest with only minor troubles (e.g. the latest Trac uses SQLite 3 and so a manual SQLite 2 to 3 upgrade was needed). I ditched Subversion and opted to put the code on GitHub (it's here with the full version history back to 2002).


Unfortunately, I haven't been able to bring back to life the discussion forums as the Trac plugin used is now no longer maintained.


The website is running on a dedicated server connected to Cloudflare with Cloudflare Tunnel. And here's where the past meets the present. Because of POPFile I got invited to something called the MIT Spam Conference run by some LISP hacker called Paul Graham who had his own Plan for Spam

The opening for my presentation the second year I was invited. It started with a custom version of All your base are belong to us. Sadly, the videotape on which it was recorded has been lost but a silent version is here.

(Now Cloudflare CEO) Matthew Prince was also invited one year. And that's how we met and that's how I ended up at Cloudflare many, many moons later.


The website is a historical archive at this point. https://getpopfile.org/ for the curious. I'm sure it's full of brokenness but most of it seems to work. If you've ever felt the urge to read a very large Perl program... start here. POPFile is 30,000 lines of Perl.

And a big thank you to everyone who contributed to POPFile over many years. It was a joy to work on and you all contributed to its success. And, hey, a tech billionaire once sent me $20 as a thank you for "saving his personal email from spam".