I use POPFile bayesian filtering to keep email spam at bay. With a little training, this works amazingly well-- I'm at 99.8% accuracy, and that's with a little over a month of "training" precipitated by a recent server migration. But bayesian filtering has one big weakness that I'm seeing more and more: spoofed emails.
How many people send from mail servers other than the name they’re actually claiming nowadays? I know for a long while I was using name@pritchetts.us but sending from mail.comcast.net. If most people don’t do that, maybe blacklist mails where the sender doesn’t match the server? I don’t have much spam blocking experience (I just stick with whatever Thunderbird does for me) but I did have to run a spam filter on a Win2k3 server for a few months. I used GFI MailEssentials, which has several different forms of filtering that work independently:
Blacklists, whitelists, bayesian, keyword, and some other stuff I don’t remember.
The upside was that I could pretty easily add and remove key words (prozac) to the word filter and let the bayesian filter take care of the rest. One of my favorite features was the auto whitelist, which whitelisted anyone I ever sent an outgoing mail to.
About web services (eBay etc.) not sending email, I don’t know if the world is ready for that yet. It’s still the only ubiquitous form of net communication that a lot of people are willing to give out their connection details for. What’s next, instant messenger?
Sorry for not bothering to read your POPFile entry until after I commented, it looks like you’re on the right track looking at multiple filtering tools getting you from 98% to 100%. I certainly can’t think of anything better at the moment.
How many people send from mail servers other than the name they’re actually claiming nowadays?
That is the other way to attack the problem: actually validate the identity of the sender (or at least the server sending the email). There have been some baby steps in this direction from Yahoo and Hotmail but I’m not sure if anything substantive has come from it yet.
It’s definitely a good idea, but the architecture of POP3/SMTP isn’t built around identity or even security-- so it’s hard to retrofit.
Actually the logical choice would be for them to offer personalized RSS / ATOM / RDF feeds for their users. So I can get one feed from them that has all their general news (customized by me) and all news specific for me. Through in some HTTPS if they want and I know I am getting the straight dope from the horses mouth.
Both spoof@paypal.com and spoof@ebay.com work fine. I always report PayPal and Ebay spoof messages. Note, though, that spoof@ebay.com won’t accept an attached message - you have to forward the message to them.
GMail actually has a decent spam filter, and it also has good spoof e-mail detection. I got a phishing e-mail the other day, that was spoofing the e-mail address, service@paypal.com, and I got a bright red message at the top that said the following:
“Warning: This message may not be from whom it claims to be. Beware of following any links in it or of providing the sender with any personal information.”
Now admittedly, anybody in our line of business should know immediately whether an e-mail is legitimate or not, but it’s still a good thing for the more non-technical people using e-mail.
On an amusing note, the e-mail I’m referring to was the worst attempt at phishing I’ve ever seen. Check out this snippet from the body of the e-mail:
“U need to update ur account once again, u forgot fill in ATM PIN at from update, come to link below and do it.”
I mean, if you’re going to do something, at least try a little harder.