“…that we all feared might happen someday…”? Where has this chick been?
ANYWAY…..
Whenever I hear about an email worm going around an infecting people left and right, I kind of chuckle to myself. These are absurdly easy to block, yet no one seems to do it. I’m in charge of all the network operations at the company that I work for and it’s a relatively small company, yet we’ve never been hit by any of the major email worms that have surfaced over the years. Why?
Well, it’s simple.. just like spam, there are certain characteristics that are static across all of the emails that are being generated. Yes, sometimes they’re more difficult to pinpoint than others, but usually (as is the case here), it’s trivial.
From all of the samples that I’ve seen, we’re dealing with two different subject lines. Obviously, the creators of this worm were not interested in filter evasion, otherwise they would have created an array of thousands of different subject lines and messages. That, in addition to thousands of random file names and websites that are hosting the file, and you’ve got yourself a worm that’s moderately difficult to block. So, you future worm writers of the world, go big or go home.
Anyway, as I’ve mentioned in a previous post, I rely heavily upon Mimedefang to filter email at the gateway. It’s a very straight forward milter for Sendmail that allows you to write custom filters in Perl. And, since I’m sure all of you know about my love for Perl, you can see why I’m immediately drawn to this.
If you’ve never played with Mimedefang, I highly recommend you check it out.
So, on to the filtering. My filter file has, well, lots of rules that accomplish a variety of different tasks. In order to filter this one, though, it’s a simple pattern match:
if(/^.*?Subject\:\s+here you have.*/i || /^.*?Subject\:\s+just for you.*/i) { foreach $recip (@Recipients) { delete_recipient($recip); } add_recipient('spam@localhost'); action_change_header('Subject', "BANNED - VIRUS - $Subject"); } |
Trivial. You place this under the “filter_begin” routine. Also, in order to get access to the headers, there are two files which Mimedefang stores. One is called “COMMANDS” and one is called “INPUTMSG”. You’ll want to open the “INPUTMSG” file in order to parse the subject line.
Example:
Anyway, this is just a simple example of a way to take care of some of these lame worms at the gateway so that your end users never even see them. Mimedefang, like pretty much everything else I use, is free. So, it always gives me a warm fuzzy when large companies (ABC, NASA, etc.) get pwned by some lame worm when they run like, $50,000 email filtering systems, yet my simple little Perl script and Mimedefang somehow has kept us protected.
Some of you are probably thinking, “Yeah, but if you had the amount of emails coming through that those guys do, your Perl script would kill your box”
Trust me, I thought about that ahead of time. The one nice thing about milters, in general, is that they’re written in C and are usually pretty quick and not super resource intensive.. so the idea of running a milter that delegates the task of filtering to a scripting language was definitely a concern. However, on this box, which is a Pentium 2.8ghz dual core (11,205.53 bogomips total), the load average stayed below 2 all the way up to 250 emails a minute. Sure, this is not a solution for the Gmail’s and Yahoo’s of the world, but does your company receive 250 emails a minute? Probably not.
Also, I’ve written in some throttling to ensure that it backs off if the load increases too significantly.
Anyway, if you have any questions about filtering with Mimedefang, let me know.. if inadvertently become pretty well versed in it.
Related posts:
Great article, my company has also not been hit with this worm due to the filters that I put in place. In your article you state the following “…just like spam, there are certain characteristics that are static across all of the emails that are being generated….” What would those standard characteristics be?
@Pegoto: Well, there’s quite a few of them. Bayesian filtering works on this theory.
A few that I specifically look for are:
1) Country of origin.. Most spam originates from Korea, Russia, China, etc. These would be characteristics… so, I have a filter that bans all emails except for emails from countries that I specifically allow. Similar to when configuring your firewall – default policy should be “deny all”
2) On that same note, one issue is emails that are sent through Yahoo. Yahoo is a US company, so relying specifically on the domain “yahoo.com” wont work. However, Yahoo includes a header called “Originating-IP”. This is the real IP address of the sender, so I wrote a filter that checks that against my country database. If that’s not on the list of “allowed countries”, straight to the spam filter it goes.
3) Rarely do spammers use legitimate email servers, so when a user sends us an email, we return a 450 message. A mailserver that follows RFC 821 will queue the mail and attempt delivery again. We take the sending email address and match it to the one that should already exist in the database when the mail server attempts redelivery. If it exists, we allow the email to go through and the sender is considered “legitimate” for 30 days. But, most spam emails will never attempt to resend when they get the 450 because they don’t know how to handle it properly. This system is called “greylisting”
The route an email takes through our filtering system goes something like this:
Email —-> RBL (DynIP/Zombie/etc) checks —-> Greylisting —-> Bayes —-> Custom filters via Mimedefang —-> Virus scanning —-> Delivery
Essentially, it goes from the least CPU intensive to the most.
The success rate of this system has been rather surprising, to be honest. We receive about 1500-2000 emails a day. Of those, about 2% are legitimate. I’ll see, maybe, one false positive a month, and maybe one spam that makes it through a month to one of our users.
I hope this gives you some ideas.. I’ve spent a lot (a LOT) of time working on effectively blocking spam and through my testing/experience/etc. I’ve come up with a solution that has been incredibly effective and I think a lot of other companies could benefit from it.. because, honestly, I have no idea how a company could operate without the ability to effectively filter spam.
How do you “test” the spam filters? Do you have copies of emails that you’ve personally checked and know they’re one or the other and then run them through the chain and see if IT identifies them correctly? Or do you do something else?
@Matt: I collect samples from various sources and then just run them through whatever search pattern that I’m using to identify them.
Also, before implementing a blocking rule, I write it as a flagging rule first. So, basically, it flags an email (by adding an additional header) indicating that it would have been marked as spam based on whatever rule I’m testing. If I notice that the header is being added to legitimate emails, I’ve obviously screwed up somewhere.
I also manually run known spam through search patterns.
All of the spam that our network receives gets saved in individual files, which I use as samples to test filters against.. they’re raw files, which means I can just copy them in to the mail queue and allow sendmail to process them to see if they’ll trigger a filter, or I can just ‘cat file|filter’ to see if whatever pattern matching I’m using hit or not.
Tada? lol
Ah, I run my own mailserver for my domain, and while I’ve got it mostly secured, I havn’t setup any real spam filtering. It hasn’t become a problem yet since I’m the only one who uses the domain. I’m looking into ways of spam/virus filtering and this post proved very useful.
@Matt: Glad you found it helpful. Definitely play with the Mimedefang stuff.. especially if you’re at all proficient with Perl. And, to be honest, it’s more about regex than it is about Perl itself.
By the way, you and I don’t live too far away from each other.