Home Effective Spam Filtering With Eudora

 


OVERVIEW

STRATEGY

THE FILTERS
 1 Virus Attached?
 2 Duplicate Fm-To
 3 Whitelist (Passlist)  
 4 Friendly Domains
 5 Newsletters
 6 List Subscriptions
 7  Keywords
 8 Personality
 9 Bogus Address
10 Username in
   Subject 

11 Click Here
12 !!!!!!!!!!!!
13 Remote Images  
    or Database Links
  
14 Bcc From
    Unknown
 
15 Bad Word List #1
16 Bad Word List #2
17 Tracking Codes
    in Subject

18 Bad Word List #3
19 Bad Word List #4
20 Bad Word List #5
21 Too Many HTTP's
22 Adult Links

23 Bogus Hotmail,
    AOL and Yahoo

"REGEXP" INFO

MOST EFFECTIVE
    SEARCH TERMS

LINKS

FILTER VERBS

Other Interesting
Eudora Filters:

Numerical User
   Name

HTML Contents
Asian Characters
Blank Subject
Secret Keyword
   With Auto-Reply

 

EUDORA SPAM FILTER 11
"Click Here!"
 

Catches: "Click" or "Clicking",  followed by up to 8 random characters, followed by any number of spaces, followed by (Here, On, To, Below, Unsub, Http) or the "<" character (opens an HTML tag).

The single most common phrase in all the spam I receive is the word "click" followed by "here", "on", "to", or "the". Some spammers know this is too easy to filter on so they go to some pains to change or disguise the words, visibly or in the source code.

Of the first 1100 pieces of spam I collected, 645 of them (58%) contained the word "click" or "clicking" somewhere in the body of the message in conjuction with "here, on, to, the, below, unsubscribe, HTTP", or the start of an HTML tag (as seen in "view source"). This filter will catch all 645 of those spams.

 

Match: Incoming and Manual
Header «Body»
Verb: matches regexp (case insensitive)
Value: CLICK(ING)?(.?){8} *(HERE|ON|TO|THE|BELOW|UNSUB|HTTP|\<)
Actions: Transfer To Spam.mbx
   Make Label 6
  Make Priority Normal
  Skip Rest

NOTE - There is a space between the last ? and the *asterisk in the Value.

Breaking It Down:

CLICK(ING)?{8} *(HERE|ON|TO|THE|BELOW|UNSUB|HTTP|\<)

Question marks (?) make the previous one character or (group)? of characters optional - maybe it's there, and maybe it's not. So "clicki?n?g?" or click(ing)? will find click, or clicking ("clicki?n?g?" will also find clicki or clickin too). The period character is a "wildcard", meaning it can be one of anything at all. If you follow a period with a question mark, you get one, or none, of anything. I use 8 periods with question marks, meaning I want to find between 0 and 8 of any random character. I wrote that in the shorter form (.?){8}  - it could also have been done the long way, .?.?.?.?.?.?.?.? and it would work exactly the same. This is very handy for finding some of the hidden tricks such as new lines or tags in the source code that some spammers now use for concealing what looks to the naked eye like a simple "Click Here". Some are also inserting multiple spaces in the source code between "Click" and "Here" that don't show up on the rendered page, so we must include a space character after the final question mark, and follow that with the asterisk ( *). The asterisk is a multiplier for the previous character, meaning that it will find from zero to 1 to 100 or more of the previous character. In our case that's the space character, and no matter how many of them are inserted between "Click and "Here", we will find them. Everything in the parenthesis( ) is one group. And since they're divided here by the OR symbol "|" we will look for "HERE" or "ON" or "TO"  or "THE" or "BELOW" or "UNSUB" or the start of a link "HTTP" or the start of an HTML tag "<" - all the things I found in spam containing the word "CLICK".

Code examples caught by "Click" and  the "<" open HTML tag symbol:

  • Click</font>
  • click </font>
  • Click</a>
  • "click <a
    href"
  • CLICK -</a>
  • CLICK HE<!--cecilw-->RE

The "Make Label" and Make Priority actions are optional but are very useful in determining which filter caught any particular email.