Go Back   The Scream! > COMMUNITY FORUMS > General Internet Questions

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #1  
Old 25-March-2003, 10:56
silver's Avatar
silver silver is offline
 
Join Date: Apr 2001
Location: Bournemouth, UK
Posts: 11,928
Cool Really neat way to stop spam ( and sort any email )

Yesterday I got 10 or 20 spam / UCE emails all with the same title but from different email addresses, normally I just hit the delete w/o opening or downloading them (using popcorn - which rocks!).,.

I have seen peeps mention mailwasher so I thought I'd give it a look,, here is my view on what it does and some of the drawbacks (as I see them):

mailwasher behaves like a pop client, it connects to the server and tries to work out what is spam / junk and any it spots it flags as potential spam, you then have a choice delete any emails before they are fully downloaded and / or send a ( fake ) bounce message, not bad...

Going by the faq here and from information given by users (blaming Gem here!) it appears that the latest free version 2.0.40 is limited to one account,, this was apparently not the case for the free version in earlier releases (!)

Mailwasher works out what is spam or junk by using various open relay lists (which you can add to) and some keywords which are kept in a file (which you can also add to). I have certain reservations on this technique. Open relay lists are nice but they do not always have all open relays listed. Unlikely perhaps but you could get valid email sent from an open relay. Not all spam comes via open relays, so in this case you are left with 'key word' matching. The keyword file does not sound flexible enough, I don't think it had a 'good word' list (it may do) which might help make it more accurate. Are people really going to go edit a file to put in a new keyword they have thought up, perhaps but not everyone.

There would appear to be a 'race condition'; if you have mailwasher and your normal pop client running, both could try to connect at the same time, or it's possible that your email client will download spam or other unwanted email before mailwasher has had it chance to deal with it. I would guess that most users turn off automatic email retrival for their pop client and manually download the email once mailwasher has run (note this does not eliminate the 'race condition' but makes it less likely).

Presuming that mailwasher must at least download some of the email to work out if it's spam you are downloading it twice (though it has to be said that mailwasher probably only downloads a small amount from each message).. This is also a benefit since you can decide to delete the mail from the server w/o needing to download the whole thing (tho this is just as easy using something like 'popcorn' perhaps?).

Only runs under window (or linux if you install wine!).

Some manual intervention appears necessary whenever you want mailwasher to 'do something', it does not appear to operate silently in the background. You must use it's GUI each time in order to made it delete or bounce emails.

Mailwasher may do everything you want and it will do it's thing with very little need for any sort of setting up, it's also free so I think it may well be usable for what most people want.

..

Which brings me onto what I think is a much better and more powerful way of doing email filtering..

( I run a mailserver in my home network, it collects mail from 5 different accounts every 10 minutes, I also have an incoming server that listens for smtp which I do a pop fetch from. Email is then stored locally and I connect my local mailserver rather than connecting out to the internet. )

I had heard of using 'Bayesian Statistics' to determine the likelyhood that a particular email is spam or not spam. It seemed there must be a pop proxy that allowed this type of sorting.. a quick search and I found popfile

Popfile basically sits between your email client and the popserver (well actually it's just a pop proxy so it can also sit between your popfetch and the popserver - which is where I have it). It handles multiple accounts transparently and there is no limit in this regard

Once installed you tell popfile what types of email you wish sort between, a simple setup would be 'spam' and 'ok', which is how I started out. I soon realised that it's can be much more powerful than that and now have it sorting between 'spam', 'ok', 'lists', 'redhat' and 'virus' which it is doing nicely. Once it has worked out the type of email then it adds [spam] or [ok] etc. to the subject line - ready for you email client to sort on the message (you can turn this off in fact and use 'X-Text-Classification' so the subject line is not altered).

Popfile works by you 'training' it, this means it doesn't work right out the box but in my experiance it's actually very quick and easy to get it to do something useful. You set up any number of different things for it to differentiate between (it calls these 'buckets') when it receives an email the first time you tell it which bucket (e.g. 'spam') is and it then uses 'Naïve Bayes' to determine whether new emails are of similar type. Sometimes it will make a wrong decision but you simply goto it's config screen where it keeps a history of recent emails (configurable) and tell it to reclassify that particular email, the more you teach it the 'smarter' or more 'precise' it will become.

All config and 'training' of popfile is straightforward via your web browser to popfiles local http server (which by default is only connectable from the PC running popfile). You can also see statistics for how much mail of each type you have got and add remove buckets / sorting types. The screens are very well layed out and easy to follow. Screenshot of one of the screens is here

Popfile works on any OS that runs Perl (so nearly any OS!) I have it running under windows, when you download popfile for windows you can choose to put Perl on seperatly or use a popfile installer which adds Perl at the same time.

Popfile faq is here

I am very impressed with the way it works and the ease of use, the software is written totally in Perl, open source and free to use

Sil

edit, added the bit abt one account for mailwasher - apparently older versions did let you check more than one account and also connect to hotmail - these features are now only available from the 'paid for' version (!)

edit2, some additions to mailwasher text, added part abt manual intervention and improved description of 'race condition' scenario (hopefully!)

Last edited by silver; 15-July-2003 at 10:16.
Reply With Quote
 

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 02:28.


Powered by vBulletin® Version 3.7.4
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Copyright ©1999-2009 The Scream!