This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Bogofilter Worthless?

Tags: None
(comma "," separated)
User avatar
undoIT
Registered Member
Posts
75
Karma
1
OS

Bogofilter Worthless?

Tue May 04, 2010 9:06 pm
I recently installed Kubuntu 10.04 with Kmail version 1.13.2. After configuring all my accounts, I ran the spam wizard for Bogofilter. It does not do anything.

I'm currently looking at emails from "Viagra manufactured by Pfizer" and "Erection Remedies on-line" among dozens of other spam emails. If Bogofilter can't filter these out, what is it good for? This was also an issue with Kmail in Kubuntu Karmic.

Are there any spam filters I can add to Kmail that work? Also, is there an adaptive spam filter like what is available with Thunderbird?


Image GNU/Linux user since 2007 | Keyboard Shortcuts | Coupon Code Swap
User avatar
bcooksley
Administrator
Posts
19765
Karma
87
OS

Re: Bogofilter Worthless?

Wed May 05, 2010 5:55 am
Have you trained Bogofilter using known "Ham" and "Spam" mails?


KDE Sysadmin
[img]content/bcooksley_sig.png[/img]
User avatar
NickElliott
Registered Member
Posts
258
Karma
3
OS

Re: Bogofilter Worthless?

Wed May 05, 2010 8:37 am
As bcooksley says, you need to train Bogofilter using 'good' and 'bad' samples from the e-mails you receive, the more you train it the better it gets. A couple of hundred of each should be enough to get it going, it requires a little effort at first but is worthwhile - and it should improve the longer you use it.

In my experience Bogofilter still misses some spam but has NEVER mistakenly identified good mail as spam which I can't say for other mail filters I have used in the past.

You might want to look at SpamAssassin but I have no experience of this one.


NickElliott, proud to be a member of KDE forums since 2008-Oct.
User avatar
annew
Manager
Posts
1155
Karma
11
OS

Re: Bogofilter Worthless?

Wed May 05, 2010 11:09 am
It's important, too, to make sure that you feed it a representative sample of your good/ham messages. Create a temporary mail folder and copy into it selections from your mailing lists, personal mail and any other kinds you have. The more information that goes into initial training, the better it works. I've used bogofilter for years now, and love it. New types of spam are missed at first, but if you move them to a training folder and re-train whenever you have a few of them you'll find that it learns very well. What's more it is much lighter on resources than SpamAssassin - it doesn't slow things down as SA does on many systems.


annew, proud to be a member of KDE forums since 2008-Oct and a KDE user since 2002.
Join us on http://userbase.kde.org
User avatar
undoIT
Registered Member
Posts
75
Karma
1
OS

Re: Bogofilter Worthless?

Wed May 05, 2010 5:45 pm
Thank you for the replies everyone :)

I have a few suggestions. I am not necessarily a new Linux user, I have been using it anywhere between 8-16 hours daily for the past two years and I have different versions installed on both my laptops, as well as my netbook. I recently switched from Thunderbird to Kmail.

When I see something like "Spam Wizard" I expect it to be a "set it and forget about it" solution, unless other instructions are given. Nowhere in the spam wizard does it mention anything about having to feed it examples or that it will learn by marking emails as spam.

I took a series of screen shots for the process as well as the error message that occurs when I click the Help button, but I can't post them here because it is not possible to add attachments.

Anyhow, it seems that this Wizard needs to learn some magic spells. At the very least, one of the windows that appears while doing the spam wizard should mention something about having to train it by marking messages as "spam" and marking messages that were mistakenly identified as "ham".

I'm pretty sure that I marked quite a few messages as spam (in the hundreds) while using Kubuntu Karmic, but the spam filter still wasn't working. I'll give it another try.

I don't know whether this is a Kubuntu specific issue or if every distro with KDE 4.4 has the same uninformative spam wizard with broken help button. If it is only Kubuntu, I'll report over there. What could be more frustrating than something that doesn't work, and then getting an error message when you click the Help button? This is the kind of stuff that makes people shy away from Linux, which is unfortunate because it would be so easy to improve this with a simple paragraph of explanation during the spam wizard and by removing the Help button when it is not applicable.
User avatar
annew
Manager
Posts
1155
Karma
11
OS

Re: Bogofilter Worthless?

Wed May 05, 2010 7:16 pm
The wizard helps you set it up. You have to train any spam filter if you want it to be reliable. You need to google for information, as different ways of using it need different training, so my on-server use will be trained quite differently from the average pop-mail user. As for screenshots - it's usual to use pastebin or a similar service, and post the URL here.


annew, proud to be a member of KDE forums since 2008-Oct and a KDE user since 2002.
Join us on http://userbase.kde.org
User avatar
NickElliott
Registered Member
Posts
258
Karma
3
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 9:08 am
annew wrote:New types of spam are missed at first, but if you move them to a training folder and re-train whenever you have a few of them you'll find that it learns very well.

Is this re-training really necessary after the initial training exercise? Or, to put it another way, is this method more effective than clicking on the 'spam' button for individual messages, surely Bogofilter learns through that process as well?


NickElliott, proud to be a member of KDE forums since 2008-Oct.
gauthma
Registered Member
Posts
21
Karma
0

Re: Bogofilter Worthless?

Thu May 06, 2010 6:15 pm
Hello,

I'm a fairly new KMail user, and have also been using Bogofilter for a couple of months. In short, it was a disappointment. I receive most of my email in Portuguese and English (mailing lists mostly), and have trained it considerably: I have a junk folder, where I stockpile junkmail (duh), and regularly run Bogofilter on said folder. Regarding my Inbox folder and subfolders, all the mail there is marked as 'Ham' (it goes without saying that all the mail in the junk folder is marked as spam).

But given all that I've described, the fact is that I still get a LOT of good mail wrongly classified as spam, even after months of training. The reverse however (getting junk mail classified as Ham) does not happen. I was (and still am) considering switching to some other filter. But now I've found this thread, is there any suggestions you may give me regarding to what (if anything) am I doing wrong?

Thanks in advance.
User avatar
undoIT
Registered Member
Posts
75
Karma
1
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 7:18 pm
gauthma wrote:Hello,

I'm a fairly new KMail user, and have also been using Bogofilter for a couple of months. In short, it was a disappointment. I receive most of my email in Portuguese and English (mailing lists mostly), and have trained it considerably: I have a junk folder, where I stockpile junkmail (duh), and regularly run Bogofilter on said folder. Regarding my Inbox folder and subfolders, all the mail there is marked as 'Ham' (it goes without saying that all the mail in the junk folder is marked as spam).

But given all that I've described, the fact is that I still get a LOT of good mail wrongly classified as spam, even after months of training. The reverse however (getting junk mail classified as Ham) does not happen. I was (and still am) considering switching to some other filter. But now I've found this thread, is there any suggestions you may give me regarding to what (if anything) am I doing wrong?

Thanks in advance.


Thank you for reporting this. I do like Kmail and the integration available with Kontact. However, now I am thinking it might be a good idea to switch back to Thunderbird + Lightning. The adaptive spam filter for Thunderbird is pretty good, it doesn't crash all the time when deleting emails like Kmail, and it can be used on all operating systems. I'll hold off for a little while, hopefully somebody can convince me otherwise.
User avatar
NickElliott
Registered Member
Posts
258
Karma
3
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 8:47 pm
gauthma wrote:Hello,

I'm a fairly new KMail user, and have also been using Bogofilter for a couple of months. In short, it was a disappointment. I receive most of my email in Portuguese and English (mailing lists mostly), and have trained it considerably: I have a junk folder, where I stockpile junkmail (duh), and regularly run Bogofilter on said folder. Regarding my Inbox folder and subfolders, all the mail there is marked as 'Ham' (it goes without saying that all the mail in the junk folder is marked as spam).

But given all that I've described, the fact is that I still get a LOT of good mail wrongly classified as spam, even after months of training. The reverse however (getting junk mail classified as Ham) does not happen. I was (and still am) considering switching to some other filter. But now I've found this thread, is there any suggestions you may give me regarding to what (if anything) am I doing wrong?

Thanks in advance.

Unfortunately I don't know what to suggest, perhaps the dual language scenario is causing Bogofilter problems though I can't see why that would be the case?

My experience with Bogofilter is that NONE of my good mail is ever classified as spam but every once in a while it misses some spam. In short, Bogofilter works well for me.

undoIT wrote:...it doesn't crash all the time when deleting emails like Kmail...

Don't like the sound of that, can't say I have read of other people experiencing the same problem with KMail.


NickElliott, proud to be a member of KDE forums since 2008-Oct.
User avatar
undoIT
Registered Member
Posts
75
Karma
1
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 8:57 pm
NickElliott wrote:
undoIT wrote:...it doesn't crash all the time when deleting emails like Kmail...

Don't like the sound of that, can't say I have read of other people experiencing the same problem with KMail.


This has been an issue for me with Kubuntu Karmic and now Lucid. My sister and my mom also mentioned experiencing crashes to some extent in Karmic, all using different computers. My sister stopped using Kontact and just logs into Gmail probably because it is easier to have access to her contacts, but maybe also because of the crashes. For me, Kontact crashes at least once a day.

Anyways, the Kontact bugs are a bit off-topic. Bogofilter does seem to be working now that I have classified a bunch of emails as spam, not sure why it didn't work in Karmic. It also filters a bunch of valid emails. Hopefully, it stops doing this after training it some more for ham.

It would be nice if Kmail would automatically move the emails marked as ham back into their appropriate email account folders. Does anybody know how to do this?


Image GNU/Linux user since 2007 | Keyboard Shortcuts | Coupon Code Swap
User avatar
NickElliott
Registered Member
Posts
258
Karma
3
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 9:15 pm
undoIT wrote:It would be nice if Kmail would automatically move the emails marked as ham back into their appropriate email account folders. Does anybody know how to do this?

I cheat by setting my other filters before Bogofilter!


NickElliott, proud to be a member of KDE forums since 2008-Oct.
User avatar
undoIT
Registered Member
Posts
75
Karma
1
OS

Re: Bogofilter Worthless?

Thu May 06, 2010 9:22 pm
What I did is changed the Bogofilter "Spam Handling" filter to mark the moved emails as "New". That way I can keep track of what I am manually marking as spam and what Bogofilter is filtering. Problem is, Bogofilter is filtering a lot of legit emails. I have 14 email accounts. It is a drag having to mark them as ham and then move them back into their appropriate inboxes.

Is there any "undelete" or other such function that will automatically move emails back into the inbox that they originated from?
sdertsr6
Banned
Posts
1
Karma
0

Re: Bogofilter Worthless?

Fri May 07, 2010 1:33 pm
Is this re-training really necessary after the initial training exercise?.

thanks
User avatar
annew
Manager
Posts
1155
Karma
11
OS

Re: Bogofilter Worthless?

Fri May 07, 2010 3:46 pm
You do have to use some discretion, too, about the mails that you feed it for training. For instance, I handle some mailing lists and I get emails telling me that I need to moderate some messages. I want to receive those - but if I fed them into the ham-training folder I'd be telling it that the quoted email is ham, when 9/10 times it isn't. Similarly, you might have messages from one source that you consider spam but very similar ones that you consider ham.

I've been using bogofilter for about 5 years. I does miss some spam when a new form comes out, and it struggles with those blue-tablet messages that consist of an image rather than a message, but I can't remember when I last saw a false spam - it must be years ago.


annew, proud to be a member of KDE forums since 2008-Oct and a KDE user since 2002.
Join us on http://userbase.kde.org


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]