#1
|
|||
|
|||
Bayesian Spam Filters for Message Boards?
Has anyone seen an implementation of a probability-based spam filter for message boards? Paul Graham wrote about this a long time ago in A Plan for Spam and I think it might just be possible to do this on message boards as well as inboxes. For example, a lot of the spam that gets posted here looks a lot like e-mail spam, with a link and some bit of relevant text.
What are the major obstacles to implementing this kind of technology to help fight our spam problem here at 2+2? Has this been attempted on a message board before? Aren't I asking a lot of questions? |
#2
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
I actually think the mods do a nice job of keeping spam off here. I know you are asking a theoretical question, and not hammering the mods. I guess I'm raising an issue of prudence. Chances are, it could filter out otherwise relavant posts with a link through statisitical anamoly -- kind of like the one you just posted. [img]/images/graemlins/smile.gif[/img]
It's not a bad idea in theory -- I'm just not sure we need it. |
#3
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
The difference between e-mail and forum spam is e-mail spam can be automatically generated with a bot and set to run for hours on end.
Forum spam actually requires a human on the other end to register a username, authenticate the account, and then start posting. Most of these guys get caught on pretty quick by people using the "Report Moderator" function. |
#4
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
My point certainly was not to bash on the moderators here, they do a fine job. But, if this proved to be an effective tool against message board spam and became widely used, perhaps message board spam volume would reduce as it became harder to post spam messages.
Also, I singled out 2+2 because this is the only forum I frequent [img]/images/graemlins/grin.gif[/img] |
#5
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
[ QUOTE ]
Forum spam actually requires a human on the other end to register a username, authenticate the account, and then start posting. [/ QUOTE ] I don't think this is necessarily true. I have a domain where I can receive e-mail to <anything>@domain.com into a single e-mail box. With this, I could set up an automated signup service using randomly generated e-mail addresses. Then, one just has to figure out the appropriate signin/post page GET or POST URLs to automatically signin and post without any human intervention at all. |
#6
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
Yes, you could probably authenticate automatically.
However all sign up pages aren't the same, and how would you pick which forum to post in? Plus different forums use different software . . . |
#7
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
Ok, so to write a good message board spambot you would have to "support" a lot of different varieties of message board software. This is not really relevant to my original question, except to the extent that message board spam may not be a widespread enough problem to warrant such measures.
A Bayesian filter doesn't know or care if an e-mail was written by a computer or a person; if it detects a message as spam, then it is rejected. |
#8
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
Autodelete any post with the word Thursday in it.
|
#9
|
|||
|
|||
Re: Bayesian Spam Filters for Message Boards?
theoretically i belive it could certainly be done, with a naive bayes filtering method.
accuracy would be over 97% with a suitable training size. i have a friend who is an expert in bayesian spam filtering. |
|
|