• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

The Stupid Filter: Or, How to Destroy a Forum with Statistical Science

Status
Not open for further replies.

Eric P

Member
The concept behind the StupidFilter Project originated during a conversation between Gabriel Ortiz and Paul Starr. StupidFilter was conceived out of necessity. Too long have we suffered in silence under the tyranny of idiocy. In the beginning, the internet was a place where one could communicate intelligently with similarly erudite people. Then, Eternal September hit and we were lost in the noise. The advent of user-driven web content has compounded the matter yet further, straining our tolerance to the breaking point.

It's time to fight back.

The solution we're creating is simple: an open-source filter software that can detect rampant stupidity in written English. This will be accomplished with weighted Bayesian or similar analysis and some rules-based processing, similar to spam detection engines. The primary challenge inherent in our task is that stupidity is not a binary distinction, but rather a matter of degree. To this end, we're collecting a ranked corpus of stupid text, gleaned from user comments on public websites and ranked on a five-point scale.

Eventually, once the research is completed, we plan to release core engine source code for incorporation into content management systems, blogs, wikis and the like. Additionally, we plan to develop a fully implemented Firefox plugin and a Wordpress plugin.

http://stupidfilter.org/main/index.php?n=Main.HomePage

How are you going to manage to recognize stupidity programmatically?
Pretty much the same way you can programmatically recognize spam, we'll look for things that characterize stupidity and assign particular tokens different weights based on how often they occur in hand-picked examples of idiotic comments. For more information, see Naive Bayes Classifiers and CRM114

When do you expect to have usable code?
We expect to start releasing alpha source code around December, 2007.

Isn't filtering stupidity elitist?
Yes. Yes, it is. That's sort of the whole point.

Do you really expect to be able to detect and filter anything that's conceivably stupid?
No, of course not. You'd need real AI for that, and beyond a certain point it's simply subjective; after all, a sufficiently advanced AI would probably filter out the whole of human discourse, which isn't the idea.

So what do you plan to filter?
The idea is that the most egregiously stupid comments will also be the easiest to detect while remaining ignorant of context; comments with too much or too little capitalization, too many text-message abbreviations, excessive use of "LOL," exclamation points, and so on.

How do you rate stupidity?
Since we're trying to build a detailed database that serves as a very verbose example of What Not To Do, we look for comments whose prose style we can point to and say, "I don't even have to understand the content of this comment to know that it's stupid," -- based on the gross prose style alone, its stupidity is self-evident. It is then useful as an example for our parser to integrate into its database of stupidity.

I looked at some of the results from the Random Stupidity page, and they don't seem that stupid to me; what gives?
Keep in mind we grade stupidity on a scale of 1 to 5. Someone might get a 1 or 2 for a comment that used no punctuation, whereas a comment consisting of nothing but text message abbreviations with a dash of LOLLLLL thrown in for good measure would probably rate a solid 4 or 5. There is a certain amount of subjectivity, and our software is aware of that; scoring will be normalized to eliminate excessively generous or harsh estimations of stupidity.

What about ironic uses of "stupid" diction?
The StupidFilter is blind to irony. Our intent is that one or two instances of "lol" or "ur dum" in several paragraphs of otherwise-cogent text won't result in a false positive. However, we consider the StupidFilter's irony-ignorance to be a feature, insofar as even if an allegedly-smart person makes a short, stupid comment, their smartness doesn't make the comment any less stupid. If your mom had designed the StupidFilter, she might say "If you can't say anything smart, don't say anything at all."

Won't people just try to defeat the filter, the way spammers try to get around spam filtering?

We certainly hope they will -- that implies they're no longer generating text statistically likely to be stupid. It's true that an obvious attack on the StupidFilter would be to salt a short, stupid comment with a long excerpt copy-pasted from, say, Project Gutenberg, but we think it's reasonable to count on the laziness of the stupidest commenters not to do this.

Aren't you just trying to eliminate comments and discourse that you consider to be stupid?
As much as that might be nice, no. The StupidFilter does not understand, in a meaningful sense, the text that it parses, and our graders select comments that are formally stupid -- that is, their diction, not their content, marks them as stupid. It is not our intent to eliminate debate or disagreement, but rather to programmatically enforce a certain quality of expression. Put another way: The StupidFilter will cheerfully approve an eloquent, properly-capitalized defense of mandatory, state-subsidized rocket-launcher ownership for all schoolchildren.

Isn't that a very prescriptivist position to take?
Yes, and we are equally aware that this will make us few friends in linguistics circles. But effective textual communication requires at least some formal rigor, and we feel such rigor is worth encouraging and, at times, enforcing.

i look forward to the release of this and its eventual unleashing on Gaf.
 
I can't actually poke around the internet while at work so I'll just ask this simple question in this thread: how does this actually filter text? In my fantasy, it eliminates stupid comments like AdBlock removes unwanted advertisements. Is that how this amazing project will work? As a Firefox extension that will make browsing the internet far less treacherous? Or will it highlight stupid comments so I can move right past them without missing a beat?
 
Gigglepoo said:
I can't actually poke around the internet while at work so I'll just ask this simple question in this thread: how does this actually filter text?

It tracks down the address of the offending poster via their IP address, then sends thugs armed with hacksaws around to their homes to chop of the hands that make such horrible posts. Remember, your donations help keep StupidFilter alive! Contribute today!

FnordChan
 
FnordChan said:
Be sure to check out these randomized samples of stupidity that you will soon be editing out of your web browsing experience. Also, the application to become a moderator is a thing of beauty.

FnordChan

Jump off a fuckin bridge, u dumb cunt. U aint got no friends thats y ur aranging this gay shit u prick. Dnt ever let me bump into u in the street. ill break ur fuckin nose. DNT COME 2 LONDON!!! go somewere else 4 ur fuckin wankin competitions, U CUNT!!!

This thing is a goldmine. Everything seems to come from YouTube. :lol
 
[ID:108661] [Source:youtube] [Comment:LOL lvl 10 at pianus JKLJFASFLLOLOLOLOL] [Moderator:Gabe] [Rating:3]
:D

Sometimes when I'm bored and feel I'm letting my somewhat limited ability with Chinese script escape me, I'll get notebook and a pen and just write random characters in random combinations.

I can't help but assume... wish rather, that most of those comments are in fact the work of bored Han in a similar situation.
 
FnordChan said:
Contribute today!

If this actually happened, would people try to actually improve the manner in which they communicate or would they have such little confidence in their feeble brains that they would cease posting completely?
 
6t5pycj.gif
 
Gigglepoo said:
If this actually happened, would people try to actually improve the manner in which they communicate or would they have such little confidence in their feeble brains that they would cease posting completely?

That, to me, sounds exactly like a can't lose proposition.
 
Gigglepoo said:
If this actually happened, would people try to actually improve the manner in which they communicate or would they have such little confidence in their feeble brains that they would cease posting completely?

Either works for me. This thing is like the Genesis device for the internet. The world needs a do over.
 
I feel we should mandate the disclosure of any comment in the random sample set that a Gaf poster can positively attribute to him or herself. :D
 
I think over policing comments would kill the character of all conversations on here. Especially when the mods do a decent enough job killin' the nonsense.

that FAQ is funny tho.
 
Eric P said:
i don't know if people are that patient and meticulous in their stupidity.

if they are, then they are being belligerantly stupid and you can just killfile them.

Exactly. I have no Earthly idea how one would do that quickly and efficiently.
 
Status
Not open for further replies.
Top Bottom