Bend Over and Say Splog
Blogs are getting pounded by spam blogs: splogs. Weasels will post a comment and add a plug for their website. They do this not once or twice: they do this 100 or 1000 times. Google and other search engines march to the blogs, index them and spit out their results. Two factors make splogging viable: spastic trendy hungry search engines and crappy programmers.
When I started doing stuff on the web (like 9-10 years ago), you could go to a search engine, add a site and voila! it would appear as part of the search engines. Weasels of the day flooded search engines, so the engine had to get picky and hold onto a link for weeks and vet it. This stymied the weasels. In the last couple of years, blogs came on the scene. Blogs are nothing new: Blogger is just a honed CMS for posting material; crap-ass LiveJournal is a discussion forum with a lot more popularity and lot less functionality than you would find elsewhere. Anyways, they make blog posting so easy that there is a diarrhetic flood of opinion, information and navel gazing. If search engines used their common, slow vetting technique, they would totally miss out on firsthand accounts of stuff as it was going on. Google, MSN and others took the brakes off of their indexing engines. You post it now, and it will be up there quickly. Really, check it out. Because search engines are working so hard and so fast they don't bother to wait and see if they're getting crap. Weasels have exploited this weakness. If you link to something, that destination has a value. Link to it twice and the value of the destination is greater. Do that 1000 times and that destination is the place to go to. So much so, that that popularity link will outshine similarily relevant destinations.
What makes splogging so easy is that programmers are largely crappy at their job. There is a long video by Chris Pirillo where he shows all of the notifications he has recieved from splog comments. It's like showing off a great car that cost so little. Then, you get in an accident and lament the lack of brakes, seat belt, bumpers, horn, and metal body. There's a reason to vet links. There's a reason to limit user commenting.
When programmers build these systems they didn't ruggedize and weasel proof them. Here are some easy ways to pull this off:
One post per user per second. If you're typing comments, no matter how fast you are, you will not be able to post at a rate of faster than 1 per second. So, comment accepting scripts should identify users and allow them to post only 1 script per second. Can you tame a splogging script to post once per second? Sure. But a wins through processing power. I've watched attacks as they happen. When a 'bot hits a site, it throws 20-100 hits at a site per second. If a splogger did tame their script to make 1 comment per second and get past this limitation it means their bot would be running at 1%-5% of its capacity. Where before they could have seeded 1000 comments in ten seconds, now it takes twenty minutes. In twenty minutes, an analysis script could catch on and lock out a user.
No hyperlinks. This would make comments much safer. A splogger could still exploit the indexing by putting a phrase from their site in the comment and letting their own site's indexing be the key part of the formula. Nevertheless, the lack of inbound links would hurt splogging viability.
Make everyone log in and renew their login. Make login with captcha the default for blogs; disallow anonymous comments and make captcha disabling an overt act. Also, send out questions to registered users via email-- something that a script couldn't automatically answer: "What do you get when you combine blue and yellow?" or "Who was the US President in 1962?" if the registered user can't answer, poof goes their account.
Prosecute Sploggers. Change the Terms of Use to disallow splogging. If someone does it, hand them a bill. Or, treat splogging the same as computer hacking and go after the sploggers. They mention their website: they want to be found. Accomodate them. Find them at home at 3AM and show them how handy you are with a taser.
When I started doing stuff on the web (like 9-10 years ago), you could go to a search engine, add a site and voila! it would appear as part of the search engines. Weasels of the day flooded search engines, so the engine had to get picky and hold onto a link for weeks and vet it. This stymied the weasels. In the last couple of years, blogs came on the scene. Blogs are nothing new: Blogger is just a honed CMS for posting material; crap-ass LiveJournal is a discussion forum with a lot more popularity and lot less functionality than you would find elsewhere. Anyways, they make blog posting so easy that there is a diarrhetic flood of opinion, information and navel gazing. If search engines used their common, slow vetting technique, they would totally miss out on firsthand accounts of stuff as it was going on. Google, MSN and others took the brakes off of their indexing engines. You post it now, and it will be up there quickly. Really, check it out. Because search engines are working so hard and so fast they don't bother to wait and see if they're getting crap. Weasels have exploited this weakness. If you link to something, that destination has a value. Link to it twice and the value of the destination is greater. Do that 1000 times and that destination is the place to go to. So much so, that that popularity link will outshine similarily relevant destinations.
What makes splogging so easy is that programmers are largely crappy at their job. There is a long video by Chris Pirillo where he shows all of the notifications he has recieved from splog comments. It's like showing off a great car that cost so little. Then, you get in an accident and lament the lack of brakes, seat belt, bumpers, horn, and metal body. There's a reason to vet links. There's a reason to limit user commenting.
When programmers build these systems they didn't ruggedize and weasel proof them. Here are some easy ways to pull this off:
One post per user per second. If you're typing comments, no matter how fast you are, you will not be able to post at a rate of faster than 1 per second. So, comment accepting scripts should identify users and allow them to post only 1 script per second. Can you tame a splogging script to post once per second? Sure. But a wins through processing power. I've watched attacks as they happen. When a 'bot hits a site, it throws 20-100 hits at a site per second. If a splogger did tame their script to make 1 comment per second and get past this limitation it means their bot would be running at 1%-5% of its capacity. Where before they could have seeded 1000 comments in ten seconds, now it takes twenty minutes. In twenty minutes, an analysis script could catch on and lock out a user.
No hyperlinks. This would make comments much safer. A splogger could still exploit the indexing by putting a phrase from their site in the comment and letting their own site's indexing be the key part of the formula. Nevertheless, the lack of inbound links would hurt splogging viability.
Make everyone log in and renew their login. Make login with captcha the default for blogs; disallow anonymous comments and make captcha disabling an overt act. Also, send out questions to registered users via email-- something that a script couldn't automatically answer: "What do you get when you combine blue and yellow?" or "Who was the US President in 1962?" if the registered user can't answer, poof goes their account.
Prosecute Sploggers. Change the Terms of Use to disallow splogging. If someone does it, hand them a bill. Or, treat splogging the same as computer hacking and go after the sploggers. They mention their website: they want to be found. Accomodate them. Find them at home at 3AM and show them how handy you are with a taser.
Comments
Also, the word verification for this comment is "jelijiz." Um, gross.