The Day The Trackbacks Died

You might read a post on this blog and decide I'm full of crap. That's fine. I often am full of crap. I encourage you to leave a comment explaining why you feel this way. And, while you're at it, feel free to point out any errors or inaccuracies in anything I've written. This kind of simple, immediate, highly visible public dialog is why I believe so strongly in comments as an essential part of blogging.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/12/the-day-the-trackbacks-died.html

Yeah, security in general should always be kept in mind. Theres a line between treating your users like customers and being insecure.

This begs the question, how would you go about designing Trackbacks 2.0?

I feel that true users are usually not evil. The evil-doers are out to abuse whatever system they can for their own gain. It is highly likely that it is the same base of “evil-users” who are responsible for spam in comments, trackback abuse, pop-up ads and spy/malware, and the “blink” tag. Well, maybe not that last one. True users can be bastards, but it is usually because they are demanding and intimately in touch with the software or service. They have high standards and expect evolutionary positive change in their app/service of choice.

If only their was some way to tag the “evil-doers” themselves and differentiate them from the mass user base…but until then, it does pay to plan.

Would it be any easier to maintain a white-list of legit blogs that you regularly get trackbacks from?

You’d still have to manually approve new blogs, but the approval list would only contain a small fraction of total trackbacks.

Still taking my time to perfect the trackback, and I still got love for the streets.

Your old captcha post implied that you’d already disabled trackbacks, so I thought they’d been gone since you first instituted the captcha. Probably would have been a decent idea, since even legitimate trackbacks are largely spam, of the “great post!” variety.

Did you change the font on the blog somehow? I can’t put my finger on whether it’s larger or also a different face; or if it’s just me.

Great!

For my trackback implementation, I do a reverse lookup – when a trackback ping comes in, I read the putative trackback URL and look for a link to my post there. No link, no trackback. (This was Nikhil’s idea, to be clear.)

http://mikepope.com/blog/DisplayBlog.aspx?permalink=1262

It’s worked great. I know it’s working because I hardly ever get trackback spam (like, a total of 6 since I implemented this), but I get referrer spam all the time, so they’re hitting the blog all right.

Of course, I can tweak anything I want, since I run a MeWare blog. :slight_smile:

I decide to comment now that you aren’t taking trackbacks. “Your full of crap.”

Actually, TypePad has been very good at automatically catching trackback spam. The trick they use is to examine the trackback url and verify that the url links to the post.

Because a link must be provided, an authentication mechanism is not really needed. You can also screen out links that contain certain words or whitelist links that contain .NET related keywords.

If only their was some way to tag the “evil-doers” themselves and differentiate them from the mass user base…

Right, but this implies logins and persistent identity, too.

when a trackback ping comes in, I read the putative trackback URL and look for a link to my post there. No link, no trackback. (This was Nikhil’s idea, to be clear.)

This is a reasonable idea, but it doesn’t scale. Furthermore, it could easily become a huge DDOS (distributed denial of service) vector. The last time I checked, I was getting 75 spam trackbacks PER HOUR-- more than one every minute! That means our server would be overloaded with bandwidth and CPU overhead of going out and retrieving all that spammy content to look for my blog post’s link.

So, if I was an evil user, I’d create a 3 megabyte HTML page, and I’d “trackback” your site every second. Or, I could have my zombie web farm send you a bunch of trackbacks, hundreds per second, pointing to garbage URLs.

Of course, these attacks are possible with other means. But making trackbacks do a reverse lookup makes DDOS attacks far easier-- they’d get our server to do all the work!

I assume you’ve seen what John Gruber thinks of trackbacks?

http://daringfireball.net/2003/06/take_your_trackbacks_and_dangle

As for alternative avenues, you might want to check out the magic that Sam Ruby has implemented in his weblog:

http://www.intertwingly.net/blog/2005/05/08/Sincerest-Form-Of-Flattery

Spooky, Jeff. I blogged a response about this post and mentioned that reading the origin link could become a DOS vector. 'Course, I was thinking along the lines of a Bayesian-style content analysis, but the gist is the same.

I’m somewhat perplexed that capable people have been buying into the trackback technology for so long.

Wow! Cool new font :slight_smile:

On the trackback spec. improvements

So the additional processing should be on the trackback posting side right? For example after the POST you could return an identifying code and an image (like that “orange” thing you have here), and the posting side could show the image to the user and ask him to decode it into some text, then post it again with an identifying code and the decoded text appended to the usual trackback parameters. That way you’d be sure someone is sitting on the other side to process the images (or some other stuff you give them). Of course this should be implemented in all of the blogging software, but it’s not too difficult and it seems quite secure to me. Am I missing something?..

An absolute travesty? Really?

If you’re interested in helping fix TrackBack, you’re more than welcome to help join the standardization effort:

http://www.sixapart.com/pronet/weblog/2006/02/submitting_trac.html

From that same post:

“As many familiar with the protocol will attest, TrackBack, despite its wide market adoption, is far from perfect – largely due to the fact that TrackBack was invented for a blogosphere that was much different in size and makeup. Today, blogging has exploded in popularity, presenting TrackBack with a whole new set of challenges to address.”

Was it an absolute travesty to design a spec that was appropriate for the audience it was delivered to? Or should Ben and Mena have assumed there would be hundreds of millions of bloggers? Now granted, they’re part of the reason that there are so many millions of bloggers today, but just as HTML 1.0 didn’t do everything the modern web needs, so too did the first version of TrackBack have shortcomings.

Very little would get done if everybody asked “what if this gets as popular as SMTP?” Not to say we shouldn’t take that responsibility seriously, but I think it’s understandable to be naive about social abuse in the same way that the architects of email, feeds, tags, and the web itself were.

Trackbacks are nothing more than a worldwide circle-jerk.

I’ve removed them from all sites that I’m involved with a long time ago.

It should be possible to use a CAPTCHA-based system also for trackbacks. Blog authors would only have to visit the site they refer to once, to receive a personalized security code (e.g., a GUID). The security code would then be registered in the author’s blogging engine, and be used automatically for all subsequent trackback registrations (as a trivial extension to Six Apart’s original metadata). In case of misuse, the code would simply be revoked, and all connected trackback posts could be removed automatically. It shouldn’t take more than a few hours to implement support for it in a blogging engine.

Jeff, I think Trackbacks are really not enough. In a lot of situations a blog could point back at anything that links to it - articles, forum posts as well as other blog entries. In my blog I take any inbound link and check back on it to see if the link’s there like Mike and if so link it. A timed routine that runs once a day then goes out and re-checks links over time to insure that links haven’t gone dead - if they’re not there anymore the trackback is removed. This actually works great for things that end up on home page links…

It’s not perfect - there’s noise there at times - but I haven’t seen any trackback spam because it usually gets thrown out before it ever gets linked. It also helps to have an easy way to get rid of trackbacks - I have my blog set up so that I all views become editable in admin mode so I can breeze through and remove garbage comments and postbacks very quickly without even hitting hte admin interface. Like you though - comment spam has nearly completely died by adding Captcha, so most of the cleanup comes from backlinks but it’s pretty minor.

OTOH, I question how much value there really is in trackbacks these days. How often do you really follow a trackback when reading a blog especially if a topic already has a number of comments? The trackback mechanism simply doesn’t tell the target site enough to make it truly informative enough to give the user the ability to see what you’re getting suckered into…