Everybody Loves BitTorrent

The traditional method of distributing large files is to put them on a central server. Each client then downloads the file directly from the server. It's a gratifyingly simple approach, but it doesn't scale. For every download, the server consumes bandwidth equal to the size of the file. You probably don't have enough bandwidth to serve a large file to a large audience, and even if you did, your bandwidth bill would go through the roof. The larger the file, the larger the audience, the worse your bandwidth problem gets. It's a popularity tax.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2007/02/everybody-loves-bittorrent.html

Have you tried enabling encryption in your BitTorrent client? It’s a scorched earth solution, but many clients support encryption now.

Encryption doesn’t matter. My ISP monitors the aggregate upload volume. Thay don’t care what I am trying to do. If I upload too much, they throttle ALL upload bandwidth.

Even normal web browsing is sometimes affected by the throttle (it’s pretty slow).

Maybe their VOIP port might not be throttled. Maybe if I signed up for their VOIP service they might have to lift the throttle. They have not notified me of the throttle, maybe if I call customer support and pretend I don’t know what is going on, I could convince them to lift the throttle. (“Oh yeah, I have a wireless router for my laptop. No, I did not turn on WEP. Gee, why would I need to do that?”)

Bob Cringely has posted several articles claiming that P2P should actually be the bandwidth savior instead of the bandwidth demon that ISPs make it out to be ( http://www.pbs.org/cringely/ ). But unfortunately the ISPs have not learned that yet. His latest article hints that maybe Apple will shove it down their throats.

I don’t understand why companies don’t simply provide a .torrent link, and then put a massive-bandwidth seeder to start it off. Ensure that the seeder’s always up, and then it works like a centralized link but with all the benefits of reduced bandwidth. Has all the benefits of both approaches. Why isn’t this being done?

Malderi – I’d imagine because many many many companies are very stupid, still, when it comes to the internet. Too many companies simply do not understand the underlying architecture to make informed decisions like that. Some, however, do exactly what you’re suggesting. Not enough, though. Not nearly enough.

If I remember correctly BT clients watch the upload rates of other peers and allow them more bandwidth so you are rewarded for uploading by having more download. This only really matters in a famine situation, when you have a feast situation and theres plenty to go around then everyone gets as much as they want.

Also even if the last seed disconnects from the swarm as long all the pieces are available somewhere the torrent will live on. The rarest pieces are prioritised by the clients so that there is the least chance that the torrent will die out.

Another part of how it all works that I like to visualise in my head is when a bad peer joins the swarm and starts dishing out bad data. The other peers see this bad peer and cut it off, effectively self healing the swarm!

It truly is a work of art and makes me glad I’m a developer and can understand the beauty of it.

“Is there any mechanism to ensure the integrity (i.e., non-modified nature) of the downloads/uploads?” - David A. Lessnau

Yes, there’s several layers to this:

  1. The .torrent file itself contains an overall integrity check (SHA-1 or MD5 sum, I think) for each file in the torrent.

  2. For each piece/chunk you transfer, the sending peer notifies you of the intended hash of the piece.

–
To prevent “poisoning” of the torrent (i.e, someone deliberately sending false data) - clients record those who consistently send bad data and refuse to communicate with them.

–
To prevent someone consistently leeching and giving nothing in return, (most) clients operate on a “tit-for-tat” arrangement. i.e if you send me data, I’ll send you data more regularly/faster.
If you don’t return any data, you will still get some data, but at a slower rate.
This also stops someone with access to high-bandwidth connections “soaking up” all the available bandwidth (i.e to prevent/slow down the distribution of the torrents)

"you don’t need a seed, as long as the peers have all the pieces amongst them.- nordsieck.

This needs reiterating - a lot of people seem to think that you always need a seed. If, between all peers, there is at least one copy of all pieces in the torrent, the seed can disconnect and the remaining clients will share completed pieces. It operates slower (you lose a peer which had all of the data), but will recover given time.

“Stability. I’ve tried various Bit Torrent clients (and client versions) and with the exception of the WoW updater, all have caused my firewall, antivirus, or both to crash. Then the Bit Torrent clients crash themselves” - Tom.

This is a sign of an incorrectly configured system, client, and/or network - and possibly failing hardware too.

Some routers don’t handle having many open connections, some don’t clear their NAT tables (for instance, the default Linksys WRT54G router firmware won’t flush NAT tables - you need to power-cycle it). Some routers can’t handle high volumes of traffic from many sources at once. You can mitigate this by reducing the number of torrents open, capping traffic to less than the full line-speed of your connection, and altering your Bittorrent client to keep less connections open at once.

If your antivirus software is locking up, then I suggest making these changes:

  1. Some Bittorrent clients (such as uTorrent and Azureus) allow you to configure an “incomplete” folder. Configure your Bittorrent client to download to a seperate folder (eg: “X:\Bittorrent-Incomplete”), and when completed - get it to move to your normal use folder. eg: “X:\Bittorrent”

  2. Remove the Bittorrent-Incomplete folder from your Antivirus scanner’s watch list.

This will stop your AV software trying to lock and scan files that change constantly. (Which can in turn cause your torrent client to lock up/crash when it needs to modify the file again)

  1. If your firewall is crashing due to bittorrent, then get another firewall, or configure it properly (i.e allow all incoming connections to your BT client). If you are using a modern software firewall on your PC, then it can open those ports automaticly when you load your Bittorrent Client.

Regarding the super slow downloads/availability: This sounds like either a problem with your network configuration, or using poorly shared torrents.
If you are using a NAT router or similar, you need to either enable UPnP, or manually forward certain ports - then configure your BT client to use those ports for incoming connections.

As far as I’m concerned, there should be a Nobel prize for computing, and the inventor of BitTorrent should be its first recipient.

The Turing Award (http://en.wikipedia.org/wiki/Turing_Award) is generally considered the equivalent. I don’t know how serious you meant that statement, but I’d assume BT probably has some less popular prior art and you certainly have to do a little more to justify being in such prominent company.

If it was a legitimate protocol it would sit on top of HTTP(S) outgoing on port 80. Or so says my work’s firewall rules anyway.

To be fair, commenters on reddit pointed out that the BT protocol does define “a tit-for-tat-ish algorithm to ensure that [the client] gets a consistent download rate”:

http://www.bittorrent.org/protocol.html

However, I still maintain that torrents need seeds to be viable. Relying on a 100% perfect distribution of data from an arbitrary number of peers seems improbable to me. I’ve participated in many, many torrents where there were lots of peers, but nobody had 100% of the data… the plaintive cry of the torrent is always, always “need a seed!”

I don’t understand why companies don’t simply provide a .torrent link, and then put a massive-bandwidth seeder to start it off.

Totally agree, but I think before that can happen, we need mass penetration of a “don’t make me think” torrent client. Perhaps something in the web browser, like Opera, or as a native part of the OS itself.

So in that case Email is not legitimate … it’s not on port 80…
ports 25/110/143 etc …

(oh an neither is HTTPS …port 443)

Yes, BitTorrent is great.

BitTorrent does have a mechanism to encourage altruism: other peers are more likely to send you data if you send them data. That said, this mechanism is still flawed – I access the internet over an ADSL line, and I’ve discovered that I can increase my download rate by ~decreasing~ my upload rate. If I let my uplink saturate, ack packets for the download get hung up, which drops my download rate. With a properly tuned Azureus, I can usually finish downloads with a share ratio of 0.3 or so.

There are interesting questions about what altruism is – for instance, some people say you should always have a share ratio above 1.0. That’s crazy – it’s impossible for everyone to have a share ratio of 1.0 if a torrent is stable in size. A torrent that has 80 seeds and 5 peers and lets me finish with an 0.05 share ratio doesn’t need me as a seed. My participation in a torrent that has 2 seeds (me and somebody else) and 5 peers that I finished with a share ratio of 1.7 really would improve performance for the peers.

So far as centralization vs decentralization, I think BT is decentralized enough – unlike Napster or Kazaa, there isn’t one organization that you can sue to stop it. The MPAA and RIAA can play whac-a-mole with web sites that host .torrent files, but people can set up new ones faster than they get torn down.

Google already provides an excellent discovery mechanism – type in the name of what you want + torrent, and you’ll probably get it.

BitTorrent is unsuitable for small files, even if they are extremely popular.

Depends of your definition of “small” though, BT is used for files in the 5Mb scale. The main issue is that the smaller the file the faster problem 1 gets felt. On the other hand, problem 4 becomes much less of an issue with small files: since you don’t get lot, you don’t have to share much for everything to work out, whereas big files (in the 100s of Mb) requires much higher absolute sizes of sharing (even if equivalent ratios).

There’s no way to punish bad peers for not sharing, or reward good peers for sharing more.

Actually there is, trackers and clients keep track of the data received and sent by clients, and the more you send the more you may get. Non-altruistic clients are usually choked out of the swarm, or only get a trickle of the total bandwidth

Furthermore, every torrent needs a “seed”-- a peer with 100% of the file downloaded-- connected at all times.

No, having seeds is the best because they share without taking anything (they add to the total bandwidth of the swarm without taking from it) and you’re ensured that your download will complete, but having multiple peers with part of the file is enough as long as they have different parts and you can recreate the whole file from their parts which is usually the point that fails (and everyone ends up with 99.8% of the data) unless you have a few clients with high completion ratios (90%) when the last seed quits, at which point there’s a quite large chance that they don’t have overlapping missing data.

If it was a legitimate protocol it would sit on top of HTTP(S) outgoing on port 80.

So your work’s firewall doesn’t consider IMAP, SMTP, FTP or SSH to be legitimate protocols? Whoa, talk about harsh

+1 to Masklinn

That said, this mechanism is still flawed – I access the internet over an ADSL line, and I’ve discovered that I can increase my download rate by ~decreasing~ my upload rate. If I let my uplink saturate, ack packets for the download get hung up, which drops my download rate. With a properly tuned Azureus, I can usually finish downloads with a share ratio of 0.3 or so.

That’s an issue of saturating your bandwith, it’s not an issue with the BT protocol per se, it has more to do with TCP on which BT is built. Which is why BT saturating your uplink can even choke your web surfing.

That’s why you should always limit your upload bandwith to 50% to 75% of your max upload bandwitch at most.

for instance, some people say you should always have a share ratio above 1.0.

I usually shoot for a ratio of 3

That’s crazy – it’s impossible for everyone to have a share ratio of 1.0 if a torrent is stable in size. A torrent that has 80 seeds and 5 peers and lets me finish with an 0.05 share ratio doesn’t need me as a seed.

Of course it doesn’t, yet if it doesn’t need you it won’t use you.

Modern clients such as Azureus or uTorrent allow you to configure the torrents on which you want to seed based on various factors including but not limited to the seeds/peers ratio and the absolute number of seeds. And to cycle between your various completed torrents.

This means that you can just leave your completed torrent in your client and not bother with it, if in 2 or 3 weeks most of the seeds are gone you may still be there and help people get the files they want/need, if the torrent doesn’t seem to need you your client will just switch it out and seed something else in your queue.

Another item: Even if your firewall is configured so others can’t connect to you, you can still connect to other clients and upload data to them. Back when I had a cable modem, and a 10.10 address, I could still share data, even as a seed, IIRC. It just wasn’t as efficient since others couldn’t connect to me.

I, too, love BitTorrent, but there’s a problem for me: it crashes my router. I can usually download files okay, depending on filesize and popularity, but after I close the program I must go to my router settings and update the port forwarding to drop the forwarding for the range I use for BitTorrent. If I don’t, the sheer amount of connection requests will crash my router or, at the very least, make my network unstable.

I realize this is a hardware problem, but it’s only evident when I’m using BitTorrent. It’s not the software’s fault–it is doing what it’s supposed to do. This is just an unfortunate byproduct.

I haven’t actually tested this, but it seems you don’t actually need someone with 100% of the file downloaded. If two people each have half of the file, and they have exact opposite halves, it should be capable of combining them into one whole.

Of course, the odds of a situation like that are a little ridiculous.

Does BitTorrent really matter to people who are only using legitimate content? I honestly don’t know the answer. I’ve never used BitTorrent or any of the other peer to peer networks. I’ve simply never had a need to. Is there a legitimate use for these networks other than sharing stolen software or multimedia files?

@Matt: I’ve only used BitTorrent a few times, mostly to grab very large ISO files and videos (archived MIT lectures, etc) that I have trouble downloading directly. It’s been a blessing during those instances!

On a similar note, Larkware posted this yesterday:

http://www.mono-project.com/Bitsharp