The Economics of Bandwidth

One of the sadder recent news stories is the disappearance of Turing award-winning researcher Jim Gray. I've written about Jim's research before; he has a knack for explaining fundamental truths of computer architecture in uniquely clear ways.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2007/02/the-economics-of-bandwidth.html

Sneakernet absolutely makes sense, this is a problem I’ve run into in my own work. We occasionally had to ship several terabytes, sometimes we’d do it over the internet, sometimes by shipping hard drives. Your math makes perfect sense, but when you take into account the fact that your pipe usually has a heck of a lot more to do then just send that one job, it makes much, much more sense. One of my first real jobs in the industry was writing software to take our low-priority jobs, break it into pieces, and send it at night when the load was low, so it didn’t inconvenience the other users.

I always thought it fairly amusing that the low-priority jobs got sent at night, over a period of days, whereas the high-priority stuff got stuffed in a box and given to the postman. It says a lot about the current state of networking, both how far its come (in years past, it would be a no-brainer. Like they say, never underestimate the bandwith of a station wagon full of mag tape) and how far we have yet to go.

As the old saying goes…

Never underestimate the bandwidth of a station wagon full of tapes driving down the freeway.

Jim Gray may well have been paraphrasing the well-known (in English circles anyway), “Never underestimate the bandwidth of a Ford Transit full of CDs”. I guess you could scale that up to HDs quite easily :slight_smile:

P.S. Just don’t expect a ping that you can use in CounterStrike.

That “old saying”, I believe, is originally from Andrew Tanenbaum’s “Computer Networks” and goes like this (I just looked it up in my Second Edition copy from 1989, page 57): “The moral of the story is: Never underestimate the bandwith of a station wagon full of tapes hurtling down the highway.”

I still remember laughing out loud when I read that line for the first time in college. It was such an eye opener.

Plus, there is something to be said for the fidelity of sneakernets. When the guy in the brown shorts shows up with your package, the transfer is done and you know (or hope) the package is complete. ZMODEM, FTP, and HTTP can’t touch that.

Indeed, computer networks only give you a better response time. You can establish connection in milliseconds, but when it comes to bandwidth, networks lose even theoretically when compared to physical movement of media. Networks use very narrow (in physical size) channels and very limited set of media states to transfer signals. Even with optical network from one end to another, you can use that same technology to fit an orders of magnitude more data to a 5"-sized media and then deliver it by mail. Not to mention cost of dedicated optical channel.

By the way, what about using airliners or ships loaded with DVDs for intercontinental networking? :slight_smile:

You would easily run into scenarios, though, where you would want your empty HD shipped back. In which case you would have to include the cost of shipping back the two HD, thus nearly doubling the cost, though saving the 16hrs at either end for upload and download. This translates to a round trip total time of 56 hours, a total cost of $120, a data transfer rate of 5.08 MB/sec, and a cost of $0.12 per GB transfered.

I’m also not sure where you got your transfer rate figure of $0.08 per GB for Sneakernet. seems like it should be about $0.06 per GB.

Cheers

where you would want your empty HD shipped back

Why? If they’re empty, what am I getting back? But maybe you’re right; if I don’t get them back eventually, my real cost was $360 ($150 per drive, plus $60 fedex), which makes the transfer cost $0.36 per gigabyte. Ideally you’d get them back with some other data that person needed to send you.

I’m also not sure where you got your transfer rate figure of $0.08 per GB for Sneakernet

Not sure what I was doing there; the time taken isn’t relevant, it’s just the price divided by total size of data transferred. Corrected.

Networks use very narrow (in physical size) channels and very limited set of media states to transfer signals

That’s right, and one of the biggest problems we’re facing right now is the storage explosion – we have mountains of cheaply stored data. That’s not a bad thing. But the pipe to get to that data is growing very, very slowly. It’s the disconnect between these two growth rates that’s the problem.

One interesting thing happens as hard drive sizes increase, without any comparable increase in bandwidth to get to the disk: you have to treat them like sequential, tape-style devices.

http://www.acmqueue.com/modules.php?name=Contentpa=showpagepid=43

JG Certainly we have to convert from random disk access to sequential access patterns. Disks will give you 200 accesses per second, so if you read a few kilobytes in each access, you’re in the megabyte-per-second realm, and it will take a year to read a 20-terabyte disk.

If you go to sequential access of larger chunks of the disk, you will get 500 times more bandwidth—you can read or write the disk in a day. So programmers have to start thinking of the disk as a sequential device rather than a random access device.

DP So disks are not random access any more?

JG That’s one of the things that more or less everybody is gravitating toward. The idea of a log-structured file system is much more attractive. There are many other architectural changes that we’ll have to consider in disks with huge capacity and limited bandwidth.

Yep. They’ve been doing this for years in astronomy. Rather than running some high-bandwidth solution to some remote part of the world, and as the data isn’t time critical, the main form of transfer was tape.

Also, Sneakernet scales upwards well. Want 10 terabytes? Your shipping costs go up a little, and your disk copy speed becomes more important, but otherwise, you’re laughing.

This reminds me of the back-of-the envelope calculation of the “bandwidth” of medium-format (2.25" square) film. Assume film captures an image of roughly 16Megapixels in 1/1000" of a second, that’s 48 GB/Second. I think that’s right…

“I looked at the bandwidth bill for Wikipedia, for instance, and it is actually substantially lower in the last year than the year before, despite traffic growing by a factor of 4.”
—Jimmy Wales, http://blogs.zdnet.com/open-source/?p=899

I think you mean KB not Kb, MB not Mb, GB not Gb, and TB not Tb.

Only when talking about memory space.
Data transfer rates are still measured in bits/second – thus, the use of a lower-case ‘b’.

I’m only pointing this out because ever value, in the main post, is in bytes/second (which is inaccurate).

I can tell you this right now, sneakernet would not work anywhere in scandinavia or in any nordic european country.

Here i pay the equivalent of 24.7887 USD/month for about 1.3MB/s for my home connection. It’s not even that fast with it’s 10Mbit/s speed up and down when you compare it with others in Sweden. I just don’t need that much at home but some have 100Mbit/s at home. Mine goes through phiber cables built into my house, most apartment houses have them here and a lot of neighborhoods with houses are working together to have them installed.

My apartment building is part of a project running all over my city. Most cities have this, even smaller ones with around 20-30000 people. It’s usually a cooperation between the local electric company and the various owners of apartment buildings. These usually own large parts of the cities apartment buildings so all their houses get this advantage.

At my job we get 100Mbit/s redundant capacity from a dedicated phiber and i can’t say how much we pay for it but it’s about a fifth of what you have in your chart for the OC-3 with about 152Mbit/s.

I hate math but whatever the exact values are, the US people are being ripped off. :frowning:

Perhaps the best scenario is to rent the HD from “SneakerNet provider” (delivery companies such as FedEx are in a very good position). Every time you send a box, they replace it with an empty one, ready for file copying the next delivery. The cost should be even cheaper. That way no HD travels empty. Waiting for delivery of an empty HD is like waiting for a huge download full of zeros! :slight_smile:

You write off the cost of the hard drive, but basically ignore the fact that, apart from the 56K modem, the others are the same monthly cost regardless of whether you download 1 byte or 1 TB. The cost for a 20GB file is a little meaningless, therefore, unless you assume that you saturate the connection for the entire month.

I’m paying 25 per month for an ADSL connection that can achieve 600K/48K - my ISP offers the raw connection, without some of the additional features (webspace, email addresses, static IP and better tech support), for 20/month. It’s officially a maximum of 8Mbps down, 800Kbps up, but you’d have to virtually live in the phone exchange to get that (I live about 100 metres away and get a reported line speed of about 7Mbps).

I’m ignoring the cost of the phone line rental - I assume that you still want to use the line for a phone. At work we also have ADSL (running at 6Mbps reported line speed because we’re a few miles from the exchange) but there isn’t a phone connected, so add 10 or so on top of the ADSL cost.

I’ve seen 100Mbit uplinks at colo’s for something like $300 / mo.

If you buy that uplink crap then you’re probably also a member of the scientology church. A lot of web hosting companies tell you what uplink you get just to sucker you into ordering their package. In reality the uplink has nothing to do with what they rate your bandwidth to. The uplink is just where your cable is connected. Usually they’re talking about the switch port you’re connected to or the ethernet card you get in your dedicated machine.

I’m only pointing this out because ever value, in the main post, is in bytes/second (which is inaccurate).

This is intentional. I hate the “bits” nomenclature. Megabits? Really? Is this 1993? Are we using Sega’s Blast Processing to transfer our data?

Bytes is the unit of measure on computers, so I’m using Bytes.