a companion discussion area for blog.codinghorror.com

Gigabyte: Decimal vs. Binary


#81

Yes hard drive manufacturers use SI prefixes correctly, as you observed. So do CPU manufacturers, as Sean observed. Did you complain about the GHz of your recent CPU purchase not being binary GHz?

When you hire someone for $100K, you pay them $102,400, right?

When you measure response time in milliseconds, that’s how many times 1/1024 of a second, right? You buy a 47K resistor, that means 47 x 1024 ohms? No … because you said you don’t use metric. In your recent power supply purchase, you measured it in BTUs because you don’t use metric. 1K BTU is 1024 BTUs, right?

Don’t think about petabytes. You don’t want to estimate how much spam there is each day.


#82

To me, it seems there is a perfectly logical reason why hard drives are measured in derivatives of base-2 - and that is that any and all programs, or any other bits of data, when viewed in their rawest form, are stored in binary on the actual disc. As such, it makes complete and total sense to use base-2, or some derivative of it, to measure the size of the disc.

Oh, and John Pirie? To most people buying hard drives, whether it be by themselves, or as part of a pre-built system, ‘500GB’ means nothing, really, by itself. To them, the only thing it means is that it should be the case that this is how much space Windows says the drive has. Currently, Windows, just like most other operating systems, will report a ‘500GB’ drive as 465GB. In my opinion, drive manufacturers should measure the size of their drives the same way most computers would, simply in the interests of being clear to the end user.


#83

Damn whippersnappers.

Drives have been measured in powers of ten since the dawn of time, back when 32K (yes, K) was a big expensive disk. There’s no conspiracy here. There’s simply NO GOOD REASON to measure hard drives in powers of two, never was, never will be. If you want to blame someone, try DOS. They probably converted by (binary) shifting.

Memory, on the other hand, always grows by powers of two (add an address line, get double the memory.) There it makes some sense to measure in powers of two.

Next you young turks are going to say you want to measure your broadband speed in powers of two! Arrgh!

Now get offa my lawn!

-hans


#84

I can’t even bring myself to read all the bs in the responses. Here’s a quick fix. Look at Western Digital’s new drive. It’s called a 1TB drive. Here’s the physical specs

Physical Specifications
Formatted Capacity 1,000,204 MB
Capacity 1 TB
Interface SATA 3 Gb/s
User Sectors Per Drive 1,953,525,168

see the rest at:
http://www.westerndigital.com/en/products/Products.asp?DriveID=336

The Scientists see units at 10^2; just the way it is. Consumers see units in 10 base units. Advertisers used the discrepancy to pad pocket liners. Western Digital was sued over this and has made the change. Now, we cannot make the Scientists change from counting data in binary units (which is how we got here to begin with), so let’s just grow up and get over it.


#85

Aaron G is completely right. The fact is, there is a meaning that most people have been assigning to terms like ‘kilobyte’ for decades. Some standards organization cannot come in and dictate that we all change how we use these terms any more than the French can be told not to use ‘e-mail’. You just cannot dictate human language. It won’t change how people use these terms, but it will cause a lot of confusion instead.

I can understand the usability reasons for using decimal notation, but redefining decades-old common terms won’t do it. Instead, all it does is cause people either to ignore that “standards” or simply to change to the new funny-sounding binary prefixes. In the end, the only ones using the decimal system are marketers. If they really wanted people to switch to base 10, new terms should have been invented for that instead (such as ‘kidebyte’).


#86

One thing that lept out at me from the table in the article is the etymological derivations of the prefixes – the higher ones, at least.

petta, exa, zetta, yotto = penta, hexa, septa, octo = 5, 6, 7, 8

Now if those prefixes stand for 2^50, 2^60, 2^70 and 2^80 they make sense, etymologically-speaking. 5, 6, 7, 8, see?

But if they stand for 10^15, 10^18, 10^21 and 10^24 there’s no obvious relation between the names and the values – they seem entirely arbitrary, and that’s not good for memorability.

(Actually they’re not arbitrary, they’re a factor of three out, but that’s not really going to help them from a mnemonic point of view.)

Considering that not many people who aren’t storage nerds use petta- yet, let alone the others, the SI might as well come up with different decimal-friendly prefixes and leave the ones that actually refer to the binary powers alone. Surely?


#87

I think it’s a bit out of line to characterize adoption of the International System of Units as “an old trick perpetuated by hard drive makers.”

To a whole lot of buyers of those drives, 500G just means five hundred trillion.


#88

Yobibyte sounds funny in Russian. “Yob” is a root of the Russian equivalent of “f#ck”.


#89

Makes me want to pay for the drive in Canadian dollars.


#90

@ Matias When I was 10, I remember my dad trying to explain to me the relative capacity of 20 MEGABYTE hard drive he got in a new computer. I asked him if it was possible to ever fill up that much space on a hard drive. He said that, practically, it was not possible.

Consider the amount of work you have to do to generate simple text (without all the cruft of Word files) - if you have to do it yourself, 20mb means a lot of typing. If that’s what your dad had in mind, his point is quite understandable for that time.

I recall a friend of mine getting one of the first Pentiums (at 90MHz, which wasn’t spectacular because our 486 ran at 80) with a gigabyte of harddisk space. I thought the same - how to fill all of that? Luckily, games that came on CD wanting to have their content installed from the harddisk because double-speed drives weren’t fast enough solved that :).

Anyway, back to the 20mb drive. Go to a more advanced mode of content generation; image editing. Autodesk Animator makes 320 x 240 GIF files in 256 colors - if the computer could handle this in the first place. I think the time of 20 megabyte drives still had amber or greenscreen most of the time (I only had a C64, never an ST or Amiga). Every file is at most 60kb (if you’d make it random noise; if it contains actual art it’s probably less). By the time you’ve made several of these files, you’d use floppies anyway (because you don’t want to completely fill up the harddisk).

Now, shoot a few 4 megapixel pictures and you’re through those 20 mb.


#91

Sextillion sounds funny in English, for the same reason.


#92

Yup that happened to me. I bought 2 500 GB drives installed in a RAID configuration and installed my OS expecting to see 1TB and all I got was 931 GB. :frowning:

If the drive manufactures are going to do that then a they should be selling 534 GB drives instead of 500 GB drives.

I WANT MY “1 TB”!!!


#93

…any more than the French can be told not to use ‘e-mail’…

Which is being replaced by a French equivalent not without success. In the same way that they (the French) forced upon us the metric system, the use of family names, a birth register, etc.

Things like these are possible and done all the time.


#94

Instead of forcing these ridiculous kibibytes and gibibytes on us, which is impossible for us macho ADA .Net programmers, and has created even more confusion, why can’t the HDD manufacturers stay in SI-land without overloading our KB’s, MB’s and GB’s?

My solution is simple: write 1K B (1000 bytes) instead of 1 KB (1024 bytes for programmers and computers, but 1000 bytes HDD manufacturers and buyers).

Everybody can save face: HDD manufacturers won’t be lying any more, just by subtly moving a space, and programmers won’t have to agree on disagreeing or agreeing with the kibibytes whenever they talk about volumes of bytes.


#95

Instead of forcing these ridiculous kibibytes and gibibytes on us, which is impossible for us macho ADA .Net programmers, and has created even more confusion, why can’t the HDD manufacturers stay in SI-land without overloading our KB’s, MB’s and GB’s?

My solution is simple: write 1K B (1000 bytes) instead of 1 KB (1024 bytes for programmers and computers, but 1000 bytes for HDD manufacturers and buyers).

Everybody can save face: HDD manufacturers won’t be lying any more, just by subtly moving a space, and programmers won’t have to agree on disagreeing or agreeing with the kibibytes whenever they talk about volumes of bytes.


#96

Let us not forget that screen measurements are on the diagonal.
I still remember taking a tape measure with me to buy a TV, because at that time there was “confusion” over the “proper” way to measure the display’s size. IMO, it should always have both dimensions listed, just like the video cards do.


#97

I rather waiting for a price drop in the terrabyte drives. I want to fill a T with my 0wn rainbow tables :slight_smile:


#98

I know using 1024 as 1k etc is part of computing history, it especially makes sence in memory referencing etc. But why the hell does windows (and other programs are just as guilty)count files on hard drives this way. Ever tried comparing file sizes when ones in megabytes and ones in gigabytes across say two different programs and you don’t know if there working off 1024 or 1000. You end up having to look at the full number in bytes to be sure your correct.


#99

Mainstream computers (this means every PC, Mac or otherwise) are binary computers without exception. We store our data on hard drive that write the data as binary (ones and zeros only please).

The blocks of data that you write to the hard drive are done in blocks which are ALWAYS multiples of 512 bytes. For example, 512 bytes, 1024 bytes, 4096 bytes.

There was an example of the Linux command “dd” use which was:

The problem has an easy solution:

The OS needs to start using SI prefixes correctly. Linux already does.

$ dd if=/dev/zero of=test bs=1MB count=10
10+0 records in
10+0 records out
10000000 bytes (10 MB) copied, 0.0261481 seconds, 382 MB/s

There’s no reason to use powers of two for displaying file sizes. It’s ridiculous and makes it more confusing for the user.
Sean on September 11, 2007 03:38 PM

This is a horrible example I/O use. The bs variable should also be one of the above examples block sizes (512, 1024, 4096) and specifically should be the block size of the filesystem to which you are writing. I agree the Linux is literal in its interpretation of MB but you are using the “dd” command poorly, Sean. Linux will do what you are asking, but please don’t waste your lovely OS’s time with commands like that. :stuck_out_tongue:

Finally when you are referencing data in the processor you use binary to do so and we humans read hexadecimal more easily. In hexadecimal, 1024KiB is 0x400 KiB. 1000 KB is is 0x3E8 KB in hexadecimal.

If you want to use metric, be an engineer. Meanwhile, leave my metrics alone. 1024 bytes is 1 KB. The computer knows this. The programmer knows this. Even web developers use hex. Colors codes are hexadecimal.

Do mechanical engineers want me to redefine 1 meter as 99.53 cm? No, they don’t. Leave your decimal out of my pure binary computer.


#100

I think the problem is that in CS, we can use whatever word for whatever purpose and its OK. Now, CS is a large field that has great tendency to overlap other fields like finance (for example). For a computer scientist, kB is ok to be 1024 byte. For a manager, its not. Before, kB were used only by computer scientists, now they are used everywhere.

I personally think that all this is an issue of whether we tell the world about that (like in school when we learn about kilometers) or we change to use the kilo as one thousand meaning.

I also think that 1 kB as 1024 is usefull when programming.