Gigabyte: Decimal vs. Binary

Peter_K · September 11, 2007, 12:00am

Sean: Yes it is correct, it’s just one eighth of “1 megabyte in bytes”.

Sean · September 11, 2007, 12:00am

Peter: megabit is 1 million bits. It has never meant 1,048,576 bits.

I still don’t get why you are so stuck on measuring file size in powers of two.

Is my 3ghz machine running at 3.2 billion hertz? No. It’s running at exactly 3.0 billion hertz.

When you start seeing a file that’s 1,500,000 bytes as 1.5 megabytes and not 1.43 megabytes, your brain will feel much better.

LucM · September 11, 2007, 12:00am

The problem is not when you compare 2 hard drives.

We have a problem when we have a 500GB (1000 based) and you need a real 500GB (1024 based).
Workers in computer (Admins, programmers…) can deal with that.
But ordinary people are very confused about that.

coderprof · September 11, 2007, 12:00am

Memory is one area where the power of 2 meaning makes complete sense. Since it’s all manufactured as 2^n, it would just confuse people to call your 1GB RAM “1.073 GB of memory”.

Josh77 · September 11, 2007, 12:00am

Yobibi Zebibi were the names of Sudam’s sons right? =P

AndrewR · September 11, 2007, 12:00am

Bah!

If you ask me, the standards organisations should be making sure that our conventions are well documented and standardised. Not running around self-importantly defining and redefining things we’ve used for years. There is a reason we use binary measurements.

Ideally they could have gone out and defined “when is kilo binary and when is it decimal?” - after all - that is often confusing. Heck they could have simply defined the standard to say “kilobyte often means this, but sometimes that”.

Then they could have run around adding well-defined “kibi” stuff to their hearts content.

Instead they made the confusion worse by actively trying to delete the common definition. A bunch of nerds with no understanding of the social outcome of their actions, if you ask me.

GenoH · September 11, 2007, 12:00am

Isn’t the important thing that, when we start saying “tebibyte”, that our theme song is already mostly written? http://youtube.com/watch?v=19MNzKL5Swk

Sean · September 11, 2007, 12:00am

“There is a reason we use binary measurements.”

Andrew: care to fill us in?

Please, just list one benefit for having a file that’s 2,500 bytes be represented as 2.44 KiB instead of 2.5 KB.

Eric · September 11, 2007, 12:00am

Most annoying about the SI prefix are the graph legends…
when you see a graph with a little ‘m’ next to the y axis, is that milli or mega?
The general rule of lowercase being less than one helps, except that ‘k’ is kilo, meaning 1000 (or preferably 1024 depending on the actual metric). Shame there is always an exception!

J__Stoever101 · September 11, 2007, 12:00am

Let’s not kid ourself: the only reason we all already KNOW that difference is because practically ALL of us got cheated once and then asked someone and learned why we were some MB’s short. I’m pretty sure that in a lawsuit it could be reasonably argued that no normal first time customer can be expected to know the difference, and hence it being an intentional misleading or something of that sort.

Kit · September 11, 2007, 12:00am

I work for a networked storage solutions company that does both hardware and software, and I will tell you that the IEEE notation is never used. In fact, I wrote a .NET class to manage storage capacities into the ‘yotta’ range, and during the work came across much of the research you did. End result: most people don’t care since we are talking about 3 orders of magnitude between prefix, even though the discrepancy grows larger.

AndrewR · September 11, 2007, 12:00am

Sean (“care to fill us in?”):

Sure. Pretty much anything that involves working with memory involves measuring in binary units. Because that’s how memory gets addressed. A 32-bit computer is so named because it uses 32 binary bits to address… wait for it… 4GB (sigh: GiB) of memory!

As a programmer I have no interest running around talking about 4.294967296 “GB” of memory. Or convoluting my tounge with silly made-up words.

Andrew · September 11, 2007, 12:00am

@sean

I am a different Andrew, but I will answer your question. Most things are stored, accessed, or otherwise dealt with in powers of two. There is a technical reason why you use powers of two, mostly dealing with the fact that you have a binary state (high voltage vs. low voltage) which is used to compute everything. Memory, for instance, is a series of binary elements indexed by an address defined by a series of binary elements. Wikipedia can give you more information (look up a muxer, for instance) but basically everything is stored in a medium (almost everything, anyway) that in the end is a power of two. As was previously mentioned, 1 gibibyte of RAM is an exact number, because that is how it is built (1024 mebibytes). So it makes sense to advertise it as such, rather than 1.024 megabytes. Early geeks just thought it was kind of cool how you could round down the numbers and not lose much, which is what got us to where we are today.

Nowadays, however, the difference between 1000 and 1024 (ambiguous) gigabytes is five movies, or your music library, or your photography collection. There is a practical aspect to the binary notation, it is just unfortunate that there is so much momentum to stick with it.

Hope that helps clear up some confusion about the topic.

GT14 · September 11, 2007, 12:00am

Many fields of study invent their own terminology for their use by co-opting words from general use. Using these terms differently does not make them “wrong”, it just makes them technical jargon specific to the field.

Within the field of computer science, one kilobyte = 1024 bytes. This isn’t wrong, in fact, the other view (1 KB = 1000 bytes) IS wrong. It’s wrong on several levels.

First, it attempts to use the wrong meaning of an overloaded word, rather than the one that is correct in context. You don’t complain when a physicist talks about the color or spin of a fundamental particle, even if you know the particle has no “color” nor is it “spinning”. The physicist isn’t wrong, he’s just using terms correctly in a physics context, where their meaning differs from other contexts. Or is the assertion here that physicists can co-opt words for their own meaning, but computer scientists are “wrong” for doing the same?

Second, it mistakes “byte” for an SI unit. According to SI, a kilometer is 1000 meters. This is very true. However, it’s absolutely false that, according to SI, a kilobyte is 1000 bytes. SI has no more to say on how many bytes are in a kilobyte than it has to say on how many feet are in a mile, since neither miles nor bytes are SI units. Incidentally, kilobyte has traditionally been abbreviated KB, not the capital K. The SI “kilo” prefix is abbreviated with a small “k”, but since the “kilo” in kilobyte is NOT the SI prefix, this is irrelevant.

Sean · September 11, 2007, 12:00am

andrews: you both explained why memory is sized in powers of two (actually it’s multiples of bus width, but that’s a power of two).

Neither explained why it’s better for explorer and other apps to describe filesize and diskspace in powers of two. Why is the average user exposed to this?

lubos · September 11, 2007, 12:00am

"Google’s wrong.
a href=“http://www.google.com/search?hl=enq=1+megabit+in+bytesbtnG=Search"http://www.google.com/search?hl=enq=1+megabit+in+bytesbtnG=Search/a
That’s not correct by anyone’s definition”

Actually what google says: 1 megabit = 131 072 bytes
is in fact correct… do the math: 1024*1024/8 = 131072

Rob_Janssen · September 11, 2007, 12:00am

I don’t know about you, but I can grasp “30 hours of video” better than “500 GB”.

Yeah, but then you get to the next Monty Python question - are those 30 hours MPEG-2, DivX, DVD-quality, HD-quality at 720, 1080, or…? Same with the “songs” metric for iPods - bitrate isn’t taken into account, just the default length size of a pop song.

As long as we can get 'm to switch after tera, then I’ll be happy enough.

gwenhwyfaer · September 12, 2007, 12:00am

Karel: “Baud is not bit per second, it is symbol per second.”

Indeed. That’s why I wrote “naturally encapsulates” rather than “is” and took care to specify a serial line.

apeinago5 · September 12, 2007, 12:00am

gi-bE-BYE-TUH

bin-ary
or

gi-bye-bye-tuh?

as in bye-nary

TijmenS · September 12, 2007, 12:00am

What I don’t see explained or knew was: How can 1TB drives have different capacities? I thought they allways sported 10^12 raw bytes

You know what is even more cheatish than the fact above? Some laptops and computers have a hidden “rescue partition” with a copy of the OS installed in it for easy recovery. I still knew a friend of mine got so angry at a computer store that sold him a 60GB laptop that had only 25 MB of free space after a clean installation - appearantly Windows XP + all the “apps” took 15 GB and there was a 20 GB recovery partition. After a lot of swearing he got a 80 GB laptop of the same brand though (which still had 50 GB free space - here the rescue partition of an otherwise same install was only 15 GB)