Gigabyte: Decimal vs. Binary

Joost wrote: “My solution is simple: write 1K B (1000 bytes) instead of 1 KB (1024 bytes for programmers and computers, but 1000 bytes for HDD manufacturers and buyers).”

And how you would pronounce “1K B” so it’s distinguishable from “1 KB”? Not all communication is written or electronic. Sometimes you actually have to talk to people.

“I am a CS major. I am utterly unaware of the world outside my tiny little shell. EVERYBODY IN THE WHOLE WORLD thinks that the SI prefixes mean powers of 2, and there is SO much history behind this usage – literally DOZENS of years! Nobody uses the peta- prefix except for people talking about HARD DRIVES!”

(I’m a computer engineer myself, see above, just seriously amused by all this.)

DOZENS == sixteens, right?

" It’s an old trick perpetuated by hard drive makers-- they intentionally use the official SI definitions "

Actually it isn’t intentional. There is a good reason why hard disk uses SI while others like RAM uses binary. It’s a bit confusing for the customers though. I already wrote about this in my blog.

http://instantfundas.blogspot.com/2007/08/1-gigabyte-is-not-equal-to-1024_22.html

I don’t know is this would be more clear.

I had made a comment to the effect that 1024 bytes is 0x400 bytes in hexadecimal and that 1000 bytes is 0x3E8 bytes in hexadecimal.

Maybe if I were to use binary it would be more obvious.

1024 bytes is 10000000000 bytes in binary.
1000 bytes is 01111101000 bytes in binary.

I call 1,048,576 bytes megabyte or 1 MB.

1,048,576 bytes is 100000000000000000000 bytes in binary.
1,000,000 bytes is 011110100001001000000 bytes in binary.

Our computers run countless billions of binary operations all day long and only convert to decimal when we humans need to see the data. Often it will display in hexadecimal for a kind human willing to meet the computer halfway.

At then end of a day the computer is the final judge and it clearly prefers to think of KB, MB and GB in terms of a binary number in the form of a 1 followed by 10, 20 or 30 zeros respectively.

This lovely machine has been programmed to convert the binary to decimal when needed so lets not force our “arbitrary” metric on it.

I know the pour marketing sod is a soulless bag of crap and is lying his/her booty off on the front of the box with a statement that 1,000,000,000 bytes is a GB.

I mean really now! Are you telling me that you are not skeptical of advertising already. Don’t we as a planet take it for granted that all marketing people are earth are liars. They have gotten degrees in deception making and make a living distorting truth for the financial gain of their employer to the detriment of everyone else on earth.

There is no need for this debate. Computer will continue to use KB, MB and GB internally as powers of two. Marketing people will use powers of ten because it is a convenient lie. (They love the convenient ones, as they make their worthless lives easier.) Educated consumers already know the exchange rate of marketing to computer science MBs.

As a final note, I would like to suggest that all marketing professionals commit suicide.

Again, that is simply a suggestions.

It took me about a day to get over the retarded names of the SI units. Then I realized that they are not much worse than the base 10 units.

It’s obvious that memory is addressed via bits and therefore the maximum theoretical addressable memory is always a power of two. Memory requirements don’t necessarily scale like that.

Also it’s funny that Knuth comments on this, where he basically agrees but says the names are too funny to be taken seriously.

http://www-cs-staff.stanford.edu/~knuth/news99.html

Once we slam a spacecraft or two into something in space we’ll probably think they are less funny.

I cannot believe the reaction from people decrying the new binary prefixes. Here is the issue.

Point #1. It can make sense to talk about both a decimal gigabyte and about a binary gigabyte. Can people accept that?

Result: We need two different units in order to talk about these things unambiguously. This gives us three options: Use the existing usage to mean the binary unit, use it to mean the decimal unit or define two new prefixes for each ‘type’ of gigabyte.

Two new units is rather foolish. And whhich sounds more reasonable - to state that the kilo prefix has an exception for certain units, but that there’s this new prefix for that unit that maps to the more standard usage of kilo? Or to make it so that kilo always, always, always means 10^3, and create the new prefix to always mean 2^10?

I think you know the answer.

Quite relevant to this post.

http://www.computerworld.com/action/article.do?command=viewArticleBasictaxonomyName=storagearticleId=9045141taxonomyId=19intsrc=kc_top

What we have:
•kB means 1000 B by official SI definition.
•kB means 1024 B in traditional memory-related computer domain.
•KiB means 1024 B by official IEEE 1541 definition (note the capital “K”).

So kB is more or less ambiguous, depending how the context relates to memory:
-RAM uses k=1024, else we get holes in adress space (yikes!).
-Bandwith uses k=1000, because it has no power-of-2 constraint.
-Hard drive capacity uses k=1000, but chunck allocation uses k=1024 to fit nicely in RAM.
-Audio CD uses non-power-of-2 chunks allocation because it is a streaming media (datarate being more important than adressing).
-Flash memory is treated like RAM if it hold a BIOS, like a hard drive if it’s assembled into a USB pen.
ect ect

Some suggestions to remove ambiguity:

•State how much your kB is (visible everytime a size is displayed, not buried deep in the doc).
Verbose, but easy fix to add.

•Use “KdB” to mean 1000 B (d standing for decimal).
Non-standard, but as compact as can be.

•State both kB and KiB.
Might look bloated, but is also the most informative.

Use k=1024 only when realy necessary.
Remember the rest of the world uses k=1000, and rightly so.

@Luc: “We have a problem when we have a 500GB (1000 based) and you need a real 500GB (1024 based).
Workers in computer (Admins, programmers…) can deal with that.
But ordinary people are very confused about that.”

I think we’re using a very odd value of “ordinary”. If you need a 500GiB(base 2) hard-drive (which are pretty rare), surely it’s more sensible just to get a 600GB(base 10). Easier to find, and one touch more space.

I think you are right, and I think that we are being short changed. I also bought a 500GB hard drive and found the same problem.
When I buy a car, I expect to get it home with four seats - not three. I also expect to get four wheels - not three.
If computers are logical systems, then we should try to talk about them in the same way. In maths, it is acceptable to round up or down figures accurately, for simple notation. Thus if the drive is 465GB (rounding up or down to suit), then that is what it should be. If all manufacturer followed suit, then there would be no problem and no exaggerations, or of the publics feelings of being short changed.

I think software should start incorporating units as they were fonts. If you are European, you leave of inches and feet units out and you never encounter them anywhere on your system again! Same with bits, bytes and kilo’s. If you are nerd, you install a 1024 kilo unit, if you are stock trader you insert 1000 as kilo, if you are drugsdealer you insert K as kilo and if you are blond you insert “big” as kilo. And if this function is not usefull enough to incorporate for the kilo-nerds, then please do it to get rid of those freakin imperial shit. Oh, and make page-sizes font-like too! Trash the letter and tabloid. A0-A6 is all we need.

“they intentionally use the official SI definitions of the Giga prefix so they can inflate the the sizes of their hard drives”

What a load of crap. Hard drives have been measured in powers of ten since they were first invented; long before the DOSes and MacOSes of the world started reporting sizes in powers of two.

The problem here is Microsoft, not marketing. What conceivable benefit is there to reporting a 100,000,000,000 byte drive as “93 GB” in one place and “95,367 MB” in another place? None. Microsoft’s notation is stupid and useless.

Western Digital was absolutely correct in their response to getting sued:

‘Surely Western Digital cannot be blamed for how software companies use the term “gigabyte”—a binary usage which, according to Plaintiff’s complaint, ignores both the historical meaning of the term and the teachings of the industry standards bodies. In describing its HDD’s, Western Digital uses the term properly. Western Digital cannot be expected to reform the software industry. … Apparently, Plaintiff believes that he could sue an egg company for fraud for labeling a carton of 12 eggs a “dozen,” because some bakers would view a “dozen” as including 13 items.’ http://paulhutch.com/wordpress/?p=214

Using “G-” to mean “1,073,741,824” is just wrong, plain and simple.

For those who haven’t yet, you’ll want to check XKCD for a definitive standard on the topic:

http://xkcd.com/394/

Well if it’s tradition to always use binary prefixes then someone should change the Ethernet spec and other networking standards which have always used decimal prefixes, not binary… the only thing that’s naturally binary is memory (RAM)… hard drives shouldn’t necessarily be.

What it comes down to it that consumers are STUPID! They believe Microsoft Windows when it tells them a file’s size is 1 GB when really it’s 1 GiB… Maybe the dumbasses should try suing Microsoft for supplying faulty software instead of going after hard drive manufacturers.

The MBR disk format for hard drives has an upper limit of 2TiB per partition. If you have a disk that’s more than 2TiB in size, you need to switch to the GPT format, which most OSes have only recently made available (e.g Windows XP 32-bit doesn’t support GPT)

That definitely is a case where the binary/decimal confusion arises; while 2TB drives are pretty rare, it’s not hard to build a big RAID array over the 2TiB limit. You do have to keep track of the fact that it’s a 2TiB limit, not a 2TB limit. Single HDDs are up to 1.5TB, so it really won’t be long before HDD manufacturers make disks that don’t work with Windows XP. That will really annoy the anti-Vista zealots.

http://www.newegg.com/Product/Product.aspx?Item=N82E16822116084 vs http://www.newegg.com/Product/Product.aspx?Item=N82E16822136322
one of them is advertised as 147 GB hard drive and the other as 150 GB, is one actually bigger than the other or is Fujitsu just more honest than Digital Western?

Honestly, it never used to be a problem. SI prefixes have always been powers of two for binary quantities (which is only bytes) and powers of ten for decimal quantities.

It’s well known by anyone that actually needs to know - I never get confused. Bytes are always powers of two. (Network communications speeds are number of bits, not bytes and take powers of ten prefixes.)

The worst was the “1.44MB” disks. These are actually 1044 kilobytes. (1044 * 2 ^ 10 bytes, in case people aren’t keeping up.)

Hard Drive manufacturers definintely used to be more generous. I remember a 20 MB hard disk drive that had a bit more than (from memory) 21 million bytes capacity - back in those days, they actually made sure they met what it says on the box and you actually got more, no matter how you measure your megabyte.

(And kibibyte? It’s stupid. Sounds like the unit of food eaten by an ISO standard cat in a cat food eating time unit. And a bunch of nerds trying to be nerdy.)

I can picture future cyber punks and general underground hoodlums now…

“Hey homey, you bustin’ some yo yo yo worth of yobibyte warez for me?” 8^D

Jeff, You’re losing it.

The 1024 vs 1000 issue is so irrelevant. Every hard drive manufacturer uses the 1000 measurement so when you’re deciding which drive to buy you can safely compare and know that you aren’t getting one product with a smaller capacity than the other.

So what if you don’t get a nice round free space number when you install the drive in your PC. The only time the issue might be a problem is if you have exactly 500GB (1024) of data and you try to buy a hard drive to hold it.

This article was just padding.