- Home
- Where we make a difference
- Fundamentals
- SI - International system of units
- Prefixes for binary multiples
- Prefixes for binary multiples
When is a kilobyte instead a kibibyte? And an MB instead an MiB? Quite often is the short answer
Have you ever wondered why your hard drive which nominally has a storage capacity of 500 gigabytes only has actually about 488 gigabytes? The reason has to do with a peculiar situation relating to prefixes for units: let's dig further into it...
While digital computers are fundamentally binary systems and as such their low-level behaviour is described in terms of the binary number system, based on two digits, the famous "zeros and ones", the prefixes for the multiples of quantities such as file size and disk capacity are based on the decimal system that has ten digits, from zero through to nine. To be sure, a prefix like "kilo" has only one meaning, that is one thousand times the given unit (so that the kilometre is 1000 metres, and so on): but, as a matter of fact, the differences mentioned above need to be accounted for by informally referring to a "binary kilo", which equals 1024. Are you confused?
Don't be surprised, because even the most tech-savvy people often mistake the actual kilo and the somewhat mysterious binary kilo.
Metric Prefixes from
yocto (10-24) to yotta (1024)
Bits and bytes
The basic measurement unit for binary data is the bit, where 1 bit is the quantity of information conveyed by one binary digit, either 0 or 1. Computers use data which amount is measured in bits or bytes (that is, a sequence of 8 bits), or in bits per second when data is transferred, for example through an internet connection or from a hard disk to the central memory of a computer. In most cases, however, computers deal with large amounts of bits (and bytes!), so large that recurring to prefixes is an obvious requirement.
Years ago, at a time when computer capacities barely matched the few tens of thousands of bytes required by this single web page, computer engineers noticed that the binary 210 (1024) was very nearly equal to the decimal 103 (1000) and, purely as a matter of convenience, they began referring to 1024 bytes as a kilobyte. Yes, a kilobyte should be 1000 bytes, but it was, after all, only a 2,4 % difference, and all the professionals generally knew what they were talking about among themselves. The (il)logic was set: one megabyte was understood to be 220 (1 048 576) bytes instead of 106 (1 000 000) bytes, and so on.
Despite its inaccuracy and the inappropriate use of the decimal SI prefix "kilo" for binary values, the term was also easy for salesmen and shops to use, and it caught on with the public.
As time has passed, data has become big (!), and kilobytes have grown into megabytes, then gigabytes, and now terabytes. The problem is that, at the SI tera-scale (1012), the discrepancy with the binary equivalent (240) is not the 2,4% at kilo-scale but rather approaching 10%. At exascale (1018 and 260), it is nearer 20%. It is just mathematics that dictates that the bigger the number of bytes, the bigger the difference: so that the inaccuracies – for engineers, marketing staff and public alike – are set to grow more and more significant.
Similar confusions arose between the computing and the telecommunications sectors of the IT world, where data transmission rates have grown enormously over the past few years. Network designers have generally used megabits per second (Mbit/s) to mean 1 048 576 bit/s, while telecommunications engineers have traditionally used the same term to mean 1 000 000 bit/s. Even the usually stated bandwidth of a PCI bus, 133,3 MB/s based on it being four bytes wide and running at 33,3 MHz, is inaccurate because the M in MHz means 1 000 000 while the M in MB means 1 048 576.
Mathematics dictates that the disparities resulting from the mixed and incorrect use of decimal prefixes will become increasingly significant as capacities and data rates continue to grow. In the IEC 80000-13:2008 Standard, all branches of the IT industry, and in fact our society at large, have a tool with which to iron out this inconsistency. For each decimal prefix, a binary prefix is defined, so that in correspondence to the decimal “kilo” there is a “kilobinary” prefix, named “kibi” and with “Ki” as symbol; to the decimal “mega” there is a “megabinary” prefix, named “mebi” and with “Mi” as a symbol; and so on. This gives us the possibility to report the difference in a technically correct way: 1000 bytes are 1 kilobyte (Kbyte), and 1024 bytes are instead 1 kibibyte (Kibyte).
The differences in the terms and symbols, from the decimal to the binary system, are purposely minor to ease the switch.
Will this endeavour be successful? Only time will tell if habits will prevail over technical accuracy.
More information
History of the SI
The present situation in the IEC
Prefixes for binary multiples
SI units