For the Same File, Why Windows and Ubuntu Show Different Sizes ?

Recently, I ‘partially’ converted one of me friends to Ubuntu, and he is using Ubuntu alongside with Windows. After few days of using it, one day, he came to see me (with his Laptop) and said; ‘hey Gayan, mate I’m experiencing something weird here!‘ …

… Because of its nature, I thought that, this particular conversation might come in handy for someone who is also new to Ubuntu. So I decided to write about it, and here it goes.

So I said, ‘what did you mean?

He answered, ‘The other day, I downloaded a file in Ubuntu yeah, and saved it into Windows partition. But as soon as I logged into Windows, I moved my mouse pointer over it, only to realize that the size had been changed!‘ (Windows showing a lesser size than Ubuntu did).

An example …

So been a non-geek, he got scared thinking that Ubuntu might have corrupted the file, and had decided not to use it, until he figures out what was going on.

But before explaining him ‘what was going on’, I asked him to take a note of the sizes of few other files in his Windows partition. Then I asked him to reboot back to Ubuntu, and view those same files, and take a note of their sizes again.

Ubuntu nuts?, or is it Windows ? 😉 …

And then he realized, not just the file that he downloaded and accidentally found out that both Ubuntu and Windows show it in different sizes, but the same thing is happening, for the other existing files too.

Now of course he had a basic understanding about ‘Bytes’, ‘Kilobytes’, ‘Megabytes’, ‘Gigabytes’ and how they work etc. So I brought it to his attention that, though the size of these files in ‘KB’, ‘MB’ or ‘GB’ are different, the size displayed in ‘bytes’, are same under both operating systems.

So, without boring him any further, I told him that this was because, when displaying file sizes in anything other than using ‘bytes’, both Microsoft Windows and GNU/Linux use two different unit prefixes, while converting units into other formats (Bytes -> Kilobytes, Kilobytes -> Megabytes, Megabytes -> Gigabytes etc).

For example …

In Windows …

Windows assumes that there are 1024 Bytes in a Kilobyte unit, and 1024 Kilobytes in a Megabyte unit etc.

In Ubuntu (GNU/Linux) …

Ubuntu assumes, a 1000 bytes constitute a Kilobyte (KB) unit, 1000 Kilobytes for a Megabyte (MB) and so on.

This ‘confusion’ has come into existence in the old days, due to various computer storage hardware devices, such as ‘RAM’ and ‘ROM’ using 1024 as the ‘unit prefix’ (due to technical reasons) when converting between units (except for ‘bytes’).

But most other storage devices such as HDDs and Flash drives, using 1000 as the base ‘unit prefix’, while calculating the sizes. So there rose a bit of a confusion among the experts, which to use while displaying file sizes in different units.

For a better explanation, I showed him this Wikipedia entry

The computer industry currently uses terms such as kilobyte, megabyte, and gigabyte, and corresponding symbols KB, MB, and GB, in two different ways. In citations of main memory or RAM capacity, gigabyte customarily means 1073741824 bytes. This is a power of 1024 (specifically 10243), and 1024 is a power of 2 (specifically 210), therefore this usage is referred to as a binary prefix.

In most other contexts, the industry uses kilo, mega, giga, etc., in a manner consistent with their meaning in the International System of Units (SI): as powers of 1000. For example, a 500 gigabyte hard drive holds 500000000000 bytes, and a 100 megabit per second Ethernet connection transfers data at 100000000 bit/s.

In contrast with “binary prefix”, this usage is referred to as a “decimal prefix“, as 1000 is a power of 10.

So later, to avoid confusions, the IEC and NIST standardized them, and changed the symbols into …

In usage, products and concepts typically described using powers of 1024 would continue to be, but with the new IEC prefixes.

For example, a memory module of 536870912 bytes (512×1048576) would be referred to as 512 MiB or 512 mebibytes instead of 512 MB or 512 megabytes. Conversely, since hard drives have historically been marketed using the SI convention that “giga” means 1000000000, a “500 GB” hard drive would still be labeled as such.

According to these recommendations, operating systems and other software would also use binary and SI prefixes in the same way, so the purchaser of a “500 GB” hard drive would find the operating system reporting either “500 GB” or “466 GiB“, while 536870912 bytes of RAM would be displayed as “512 MiB”.

In simple terms, if an operating system uses the term ‘megabyte’ (MB), then it should use the 1000 bytes per kilobyte (KB), 1000 kilobytes to a megabyte (‘MB‘) etc perceptual value (‘decimal prefix), while converting between the units.

If it uses the value 1024 (‘binary prefix’), then it should address them as ‘kibibytes’ (KiB), ‘mebibytes (MiB) etc.

So in that sense, it does not matter, whether the OS uses the ‘binary prefix’ or the ‘decimal prefix’, what’s important is that, whether if it uses the correct symbols while displaying them.

It is apparent that Windows is using the ‘binary prefix’, as if you take the first image, then you will see that it lists the size as ‘710,934,528 bytes’. Now take a calculator and divide it by ‘1024’, which should give you its size in ‘kibibytes’. Then re-divide it again and it will give you the value 678, which is in ‘mebibytes (MiB).

Now do the same, using the second image that was taken in Ubuntu. But this time, use the value 1000 instead of 1024, and you will get the output in megabytes (MB), 710.9.

I honestly do not know about ‘IEC’ and ‘NIST’ laws and how they are applied, but Windows, since it uses the ‘binary prefix’, should be using symbols KiB, MiB, GiB etc rather than using KB, MB, GB etc, and therefore seems like in a direct violation as well (the paper only say ‘would’ though).

Hey ‘Windows!’, where’s ‘i’ ? 😉 …

And Ubuntu or GNU/Linux, is using it in its ‘proper’ foam. Not only that, as shortly mentioned, due to their technical nature, hardware devices such as RAM modules sizes are calculated using the ‘binary prefix’.

Therefore, as you can see above and below, Ubuntu displays my RAM size and the programs that are loaded into the it, in ‘kibibytes’ (KiB), ‘mebibyte’ (MiB) and ‘gibibyte’ (GiB).

But when showing files stored in a HDD for example, it uses the ‘decimal prefix’, as most disk drives manufactures mark them under the ‘decimal prefix’ standard. Then again, what matters is not whether you use ‘1000’ or ‘1024’, but whether the OS is using the proper symbols.

But GNU/Linux is respecting both technical implementations, plus the standards (as far as I can see), and after hearing that, though he was really bored, my friend was very happy. End of the story :P.

P.S: You can read another brief explanation by using the below command in your Terminal window as well.

man unit

17 thoughts on “For the Same File, Why Windows and Ubuntu Show Different Sizes ?

  1. Windows 7 ist not even able to change that behaviour. What a crappy os. Wow I have a 1836GB harddrive, because they use “Giga” in terms of 1024ies. Seems switching completly to linux is the best. Fuck windows. Fuck MS.

  2. You have mistake here:
    1000 kilobytes to a megabyte (‘MB‘) etc perceptual value (‘binary unit prefix),

    and here:
    f it uses the value 1024 (‘decimal prefix’)

    The terms (‘binary unit prefix) and (‘decimal prefix’) should be used vice versa

  3. I’m an old-school comp-sci major that hasn’t kept up with some of those developments. Thank you very much for explaining this. I was recently perplexed by this same issue. Much like in the world of web development, hopefully the use of the correct standards will improve with time. :\

  4. Sorry, I grew up in the ’80s and as far as I’m concerned the ONLY measurement for computers – an inherently binary device – is base-2. The binary-base measurement was standard throughout the industry until about the early ’90s when hard drives exceeding a gigabyte started to appear. That was when some hard drive manufacturer’s marketing department discovered that they could make their hard drives sound larger than they actually were just by switching to base-10 numbering. Other manufacturers quickly fell in line so as not to lose sales. Nobody was confused until those marketing a$$wipes got involved.

    It’s a lot like sound systems. Nowadays systems are sold that claim ridiculously high output wattage. What wold have been called a 30-watt stereo back in the day is now sold as 120-watts simply because they changed how they measured it so the marketing department could claim more sales. It’s borderline dishonest, but technically legal and sadly it works. Can’t decide what disgusts me more: the companies that pull this crap. the goverment for allowing it, or the moron consumers who buy into it because all they see is a higher number (so it MUST be better, amiright?!?).

    I’m just making the switch to Linux and I REALLY am considering switching back to Windows over this issue alone. I am having a helluva time finding a way to force the proper BINARY file sizes to be displayed in ALL programs I use.

    • Preach it, brother. 😉

      Personally I blame Europe’s obsession with the metric system. It does not need to be applied to storage sizes. Plus kibibytes, mebibytes etc. sound really stupid.

        • I agree. Its all well and good to give people the warm fuzzies with technical stories about what (or who) is right. But that doesn’t change the fact that MiB is the more useful number to display.

          So many UIs still use MiB (mislabled Mb or whatever) , so when you’re comparing file sizes or whatever, its useful if the UIs are displaying the same metric (see what I did there)

          I thought 1024 (binary prefix) was pretty much the norm until the marketeers came along pushing their agenda (of making hard drives appear bigger). When did the geeks jump on board with that?

          1000 is really only useful to people that can’t count past 10 on their fingers. When we all know you can count to more than 100x that (+24)

          If you’re gonna make a new standard, I reckon don’t steal the old symbols. Is that how we ended up with ton vs tonne?

  5. Ain’tNobody hit the nail in its head! This whole confusion started because of lies perpetrated by marketng departments. Its easier so sell a 2TB hard drive that a 1.8TB. He got his wrong thou: “The binary-base measurement was standard throughout the industry until about the early ’90s when hard drives exceeding a gigabyte started to appear.” Wrong. This has been the case since comercial hard drives hit the market. I still have a 345MB Maxtor that MSDOS reported as having 330MB… 🙁 And this was at a time when $/MB was not exactly cheap!

  6. I would argue that the only place it’s sensible and desirable to use binary counting is when measuring the size of your RAM – it’s a lot neater to say that you have 1, 2, 4, 8 or 16 GiB in your system than 1.1, 2.1, 4.3, 8.6 or 17.2 GB.

    When it comes to files and storage sizes which are more continuous than discrete, surely decimal counting is best. I’m currently working on a 1,191,465,227 byte file – what is the sense in calling it 1.10 GiB or, worse “1.10 GB”, rather than the logical and straightforward 1.19 GB? If a hard drive manufacturer managed to fit 2,314,325,221,120 bytes of “space” on a HDD, I have no problem with them labelling it as 2.31 TB capacity, not 2.10 TiB.

    So, to me, Ubuntu has it spot on.

  7. oh yes what if for the sake of the argument some hard drive manufacturer release a 100 terabyte hard drive, how would they advertise it? as far as concerned a 100 terabyte is equal to 90.9 terabyte in Windows terms.

  8. CD-R is 700 MiB while DVD-R is 4.7 GB (4.5 GiB)

    Ubuntu and Android should allow users to display everything in GiB including data transmission speed.

    No need for bits/sec.

    So should Microsoft but Nadella is disappointing.

  9. @Charlie I totally agree with you! the decimal prefix is convenient for most use cases, excluding RAM sizes, where the RAM manufactures should definitely label their sizes as GiB instead of GB!

    @Jacob You mean “CD-R is 703.13 MiB (737,280,000 bytes (2,048 bytes/block), 737.28 MB)) while DVD-R is 4.7 GB (4,707,319,808 bytes (2,048 bytes/block), 4.38 GiB)”.

Leave a Comment