Yesterday I Tweeted the suggestion that one could use NTFS compression to gain more storage space on the Surface or other Windows RT tablet. People quickly pointed out there are a number of questions around, and potential issues associated with, doing this. I’m not going to fully explore this subject, but did want to get the conversation going. Perhaps someone into doing this for a living will fully explore the subject.
The only Surface I have is my go to machine, so I’m not going to muck around with it much. But I did do a simple test just to see if exploring the use of compression might make sense. I found a large folder (2.38GB) and compressed it. The directory was C:\Users\xxxxxx\AppData\Local\Packages. It compressed down to 1.67GB, a 30% savings that in concrete terms gave me an additional 700MB of space. A not insubstantial improvement, and that is with just one folder compressed.
Now my Surface has 128GB in it, 64GB internally and I added a 64GB SD card, so I’m not exactly in need of compression. I uncompressed the folder and started to write this blog post. I do have my old W500 sitting unused at home, so I could dig it out and do more testing (including actual performance tests) on an older Atom/SSD/Windows 8 system. But I don’t know if that will be interesting enough to suggest what Surface owners might want to do. Hence my call for someone more into doing this for a living to give I a try.
Now to address some of the things that were raised over on Twitter.
Is using compression bad on an SSD? I don’t think so, but I haven’t really studied the issue. Michael Fortin wrote a blog post back when Windows 7 came out talking about SSD support in which he recommended use of compression for “infrequently modified directories and files” and against using it for actively written files. Tom’s Hardware did an extensive review on the topic a little over a year ago and also recommended use of compression on an SSD. The real question around using NTFS compression on an SSD isn’t is it good or bad, but rather what directories would it be good for and which would it be a bad idea for.
Compression is basically a tradeoff in which you use more CPU time in order to take up less disk space. If you have files that are infrequently accessed and an abundance of free CPU time than you have a perfect situation for using compression. If you are trying to maximize CPU efficiency and/or client responsiveness then compression is counter-indicated. That is why, for example, Microsoft wouldn’t simply ship Surface with compression turned on. They would rather trade away some storage space than risk degrading the user experience. In compressing the directory mentioned above (which is where apps store their local data) I probably impacted application startup times a bit, I just don’t know how much. In previous experience on earlier versions of Windows I was never able to detect performance differences with compression enabled, even when I compressed the entire system disk. But Windows RT systems are very carefully balanced to provide their user experience, so it is possible that misuse of compression would lead to noticeable degradation. We need a guinea pig….
A related issue is around one of the more worrisome characteristics of SSDs, that they wear out from write activity. If compression leads to more writes, or a write pattern that isn’t balanced across the cells of the device, then you could have wear issues. On the surface (no pun intended) I don’t see it as an issue. Generally compression reduces the number of writes by compressing the data. But what it might do is increase the number of cells written on random updates as compression changes the update from as small as a single sector into chunks as large as 64KB. So on a file getting a high volume of very small updates you could see a theoretically bad behavior. But I doubt this is a relevant problem for today’s tablets, particularly given how SSD’s caching and internal file system behavior actually work. If you are planning to use your Surface as a transaction processing machine updating a database hundreds or thousands of times per second then contact me and we can discuss the wisdom of compressing the database file. But otherwise the wear issue is a non-issue over the expected lifetime of the device.
There is another point that was made in two ways, which is basically what if there is some other form of compression already at work? Some file types like JPEG or MP3 already compress the data, so adding more compression isn’t really going to help. True. Also, some SSD controllers do compression at the controller level (which, if the controller is fast enough, is all gain for no performance pain). But, as my test showed, not the Surface.
The largest amount of user data on a client device is likely to be already compressed files, so the purpose of using NTFS compression would be to reduce the amount of space taken up by Windows and applications rather than user data. If you need more space for your photos, videos, and music then use an SD card. They are cheap! The one in my Surface has my entire music collection on it, and when I travel that is where movies and TV programs I want to watch are stored. It is also where my camera gets backed up every night during a vacation. I just wish that Microsoft’s related Windows Store apps had better Library support (which is another topic and one of my few real complaints about the Surface).
I’d love it if someone took a deep look at this question and came up with a set of recommendations (and preferably a script) of what to compress on a Surface or other Windows RT device. I suspect one could easily free up 1-2GB with no noticeable impact on user experience. But I have other fish to fry so I won’t be the one to do it.
I would be worried about battery life — not performance. CPUs are so fast these days that they could easily keep up with the disk bandwidth. However, LZ77 compression (as used in NTFS) is not hardware-accelerated.
This contrasts with the AES encryption situation. AES is implemented efficiently in hardware on the Nvidia Tegra and on the Core i5 — which covers the Surface RT and the Surface Pro. Thus, the Surface can turn on Bitlocker by default, with only a negligible impact to battery life. Running AES on dedicated hardware is a lot less wasteful than running AES on a general-purpose architecture like x86 or ARM.
(I actually once ran Bitlocker on a desktop machine with no AES-acceleration. I didn’t see any performance degradation. But then, I wasn’t worried about battery life either — it was a desktop.)
Hal —
I can appreciate everyone’s concern over the space used by the operating system repair image. However, a very real and troublesome issue in Windows 8 is that it doesn’t seem to delete old versions of apps downloaded from the Windows Store.
In my Program Files\WinApps folder I have six versions of Bing News (Microsoft.BingNews_1.7.0.31_x64__8wekyb3d8bbwe) and each version consumes over 50MB of space! So if we want to talk about actual space used versus free on a clean Surface install, we have to also consider that each built-in Microsoft app is going to be patched and these additional patches consume more space. See screenshot here – http://imgur.com/k70cpvv
Interestingly, I also found old versions of apps that were uninstalled still in the folder. These zombie apps consume space and there appears to be no way from the Windows 8 UI to remove them. So for an example — a game I downloaded from the App store but never played, Adera, (Microsoft.Adera_1.2.0.10540_x86__8wekyb3d8bbwe) consumes 1.3 GB per install and there are three separate versions of the game sitting in the Program Files\WinApps folder. I deleted the game but the files remain and they occupy over 4GB of space on the hard drive since there are three versions of it in \WinApps. See screenshot here – http://imgur.com/93gECPx
I have over 10GB of wasted space on my Surface dedicated to separate versions of Microsoft patches and downloaded apps that were deleted but never uninstalled properly.
So as a follow on question to “Why does Surface and the OS use so much space”, we should also be asking, “Why isn’t the OS cleaning up after itself when applications are updated or removed?”
Earlier last week I found a Microsoft page on MSDN/dev site which explains the rationale here and I can’t seem to find it today. It basically says this is a feature so that other users who share the Surface can login and use different versions of the same app. That sounds great in practice until you accumulate a half GB of space for each built-in Microsoft app. This feels like #FAIL.
Shaun Tonstad
Isn’t this similar to the way that Windows from Vista up to 8 now handles apps? I’ve noticed the same thing on my Windows 7 system but I wasn’t so worried about it because I’ve got a gi-normous hard drive. This seems like it wouldn’t be such a great thing on a tablet, though.
John — in Windows Vista and 7 it is up to the application developer to decide how an upgrade occurs and whether old files should be preserved. In the case of a Windows 8 Store developer, he/she does not have control over this process. Unfortunately, it doesn’t appear that end users have any control either…
Vista/Win7 cache installers so you can uninstall via the Add/Remove Programs feature in control panel. Windows theoretically guts the installers when doing this to save space; but I’ve seen enough massive installers get cached to know the process isn’t foolproof. I suspect this may be related to MS’s preference for tiny installers that download everything else when you run them.
Dan, I think that you may be on to something. I’ve seen the same thing. I was curious as to why MS had those tiny installers that download the real installer. Your theory is an interesting one.
This would only be part of it. The other half is that offline installers either need to contain everything you need for each supported platform (os version, 32/64bit, languages, etc) including dependencies that may have been installed separately already and optional features the user may or may not want; or you need to have a large number of customized installers. The former means large amounts of extra data to download that won’t be needed. The latter will confuse non-technical users. Both cases produce executables with limited shelf life due to security patches while the auto downloader will always pull the latest and greatest,
I don’t think it’s an issue now; but years ago the MS downloader avoided problems where a browsers download tool didn’t support suspend/resume. It could also allow them to do fine grained error recovery for in flight data corruption by a bad connection although I don’t know if they actually do so.
The discussion about installers is very interesting to me. I am surprised people aren’t up in arms about Windows 8 keeping duplicate copies of installed Windows Store Apps. These apps are not like traditional installed applications — instead of a Windows Installer process an .appx package is unzipped for each installed version of the software. In my post I call out the issue concerning multiple copies of an entire application residing on the file system, not remnant installers. So if I download a 1GB application from the Microsoft store and it is upgraded, I now have two 1 GB applications installed and as an end user there is no apparent remedy to this behavior.
What does Microsoft say about this?
For me compression isn’t about having more space to dump stuff but to decrease disk page reads, SQL server has done much with compression not to save space (HDDs are relatively cheap) but to decrease disk activity and thereby increase performance. I don’t know the file system that well, would that apply to folders?
If you have excess CPU bandwidth then compression can help with performance by reducing disk I/O in the file system case. But I doubt it is ever dramatic in real life use. In the database case the benefit can be more dramatic because of how extensively data is cached. Compression can, in some cases, allow the database system to keep vast amount more information in memory and thus dramatically reduce the need to do I/O.
It depends on how much RAM you have and how fast your processor is. If the thing is speedy and has enough RAM, then it’s very possible that you’ll gain more time back by reading less disk than you lose in CPU time, compressing and decompressing all that stuff. I wouldn’t recommend installing SQL Server databases into a compressed folder, but for general stuff it’s just fine.
Pingback: Selecting a companion device for my PC | Thoughtsofanidlemind's Blog
Pingback: Congratulations Microsoft on the Surface Pro | Hal's (Im)Perfect Vision
Wow.. is it ~20 years ago, again? The last time I remembered having to deal with compressing a normal OS drive was almost 20 years ago on a 130MB (yes, meg, not gig) drive. I didn’t think anyone had to bother anymore.. Yes, I’ve seen the compression option in the Windows filesystem dialogs for years, but didn’t see much point.
“2013 – Its the new 1990’s. Thanks M$!” 😉
I’d read a bit more on the subject before jumping on the MS pitchfork bandwagon my friend. SSD’s and modern processors make NTFS compression somewhat positive in comparison to legacy hardware.