A few weeks ago I explored how Microsoft could release Windows 8 and Windows Phone 8 in June of 2012. With yesterday’s announcement that Windows 8 Beta won’t begin until late February people are wondering about hitting 2012 at all, let alone June. There is little chance Microsoft will miss the holiday 2012 selling season with Windows 8. Not having a competitive Tablet in the market in 2012 is a potentially existential problem, and would make Windows Vista’s delays, deficiencies, and market ultimate failure seem minor by comparison. But will Windows 8 ship (more precisely, reach GA) in June, Summer, or Fall of 2012? If you read my previous arguments then Summer seems most likely. I can (and in a moment will) make an argument for June still being feasible, but I have my doubts. In fact, if Windows 8 does indeed reach GA in June then I will spend the entire month of July parading around in t-shirts with Steven Sinofsky’s picture on front and pictures of his directs on back.
Since the late 70s the industry has been trying to figure out how to eliminate the need to put the testing burden on customers (aka “Field” or “Beta” testing). If you think about products like DEC’s TOPS-10 I really don’t recall there being any formal testing. You would write-up a change on a listing (!) of the source code, the code would be reviewed at a weekly meeting and, if approved, edited into the source tree by release engineering. They’d do the weekly build and deploy the result on the main development system. Testing was simply by usage. Over time the builds would deploy to additional machines. And then we’d send it out to customers for field test. Basically customers were the primary way of testing.
Fast forward a decade to DEC Rdb/VMS circa 1988 and things had evolved, but not so much. When a developer checked in a change they wrote a test for it and checked it into a test suite. There was no test plan, and no review of the tests the developer wrote. Some did decent positive and negative test case coverage, but many did quite a minimal job. We would build the product and run the test suite every night and hold a meeting each morning to review test results and assign regressions to developers. For Rdb/VMS V3.0 I’d pushed to get a contractor to independently write a more extensive set of tests, but after several months senior management decided not to renew the contract. They just didn’t see the value. So we had better tests, but not nearly the test suite I’d envisioned. Once we were done with development off the product went to customers for field test. Again the vast majority of testing was left to customers. One problem with this approach is that it is hard to know when field testing should end. You basically live with a plan that says you’ll RTM when the bugs stop being reported.
One particular story highlights the flaws in relying on customers to perform product testing. About a year after Rdb/VMS V3.0 shipped a major customer sued DEC over a bug that caused massive data corruption. This customer had previously been one of our best, and indeed was a field test customer for V3.0. During the discovery process it was found that the customer had discovered the bug during field test, but the report had sat on the customer’s system administrator’s desk rather than ever being submitted to the Rdb team! Months later they put V3.0 into production and…. The bug required a particular stress condition that only they, amongst all our field test customers, achieved.
Fast forward another decade as Microsoft SQL Server 7.0 was being readied for release. We had separate test teams writing complete test suites of positive and negative test cases for the entire product. We did code coverage analysis to see areas that needed more test attention. We had several massive stress tests designed to push the limits of the product (exactly the thing that would have uncovered the data corruption problem back with DEC Rdb/VMS). We took traces of many dozens of customer systems and ran them regularly against our builds. We worked closely with ISVs to run their software and test suites against our builds from very early in the process. And we had internal Microsoft IT systems running in production on builds from pre-beta onwards. We gave out builds, called IDWs, to internal development partners (e.g., Visual Basic, Access, etc.), ISVs, customers who were part of the Early Adopter Program, book authors, and some others. A sea-change had clearly occurred, but we still had a reliance on beta testing to finish finding bugs and help us polish the product.
Jump forward yet another decade and there had been more evolution (again using SQL Server as an example). Attention to specifications and having complete plans prior to any coding is more evident and reduces the amount of rework (which tends to generate more bugs than the original coding) needed later in a project. Features can’t be checked in until they are completed end-to-end (e.g., tools support, replication support, etc.) and fully (positive and negative) tested (for 7.0 we required positive test cases for check-in with negative test case complete coming later). More tools (e.g., PreFast) are available that eliminate coding errors. The testing processes developed for SQL Server 7.0 have continued to evolve. Builds are kept close to shippable state at all times (where as with SQL Server 7.0 we would allow the bugs to back up and then spend months in a separate stabilization milestone fixing them in a in order to have a beta or release quality build). And the SQL Server team has eliminated a formal beta. The IDW builds of old have evolved into Community Technical Previews (CTPs) that are made available from time to time during the course of development. And the intent of the CTPs has evolved from primarily about finding bugs to being primarily about getting people to start using new capabilities as soon as possible. Yes there remains some intensive customer testing programs, such as TAP/RDP (the successor to EAP), and ISV engagement. But what they’ve tried to do is get away from a philosophy that you roll the product into a release like thing called “Beta”, based on some arbitrary date, and throw it over the wall to customers for finding bugs. It’s this philosophical change that is the real difference between CTPs and Beta.
There are major differences between the processes used by SQL Server and by Windows (e.g., Windows controlled information release policy precludes SQL Server’s CTP process), but the Windows processes go even further in trying to avoid having customers be the means by which bugs or design deficiencies are discovered. Attention to design and specifications takes dramatic precedence in the current Windows processes. Ask most developers within Microsoft and they are shocked by how few actual coding weeks and milestones there were in the Windows 8 development cycle. Amongst other things this forces work to be done in smaller chunks and hopefully keeps the overall product from being destabilized. The amount of testing that goes on after each coding milestone is extensive. Customer input is something you collect early in the project, not after you are nearly done. You see this in how often they quote telemetry in the Building Windows 8 blog, though telemetry is not the only input (e.g., usability testing, feedback from the previous release, etc.) you rely on. Thousands of Microsoft employees have no doubt been using Windows 8 as their primary operating system for many months, and by the February beta it will probably be 10s of thousands. In other words, most of the heavy lifting of making sure a release of Windows is ready for customer production use is done long before it goes to beta. Beta, in actuality, is becoming a formality and not a required part of a release process. In the extreme you see this with on-line offerings, such as from Google, where things stay in beta for years and the primary purpose of the label is as a way to say “we reserve the right the pull the rug out from under you at any time”. So what does “Beta” mean for Windows 8?
Beta is mostly about getting the ecosystem ready for the release. This started with the Developer Preview (and earlier private builds to key partners), but Beta will be where they say to partners “we are done, get yourself ready”. Sure they will continue to work on bugs, both internally discovered and reported by beta testers. But that is actually a secondary goal. They will avoid changes that impact OEMs, developers, documentation (because of localization lead times), etc. or anything they feel could destabilize the release. Basically Windows 8 Beta is more traditional RC0 then it is a traditional beta. So, can you go from RC0 to RTM in 3-4 months? Of course. And that, combined with what I talk about in the earlier blog entry, would still allow for June availability.
The fly in the ointment of my justification that June is still possible is how dramatic some of the changes in Windows 8 are. I’ve been playing out two scenarios in my head (as though I were running the project). The first is that feedback (on both the new user experience and the new application model) from the Developer Preview and other activities is of sufficient quality and quantity that I’m comfortable not having any runway to react to input once Windows 8 beta starts. That is the one in which a June (or shortly thereafter) RTM occurs. The other scenario is if I’m uncomfortable with the feedback received to date. In that case I would want to build in some opportunity to react after beta, leaving me to target more of a September RTM and GA. The Windows philosophy clearly is more lined up with the first scenario. But only Microsoft has the data to know if they are on track or might need to go with the latter scenario.
I think it is most likely that Microsoft received enough feedback from earlier activities to know Windows 8 needed a bit more work, but not that it required a traditional beta. This is all but confirmed by the late February beta date being so far beyond what most observers expected. It seems like Microsoft added another milestone in order to fully accommodate the feedback now rather than ship the beta on some original schedule and then add a beta update to pick up changes it was still working on (which was the traditional approach). But I’m also thinking that even if they make a June RTM it will be late June and thus GA will move into July (or later if my earlier assertions about changes to the end game are incorrect).