WinFS, Integrated/Unified Storage, and Microsoft – Part 4

I knew that the Cairo Object File Store (OFS) was in trouble my first week at Microsoft.  I’d been asked to attend a design meeting for OLE DB that would start at 6PM.  On Friday.  Why such a seemingly important meeting would be scheduled for Friday evening would soon become apparent.  OLE DB was being created as Microsoft’s new unified storage API, and it would be an API for OFS.  So you can imagine my confusion when the Program Manager from OFS reported he had to delay completing a part of the OLE DB spec because of “his day job”.   In what shape could OFS be if they couldn’t make specification of its API, the API that Developer Division would focus tools support on and evangelize, part of someone’s day job?  That was just the first hint of trouble.

My initial job at Microsoft was to take the 100,000′ foot vision for Microsoft in the Enterprise (Servers in particular) that David Vaskevitch had presented to Bill, and gained approval for, and drive it to a set of real engineering plans.  The second clue that OFS was going down is that David (who along with Bill and Jim Allchin was one of the primary executives pushing for Integrated Storage) didn’t include figuring out how OFS fit into the Enterprise plan in my effort.

The third clue was the definitive one.  Because I was new to Microsoft (and thus could be objective) I was asked to intervene in a spat between the Exchange team (working on the first version of Exchange Server, nee Exchange 4.0) and the JET-Blue database engine over the performance of the Mailbox Store.  What I learned along the way was that the intent was for Exchange Server to be built on OFS, but since OFS wasn’t ready Exchange was doing its own interim store for Exchange 4.0.  The plan of record was for the second version of Exchange to move to OFS.  However, in an email discussing the performance of the existing mailbox store the Exchange General Manager mentioned that he didn’t think Exchange would ever move to OFS.  While the OFS project was still alive, it was clear to me that everyone in the company had already written it off.

I didn’t pay any attention to OFS and some months later word came down that it had been canceled.  I can’t really tell you what happened with it.  Was it a conceptual problem?  A timing problem?  Or mostly an execution problem?  I don’t know.  My guess, based on the subsequent Integrated Storage efforts, was all of the above.

While OFS was fading from the scene my efforts included a focus on another part of the unified storage puzzle.  Going back to DEC I’d put a lot of focus on the question of Integration (a single image store) vs. Federation (making separate stores appear as a unified one).  OLE DB was a piece of that puzzle and we put significant resources into its ability to support Federation.  I proposed and lead the acquisition of Netwise so we could bring mainframe data into the picture.  Later, when I was running the Relational Engine team for SQL Server 7.0 I promised David that I’d find a way to get heterogeneous query into the release even though it was “below the line” and I had no resources to do it.  It was one of those calculated risks (making a personal commitment that wasn’t supported by the official commitments) that paid off.  Soon other stores within Microsoft started exposing data via OLE DB, including Exchange and Active Directory.  Federation and OLE DB would be another topic, so I won’t say more about it here.

For a while I thought that federation was going to end up being Microsoft’s answer to the unified storage riddle, but the itch to create an integrated store was one that still had to be scratched.  About mid-way through development of SQL Server 7.0 Peter Spiro and I were summoned by Jim Allchin and told that we were being locked in a room with representatives from Exchange and Windows until we came out with a design for Integrated Storage.  The minimum lockup was set at two weeks (and Jim was serious about this as even though we finished our work a couple of days early he wouldn’t let us return to our regular jobs).

The best part of Jim locking us all away was that it was the first time that the leaders of the unstructured, semi-structured, and structured storage communities had really gotten together to level-set on how these three storage types worked.  Before you get too shocked by this keep in mind that in any earlier meeting, such as during OFS’ conception, Microsoft was not a significant player in the structured storage world.  The discussions at the time would likely have been around how you could build Access on top of OFS, not on how the guys who create engines for structured storage were going to build or contribute to an integrated storage engine.  Also, I should point out that intensive two-way discussions had always gone on between the file system group and structured and semi-structured communities.  It was getting the structured and semi-structured guys in a room for a long enough time for them to really understand one another that was novel.

At the end of a few days of level setting we got down to defining an integrated store.  We named the design proposal we came up with JAWS, which if memory serves came from Jim Allchin’s Windows Storage.  We all returned to our day jobs, which in Peter and my case meant the death march that was SQL Server 7.0.  I think a week or two went by and then David called me over to let me know that Bill and Jim had decided to proceed with JAWS and I was a candidate to lead the effort.  I declined, and the other candidate (a Product Unit Manager from Exchange who had been one of their participants in the design) took the role.

While I was a strong believer in integrated storage, and very interested in JAWS, there was just no way I would leave the SQL Server 7.0 effort at that point in the cycle.  And given the success of 7.0 and the SQL Server business I don’t regret my decision one bit.  But at the time I was worried I was making a horrible career choice, because Integrated Storage was of greater strategic importance to Microsoft than the database business.

Being buried in SQL Server 7.0 meant I didn’t get to follow what was going on with JAWS, but after a year or so of development the powers that be took a look at it and decided it wasn’t going to work.  Without the ability to get any support from the underlying system JAWS had turned into a  complex and very heavyweight layer on top of SQL Server.  The project was cancelled with the idea that the SQL Server team would pick up responsibility and create a lighter-weight solution as part of the SQL Server 2000 release.  That responsibility fell to me and I was back in the Integrated Storage business.

SQL Server 2000 was initially thought of as a 12-18 month release because we were concerned we’d missed something in SQL Server 7.0, and wanted to be ready to respond quickly to customer feedback.  The Integrated Storage work thus created a tension between our thinking of what was best for the database business vs. what was best for the Integrated Storage charter.  We allowed the release to stretch into the 18-24 month range to accommodate this, but it wasn’t without creating inter-team issues and planning miscues.

Besides doing design work on turning SQL Server into a usable store for semi-structured and unstructured data types we worked with potential clients around Microsoft to figure out who was going to be our customer.  For example we hoped that Outlook would sign on to use SQL Server 2000 as a replacement for its PST files.  But although we had the official charter for coming up with an Integrated Storage solution, an alternative effort was underway in the Exchange team.

The JAWS project had largely been staffed out of Exchange (plus new hires) and when it was cancelled many of them returned to the Exchange team.  They pursued creating an Exchange File System as part of Exchange Server 2000.  And this is where politics does rear its ugly head.  At some point a Microsoft reorganization brought Exchange, Office, and Developer Division together under Bob Muglia but left SQL Server in a different reporting chain.  Bob’s organization rallied around the Exchange File System creating a proposal that would have Office use it as their store, Developer Division create a new code management system on top of it as well as provide tools support for apps written against the store, etc.  As a result the key clients that would have used a SQL Server-based Integrated Store were lost to us.

In a meeting with Bill to decide the direction for Integrated Storage he had to choose between two options.  One was the technology base that he thought was the right one for the long-term vision of Integrated Storage, but it was a store with no one committed to use it.  The other was a solid plan and commitment to deliver something that unified the unstructured and semi-structured worlds within Microsoft.  Bill chose to let the Exchange-based plan proceed, but also encouraged us to continue to work on SQL Server as the basis for a future Integrated Storage solution.

While some of what we were working on for SQL Server 2000-based Integrated Storage continued, such as a plan to evolve full-text indexing over the course of a few releases, we mostly put Integrated Storage on the back-burner and turned our attention to the database business.  Plans for Win32 file API access to Blobs was pulled from the release, for example.

The Exchange File System work continued until shortly before Exchange 2000 was to be released.  The powers that be then took a look at what had been done and decided that it was not the right thing to promote as Microsoft’s Integrated Storage solution.  The grand plan was rolled back at the last-minute, and responsibility for Integrated Storage once again fell to the SQL Server team (including reorganizing Exchange into the Server Apps organization that SQL Server was part of).

At this point I was ready for a break from the storage world.  Not only had I been focused primarily on database software for 25 years, I’d spent 5 years in death march mode.  So I decided to go help David Vaskevitch with his new efforts to launch Microsoft into the small business software arena.  Before making the change I drove creation of the engineering plan for Yukon (SQL Server 2005), during which the deal to acquire Great Plains Software was made.  WJen I moved over to work for David my first assignment was to drive the engineering side of integrating Great Plains into Microsoft.  A few months in to this effort I took a long-planned two-month sabbatical.  Just before leaving it became apparent that as it made no sense to have two Senior VPs in the small business software arena (Doug Burgham had joined as part of the Great Plains acquisition) David would be taking a new position.  Bill wanted him to take a CTO role and before I left for sabbatical I told David I’d go with him if he took that position.  So I returned from sabbatical to find myself working for the CTO.

The first thing on David’s plate as CTO was to revitalize the Integrated Storage effort.  And so I returned from sabbatical to find the main thing on my plate was Integrated Storage!  In the six months since I’d left SQL Server they’d made some progress on this front.  A couple of people from Exchange had joined the SQL Server team and created something called Mighty Mouse, essentially a SQL Server image suitable for use inside Windows.  And the Exchange team had started design work on a redesign based on SQL Server.

While I was on sabbatical David had neatly partitioned the problem in an attempt to simplify the effort, with SQL Server as the store and a set of schema definitions for People, Places, and Things that we wanted to promote across Microsoft and to third-parties.  So, for example, we could have a common definition of Contacts that would be used by the Windows Shell, Outlook, Exchange, Great Plains, etc.  And we could evangelize SAP and others to use our schema as well.  For the next few months we would engage many teams around Microsoft in an effort to get them to adopt SQL Server and the P.P.T. schema.

A plan for Integrated Storage was finally coming together.  The Windows Shell team had decided to build the new Longhorn shell on Mighty Mouse and use the P.P.T. schema.  Exchange was on board.  The SharePoint team, already users of SQL Server in an idiosyncratic way, was on board for a redesign.  Outlook was back to taking a serious look at switching to SQL Server for a PST replacement.  Active Directory was committing to doing their future work, starting with a meta directory, on SQL Server.

Why the apparent success this time?  I think two things were different.  First of all was the previously mentioned industry-wide attitude towards SQL databases.  Whereas during the mid-90s they were considered suitable just for large-scale data processing applications, by the turn of the century they were accepted as a storage medium for a broad array of storage needs.  Second, rather than having to deal with a future Integrated Storage file system, projects were looking at SQL Server itself.  We’d ended up with a much more incremental strategy, move everyone to SQL Server (plus some common schema) then evolve it to the full Integrated Storage solution.

I agreed to move back to the SQL Server organization to become the Integrated Storage General Manager.  This looked like it would work out both professionally and personally.  I could build an organization and deliver the first version of Integrated Storage in Longhorn, put designs and a plan in place for the more complete Integrated Storage solution in Blackcomb, and then retire to Colorado on the timeline my wife and I had set.  Sadly this is not how things would play out.

Microsoft had decided that NT would become the base for mainstream Windows and the plan was to have one release to merge the two streams and then follow that up with another major release to move Windows forward (i.e., in the direction that Cairo was originally intended).  Windows 2000 was supposed to be the merged release, but finishing up the application and device compatibility work proved to be more than could be accomplished.  So the plan now became a quick turnaround release called Whister to finish the merge and then follow that up with Blackcomb as the leap in user experience etc.

Whistler became Windows XP and during its development the Windows team decided they needed to do another modest release before moving on to the major overhaul that was envisioned for Blackcomb.  This release was named Longhorn.  While Longhorn work started based on the original modest requirements and schedule a number of factors lead to a change in thinking.  Microsoft couldn’t afford to keep delaying the reinvention of Windows, and so effectively Longhorn took on the Blackcomb requirements.

From an Integrated Storage perspective the new direction for Longhorn meant that the incremental strategy would have to be replaced by a full-on Integrated Storage File System.  For me personally it put the effort out of scope.  I’d already worked out with my wife that if Longhorn slipped (as releases often do) she’d move to Colorado and I’d stay in Redmond for a few months to finish up.  But now Longhorn had officially moved out a year, and unofficially everyone believed it would take at least two years beyond the original schedule.  I briefly considered being a commuter, but that wasn’t going to be viable (as I’d have to be in Redmond five days a week, every week, and often on weekends) for as long as Longhorn would take.  So I bowed out and Peter Spiro took on the Integrated Storage responsibilities.  The accelerated Integrated Storage effort would eventually get the name WinFS.

Shortly before I left the effort Hailstorm was cancelled and some of its charter and teams (e.g., synchronization) moved into my organization.  Basically the distractions to finally delivering on Integrated Storage were being removed.  But as the Integrated Storage charter grew so did the problem of deciding what to address in the first release.  For example, could you address client, server, and cloud storage all in one release cycle?  No, at least not equally.  So this became one of many tensions in the system.

A few months ago someone asked me why Exchange wasn’t built on SQL Server.  And earlier in his post I mentioned that they’d actually started such an effort.  But with the creation of WinFS they were stuck between a rock and a hard place.   Should they target SQL Server or WinFS?  WinFS was prioritizing client first, which suggests they should have continued with their SQL Server port.  But then Exchange would have faced the prospect of porting once to SQL Server and later to WinFS, so I believe they decided to wait for a server version of WinFS.  Apparently they are still waiting.

Longhorn itself turned out to be too aggressive an effort and have too many dependencies.  For example, if the new Windows Shell was built on WinFS and the .NET CLR, and WinFS itself was built on the CLR, and the CLR was a new technology itself that needed a lot of work to function “inside” Windows, then how could you develop all three concurrently?  One story I heard was that when it became clear that Longhorn was failing and they initiated the reset it started with removing CLR.  Then everyone was told to take a look at the impact of that move and what they could deliver without CLR by a specified date.  WinFS had bet so heavily on CLR that it couldn’t rewrite around its removal in time and so WinFS was dropped from Longhorn as well.

The WinFS project continued with the thought that it would initially ship asynchronous to a Windows release before being incorporated into a future one.  But now it had two problems.  First, it was back to the problem of having no Microsoft internal client that was committed to use it.  And second, they eventually concluded that there was no chance in the forseeable future of shipping WinFS in a release of Windows.  With the move of Steven Sinofsky, who had been a critic of WinFS, to run Windows that conclusion was confirmed.  WinFS was dead.

Of course the SQL Server team had learned a lot about the needs of non-traditional database applications and created a lot of technologies while working on WinFS.  And they had a strong ship vehicle of their own in SQL Server.  So they decided to incorporate much of what they’d learned on WinFS in future releases of SQL Server as part of making it a broader Data Platform.  For example Sparse Columns, released as part of SQL Server 2008, is a feature that the semi-structured storage community had been asking for since the JAWS discussions over a decade earlier.  The Entity Data Model, File Streams, Semantic Search, etc. are all outgrowths of the long history of work on Integrated Storage.

In the last and final part of this series I’ll speculate a bit on the future of Integrated Storage.

This entry was posted in Computer and Internet, Database, Microsoft, SQL Server, Windows and tagged , , , . Bookmark the permalink.

23 Responses to WinFS, Integrated/Unified Storage, and Microsoft – Part 4

  1. Thanks for this great history lesson.

    Two ideas I championed during Win7 incubation of a new “Windows Application Model” (an effort led by then Core Arch + an amazing team of devs led by Suyash Sinha chartered to tackle the “Windows Hard Problem” list) were a generalized way to annotate stores with enough meta data so that the OS could programmatically reason over their semantics (this was part of effort to arrive at a generalized form of state separation and exorcise the evil that is the Windows Registry) and the other was a generalized way to define extensibility contracts for the OS, and for applications.

    Both ideas were debated at length but not adopted. Part of this was my fault for not explaining well enough. I was often told “you’re trying to build WinFS and we’re not doing that.” Without this history lesson I never really knew what that meant other than realizing that was my queue to shut up 🙂

    Postscript: I tried again in the new Windows 8 organization. You can judge my success by the lack of native OS support for extending Win8 apps, and the fragmented stores used for the metadata associated with installed AppX packages.

    veni vidi defui

  2. Pingback: SQL, OFS, Exchange file system, WinFS, and other murky dealings | Thoughtsofanidlemind's Blog

  3. There is something I don’t understand about this: When it comes to such fundamentally new technologies, isn’t it more appropriate to have a Microsoft Research-like team work in isolation first, figure out what this “Integrated Storage” should be and build a functional prototype?

    Only then can all the various teams work on using that technology to build their products. Contacting them earlier than that is a recipe for disaster (“too many cooks” analogy).

    Many ground-breaking technologies were created that way (including .NET), so I don’t understand why that wasn’t tried for this seemingly important project.

    I’m guessing you will cover that in your speculation in the next part.

    • halberenson says:

      What makes you think .NET was created that way? Back in 1994 David Vaskevitch, whose role at the time included being DevDiv’s chief architect, proposed a common runtime. DevDiv was too busy with the problem of how to unify its IDEs (recall that each language, particularly C++ and VB, had their own) that they couldn’t deal with the common runtime idea. Later, after Java entered the market and David had become the SVP of DevDiv, Microsoft got on the common runtime path first with Project 42 and then scaling that back to what we now know as .NET. But I don’t recall this being based on any research project (not that MSR might have participated).

      The problem with the approach you describe is that it only works when no one actually is interested in addressing the problem. That is, you can do research and advanced development in advance of someone saying “let’s go build a product”. But once a product team decides it is ready to build one, it goes and builds one. They don’t sit around and wait for researchers to tell them what to build.

      That’s not just an observation of Microsoft, its an observation of DEC as well. And from discussions with IBM folks it is (or was) true of them too.

      • I hadn’t read the history of .NET… My bad (my perspective was colored by all the Channel 9 videos I watched of Anders Hejlsberg & cie).

        I see what you mean about the product teams not waiting for “researchers”. And from what I read in this post, these teams did indeed go ahead and built their own isolated solutions.

        But when I look at the figure on the following page:
        It tells me that Microsoft has been interested in Integrated Storage for more than 20 years now.

        So, putting the politics aside (which I know isn’t easy), if a more isolated/independent team had build a prototype years ago (in parallel of these teams’ efforts) and proven that it was the “best” (most efficient, extensible, scalable, etc) solution, all these teams could have been forced into converging on that technology.

        So, is it possible and more likely that this project would have been more successful if that approach was taken? (Was it even tried? And if not, why?)

    • Brian says:

      A few comments on this:

      1) If you want to spelunk the history of .NET, thumb through really old MSJ/MSDN magazines from the late nineties. Look at the articles on what was planned for later releases of COM+, and look was Microsoft was doing to Java. If you screw the two of them together, you get a kinda-sorta view into what became the .NET Framework.
      2) MSR’s big contribution to .NET was the implementation of generic types in the v2.0 version of the Framework. It was really a masterful thing – silver bullets are rare in software, but the .NET generics implementation was pretty silver-bullet-like.
      3) MSR can do lots of great work, but doing something like Integrated Storage (where *key* Microsoft products like the OS (for the file system), SQL Server, Exchange, and other products that need some sort of store will be the primary customer really aren’t where MSR excels. MSR can help with blue-sky work here, but someone in the product group(s) needs to make sure that the key customers (the OS, SQL, Exchange) have their very specific needs met.

      • halberenson says:

        Brian, I think you kind of captured it. MSR (and any research organization) is good at coming up with technologies that are then picked up by product groups. Sometimes they do assemble pieces of their research into more complex research systems, but it is very very rare that such a system is productizable. There is a different type of non-product development that at DEC we called Advanced Development but has never had a clear name or methodology at Microsoft. You fire up an AD project when the technology exists (i.e., someone like MSR has done the invention) but you don’t have enough understanding to craft a full set of product requirements, scenarios, etc. So you build prototypes and prove out your ideas, then when you understand what you want to build you transition it to a formal product effort. At least you try. When the relational database AD effort at DEC was getting ready to transition to development the direction it had chosen was rejected, the project killed, and two other (competing) product efforts started up instead. In the end all that came out of the AD work was enough proof of concept to show that you could indeed build a commercial relational database product.

        Basically, you have to build products in a context where you have something ground in reality as an anchor for your work. You need real users. The reason Azure took so long to get traction is that the first version did not have a customer in mind. Had they partnered with an internal property as their first client then they would have built something useful. Instead they built “something”. “If you build it they will come” rarely works.

  4. Tim says:

    I remember reading MSDN Magazine articles on WinFS back in 2004 or so. Very interesting to hear what was actually going on behind the scenes within Microsoft. Until reading your blog (and others’) in the past year or so and hearing about all the maneuvering that actually takes place within the business, I had naively assumed there was a unified effort on the part of Microsoft whenever I heard about the things that were “coming soon.”

    Must have been difficult for those working on initiatives that span many years only to have them cancelled.

  5. Samuel Smith says:

    Hal, thanks for this great post. I love reading your take on many different topics, but this one … let’s just say that it isn’t often that I get to read something that helps illuminate the events of my own past!

    I read the history you describe here, and for some of it I thought, yeah, I was there, I remember that – but from a much different perspective. As a low-level SQL engine developer, I saw only a little of what you describe directly, but I saw so much of its effect as we were yanked back and forth by what seemed to be ever-changing plans and product directions. From the grand plans of WinFS (eventually surviving as “mere” File Streams), to the promise of .Net CLR’s nirvana of server-hosted extensibility (with years to get basic hosting right, the other dependencies you mention had no hope!), from the mostly successful heterogeneous query access via OLE DB (so much was special-cased for distributed but “homogenous” access), to the less successful attempts to integrate search at the storage-level (using approaches that fit a client-side application) … those were definitely interesting times! I actually miss (some of) it.

    Thanks again for your post.

  6. Vitaly Akulov says:

    thanks for bringing this information to the surface. As someone who was in SQL Server group at that time (although things I did were not related to WinFS) I found it very fascinating. I think there were more reasons “helping” WinFS to die which you didn’t mention:
    – Semi-official merging SQL Server with Exchange when tens of Exchange developers were “forced” to work on SQL Server code, including WinFS and not having previous database experience
    – Many of them leaving SQL Server group as soon as they could, back to Exchange or other groups
    These factors disturbed talent pool distribution between SQL/WinFS and created some unhealthy tensions, from what I remember.

    Thanks again for the great post,

  7. Qone says:

    Hi Hal, I am an avid reader of your blog , always great insights
    Would you be able to do a blog on HTML5 vs native apps and the future of HTML 5 in mobile dev

  8. yuhong says:

    What about the Jet 4.0 story and how Access 2007 ended up having to fork it to create ACE?

  9. Fascinating series of articles. Looking forward to part 5. 😉

  10. AndyLawrence says:

    It looks like you never got around to writing part 5 of WinFS and what you think the future holds for integrated storage solutions. I am currently working on a whole new, general-purpose data management system that manages structured and unstructured data, so I found your series (Parts 1-4) a fascinating read. I have worked with file systems for over 25 years and my new system is designed to be a replacement for conventional file systems that integrates a lot of database features. It works far better than I imagined it would when I first started coding it from my design (now 3 years ago), but I still have a lot of work to go before it is a finished product. Hopefully it will go a lot farther than WinFS ever did.

    • halberenson says:

      It’s still on my list of things to do. I get very frustrated whenever I try to write it and end up abandoning the effort. Someday it will just jell in my head and I’ll put it down.

      Good luck with your design!

  11. meiru says:

    I know that there is a product that is what WinFS (in my opinion) would have been if the project was not cancelled. Contact me for more information…

    • meiru says:

      … to be more precise: It’s a project, that’s still in it’s final testing phase and not yet publicly available.

Comments are closed.