Now I get it, the Consumer guys at Microsoft are just plain wrong!

I’m an Enterprise guy.  In the mid-70s I set a career goal of dethroning IBM from its then total dominance of the enterprise computing space.  By the end of the 80s I needed a new goal, and it’s always been around furthering the use of computing in the Enterprise.  So I like to think, and have plenty of evidence to support, that I know enterprise computing as well as anyone on the planet.  But outside of being a consumer, and applying common sense and observation, I claim no particular expertise in bringing technology to consumers.  So I tend to give the so-called experts in the consumer area the benefit of the doubt when they claim some behavior that violates my sensibilities.  But I’ve concluded they are wrong.  Horribly wrong.

In the Enterprise space we realized something decades ago:  Customers don’t so much buy your existing product, they BUY-IN to your strategy.  Your product can have numerous weaknesses and even look bad against the competition in some critical areas, but as long as the CIO and other key decision makers like where you are going they will still choose you.  And so we always have been willing to tell our customers, often but not always under NDA, where we were going.  It worked.  They bought.

But the consumer experts, using Apple as an example of a successful strategy, have argued that you don’t say anything until shortly before shipping.  There are good reasons for this.  You want maximum press coverage very close to availability.  You want to avoid the Osborne Effect.  You want to avoid over-promising and under-delivering.  A CIO might understand that you had to change plans because of strategy changes, technology shifts, or just engineering expediency.  Consumers aren’t so forgiving.  Once you tell them about something they aren’t very tolerant of a failure to deliver.  Basically, in the consumer realm “shock and awe” rules.

Problem  number one with “shock and awe” is that it doesn’t work when you need developers and other partners to succeed.  Apple doesn’t say anything about iPhone x more than days before availability, but it does release the SDK for the new version of IOS months in advance.   Microsoft did this for Windows Phone 7.0 and 7.5, but for Windows Phone 8 it didn’t release the SDK in advance.  So four months after Windows Phone 8 devices started shipping we still see very few apps that take advantage of new features.  That’s a FAIL Microsoft.

The latest catch-phrase being attributed to the Windows Phone team is “shut up and ship”.  Really?  That’s what you did for Windows Phone 8 and it didn’t work.  You alienated developers.  You alienated power users.  You alienated the faithful.  You alienated the influencers.  And you are continuing down that path.  True the volume of buyers doesn’t care a lot about these things.  But they take their cue from those that do.

Of course this doesn’t just apply to the Windows Phone team, but to Windows as well.  Microsoft had a history of saying too much too far in advance and then being unable to deliver.  The purpose of PDC was to give developers an advance peak at what was being worked on, and get their feedback.  This gave both developers and Microsoft time to react before a product was finalized.  The disaster that was Longhorn let Steven Sinofsky bring his philosophy to Windows, and it was the complete opposite.  Don’t talk to people, even under NDA, about what you are doing until it is almost fully baked.  Don’t let developers talk to customers.  Impose secrecy (and impose it even on enterprise customers, which is a horrible mistake).  What did this get Microsoft?  A bunch of rookie mistakes, including failure to address the Start Menu/Desktop situation in a way that traditional form factor users find acceptable.  80%, 90%, perhaps 99.99% of dissatisfaction with Windows 8 is all tied up in this one issue.  Failure to disclose intent early, and respond to the resulting feedback, is holding Windows 8 back.

And the Windows 8 mistake continues.  Early disclosure of the direction for Windows Blue and Windows 9, or whatever the next couple of releases are called, would go a long way towards assuaging power user discontent.  It would also give Microsoft feedback on if what they are doing is sufficient, in time to actually react to it.  But no, that is not the philosophy of the Windows team.  Nor Microsoft in general these days.  The irony is that when Microsoft was considered at the height of its arrogance it was actually listening to customers very closely.  Now that it is fighting for its right to continue to be called an industry leader it displays the greatest sense of arrogance.  “We know what’s right for you and we’ll tell you what that is when we attempt to shove it down your throat”.  If my Microsoft friends don’t believe that is their attitude, then it’s because they are on the inside looking out.  The view from outside is enlightening.

Microsoft needs to improve customer engagement, dramatically.   And the first thing that will take is to recognize that the consumer teams’ attitude towards communicating futures is just plain wrong.  That doesn’t mean abandonment of controlled information release, it means applying it more intelligently.  It means disclosing platform direction early.  It means bringing developers on board early.  It means giving general technology direction to the public early (ala what BillG used to do) without talking specific releases or products.  It does not mean releasing every detail of a product in advance.

Microsoft has to make a clear differentiation between platform and product.  They need to nurture the ecosystem around the platform.  That requires openness.  They need to reserve “shock and awe” for the product.  The problem I see from the consumer guys is that as much as they say the word “platform” they don’t get it.  Even those that used to get it now seem to have amnesia.  They drunk the consumer Kool-Aid.

Let me contrast three strategic thrusts going on at Microsoft.  Windows, Windows Phone, and Azure.  Windows and Windows Phone are in the “shut up and ship” camp.  Azure is in the ENGAGE camp.  It seems like every week Scott Guthrie is announcing new Azure technology previews or releases.  Everything about Azure is exciting.  Amazon, Salesforce, and a few others defined cloud computing.  Azure is displacing them.  It has the Big Mo.  Let me make this clear, AZURE IS GOING TO WIN the cloud computing infrastructure and platform battle.  Meanwhile Windows and Windows Phone continue to alienate their ecosystems.  It is unclear if Windows Phone will ever amount to a significant third ecosystem.  It is unclear that Windows will be able to halt an overall market share decline against IOS and Android tablets.  Azure developers are excited.  No, it’s beyond excitement.  Windows and Windows Phone developers?  Not so much.  They are, at best, conflicted.  Azure is doing platforms right.  Windows and Windows Phone?  They prefer to “shut up and ship”, even if it risks no one caring what they ship.

It turns out that the consumer ecosystem as well as the community of influencers craves direction and interaction with the vendor every bit as much as Enterprise CIOs do.  Put simply, the consumer guys are wrong about secrecy.  At least at Microsoft.  I hope they figure it out soon, because “shut up and ship” is not helping their cause.

In closing let me remind everyone that I think Windows Phone and Windows 8 are great products.   It is the failure to engage with the ecosystem in a way that Microsoft well understands, and continues to do successfully in the enterprise (STB) space, that I’m criticizing.  Forget that conventional “shock and awe” consumer wisdom.  It was wrong.  Return platform evangelism, including willingness to discuss “futures”, to the forefront and watch Windows and Windows Phone adoption explode.

About these ads
Posted in Cloud, Computer and Internet, Microsoft, Windows, Windows Phone | Tagged , , , , , , | 50 Comments

Losing patience with Windows Phone

I have to say that my patience with Windows Phone is wearing off.  There are two reasons for this.  The first is the lack of real progress on the application front.  The second is the feeling of abandonment surrounding the Nokia Lumia 900 (or more broadly, Microsoft’s inability to even get a minor update like Windows Phone 7.8 fully rolled out).

Back in April and June of 2012 I wrote blog entries bemoaning the lack of real world apps on Windows Phone.  I won’t remake those arguments, so I suggest you read those pieces to understand my frustration.  The bottom line here is that Windows Phone does not help me manage my life nearly as well as an iPhone (or Android-based phone) would.  It is 11 months since I wrote that first blog entry and as far as I can tell NONE of the real world interaction apps I was missing have appeared on Windows Phone.  And I include both WP7.x and Windows Phone 8, so it isn’t that moving to a WP8 device would help.

The second issue is less important, but actually related to the first.  Microsoft “screwed the pooch” on the rollout of Windows Phone 7.8.  It is now March and AT&T still hasn’t offered me the update for my Lumia 900.  Well, right now I think updates are on hold while Microsoft fixes a bug with Live Tiles.  But every day that goes by I care less and less about the update.  Basically, thanks for the cosmetic improvements but if and when 7.8 actually comes to my Lumia 900 it won’t make the device measurably more useful to me.  Microsoft has managed to turn WP7.8 from something intended to mollify WP7 device owners into something that rubs their nose in the lack of upgradeability of devices they acquired less than a year ago.

The reason I feel these two issues are related is simple.  If WP8 solved my app problem I’d be salivating over WP8 devices and forget all about the WP7/WP8 transition. Instead I just view WP8 devices as more major cosmetic improvements that don’t get me even 1% of the way towards closing the usefulness gap with the iPhone.  They aren’t giving me what I really need, and thus any momentary lust for new technology is overwhelmed by the fact that I’ll shell out $500+ of my own money and after a few weeks of technological masturbation be just as unsatisfied as I am today.

So next month, when I would normally do my mid-contract upgrade, I’m not sure what to do.  One thing is clear, the certainty that I’ll be getting a new Windows Phone is gone.  I just don’t see the compelling value proposition.

Posted in Computer and Internet, Microsoft, Mobile, Windows Phone | Tagged , , , , , | 51 Comments

WinFS, Integrated/Unified Storage, and Microsoft – Part 3

Although there are several ways to interpret the phrase “integrated storage” (or “unified storage”) one of the most important ones to focus on is that it creates a single store for Unstructured, Semi-Structured, and Structured types of storage.  The differences between these storage types, often seemingly small, are at the core of the technical, engineering, and political challenges involved in creating a new store.  So before diving into the history of Microsoft’s efforts it is valuable to discuss these three types of storage.

Unstructured Storage, the classic storage provided by operating system file systems, is something I’ve already discussed quite a bit in the previous parts of this series but want to add more clarity.  File Systems historically treat files as a bag of bits which can only be interpreted by an application.  They concern themselves with making it very fast to open a file, allocate space to it, stream bits to and from the file, and navigate to specific starting points in the file for performing streaming.  They also pay a lot of attention to maintaining the integrity of the storage device on which the file resides, and of providing certain very specific behaviors upon which an application (which might include a DBMS) can build more robust integrity.

The developers of File Systems tend to rebel against changes that violate the basic Unstructured Storage premises.  They want a very restricted fixed set of metadata about a file so they can make File Open very fast.  They don’t want to introduce concepts that require a lot of processing in the path of moving data between a raw storage device and the application (or network stack in the case of TransmitFile).  They don’t want to introduce complexity into kernel mode that risks the overall reliability of the operating system.  And they pay a huge amount of attention to the overall integrity of a “volume” and what happens when you move it between computer systems.

It isn’t that File System developers haven’t responded to pressures for richer file systems, it is that they have done so in very careful and precise ways that mirror their core mission.  At DEC, for example, they introduced Record Management Services (RMS) to add some measure of structure on top of the core file system.  RMS turns a bag of bits into a collection of records of bits.  In the case of keyed access a set of bits within the records could be identified as a key which was then indexed allowing retrieval by key.  But once a record was retrieved the application was responsible for interpreting its contents.  Importantly RMS existing as a layer on top of the core file system, and didn’t run in kernel mode.

At Microsoft you can see numerous ways that the File System team tried to accommodate greater richness in the file system without perverting the core file system concepts.  For example, the need for making metadata dynamic or adding some of the things that the Semi-Structured Storage world needs was met by adding a secondary stream capability to files.  That is, the traditional concept of a file was that you had a single series of allocation units pointed to by a catalog entry.  NTFS gained the ability to have that catalog entry point to more than one stream of allocation units.  The primary stream represented the file as we normally know it.  An application could attach another stream to the file to hold whatever it wanted.  The file system guys really didn’t care, and they didn’t interpret the stream.  So this was a very natural extension.  They also created File System Filters as a means to allow extensions to the file system without modifying the core file system itself.

From an engineering and political standpoint you can see what might happen when you start discussing replacing something like NTFS with an Integrated Storage solution like WinFS.  How does it impact the boot path of the operating system?  How does it impact the reliability of the operating system?  What happens to scenarios like Web Servers or Network File Servers, which serve up bags of bits using standardized protocols.  And are evaluated by benchmarks, and against competition, that will neither benefit from nor suffer the cost of a richer file system?  How would the new file system impact minimum system requirements?  Does the namespace cross multiple volumes?  How would that impact the portability of volumes?  All very good questions that need to be addressed.

The natural progression would be to talk about semi-structured storage next, but since it is the youngest of the storage types I’ll first focus on Structured Storage.  While the file system guys have always treated files as a bag of bits, applications need some way of interpreting those bits.  That knowledge can be completely encapsulated in the application itself, or parts of it can be shared.  One of the earliest motivators of the library mechanisms we find in programming languages today was as a way to share the definitions of how to interpret the contents of a file.  COBOL’s Copy statement was a prime example.  Data Dictionaries, and their modern evolution to being a Repository, were further evolutions of this concept.  To commercial data processing, as opposed to technical/scientific, applications a file was a collection of records each of which adhered to a specific format.  That format information was shared across any application that desired to process the file.  So you had a customer file with customer records.  Each record was xxx bytes long.  The first two bytes contained an integer Customer ID, the next 30 bytes had a Customer Name, etc.

Pretty soon this evolved to deal with the fact that apps didn’t processes one file with one record type.  You had orders, and order line items, and part, and the bill of materials for those parts, and inventory information, and the customer, and customer contact information, and so on.  You needed to manage and share these as collections.  Then notions of cross-file integrity entered the picture and transactions, logging/recovery, etc. were added.  And there was recognition that apps not only didn’t care about the physical structure of the “files”, putting that knowledge in apps made it hard to evolve them.  So separation of logical file and physical file ensued.  And making every app responsible for the integrity of the data lead to logical data integrity problems, so the ability to pull some of that responsibility into what is now called a database management system was added.    And application backlogs became a key problem so there was a push for reporting and query tools that allowed non-programmers to make use of the data collection.  And high-productivity “4GL” development tools to allow lower-expertise programmers to write applications.  And this all lead to the modern concept of a relational database management system.

So when we talk about Structured Storage we are talking about the classic database management concepts.  We’ve replaced Files/Records with Tables/Rows.  Each table has a well known logical structure that each row in the table conforms to.  There are good mechanisms for making tables extensible, such as adding a new data element (column) that is “null” in rows in which no value has been specified.  And a relational database by its nature transforms tables into other tables so we can actually have virtual table definitions (or views) that applications use.  But basically we are talking about groups of things with well known, externally described, structure.

Most of the world of commerce we are used to was made possible by the creation and growth of the concept of Structured Storage.  The modern world of Credit Cards and ATMs is 100% predicated on this work.  Amazon.com was in the realm of science fiction in the 1940s.  By the 1970s the conceptual basis for everything you needed to create it was in place.  It took until the 1990s for those concepts to mature sufficiently to let Amazon happen.  For structured storage we had database management system concepts and (hierarchical and network) implementations appear in the 1960s.  Ted Codd described the relational model in 1969, and during the 1970s the System R and Ingres projects explored how to implement his model.  They also defined most of the integrity concepts we take for granted today such as ACID.  But it wasn’t until the late 1980s that relational database management systems, which found their earliest adoption in “decision support”, became suitable for transaction processing.   And it was the 90s by the time they were the preferred solution for high performance transaction processing.

Moreover, it wasn’t until the late 90s that developers in all application areas embraced relational database management systems.  In fact, in the mid-90s most applications that weren’t clearly in the commercial data processing camp preferred to use unstructured storage even when they were storing structured data.  Today we have smartphone applications using SQLite (and other small relational systems) as a primary means of storage.  My how Structured Storage has evolved.

During the commercialization of relational database management systems (RDBMS) in the 1980s it was recognized that not all data you’d want to store in them was actually structured.  During the development of DEC’s Rdb products Jim Starkey invented the concept of a BLOB (Binary Large Object) as a way to store this data, a concept that was embraced by virtually all RDBMS.  The simple idea here was that you could do something like store an employee’s picture in a blob that was logically inside the employee’s row in the Employee table.  Other ideas quickly developed, such as a document management system with the documents stored in blobs.  But blobs were rather weakly implemented and received minimal attention from RDBMS development groups.  This will play an important role in our later exploration.

Meanwhile I third category of storage had emerged, primarily out of the Information Worker environment, called Semi-Structured Storage.  I like to think about this as having two periods of evolution.  In the first, files remained a bag of bits whose internal structure was private to an application but that also carried around a set of public metadata.  In the second, the internal structure was exposed to any application though they might not be able to actually operate on it.  The latter is the world brought about by XML and I’ll discuss that a bit later.

So what are examples of Semi-Structured Storage?  A Microsoft Word document is one.  Forget that today Word documents are stored as XML using the Open XML standard, they used to be a fully proprietary binary format.  But they exposed metadata such as Title, Author, etc. as a group of Properties known as a Property Bag.  In other words, they promoted certain information from their private format to a publicly accessible one.  Email is another example of something in which there is the content of the message and then a set of metadata about the message.  Who sent it, who was it sent to, what is its Read/Unread status, etc.  For something non-IW think about JPEG files.  There is the image and then there is a set of properties about the image.  Things like the camera it was taken with, GPS coordinates, etc.   Applications, including the Outlook or the Windows Shell, can make use of these Property Bags without having the ability to interpret the contents of the file itself.

One of the characteristics of a Property Bag is that new properties can be added rather arbitrarily.  A law firm might create a “CaseNumber” property that it requires employees to tag all Word documents with.  Or Nikon could add specific properties about photos taken with their cameras to a JPEG image that neither the standard defines nor that any app other than their own could make sense of.  But it’s not just top level organizations that can define properties, anyone can.  So the PR department can define a property for its documents such as “ApprovedForRelease” with values such as “Draft” or “Pending” or “Approved”.  Or an individual could define a property such as “LookAtLater” for email messages.

The notion of a Property Bag seems easy enough and painless enough to understand, but it clashes with the world of Structured Storage.  How does arbitrary definition of metadata clash with a world in which schema evolution is (mostly) tightly controlled?  Do you add a column to a table every time someone specifies a new property?  If two people create properties with the same name are they the same property?  If a table with thousands of columns, all of which are Null 99.99% of the time, seems unwieldy then what is an alternate storage structure?  And can you make it perform?

XML didn’t exist until 1998, so when I start talking about Microsoft’s Integrated Storage history it is important to note that it didn’t play a role in the first two major attempts at a solution (OFS and JAWS).  Prior to XML it was assumed that either a file was explicitly a semi-structured storage type (with a Property Bag, stored in a secondary stream for example) or implicitly one because an application-provided content filter (IFilter) could extract the Property Bag from a proprietary bag of bits.  In either case the application controlled the set of properties that were externalized.  With XML though anyone can examine and process the content of the file, making arbitrary structured storage-like queries possible.  The world of semi-structured storage exploded.

There are numerous ways one can combine these three views of storage.  BLOBs were an early attempt to address use cases where unstructured storage was needed in an application that was based on structured storage.  My “ah ha” moment around the importance of XML came during a customer visit and involved a favorite (from the earliest days of my career) application, Insurance Claims Processing.

During the waning days of SQL Server 7.0 Adam Bosworth approached me about this new industry effort, XML, that he and his team were driving.  XML as an interchange effort made a lot of sense, but as a database guy I was a skeptic on using it to store data.  So I set up a series of customer visits to early adopters of XML.  One customer was using it in an insurance claims processing app to address an age old problem.  The claims processing guys were evolving their application extremely rapidly, must more rapidly than the Database Administration department could evolve the corporate schema.  So what they would do is store new artifacts as XML in a BLOB they’d gotten the DBA’s to give them and have their apps work on the XML.  As soon as the DBA’s formalized the storage for an artifact in the corporate schema they would migrate that part out of the XML.  This way they could move as fast as they wanted to meet business needs, but still be good corporate citizens (and share data corporate-wide) when the rest of the organization was ready.

I returned from that trip convinced we had to add formal support for XML in SQL Server 2000.  So convinced that I encouraged my boss to bring Adam into the SQL organization and combine his efforts with others to create the Webdata org.  And, in a move that caused some consternation with the rest of the Server team, let the Webdata team make changes to the relational server code base.  And so independent of, though actually very much in line with integrated storage thinking, SQL Server was on its way in semi-structured storage.  Something I’ll return to in Part 4.

The existence of three types of storage, three sets of often conflicting requirements, three (or more) shipping product streams with different schedules, three classes of experts who deeply understood their type of storage but not both of the others, and three organizational centers of activity for those types of storage would make trying to create an Integrated Storage solution a continuing challenge.  It actually gets worse though in that various efforts which weren’t specifically under the storage or integrated storage umbrellas had deep overlap with storage.  Hailstorm is one example,  And it seemed like everyone in Microsoft had their own sync/replication service.  What was different about WinFS is that most of these barriers, including the organization structure, were addressed.  And the failure to deliver an Integrated Storage File System when the conditions were as close to ideal as they’ll ever be is why the concept will probably never be realized.  Meanwhile the world of storage has moved on in interesting ways.

In the next part of this series I’ll go through the actual history of Microsoft’s efforts.  Depending on its length I’ll either wrap up there with thoughts about the future or finish up with a fifth part.

 

Posted in Computer and Internet, Database, Microsoft, SQL Server, Windows | Tagged , , , , | 6 Comments

WinFS, Integrated/Unified Storage, and Microsoft – Part 2

Hopefully Part 1 gave you some idea of the scenarios that Integrated Storage was intended to address and why you would want support in the storage system to help address them.  And yes, I barely scratched the surface of what one could imagine being possible if you did have that support.  I know many of people just want the dirt…I mean history…behind Microsoft’s Integrated Storage efforts, but you are going to have to wait for Part 4 before I get to that.  In this part I wanted to discuss some of the real challenges in creating Integrated Storage.  Basically I want to explain why it is such a difficult nut to crack.

Which came first, the chicken or the egg?  This classic question drives a lot of the innovation problems in technology, particularly platform technology, and plays a huge role in trying to come up with an Integrated Storage strategy.  Ok, let’s use a couple of different and perhaps even more appropriate sayings:  “Build it and they will come” or  (to paraphrase) “Suppose you built a storage system and nobody used it?”  These questions dominate any discussion about how to bring the concept of Integrated Storage to reality.  Microsoft thought it had the answers, part of which was that you make it a (or rather THE) file system.

Creating Integrated Storage as a file system has both psychological and practical purposes.  It declares that Integrated Storage is the primary store for the platform which is important to attract developer interest.  This creates a commitment to applications that would build on Integrated Storage that the store will always be present on the platform.  Maybe even more importantly, it allows other platform components to use the new store.  And (as envisioned by Microsoft at least) it creates a means by which applications that don’t explicitly know anything about Integrated Storage can still manipulate the artifacts in the store.

Before I get into talking about file systems in more detail let me tie this back to one of my scenarios.  By the start of the 21st century it was clear that Photos was the next “killer app” for PCs.  It was also clear that traditional files systems were totally not up to the task of being an organizing tool for Photos.  Third party products like ThumbsPlus and ACDSee had appeared to fill the void.  If Photos were going to become such a critical data type than you needed to make them first class citizens in your platform.  So out of the box you wanted Windows (and particularly Windows Explorer aka Windows File Explorer) to provide a full out-of-box photo organization and basic manipulation experience.  To do that would require capabilities not present in the traditional file system.  But unless your Integrated Storage solution was part of the platform then components like Windows Explorer couldn’t rely on it and couldn’t provide a great OOBE for photos.

The file systems we use today, across all operating systems, are (externally) no different from the ones I used in the 1970s and that had their origins in the 1960s.  A file is a set of allocation units on a storage medium that externally is just a bag of bits (or blocks) without structure, without a name, and without any real way to navigate to it.  External to the data structures that deal with allocations and the basic concept of a container is a catalog structure that exposes a name and navigation (directory/file a.k.a. folder/file) system to users and applications.  At the leaf nodes of the catalog there are pointers to the allocation system’s container.  So applications (including something like Windows Explorer) use one set of APIs to navigate the catalog and then take another set to manipulate the bag of bits (or stream) they find at the other end.   Internally we’ve made lots of advances in how to organize and maintain the allocation units.   Long gone are the days where files had to be contiguous, for example.  But to an end-user or application, outside the switch to long file names, I’m hard pressed to describe any significant changes in the last 40 years.

File system stability has both up and down sides.  The upside is that every application knows how to deal with a traditional concept of file.  That’s the downside too.  So take our photo example.  You don’t need to implement Integrated Storage as a file system in order for Windows Explorer to be able to provide a great organizing experience for it.  But what happens when the user wants to run Adobe Photoshop to edit the photo?  You could evangelize Adobe to support the new store through a new (non-file oriented) API, but even if successful that doesn’t help until the user buys a new version of Photoshop.  From their perspective if the photos aren’t stored in the file system, and specifically a file system accessed with existing Win32 APIs, you’ve broken their application.  This same scenario applies to Microsoft Word.

New versions of Word might support a new Integrated Storage-based document store, but forcing purchase of a new version of Word in order to access documents in the store meant dramatically slower (if not nonexistent) adoption.  Thinking about a worst case scenario where a customer had a dozen apps, any one app’s failure to support Integrated Storage could have prevented the customer from making any use of Integrated Storage.

So from the earliest discussions I recall Integrated Storage was always a new, Win32-compatible, file system.  Accessing new functionality would be done by a new API, but you always had to be able to expose traditional file artifacts in a way that a legacy Win32 app could manipulate them.  Double-click on a photo in an Integrated Storage-based Windows Explorer and it had to be able to launch a copy of Photoshop that didn’t know about Integrated Storage.  And since that version of Photoshop didn’t know about Integrated Storage it also couldn’t update metadata in the store, it could just make changes to the properties inside the JPEG file.  So when it closed the file Integrated Storage had to look inside the file and promote any JPEG properties that had been changed into the external metadata it maintained about the object.

Much of the complexity of Microsoft’s attempts at delivering Integrated Storage is owed to all this legacy support.  Property promotion and demotion (e.g., if you changed something in the external metadata it might have to be pushed down into the legacy file format) was one nightmare that wasn’t a conceptual requirement of Integrated Storage but was a practical one.  Dealing with Win32 file access details was another.

In the early post-OFS days dealing with making Integrated Storage a Win32 file system was the kernel/user mode transition problem.  An application would make a Win32 call that would end up running in kernel mode.  That would then call down into a user mode process, which itself could make a bunch of kernel model calls to access the data.  Eventually you’d return the data back through kernel model and back into the user mode process of the application that made the file system call.  It sounds slow.  And moreover it has the potential for deadlocks.

Another problem had to do with the optimizations Windows had made for dealing with network access to files.  For example, Windows had implemented the TransmitFile function for optimizing transmission of files from a web server by doing all the work in kernel mode.  It understood how to walk the allocation unit structure in NTFS in order to do this.  If one imposed a different or higher-level allocation structure on top of this, such as database blobs, then TransmitFile could no longer work as intended.  Dramatically reducing Windows’ ability to serve up web pages was considered a non-starter, particularly in an era when battles over web server market share were at their peak.

Even perfectly emulating all the file access capabilities of a Win32 file system would prove daunting.  A number of attempts at it were demonstrated to show full application compatibility in the high 90 percentile area.  Sounds great doesn’t it?  Well one of the applications that used a highly idiosyncratic feature that was impossible to emulate was Microsoft Word.   It didn’t really matter if you hit 99.5% app compatibility if that 1/2% miss included the single most important application in the entire portfolio!

Just to finish up with describing how difficult this problem is I’ll mention the Windows boot path.  It was clear from the earliest post-OFS days, and after considerable discussion that would be repeated with each attempt at Integrated Storage, that you couldn’t put the new store in the Windows boot path.  Certainly not initially.  Once you accept that you can focus on when does the new store load and what facilities in Windows can take a dependency on it.  As you work through how a Windows system functions you can find many cases where there are things that should be using the new store, but they have to run in environments where the new store can’t yet be run.  I went through a lot of Excedrin in those days.

Of course if everything just uses your Integrated Storage solution as a Win32 File System then you won’t get much benefit out of it.  Better search (or maybe discovery would be a better description) being one of the things you might get, because part of the Win32 solution was the property promotion/demotion idea that I mentioned previously.  But you really want some clients that will natively use your Integrated Storage solution and take full advantage of it.  While those clients could be internal applications or customer (nee ISV) applications, having internal clients to work with is highly desirable.  Particularly if you want to establish your solution as part of the platform (that is, why would a customer rely on it if you aren’t using it yourself).  You need clients to know what tradeoffs to make in your design and implementation schedule.  Lack of real clients either delays, or completely tanks, adoption of a new service.

Finding appropriate clients to work with you on, and commit to using, a new Integrated Storage solution turns out to be a daunting task.  Their schedules, priorities, risk profiles, etc. do not necessarily match yours.  And yes, even the org structure can get in the way.    One alternative is to take the “Build it and they will come” approach.  We repeatedly considered, and rejected, that approach.  Another approach was to forget about internal clients and just work with a few close ISV partners (e.g., SAP) for the first wave of an Integrated Storage solution.  Again, considered but rejected (largely because this was a Windows platform initiative and not specifically a database product initiative).  When I get to the history you’ll see how this influenced the direction of Integrated Storage.

Also needed is a shipment vehicle.  If you want Integrated Storage to be a platform service then you need a way to ship it as part of the platform.  One can argue the definition of platform, for example Microsoft’s platform is more than just Windows.  However to achieve its vision, including having Windows use Integrated Storage internally and having ISVs be able to count on its presence on every PC and Server, you pretty much have to be part of Windows.  Alternate strategies look good on paper, and might have been acceptable as interim solutions, but in the end the goal was to build an Integrated Storage file system for Windows.

In Part 3 I’m going to talk about the different perspectives of the unstructured (File System), Semi-Structured (Office Document), and Structured (Database) worlds and how difficult it can be to marry these three world-views.  It will serve as a transitional piece that goes from explaining more of the difficulties in building an Integrated Storage solution to the history of Microsoft’s attempts at delivering a solution.

Posted in Computer and Internet, Database, Microsoft, SQL Server, Windows | Tagged , , , , | 5 Comments

WinFS, Integrated/Unified Storage, and Microsoft – Part 1

People have been bugging me to write about Integrated Storage for some time, and with Bill Gates having just disclosed that failure to ship WinFS was his biggest product regret  now seemed like a good time.  In Part 1 I’ll give a little introduction and talk about scenarios and why you’d want an Integrated (also refered to as unified) Store.  In a future part (or parts) I’ll talk more about Microsoft’s specific history trying to tackle this problem and what I think the future holds.

To position myself in all this, of the five attempts the Microsoft made at directly attacking this problem I had a hand in three of them as well as helping with a lot of the ancillary strategy.  My last position before leaving Microsoft the first time was as the General Manager of what became known as WinFS, so I have a lot of insight into how it started but only limited second-hand knowledge about how it ended.

I’ve noticed that a lot of people on the periphery have made comments that they never understood what WinFS or, more broadly Integrated/Unified Storage, was about.  The common thread being that anyone listening to a description came away with the impression that it was about “search”.  Now maybe that is to be expected given the simplest scenarios that people presented.  In fact, maybe Bill was most responsible for this.

When trying to express his frustration over the multiple stores situation at Microsoft Bill would use an example of “I know I saw a spreadsheet a couple of weeks ago; when I want to find it again do I look in my file system or do I look in my email?”.  Bill was trying to make multiple points with this simple example, but the primary one was not that there should be a way to search across disparate stores.  His primary frustration was that spreadsheets were stored in many different places each with their own semantics, APIs, “contracts”, management tools, and user experiences.  If you can’t solve the simple problem that Bill expressed of knowing where to look, then how can you hope to solve the problems involved in complex collaborative information worker scenarios or interoperable multi-data type enterprise applications?

So making it easier to find information was a critical goal of any of the integrated storage efforts.  By the way, this should be no surprise as the Integrated Storage efforts grew out of the vision for “Information At Your Fingertips”.  Nor should it be a surprise that Bill was focused very much on end-user scenarios given the IAYF vision and Microsoft’s background.  At the time of the first integrated storage effort, Cairo’s Object File System (OFS), Microsoft had no presence in the enterprise server or apps space.  So many scenarios that drove integrated storage were end-user scenarios.  Often those were Information Worker scenarios, but sometimes they were Consumer scenarios.

A somewhat simple set of consumer scenarios. and one that was a big focus for WinFS, was around the storage of photos.  Let’s say you are on a trip and take a bunch of photos.  You take photos at the wedding you attended, and photos of your kids at Disney World, and photos of a launch from Kennedy Space Flight Center, and some pictures late one night at the hot tub that no one but you and your spouse should see.  Now you transfer them to your computer and store them in the file system, but how can you organize them?  The file system provides very few tools for doing so.  They get stored with a meaningless file name, any given photo can be in only one place (and by default just as a collection from that download), and they have a fixed set of attributes that the file system knows about (e.g., creation date).  But you want photos that live in multiple places.  For example, you might want an album with pictures of Aunt Jean.  But you also want the pictures of Aunt Jean at the wedding to be in the wedding album.  You also want to share about 50 of the 500 photos you took (and make sure you don’t share any of the hot tub pictures).  How do you do that without copying the pictures to a separate share location?  Maybe you want to organize photos from all visits to Disney World together, but also keep them together by broader trip.

So integrated storage is about creating a rich organizational system.  One that isn’t tied to the rigid structure of file systems but rather to the organizational principles of the domain, application, and/or user preference.  Of course you also want to be able to find photos by far richer information than a file system stores in its metadata.  Perhaps tagged by the camera it was taken with or the person who actually took the shot.  Perhaps you want to query for photos taken within 50 miles of particular GPS coordinates.  And so on.  Thus search is very important and enabling rich searches based on semantics rather than simply pattern matching is important.

You can solve many of the problems I described for photos by putting an external metadata later on top of the file system and using an application or library to interact with the photos instead of interacting directly with the file system.  And that is exactly how it is done without integrated storage.  This causes problems of its own as applications typically won’t understand the layer and operate just on the filesystem underneath it.  That can make functionality that the layer purports to provide unreliable (e.g., when the application changes something about the photo which is not accurately propagated back into the external metadata store).  And with photos now stored in a data type-specific layer it is ever more difficult to implement scenarios or applications in which photos are but one data type.

Let me cross over into the enterprise app space and talk about an Insurance Claims Processing scenario.  Claims processing is interesting for a number of reasons, they key one being that it was one of the first enterprise applications to really embrace the notion of multiple data types.  When you file a claim, for a car accident for example, it goes into a traditional transactional database system.  But each claim has an associated set of artifacts such as photos of the accident scene, the police report, photos taken by the insurance adjuster, photos taken at the repair shop, witness statements, etc. that don’t neatly fit into the classic transactional database.  Yes you can store these artifacts in a database BLOB, but then they lose all semantics.  Not only that, you have to copy them out of the database into the file system so that applications that only know how to deal with the filesystem (e.g., Photoshop) can work against them.  And copy them back.  That creates enormous workflow difficulties, introduces data integrity problems, and prevents use of functionality that was embedded in the photos storage application.

The claims processing scenario is one that demonstrates where the name integrated storage came from.  What you really want is for the same store that holds your transactional structured data about a claim to hold the non-transactional semi-structured artifacts, and not just as blobs.  You want the semi-structured artifacts to expose their metadata and semantics to the application, or applications, built on that store.  As soon as you do that the ability to create richer apps, and/or use the data in complex information worker scenarios, climbs dramatically.

Rather than just using the photos as part of processing a specific claim they now become usable artifacts for risk analysis, fraud analysis, highway planning, or any number of other applications.  Data mining applications could run against them seeking patterns that weren’t captured in the transactional data.  Indeed all kinds of linkages could be made amongst the photos, police reports, etc. that just aren’t possible from the transactional data alone.

The multi-data type scenarios are huge in the information worker world and we’ve developed numerous application level technologies to deal with them.  OLE, for example, allows you to embeded one Office data type within another.  ODBC started out life as a way to bring structured data into Excel.  But these application-layer solutions have significant flaws.  They basically use an import model and you generally aren’t looking at the actual data but rather at a snapshot.  And you’ve probably discovered times where it was impossible to refresh document with current information because you didn’t have access to the location where it was stored.  Imagine submitting a settlement brief in a legal case to the judge with the numbers being out of date because of the complex series of steps from an ODBC query populating an Excel spreadsheet that is then embedded in a Word document and somewhere along the lines something didn’t update.  This could be a disaster.

Even organizing data for information worker projects is difficult.  Imagine you are building a proposal for a new business.  How do you organize and control all the artifacts amongst a set of people working on the project?  Sharepoint will do this for you, by creating another store on top of underlying stores.  Each application must understand how to work with a Sharepoint-like document management system (DMS), or the end-user must use a checkin/checkout system to copy artifacts from the DMS into the fileystem and then put them back.

How about another simple task, like setting up a video conference between a few people in your company and a few at a customer?  Contact information about your peers is stored in your company’s Exchange Server and the scheduling is done via Outlook, but your customer contacts are stored in a CRM system.  Working with the different sets of contacts can be painful, often involving cut and paste rather than seemless operation.  And this is a case where the CRM vendors actively work to integrate with Outlook.  Imagine you have a CRM system that hasn’t written a specific Outlook extension.  Where the names of common data elements aren’t the same.  And when they are the same, where the data formats for them differ.  Today we largely treat contacts as an MDM problem, with problem being the operative word.  For example, I recently noticed that one of the email addresses I have for Microsoft’s Dave Campbell is actually the email address from another of our former DEC colleagues.  Another Dave.  Some tool mistakenly merged it into my contact record for Campbell.

Finally let me give a system management scenario.  Many systems that need to combine structured (i.e., typical database data) and semi-structured/unstructured data (e.g., a photo or document) do so by having the database contain a pointer (e.g., URI) to the unstructured data.  How do you backup and restore this data in a consistent manner?  Imagine going to repair an aircraft and having the diagram associated with the area you are working on is out of sync with the database that contains information on the set of changes that have been applied to that specific aircraft.  Without a storage system that can be the primary store for structured, semi-structures, and unstructured data types you always have the situation of being unable to manage the collection of data that make up an application as a unit.

So what is Integrated Storage?  It is taking the storage concepts necessary to address these kinds of scenarios and moving them from the application layer, where each application addresses them individually, into a storage layer where they are addressed in a common way.  It is a storage system that provides rich and flexible organization, sharing, extensibility, discoverability, control, and manageability across the entire spectrum of data types that need to be stored.

At Microsoft Integrated Storage has repeatedly shown up positioned as a new file system (e.g., WinFS), which many see as a pejorative.  There are hints of why you’d want to do this at the file system level in many of my scenarios.  So I’ll start off Part II by drilling in to why this is, and why it has been the pivot point on which all attempts to create an Integrated Storage system have failed.

And for those who found this section to be too much rambling I apologize.  If I were doing this as a formal paper or presentation I’d go through scenarios first in a more pure form and then get into problems with current solutions.  But this is a blog, so you get to live with stream of concience and my time constraints on cleaning it up.

Posted in Computer and Internet, Database, Microsoft, SQL Server | Tagged , , , , , | 9 Comments

Quick note on Surface Pro being sold out

I have no idea how many Microsoft had available yesterday but I did want to make a few observations:

  1. Microsoft Stores had been giving away Surface Pro reservation cards for the week before availability.  My local store ran out of reservations this past Wednesday or Thursday I believe.  Most likely the bulk of their inventory was thus pre-committed and their were few Surface Pros available for walk-ins.
  2. I saw one report that Best Buy had allowed reservations with purchase of a $50 gift card.  I’m not sure if that is the case, but if so it could mean that their inventory was also largely pre-committed and not available for walk-ins.
  3. Reports of dismal supplies yesterday are based on anecdotal conversations with Best Buy and Staples employees.  But it isn’t clear how much inventory Microsoft would have committed to each store (or rather, how much each of those stores would have ordered).  Performance of the Surface, or of other tablet devices, at those chains might have suggested sales rates that dictated a steady stream of a low number of devices over having large initial stocks.
  4. In my discussions with Microsoft Store employees last week I got the impression that they expected to have adequate supply on hand to meet first day walk-ins.  That suggests demand was greater than expected, though of course each store could have had fewer devices on hand than they were expecting.
  5. That microsoftstore.com sold out and doesn’t allow backorders suggests to me that considerable additional supplies are being targeted at retailers.  My thinking here is that Microsoft would be concerned that if they allowed backorders then they would be “soft”, meaning that buyers would keep checking retailers and as soon as they got their hands on the Surface Pro they would cancel the online order.   That would create an inventory imbalance problem for them.  And it is a hint that Microsoft is focused on making the Surface Pro a success in traditional retail in the short-term over moving product through its most profitable channel (and perhaps alienating those retailers).

 

Posted in Computer and Internet, Microsoft | Tagged , , | 3 Comments

Microsoft is, and deserves to be, judged by a different standard

The Microsoft Surface Pro went on sale yesterday and immediately sold out, leading many pundits and other observers to declare the launch a failure.  Meanwhile the Surface RT has sold an unknown number of units (because Microsoft won’t reveal actual numbers) but let’s use the estimate of 1 million that was popular for a while.  Oh, it isn’t popular right now because any time an analyst, any analyst, speculates on a lower number people love to glom onto that.   Even at 1 million the Surface RT is  considered a dismal failure by pundits.  At the same time Google’s Nexus 4 smartphone, far cheaper (e.g. $50 with a two-year mobile plan commitment) and available at far more retail outlets than the Microsoft Surface, took a few weeks longer than the Surface to hit 1 million units.  And it is considered a runaway success!  You see the Nexus 4 is supply limited.  But wait, so is the Surface Pro and that is a “failure”.  And how about that iPhone 5 introduction?  My wife waited weeks to get her hands on an iPhone 5, because they were sold out from the moment of claimed availability.

Doesn’t it seem like Microsoft is being judged by a higher standard than the rest of the industry?  They are.  And to a surprising extent, as frustrating as it is, it is fair.  Apple has nothing to prove.  Google has nothing to prove.  Amazon has nothing to prove.  Microsoft has a lot to prove.  In the court of public opinion, or at least pundit opinion, Microsoft is expected to have big runaway success stories before it can leave its 20th century legacy behind and deserve to be uttered in the same breath with Apple, Google, and Amazon.

For pundits to have declared the Surface a success it would have needed a blowout introduction on the order of the Kinect.  Recall the Kinect sold 8 million units in 60 days and was the blowout consumer electronics product of the 2010 holiday season.  That is the standard by which all Microsoft product introductions are now measured.  And it is a tough standard to meet.

While Microsoft would love to see products like the Surface, Surface Pro, Windows 8, and Windows RT getting love from pundits and sales that blow away all expectations that isn’t their central focus.  They know they are running a marathon, not a sprint, and that what really matters is where they are in two, three, four, or more years.  They know they could have goosed the short-term results for the Surface RT by using a lower price and making it more broadly available.  The headlines would have been great as many millions of units (assuming they overcame supply constraints) were purchased.  With feverish demand, and a clear “winner”, being constantly sold out would have then been a plus.

Unfortunately any blow-out short-term success would have come at a high-price.  It would have irreparably damaged the OEM channel.  It would have set a precedent that Microsoft was a vendor of low-price rather than of high-value devices.  With low-price comes the risk of a “race to the bottom” against commodity device manufacturers and the inability for Microsoft to ever make money selling devices.  Microsoft needs to make money, Apple-like money, if it hopes to be in the devices business in the long-term.  Moreover, a key purpose of having its own devices is to bring its latest innovations and viewpoint to market.  Low-price means low-cost, low-cost is the antithesis of new technology introduction.

I know some are thinking it is silly for Microsoft to have worried about setting a precedent by using a low-price to ensure quick adoption of the Surface and Surface Pro, but Microsoft worries about precedent a lot (in many different areas).  It is far easier to lower prices than to raise them.  The time will come when Microsoft decides it is appropriate to lower prices, either directly or by repackaging (e.g., always include a Touch Cover with the Surface RT at the current tablet-only price; or eliminate the 32GB version and sell the 64GB version for the 32GB version’s price).  Meanwhile it will have established its position as a vendor of premium devices and retained its ability to target the market segments that most interest it.

Precedent also plays a huge role in why Microsoft has avoided giving out numbers for Surface, Surface Pro, and Windows Phone 8 unit sales.  It doesn’t matter if they are happy or unhappy with the numbers, nor how good or bad the numbers are, they are trying to avoid being drawn into the numbers game.  Once they start disclosing weekend, monthly, quarterly, or any other absolute numbers the expectation is that they’ll continue to do so on a regular basis.  And that these numbers would then dominate every discussion they tried to have about products.

Let’s face it, if they confirmed numbers that were low then it would just add to the damage caused by speculating they were low.  If they announced numbers that were at or slightly above expectations it wouldn’t help them (and the headlines would still shout out “mediocre” or “modest”).  The only really helpful number would be something crazy high, and that would show up in so many other metrics that Microsoft wouldn’t need to confirm them.  The frenzied speculation would do the positive PR job for them.  So they have chosen not to play the game.

In the short-run Microsoft’s approach means taking a lot of body-blows in the press and blogosphere and risking slower adoption rates as a result.  In the long-term Microsoft’s success or failure in its approach to the “post-PC era” will become evident and, in the case of success, it will have changed the nature of the conversation.  Perhaps not only meeting the higher standards to which it is held, but setting a new standard by which Apple, Google, Amazon and others are judged.

So for now all we can do is be frustrated by Microsoft being held to different standards than others.  And wait for the day when we can look back and, hopefully, correct the perceived injustice.

Posted in Computer and Internet, Microsoft | Tagged , , | 21 Comments