SQL Server 2014 Delayed Durability/Lazy Commit

I am having a lot of fun watching everyone get excited over SQL Server 2014’s Delayed Durability feature, mostly because I invented it back in 1986.  At the time no one was particularly excited by the idea.  It’s possible someone invented it before me, but I never found any evidence of that.

Not long after taking over as Project Leader for DEC’s Rdb product I was looking at ways to address one of its performance bottlenecks, the wait to flush the log on a commit.  For those not schooled in database technology a key part of ensuring no data is lost on a system failure (a.k.a., durability) is to require that changes be written (forcibly if necessary) to the log file before you acknowledge a transaction has been committed.  The log file is sequential, so writing to it is enormously faster than writing changes back to the database itself.  But you still have to wait for the write to complete.  The architecture of Rdb 1.x and 2.x did not allow for what is now known as Group Commit or a number of other techniques for speeding up commit.  Further each database connection had its own log, so that even log writing typically required a seek (i.e., it was still random rather than serial) thus limiting throughput and typically imposing a 100+ms delay to commit.  On heavily loaded systems I remember this climbing to 250ms or more.  Since we couldn’t implement Group Commit in a minor release, I was thinking about other answers and had a revelation.

For many applications a business transaction (e.g., add a customer) is actually a series of database transactions.  From the application perspective, the customer add is not complete until the final commit of the series of database transactions, and thus they already have (or could easily be written to have) recovery logic that deals with failures (e.g., system crashes) between those individual database transactions.  In effect, the durability of those individual database transactions was optional, until the final one that completed the business transaction.

With this in mind I went and prototyped Delayed Durability as an option for Rdb transactions.  On Rdb it was quite simple to implement and I literally had it working in one evening.  But these were short turn-around releases, I was treading into another team’s area (the KODA storage engine), and there just wasn’t time to finish productizing it.  So a couple of weeks later I pulled out the change and in Rdb 3.x we started working on other (app transparent) solutions to the synchronous commit performance problem.

Now jump forward to 1994 after I’ve joined Microsoft.  There is somewhat of a battle going on between the team working on the first version of Microsoft Exchange (nee 4.0) and the JET Blue storage engine team over performance issues Exchange was having.  Because I was new to Microsoft and had no biases I was asked to look into Exchange’s performance problems.  That was quite the experience but I’ll limit this to just the relevant story.  I learned that to send an email the client did a series (30-40 pops into mind as typical) of MAPI property set calls.  And each one of those calls was turned into an individual database transaction.  Which of course meant 30-40 synchronous log flushes per email message!  No wonder they were having significant performance problems.  While my major recommendation was that they find a way to group those property sets into a single transaction, I had another trick up my sleeve.

After confirming that Exchange was fully designed to handle system failures between those MAPI Property Set “transactions” I suggested to Ian Jose, the Development Lead for JET Blue, that he implement Delayed (I think I called it Deferred at the time) Durability.  The next day he told me they’d added it, and so to the best of my knowledge the first product to ship with Delayed Durability was Exchange 4.0 in April 1996.  A full decade after I first proposed the idea.  Of course that wasn’t visible to anyone except through the greatly improved performance it provided.  But still I was quite proud to see my almost trivial little child born.

With SQL Server 2014 shipping Delayed Durability as a user visible feature my little child is finally reaching maturity.  It only took 28 years.

Update: My friend Yuri pointed out that Oracle implemented Asynchronous Commit in Oracle 10gR2 in 2006.  So it only took 20 years, not 28,  from my invention until the feature appeared in a commercial product.

This entry was posted in Computer and Internet, Database, Microsoft, SQL Server and tagged , , , , . Bookmark the permalink.

22 Responses to SQL Server 2014 Delayed Durability/Lazy Commit

  1. Joseph Williams says:

    Hal, you bust me up! Nicely told.


  2. Bob - Former DECie says:

    Thanks for telling this story.

  3. Yuri says:

    Does Oracle have this feature to their database? added in 10g?

    • halberenson says:

      Ok, looks like Oracle implemented it in 10gR2 shipping in 2006. So it only took 20 years from when I invented it until it appeared in a commercial product, not 28!

      • Yuri says:

        …. and I would not be surprised if this feature was designed/coded by one of the ex-DEC folks on the Oracle RDBMS team, familiar with Rdb/VMS !! 8^)

        • Bob - Former DECie says:

          I agree.
          I think the sale of Rdb to Oracle finally convinced me that there was no hope of DEC ever recovering. I waited around awhile after that hoping to get TFSO’d, but then left on my own when I decided that getting on with my career was more important than any 5 figure severance pay.

          • halberenson says:

            The sale of Rdb was one of the shoes that dropped in my deciding to leave. I was told of the negotiations with Oracle months before the Rdb team learned of the sale because I started spouting off to Bill Strecker and others that we should spin it off into its own company. To shut me up I was brought in on the secret. I wasn’t interested in working for Oracle, though I would have tried to go with Rdb had it been spun off.

            My job wasn’t databases specifically at the time, it was all of Production Systems (i.e., Enterprise) software and I’d relocated back east specifically to help save software at DEC. So the Rdb sale, as painful as it was, wasn’t a completely fatal blow. The other shoe that dropped was that a group of us known as Strecker’s Pizza Cabinet had devised a new software strategy. Bill bought into the new strategy, but then put an old guard hardware guy in charge of the most important elements. He immediately veered off course and destroyed its viability.

            The nail in the coffin was that we had a strategy to put our software on other platforms so that the efforts were financially viable. We had a deal just about signed for HP to adopt ACMS as their TP Monitor. A new VP of the Software Business (i.e., marketing) was hired from IBM and in our first meeting with him he laid down the law as its ok to sell software on other platforms as long as it doesn’t help our key competitors, which killed the HP deal and made the software business non-viable. Roger Heinen had approached me a year earlier about coming to Microsoft, and I went home that night and called him to tell him I was ready.

            The Microsoft strategy was remarkable similar to the one that we had proposed (and in theory at least gotten approval for) to Strecker. Except they were actually going to do it. So I resigned from DEC to join a real software company.

            • Bob - Former DECie says:

              Wow! Another ex-IBMer helped kill DEC. I remember when DEC hired an ex-IBMer, I’ve forgotten his name, to head up sales. He went around the country telling all the sales folks to sell more or they were gone. The entire field staff, not just sales, was ready to revolt after that.

  4. rule30 says:

    Hey there

    I recently did a small article on performance testing with Delayed Durability switched on. Any feedback welcome!


    Best regards

    • halberenson says:

      I tried to post a comment to your blog, but it gave me an error. So here it goes again.

      Your performance test is pretty much a best case scenario for Delayed Durability. First, it is a single stream test case so that it pretty much defeats all the log optimizations that have been done over the years. With Group Commit the log flush for User X will also flush the log for User A, C, D, M, Q, etc. There are even optimizations like delay the flush for User X for nMS to see who else comes along with a log flush of their own. The result is that really only 1/t transactions actually see a delay from the log flush. Oversimplification, but hopefully you get the point.

      The second thing is that single statement default transactions have always been the worst case scenario since you have to flush once per statement. Thus Delayed Durability would yield the best performance gain with those. But this also isn’t a rational model for most apps to use, especially those that require logical consistency or impact physical world events (your example) such as moving money. It is a rational model for bulk loading of data, which is a place I see Delayed Durability being used rather extensively.

      A good but still reasonably simple performance test would be something like order processing. In one transaction create the order header, with full durability. Then in individual transactions with delayed durability add the line items. Then in a transaction with full durability finalize the order. If you do this in a single stream you are simulating processing orders coming to you from an external feed. Then you could run several streams in parallel to get a multi-user read on performance. Obviously still not like running a TPC-C class benchmark, but still it would give some decent data about the kinds of scenarios where I’d put Delayed Durability into practice.

      • rule30 says:

        Hey matey

        Yeah thanks for letting me know about the comments thing – I’ll look into that tonight. Totally agree that the perf test I used was very naive.

        There are a couple of types of tests that I would love to see but don’t have the time to set up yet.

        One would be to actually get a number of different clients streaming more realistic transactions with cross dependencies at the server (similar to what you are saying I think)

        The other would be to have the log storage getting stressed by transactions being committed to multiple log files. I’d also like to see some numbers against an actual enterprise class storage array – my tests were just against a reasonably powerful laptop!

        I’m short on time at the moment but might be able to sort out some slightly less basic tests involving multi-table updates and more realistic workflow patterns



  5. Bob - Former DECie says:

    When do we get part 3?

  6. Remi says:

    It seems that our recollection differ – Jet Blue had lazy commit before you suggested it.

  7. Pingback: SQL 2014′s link to Exchange 4.0 | Thoughtsofanidlemind's Blog

Comments are closed.