Hal's (Im)Perfect Vision

ARM in the Cloud

Posted on December 3, 2019 by halberenson

I know I’m long overdue on a “Part 2”, but wanted to slip this in first. I’ve long been a skeptic on ARM becoming a mainstream processor choice for servers. But today’s announcement by Amazon Web Services of the ARM architecture Graviton2 processor and the M6g, C6g, and R6g instance families has me rethinking my position. As is often the case I’ll start with some historical perspective and then discuss today’s AWS announcement.

In the late 1980s and early 1990s it was widely believed that RISC (Reduced Instruction Set Computer) architectures would replace CISC (Complex Instruction Set Computer) architectures such as the then leading VAX, x86, IBM 360/370,Motorola 68000, etc. instruction set architectures (ISA). The reasons for this were two-fold. First, it was believed that in any given semiconductor process technology a RISC processor would have 2x the performance of a CISC processor. This was largely because with CISC you were devoting an ever increasing percentage of the available transistors to overcoming the inherent bottlenecks of the ISA, while with RISC those transistors could be devoted to increasing performance. Second, the complexity of designing CISC ISA processors had reached the point where the semiconductor technology could advance more quickly than you could design a processor for it, so you were always behind the curve of taking advantage of Moore’s Law. RISC ISA processors were easier to design, and thus would better track the semiconductor process improvement timing.

One thing to keep in mind was the original RISC concept was to create a new ISA every time you made a new processor. So you never really had to waste silicon on legacy, you did a new optimal ISA for each processor and made it a compiler problem to re-target to each processor. Of course software people quickly made it clear to the hardware folks that this was a non-starter, that the cost of supporting new processors effectively (finding and fixing ISA-specific bugs, tuning performance, issuing patches to released software, etc) would outweigh the performance improvements of not having a fixed ISA. So we moved on to fixed RISC ISAs that would survive through multiple generations of processors. By 1995 RISC was well on its way to world domination. IBM, Apple (Mac), and Motorola had moved to Power ISA. DEC moved to Alpha and HP to HP/PA. Acorn Machines was a PC manufacturer that created its own RISC processor (the Acorn RISC Machine) and operating system (RISC OS). Acorn would later shift its focus away from PCs to its RISC ISA, dropping “Acorn” in favor of “Advanced”, renaming the company ARM, and licensing its architecture and designs to various companies. Other RISC chips also appeared including the Intel i960 and the MIPS line. MIPS in particular looked that it would become “the Intel” of RISC processors, though it would eventually falter. And as we now know, ARM would be the only RISC ISA to really thrive, by riding the growth of the market for mobile devices. But at the start of 1995 it looked like we were going to have RISC everywhere.

So what happened in 1995? The Intel Pentium Pro. The Pentium Pro could hold its own on performance with that year’s RISC chips while maintaining full x86 compatibility. How did Intel do it? First off they clearly had made advances in chip design tools that let them move much faster than other companies working on CISC. And they adopted an approach of compiling CISC instructions into RISC-lie ROPS and then making the rest of the processor work like a RISC processor. But maybe more importantly, they had a generational lead on introducing semiconductor manufacturing processes. So even if the assumption that in any given semiconductor process technology RISC would be 2x CISC held, Intel being a process technology generation ahead negated that RISC vs CISC advantage.

Intel’s process technology advantage held for twenty years, allowing the x86 to extend its dominance from the desktop to the data center. With the exception of Power, which IBM continued to advance, RISC ISAs disappeared from the server and desktop world. But RISC had another advantage, its simplicity made it easy to scale down to smaller low-power microprocessors for embedded and mobile applications. Today pretty much every mobile phone and mainstream tablet uses a processor based on the ARM ISA.

A number of years ago ARM and its semiconductor partners began trying to bring RISC back to the server and PC markets where it was originally expected to dominate. On the PC front ARM has made some limited progress, particularly with ChromeOS PCs and more recently in Windows PCs such as Microsoft’s Surface Pro X. But so far that progress represents a tiny portion of the PC business. In servers we’ve seen several abortive efforts and essentially no adoption. Until now.

Last year Amazon Web Services introduced an instance family, the A1, based on an ARM ISA processor of its own design called Graviton. Side note, in most cases (Apple is the counter-example) semiconductor designers license not the ARM ISA but an actual “core” design from ARM. That is the case with AWS. This was a pretty niche offering, and (to me at least) signaled likely another failed attempt to bring ARM to the mainstream server market. For example the A1 was not something you could benchmark against their Intel-based instances and end up with a direct comparison. It was more niche targeted.

Today AWS brought their second generation ARM processor, the Graviton2, to its three most mainstream instance families. Those are the M (general purpose, or balanced), C (compute intensive), and R (memory intensive) families and we now have the M6g, C6g, and R6g families. They even did some performance comparisons of the M6g against the Intel Skylake SP-powered M5. And they were quite favorable to the M6g. But Skylake SP is an older Intel generation, and a comparison with the Coffee Lake SP and AMD’s Rome would be more telling. These have already made their way into some models in the C5 and C5a families. Intel is also accelerating its product cycles so I expect it to regain a performance lead, though perhaps not enough to deter the growth of Graviton. Graviton is likely to retain a price/performance lead in any case.

So what happened to allow ARM to (apparently) catch up to Intel in the data center? I think there are three factors at play. First, recall the original RISC premise that in any given semiconductor process technology RISC should be 2x CISC performance and that this turned out not to matter with the x86 because Intel was a generation ahead on semiconductor process. Intel no longer has that generational advantage, and by some measures (e.g., smallest feature size) is behind semiconductor foundries such as the one AWS uses, TSMC. The second factor is we have a modern, some might say the most important modern, “system” vendor, AWS, leading the charge. Instruction Set Architectures have tended to thrive when backed by a powerful system vendor, not as pure artifacts of the semiconductor industry. x86 is the dominant PC and Server chip today because of selecting it for the IBM PC. ARM’s success came from DEC adopting it to create the StrongARM family, which was the dominant processor used in PDAs and early smartphones. Even ARM originator Acorn used StrongARM in its systems. Earlier dominant ISAs came from the system vendors themselves, particularly DEC and IBM. Now, just as DEC boosted ARM into dominance in the mobile device market, it looks like AWS will do the same for servers. Third, because AWS can optimize the ARM architecture and licensed core into its own chips for the controlled environment that is the AWS Cloud it can tune chip designs far more than someone trying to create a general purpose offering for on-premise servers.

So is it game over for the x86? After decades of watching, and working with, Intel I doubt it. Though it isn’t just AWS that Intel has to worry about. If AWS starts to show real success with Graviton than Microsoft and Google will be empowered to go full-bore with ARM in the cloud as well. And then there is the persistent rumor that Apple wants to move the Mac to ARM processors of its own design. With failed efforts to break into the mobile device market, pressure from all corners growing in the PC market, and now an apparently competitive mainstream ARM offering in the server market, Intel can’t stay on the modest improvement at high unit cost bandwagon much longer.

Posted in AWS, Cloud, Computer and Internet | Tagged ARM, Graviton2 | 2 Comments

Endings and Beginnings (Part 1 – AWS)

Posted on August 13, 2019 by halberenson

Last week’s announcement of Amazon Aurora Multi-Master being generally available marked a kind of ending for me. It also served as a reminder that I haven’t written anything about my new venture, Gaia Platform. So nearly two years after I tried (and once again failed) at retirement, let me wrap up my Amazon Web Services (AWS) adventure and tell you about my new one.

The lure of working on databases for a new computing era, The Cloud, is what drew me out of semi-retirement and to AWS. I was running Amazon Relational Database Service (RDS), parts of Amazon Aurora (it’s complicated in that I had the control plane and product management reporting to me, and then the Aurora PostgreSQL project fully reported to me, but my peer Anurag Gupta owned Aurora MySQL and the Aurora storage system and is the father of Aurora; I get embarrassed when assigned credit that rightfully belongs to Anurag), the Database Migration Service (DMS), Performance Insights, and a few things that aren’t externally visible (e.g., the DBAs for AWS’ control plane databases, an operations team for a bunch of services in AWS under the CIA’s Commercial Cloud Services (C2S) program).

There were a lot of challenges in this new role for me and I relished them, even when I struggled. For example, I’d always forced my hand into the business side of the products I worked on but never had actual responsibility for the business. At AWS I owned the relational database business. While confidentiality considerations keep me from talking actual sizes, it was one of the largest AWS businesses and the fastest growing of those larger businesses. In the weeks before I stepped down we passed one of the (even) more household name services in revenue. I have no real idea on the current business size, but doing some very conservative projections it must be an unbelievably big business today. What amazes me when I look back on the experience is not that they trusted me with engineering and operations for RDS, I had the track record to suggest I could succeed at that, but that they trusted me with such major business responsibility. That turned out to be an incredible career highlight for me, and I thank Andy Jassy, Charlie Bell, and Raju Gulabani for giving me that opportunity. Particularly Raju, because I know the only way it happened is because he committed to have my back.

After three years I announced I was going back into retirement. For those who don’t know, I live in Colorado and commuted every week (early Monday morning out, late Friday return) to Seattle. That I sustained it for three years is only a bit of a mystery to me, that my wife survived three years of it is amazing! But neither of us could sustain it longer, and didn’t want to move to Seattle. Plus we had some family things to take care of. There is more to this story, and we almost found a way where I would keep working for Amazon part-time. But I realized I’d never contribute to Amazon in a way I found satisfying as a part-timer. So after a few months I pulled the plug on a staged retirement and did a cold-turkey retirement. Or so I thought. Once again a little credit here, I couldn’t have worked from Colorado without Raju having my back (i.e., in 2014/2015 Amazon literally did not allow people to do work when in Colorado, so Raju had to cover for me if there was an operational issue that needed VP involvement over a weekend), and he was the one who proposed a staged retirement.

So why was last week’s launch of Aurora Multi-Master a good end point of the AWS story for me? My one major regret from my days at Microsoft had been that we never shipped a “single system image”, multi-master, SQL Server clustering solution. When we did the original planning for building our own database business (out of the ashes of Sybase SQL Server) we’d put clustering in our 3 version plan. Yukon (SQL Server 2005) was supposed to include a single system image clustering solution. By single system image I mean that an application can talk to and update a database on any node in the cluster completely transparent to the fact that the database is distributed over multiple nodes. In other words, it looks just like you are talking to a single system. That’s what we’d done at DEC with Rdb (conceptually copied by Oracle to become RAC). Others had done variants as well, but after a burst of energy in this space in the 80s to mid 90s, vendors (except for Oracle) lost interest. The SQL Server team made a number of stabs at it, but they always faltered in the wake of either higher priority work or technical challenges. Doing single-system image is hard. So sharding, or dropping some of the transparency (Spanner is an example), or going to NoSQL models that had far fewer transparency demands, became alternate answers. I’ve been away from Microsoft for almost 9 years, and the SQL Server team for over 15, and SQL Server (or Azure SQL) still doesn’t have a single-system image clustering solution. But with AWS, that was once again the vision for Aurora.

I didn’t get to be the one that built Aurora Multi-Master, and I’m fine with that. When driving back from an Andy Jassy OP1 offsite in the summer of 2015 Anurag and I talked about single-system image clustering and how desperately we both wanted to see it done. No matter how we rejiggered the organization structure over time, we would make this happen. Anurag got to drive it, although he too left AWS before Aurora Multi-Master GA, it’s done now. Oh they have plenty more to do to complete the vision (e.g., multi-region multi-master), but the solution is out now. Take your credit card and go give it a try. From my standpoint there is always a ton more to do in meeting customer database needs in the cloud. But in terms of a feeling of completeness and ability to move on, having Amazon Aurora Multi-Master available lets me focus on what other interesting problems there are out there. I’ll talk about that in part 2.

Posted in Amazon, Aurora, AWS, Cloud, Computer and Internet, Database, Microsoft, RDS, SQL Server | 4 Comments

Ad-blocker Wars

Posted on May 31, 2019 by halberenson

About a year ago I wrote my Adblockers are the new AntiVirus piece. In the intervening period the war between ad blocking and web sites that depend on advertising has gone exponential. Many sites put up a warning asking you to unblock ads on their site, others block access entirely. And now Google, the tech company almost entirely based on serving ads, is using their control of the dominant web browser, Chrome, to limit ad blockers. Make no mistake, I am OK with the concept of advertising on the web. It is a great way to democratize access to content, whereas pay walls (however appropriate in many situations) limit information flow. But as I wrote in the earlier piece, as long as advertising remains a huge channel for distributing malicious content I will be blocking it. Because I refuse to white list them there are several web sites that I can no longer access, but it is a small price to pay for better security and privacy. On the positive side for some, there is one site I found valuable enough to pay for access rather than allow ads. But just one so far, and it was a very small charge.

While I use all three major browsers to some extent, Firefox remains my primary browser. That’s partially because it offers the most options for incorporating ad-blocking and other filtering options. It even has a built-in content blocker, though you must know to configure it to use for general browsing. One of my favorite Firefox features is that it allows you to specify a DNS server to use independent of what your system is set to use. So my family notebook computers have Firefox set to use Quad9‘s malware-filtering DNS no matter what network they attach to, without having to manually change network settings each time (when on our home network our router is set to use Quad9). I could use the same mechanism to point to an ad-blocking DNS.

Of course ad-blocking extensions for browsers are insufficient, and with Google limiting their capabilities on Chrome, are becoming the wrong point in the technology stack to block ads. There is also the problem of non-browser applications that bypass the extensions, as I talked about in last year’s entry. Fortunately there are other options. Ad-blocking DNS may be the easy and free alternative, with AdGuard DNS currently the leading option. Some routers also offer built-in ad-blockers, though they may be part of a paid service. For example, the eero Plus service for eero routers supports ad blocking. That feature has been available for years, but is still labeled as being in beta, so caveat emptor. For those who like to really hack, you can download new firmware such as Tomato or DD-WRT to your router, or build your own Pi-hole. I keep getting tempted to add a Pi-hole to my network, but it is down a long list of things I may never get time to do. More consumer-friendly hardware solutions such as the little known eBlocker are available. I suspect as this category grows the mainstream vendors will increasingly include ad blocking options on new routers, which is great because my experience with whole-home devices that sit beside the router is decidedly poor.

There are also paid system-wide solutions. I’ve mentioned AdGuard for Windows before, but still haven’t given it a serious try. There is also a version for the Mac. I did pay for AdGuard for iOS Pro, which can perform adblocking across an iOS device rather than just in Safari. Don’t confuse this with the free AdGuard for iOS, which is a Safari extension. Not that it too isn’t a good adblocker.

And then there is Microsoft (and Apple, but I don’t follow MacOS developments). It is unclear how Microsoft’s adoption of Chromium as the basis for Edge will be impacted by Google’s latest change to Chrome. Will Microsoft follow Google’s lead,or continue to support a fully featured interface for ad blocking extensions? Microsoft abdicated its leadership role in this space when they failed to move the Tracking Protection List feature forward from Internet Explorer into Edge. They could either return to leadership by adding new features to the Chromium-based Edge, emulating Firefox, add new features to Windows that work across all browsers and applications, continue to leave this to others, or adopt Google’s privacy and security unfriendly behavior. While disappointing, I suspect they will take the middle road and leave this to others.

What you should notice is the one option that would save ad-supported websites, a move by the advertising industry to truly protect security and privacy, is absent. Maybe there is some work going on there, but so far it hasn’t made it to the mainstream. As I said a year ago, they are running out of time to save themselves. The escalating ad blocking war tells us that it is just about too late.

Posted in Computer and Internet, Microsoft, Privacy, Security, Windows | Tagged ad-blocker, adblocker | 1 Comment

Prime 1-Day Delivery Really is Different

Posted on May 3, 2019 by halberenson

At last week’s earnings call Amazon announced it was moving Amazon Prime from its historical 2-day shipping to 1-day shipping. Inevitably there were articles saying how Walmart or Target or whoever already had this. Or even better than Amazon, had same day delivery for some common products. All because they delivered from their large network of stores. I’m going to call BS on that, because “delivering from their stores” turns out to be more a symptom of a problem then a means of solving it.

Go back to Amazon’s origins as an on-line bookseller and Jeff Bezos’ recognizing that he could offer access to a vastly larger number of books (basically all those in print) than you would ever find in your local bookstore. Far more even than in the giant bookstores being built by chains like Barnes & Noble and Borders. This observation holds true more than ever in today’s retail environment. A retail outlet, even one as large as a Walmart Supercenter, only stocks a tiny fraction of the products, brands, styles, colors, sizes, etc. that are available. And in one of the most frustrating parts of the shopping experience, they frequently don’t have what you are looking for when you go into the store.

It is a very rare event when I go out shopping in local retailers that I come home with every item I was looking for. Even going to a store I know carries an item I want is often an unsatisfying experience. “Sorry you just drove 30 minutes and dealt with parking issues, crowds, etc. we are out of stock on that.” ^&%$(. “Oh, you like those shoes? Sorry, we don’t carry that size in store but you can order it on our website.” “We only carry the 2′ version of that cable in the store, if you want the 4′ you’ll have to order it on our website.” Brand of a particular nutritional supplement? Lets roll the dice and see if this store carries it and as it in stock at this very moment. My preferred brand/scent of antiperspirant? The Safeway stores in Denver seem to stock it, but not the ones around Seattle. And so on. As a result, I don’t bother going to stores. When I need something I just order it. Most of the time from Amazon.

While being able to deliver in one day, or same day, from a local retail outlet can be a very useful part of a fulfillment system, any attempt to make it the center of the experience replicates its bad characteristics to the online world. I don’t really care if Walmart or Target can deliver to my house in 20 minutes if neither carries the antiperspirant I want. Or if they are out of stock on the style, color, and size jeans I am looking for.

I’ve been living in an area where Amazon already offers free 1-day Prime delivery on many items for orders over $35. On Tuesday I realized I’d lost my Apple Pencil I had a new one on Wednesday, despite my SUV being in the shop. Amazon also offers various same-day delivery programs in my area, though I haven’t made use of those services. The news in Amazon announcing that they were moving Prime to 1-day delivery as the default is that they are building out their logistics system to support doing so for a very large portion of the items available on Amazon.com. And that is a whole different beast, both in complexity and in customer offering, than adding a delivery service from your local poorly stocked store. It’s the very same advantage that Jeff Bezos’ had over bricks and mortar bookstores on Day One. And not a surprise for a company where “it is always Day One”.

Posted in Amazon, Computer and Internet, Retail | 3 Comments

DMARC or Die

Posted on March 2, 2019 by halberenson

Let me ask a simple question, when are we going to get serious about dealing with unauthenticated email and its associated Phishing and Malware risks? If you think the industry is already taking this seriously, and that it is simply a hard problem, you are (IMHO) just wrong. Take this little snippet from the Microsoft Office 365 documentation on their handling of inbound mail that fails a Domain-based Message Authentication, Reporting, and Conformance (DMARC) check:

If the DMARC policy of the sending server is p=reject, EOP marks the message as spam instead of rejecting it. In other words, for inbound email, Office 365 treats p=reject and p=quarantine the same way.

In other words, in Microsoft’s infinite wisdom they ignore instructions from the domain owner to shred, incinerate, and bury deep in the earth mail that fails the checks they established to prove it comes from them, and instead put that mail in the Junk folder where 100s of millions of naive users will find it and believe it might be legitimate. This may have been a wise step back when DMARC was fresh and new in 2012, today it is simply irresponsible of Microsoft to favor legacy behaviors over a domain owner’s explicit instructions.

I don’t really want to pick on Microsoft, other than as a representative of the industry overall. We have the tools (SPF/DKIM/DMARC) to dramatically impact the SPAM problem but aren’t driving adoption, and proper usage, at a rate commensurate with the danger that unauthenticated email represents. SPF and DKIM have been with us for about 15 years. After 15 years we should no longer accept excuses such as SPF breaking legacy (pre-)Internet systems like listservers, there has been plenty of time for alternate compliant systems to be deployed. Unfortuntately nearly every SPF record seems to end with a soft-fail indicator, meaning “I don’t know who might legitimately send email on my behalf so don’t actually reject anything”. DMARC, which really brings SPF and DKIM into a useful framework, has only been adopted by 50% of F500 companies. And nearly all of them have DMARC policies of NONE, meaning just go ahead and deliver mail that fails authentication to the user’s inbox. WTF? And if they do take DMARC seriously only to have Microsoft ignore instructions to REJECT mail that fails authentication, it’s enough to make a CISO drink.

Is it going to take legislation to make the industry get serious? Maybe if Microsoft were subject to a lawsuit with treble damages because they delivered a malicious email to people’s junk folder rather than honor the DMARC REJECT policy we’d see some action. Not just by Microsoft, but by every organization fearful that new legislation had made it clear that failure to adopt well established anti-SPAM techniques subjected them to unlimited financial exposure.

We need a hard timetable for DMARC adoption, and if industry doesn’t do it then perhaps it will take a legislative push. In either case, we need a date by which all domains either establish a DMARC policy or have their mail rejected by recipient servers. We need a date by which a DMARC policy must be either REJECT or QUARANTINE. We need a date by which servers must enforce the DMARC policy rather than just check it. The later is actually the first thing to be tackled. If someone has taken the trouble to establish a policy, a server should enforce it! Hear that Microsoft? And we need a date by which REJECT is the only acceptable policy. Want to install some other milestones, fine. But let’s stop with the excuses. It really doesn’t matter if this is a problem of the perfect being the enemy of the good, or of competing interests, or just inertia. Throw out the excuses and DMARC or Die.

Posted in Computer and Internet, Microsoft, Phishing, Privacy, Security | Tagged DMARC, malware, Phishing, SPAM | 5 Comments

The Travel Buddy PC

Posted on November 19, 2018 by halberenson

There have been a flurry of articles and blog posts lately on the topic of can the iPad Pro replace a PC (Windows or MacOS), and I thought it was time to wade in to the muck. If you’ve read my blog these last (almost) 9 years then you’ll guess some background on where I land on this topic. But before we get there, let’s talk a little about observations.

The other day one of our friends came to visit for a few days. While she has been an iPad (both the 9.7″ and iPad Mini) user ever since Apple released them, in the past whenever she needed to do work she pulled out a MacBook. Not on this trip however, instead she sat on our couch with a 12.9” iPad Pro. When I queried her about it she said that when she traveled she preferred to take the iPad Pro with her rather than the MacBook. It hadn’t replaced the MacBook, but it addressed an overlapping requirement.

Another friend, a former Microsoft C-level executive, uses the iPad Pro in much the same way. He has a few Windows notebooks as well as a Mac, yet the device he always has with him is an iPad Pro. I have seen him use it to present to another former Microsoft C-level executive on a large monitor, and present to senior (including C-level) executives at both a large bank and a medium-large technology company. He can also frequently be seen using the Apple Pencil and iPad Pro to take notes in OneNote.

Our friends’ use cases caused me to reflect further on my wife and my own iPad Pro usage. While we both have notebooks, and a family desktop, for serious productivity work, our go to devices for portable personal computing are our iPad Pros. Indeed, this blog is being written on my new iPad Pro 11″. It is sitting on the ottoman, its base and keyboard stable (actually more stable than my top heave Surface Book 2 would be) while I sit on the edge of the couch. The viewing angle is almost right, I do wish I could adjust it a little more, but certainly not a problem. While the Surface Pro is more adjustable, most traditional notebooks really don’t offer a better angle for how I’m sitting, their screens don’t go back far enough. I haven’t tried the 11″ on an airplane yet, but my 9.7″ worked great on an airline tray and it worked adequately on my lap. The 11″ keyboard/case design is more stable than that used on earlier iPad Pros and passes the lapability test far better than many of the 2-in-1 Windows PCs I’ve tried.

I regularly use an iPad Pro to do management of my AWS account resources, research problem solutions, and of course write emails. I also use it for spreadsheets and preparing presentations. Some of those (famous) narratives that are the bread and butter behind all decisions at Amazon were partially written on my iPad Pro 9.7″. I’ve even done some limited software development on it, by using it to connect to both a remote Windows desktop and an Ubuntu Linux development machine. While I wouldn’t recommend the iPad Pro be a primary computer for any of these things, it does most of them adequately enough to let you leave your Windows or MacOS notebook behind much of the time.

My iPad Pro is almost always with me. I slide it under the seat of the car when I am out and about, take it into restaurants when I dine alone, take it to the doctor’s office or car dealer etc. when I know I’m going to be waiting around. Take it to business meetings so I can take notes or do research. And usually it is the only computer I take with me when doing non-business travel.

I only take the Surface Book 2 when I am in full work mode. Then it travels between my home, a client, whatever I am currently using as an office, etc. I take it if I’m in the middle of building something serious, where the advantages of having a full WIMP user interface at my disposal makes me more productive. But that’s my 10% use case. Most of the time my SB2 sits docked to a large monitor in my home office.

When people ask the question “Can an iPad Pro replace my notebook” the answer is a clear “much of the time”. For me the iPad Pro is ideal as what I call the Travel Buddy computer (or even Travel Buddy PC). It retains the application library and content consumption strengths of the original iPad, while getting to the 80-90% mark on content creation compared to similar Windows tablets/2-in-1s. Recent Windows systems like Microsoft’s Surface Go also fall into the Travel Buddy category, but are too weak in tablet usability and limited application library to address most user’s non-work desires of a Travel Buddy PC.

So what are the biggest limitations of the iPad Pro as a notebook replacement? As others have noted the lack of a mouse or equivalent pointing device makes some work painful. In particular, cut/paste. The iPad Pro has an advantage over a PC in terms of broad adoption within applications of sharing entire objects, and sometimes that makes them feel superior. But if you need to take a precise region of data, like part of a list within a document, and copy it to another document, then the PC wins. PCs are also much better at multiple windows than the iPad Pro, although this is somewhat a matter of taste. As I’ve written in the past, I mostly run in full-screen mode no matter how big the screen I’m using. Sometimes I use two windows so I can look up data at the same time I’m filling in a form or writing a document. Well, the iPad Pro can do that. But if your work style is to keep 3, 4, 5+ windows open on the screen at the same time then…what on earth are you doing buying an 11″ or 12.9″ display device of any variety?

So could you replace your notebook with an iPad Pro? For many scenarios absolutely. For all scenarios no way. And in particular, could an iPad Pro become your only (non-phone) computing device? I think for a surprising number of the 1.5B PC users out there it could, but that is because many of them don’t really rely on the PC’s strengths. And for all of us, the computer you have with you always beats the computer you left at home. Which makes the iPad Pro a good alternative to a Windows notebook or MacBook as a Travel Buddy.

Posted in Computer and Internet, Mobile | Tagged Apple, iPad Pro | 4 Comments

Snatching defeat from the jaws of victory

Posted on October 31, 2018 by halberenson

Once again Microsoft appears to have snatched defeat from the jaws of victory, this time repeating a key mistake from the Windows 8 era. Microsoft was on the path to a coupe, launching the seemingly excellent Surface Go well ahead of Apple’s launch of the next generation of iPad Pros. It also launched the Surface Pro 6 ahead of Apple’s launch, though with a much smaller lead. So where did Microsoft go wrong? NO LTE. Oh they promise LTE in the future, but futures don’t cut it in this case. This is exactly where Microsoft (and its ecosystem) screwed up back in 2013, and has continued to screw up in successive launch cycles.

Back in 2013 the excellent Dell Venue 8 Pro, and other Windows tablets, launched with a promise of LTE, and then it never appeared. Within the Surface line Microsoft has always either ignored LTE, delayed it for well beyond initial launch, and if it did arrive they made it hard to buy (i.e., targeted the business sales channel) rather than featuring it. Now we have Microsoft singing the praises of “Always-Connected PCs”, but they don’t walk the talk. For Microsoft, being “always connected” only applies to low-end ARM-based Windows 10 systems. And they so far haven’t even offered one of those themselves.

With Apple you just select WiFi-Only or WiFi+LTE as part of its normal sales processes, both online and in-store. And they launch (and generally ship) the LTE models concurrently with the WiFi-Only models.

I was completely ready to spring for a Surface Go the moment I could get one with LTE, and then yesterday Apple launched the new generation of iPad Pros. There are a few things that the iPad Pro is not good at, like software development, but for my daily on-the-go needs it is near perfect. And most importantly, I will have one in my hands, WITH LTE, in a couple of weeks. So the moment has passed Microsoft, and while you keep talking about being always connected Apple is doing a much better job of walking the talk. The Surface Go likely isn’t going anywhere, and I’m not particularly hopeful about the “Always-Connected PC” initiative either.

Posted in Computer and Internet, Microsoft, Windows | Tagged Always-Connected PC, LTE, Surface | 6 Comments

Google goes to the dark side on JEDI

Posted on October 9, 2018 by halberenson

Every time I read an article on the U.S. Department of Defense large Cloud project known as JEDI I find myself suppressing an urge to comment. Google dropping out of the bidding finally made that urge difficult to suppress.

It is almost certainly true that only Amazon (AWS) and Microsoft have the current breadth of offering to meet much of the JEDI requirements. It is equally true that neither of them have all the pieces needed for this contract, they are going to have to build new capabilities as well. Most articles I’ve read focus on certifications as a differentiator, and while those may represent a minimum bar for selling into this market and a demonstration of a Cloud’s maturity, they seem neither a significant differentiator nor a significant hinderance to a vendor’s ability to compete for the RFP. Put another way, if the rest of the RFP response showed overwhelming leadership then a roadmap for achieving the needed certifications would be sufficient to overcome the AWS and Microsoft existing certification leads.

The problem for every potential bidder is that they either need to partner to meet the full RFP requirements and/or commit to significant developments that could negatively impact (e.g., in opportunity cost) their commercial offering roadmaps. The whining about JEDI being a single source contract says more to me about tech industry disdain for partnering amongst major players than it does about the nature of the contract. DOD is used to many, if not most, major contracts involving partnerships amongst the top suppliers (aka, competitors). Boeing/Lockheed, Lockheed/Boeing, Lockheed/Northrop Grumman, Boeing/Saab, etc. The right bid from a lead/prime with a lot of DOD experience would have a strong chance to challenge AWS and Microsoft. For example, IBM has lots of the pieces for a bid and decades of experience being a Prime contractor for DOD. It is the latter, not their fragmented commercial cloud offerings, that make them a serious contender to win JEDI.

The real question about JEDI, and likely the real meaning behind Google’s using lack of certifications as an excuse to drop out, is how much a vendor is willing to let the JEDI requirements impact their commercial roadmap. AWS’ Andy Jassy likes to say that there is no compression algorithm for experience. While that sometimes sounds like a marketing sound-bite, there is a lot of truth to it. When the cloud was new, and enterprise adoption was near non-existent, AWS aggressively went after a number of deals for the experience they would provide. Those deals were key to getting AWS to its current leadership position, because they prepared an organization with only eCommerce DNA to address industries it otherwise couldn’t understand or relate to. One of those was the U.S. Intelligence Communities’ Commercial Cloud Services (C2S) contract, which many point to as one of AWS’ key strengths in the JEDI bid. Certainly AWS wouldn’t be in a good position to win the JEDI deal without C2S, because it would face the “no compression algorithm for experience” dilemma. And while others may not have the direct classified cloud experience C2S gave AWS, Microsoft, IBM, and Oracle have decades of experience working with DOD and meeting their most demanding IT needs.

C2S is most important in the context of how much a young and small AWS was willing to impact its commercial roadmap to gain experience at working in the toughest public sector environments. Both the learnings, and yes the optics, of being able to support the most demanding security environment have had a huge impact on AWS’ ability to attract large enterprises to its cloud. This is where Amazon’s focus on the long-term comes into play. C2S was a drop in the bucket on public sector IT spending. JEDI is still just a toe in the water. AWS will value JEDI not only for the business it brings, but for the things it forces them to do to meet DOD’s requirements. Many of which it will bring back into its commercial offerings. Oracle will value it for giving their cloud a legitimacy they have yet to achieve. It could actually save their IaaS/PaaS offerings from oblivion. IBM seem more likely to value the revenue than other benefits. Microsoft likely sees it as validation that the direction(s) they’ve taken with Azure (including Azure Stack) has them equal to or ahead of AWS (without having to fall back on winning because the customer is an Amazon-retail competitor, or buying the business with a “strategic investment”). Sorry, I couldn’t resist taking a little dig at my Microsoft friends.

And Google? Google’s primary marketing thrust is you should use Google Cloud because everyone wants to do things just like Google does. But if Google doesn’t want government to use AI like they do, and may in the future not want government to use some of their other technologies, and doesn’t want to disrupt their commercial roadmap to meet DOD requirements, then Google can’t bid on the deal. The same applications that Google doesn’t want its AI technology to be used in could make use of technologies like BigQuery and Spanner, so how can Google offer those as part of JEDI? And how much does Google want to focus its infrastructure work on being able to quickly standup a new region at a newly established military base vs continued development of its commercial regions? How hungry are they for this business? Apparently not very as they’ve decide to go dark on the bidding.

The company that wins this business is going to be a company that is hungry for it, and not just for the revenue it brings. That is always important of course, and being able to make a profit at it is just as important. But in the end the winner is going to be, or at least should be, someone with a passion for the DOD customer base and for applying the learnings from JEDI to moving the Cloud up another notch in addressing broader customer needs. I obviously see that from AWS and Microsoft, and Google already made it clear that isn’t the case for them.

Posted in AWS, Azure, Cloud, Computer and Internet | Tagged DOD, JEDI | Comments Off

The Big Non-Hack?

Posted on October 6, 2018 by halberenson

This week Bloomberg Businessweek (BBW) published “The Big Hack: How China Used a Tiny Chip to Infiltrate U.S. Companies” which claimed that 30 companies, most notably Apple and Amazon Web Services, had servers using hacked Chinese-made motherboards from U.S. manufacturer SuperMicro. Apple, Amazon, SuperMicro, and even the Chinese government issued strong denials. Additional denials are coming in as well, and right now BBW seems pretty far out on a limb with the story. True or not, the article publicized real concerns about the security of the technology supply chain. Concerns we are not taking seriously enough.

One bit of clarification (which is important, particularly if you don’t read the article carefully) is that the Amazon-related comment is about a company it acquired, Elemental Technologies. Allegedly the hardware hack in Elemental server products was discovered as part of Amazon’s pre-acquisition due diligence and nearly scuttled the deal. If there is any truth to the story, and Amazon gave quite a detailed response saying there isn’t, it should give some measure of assurance to AWS customers that AWS’ security processes caught this before the Elemental acquisition. One weird part of the story vis a vi AWS is that some of the hacked motherboards showed up in the AWS Beijing region. While I won’t say exactly why, that part of the story set off my BS detector. Otherwise, the AWS servers that run customer virtual machines (AMIs) and service control planes were not implicated in the story.

For all three major cloud providers I expect security practices that would either prevent or quickly uncover a hack such as the one discussed in the story. I have no personal knowledge of Google, but both Amazon and Microsoft are extremely thorough, sophisticated, and usually quite aggressive on the security front. Particularly when it comes to their own infrastructure. At AWS security is considered the #1 priority, and failure is treated as the ultimate risk for destroying customer trust. If the story about Elemental is even remotely true, the result of having discovered an actual hardware hack would have led AWS to implement numerous additional checks in its hardware acquisition and acceptance processes.

But to the meat of the issue, China is increasingly seen as a bad actor. When you combine repeated concerns about back doors in Chinese-made technology products with ongoing Intellectual Property theft concerns, rising wage costs, rising shipping costs, rapidly growing national security concerns, and the nascent trade war, I have to wonder how long until western companies just start removing China from the supply chain. That doesn’t necessarily mean moving manufacturing “back” to the U.S. (or western Europe), it may mean moving to other low-cost countries. Countries where, presumably, there is better protection of Intellectual Property and privacy. And far less national security risk as well. Basically, how long before western companies say the risks of having China in your supply chain far exceed the rewards? For those wanting to sell to the U.S. Government, and likely many allies, the day of reckoning is already here. That noose will just keep getting tightened.

When will we see an accelerated move away from including China in the supply chain of technology products? If the BBW story turns out to be true, that will certainly accelerate things somewhat. If the trade war lasts for more than a few months, that will have a major impact. Few, if any, companies are going to try to figure out how to remove China from the supply chain of existing or well along in development products. But probably every (non-startup) western company is looking at products just entering the development cycle and trying to figure out if there is a sensible way to not make that product in China or with Chinese-sourced components. Most will likely conclude there isn’t currently a sensible alternative, or decide to take the risk the trade war will be resolved before they go into production. Many will at least take some initial steps to reduce their China supply chain exposure, such as seeking second sources outside China for key components. The longer the trade war goes on the more they will conclude tariffs are a long-term part of the cost equation and shift away from China. And if another, confirmed, story of Chinese hardware hacking comes out during these deliberations? There will be a mad rush for the exit.

As for BBW, I’m concerned that the story doesn’t seem to have legs. And if the story is false, or at least got a lot of the facts wrong, then it gives a serious black eye to reporting on the technology business.

Posted in Amazon, AWS, Cloud, Computer and Internet, Security | Tagged Supermicro | Comments Off

The Product Shipping Tax

Posted on September 21, 2018 by halberenson

There is an observation I had back in the 1980s that both holds in today’s Cloud world and remains one of the toughest messages to communicate to senior leadership. When you ship a new product or service, major release, or even a major feature (in the Cloud world), your people resources for new feature development is permanently cut in half. The short-term message is no more palatable, but perhaps easier to communicate. For the first 6-12 months after a major release nearly 100% of your people will be unavailable (or their efforts severely degraded) for new feature development. So each budget cycle product teams end up asking for more staffing, even as it seems we are delivering less in the way of features. It isn’t that senior leaders don’t get that there is a tax on supporting existing products, from bug fixing to operations, but they do have trouble with the magnitude of it. For example, non-engineers (or those who haven’t done engineering recently) struggle with how costly yet necessary it is to pay down technical debt.

Two things happened to me in the 80s that lead to my 50%/100% rule of thumb. The first was my experience as a project leader of multiple releases. Each time we did a release I would end up finding the number of person-months I had to schedule to investigate customer issues, fix bugs, perform cleanups of code that had become unmaintainable, deal with dependencies (e.g., a new OS version breaking an existing product), revamp build systems, respond to corporate initiatives (e.g., you must switch to this new setup/installation system), etc. would go up. And over time I realized it would stabilize at about half the team.

The other thing that happened in the 80s is I went back and looked at multiple releases, including those I hadn’t been involved in, and plotted the incoming Software Performance Report (SPR) rate by month against a number of other metrics. SPRs were a means for DEC customers to report bugs, request features, and otherwise communicate with the engineering team about issues. There was no filter on these, even customers without support contracts could submit SPRs, so a complex feature might generate a lot of SPRs even though those resulted in a low unique bug rate. There were two interesting data points here. The first was that incoming SPR rate started to rise dramatically about 60 days after release, the peak occurring around the 6 month mark. While the incoming rate dropped off, it plateaued at a higher level after each release. There were two causes for that, one being just having more features that needed support. The other was that, thankfully, there was a rapidly growing customer base. So even if you drove SPRs per Customer (one of my favorite overall product quality metrics) down, the growth in customers meant more SPRs.

The second data point was that there was a clear correlation between the number of check-ins for the release and the incoming SPR rate, so major releases not surprisingly resulted in more SPRs than minor releases. I was actually able to predict the SPR rate for a new major release would be terrifyingly high based on this metric, a prediction that sadly was accurate. At peak nearly the entire development team was required to respond to SPRs, and for about 90 days before and after there was a high interrupt load on most developers as SPRs hit for their area rendering them unproductive at working on new features.

The Cloud changes none of this, and perhaps makes it even worse. Before you enter a beta or preview period you have no operational burden, minimal deployment burden, only modest urgency on fixing most bugs, etc. The preview is as much about making sure you can operate at hyperscale as it is about traditional beta things like verifying that customers can use the service as intended. Then the day you declare General Availability (GA) you have a 24×7 operational burden. Production-impacting bugs become urgent. It’s the day you start learning where you missed on preparing for hyperscale (see https://hal2020.com/2018/01/20/challenges-of-hyperscale-computing-part-2/ and https://hal2020.com/2018/08/25/challenges-of-hyperscale-computing-part-3/). It’s the day customers start trying to do things you never intended, or perhaps never expected. It’s the day that you start having to plan on paying down technical debt built up during development. It’s the day you have to start dealing with disruptions like the Meltdown and Spectre security issues with an urgency that distracts from feature work. Etc. So just like with a 1980s packaged product, for the first 6-12 months nearly the entire team will be unavailable for feature work and on an ongoing basis only half the team you had at launch will be available for feature work.

I tried for years to find ways to avoid the 50%/100% tax, but never succeeded. So each budget cycle I’d look at all we wanted to do, all that our customers wanted us to do, and go and ask for a significant headcount increase. Each year I would face the pain of telling senior leadership how little feature work we could do without that increase. Each year they would challenge me, and I didn’t blame them. I never found a way to communicate the magnitude of the situation in the context of the budgeting exercise. In retrospect I realize was I should have done at Amazon is written a narrative, outside the “OP1” process, that made all this clear. I could have looked at data for numerous projects that would have (likely) supported my career-long observation. But that would have been too late to help with the decades at DEC and Microsoft where I failed to fully explain the need for the additional people. To be clear, I just about always got the people I needed. It was just more painful than it should have been.

So what prompted me to write this now? I’m watching as the first signs appear that Aurora PostgreSQL is getting past its “V1.0” 100% stage. For example, although Aurora PostgreSQL has not yet announced PostgreSQL 10 support in some regions you can actually find it (10.4 specifically) in the version selector for creating Aurora PostgreSQL instances. Launch must be fairly imminent, with hopefully many more features coming in the next few months. Overall though, it reminded me my 50%/100% rule still applies.

Posted in Computer and Internet | 2 Comments

Hal's (Im)Perfect Vision

ARM in the Cloud

Endings and Beginnings (Part 1 – AWS)

Ad-blocker Wars

Prime 1-Day Delivery Really is Different

DMARC or Die

The Travel Buddy PC

Snatching defeat from the jaws of victory

Google goes to the dark side on JEDI

The Big Non-Hack?

The Product Shipping Tax

Recent Posts

Archives

Categories

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Recent Posts

Archives

Categories

Meta