or… Drop(The Ball)Box – How To (Mis)handle An Outage
Late on Friday (January 10, 2140), Dropbox apparently experienced a service outage. If I am to believe the noise on Twitter, it was a globally-impacting unavailability affecting a large proportion of their user base.
These things happen – usually at the most inopportune moments – but what makes the difference to most customers is how you handle things when they do. And Dropbox, I’m sad to say, seems to have failed to learn anything from all the other companies who have been through this before them, and their response so far has been nothing less than appalling.
For reference, I’m a “free” user, so I accept that I don’t get any compensation when issues arise, and my choice is to put up with it or to move somewhere else. This isn’t, therefore, a personal rant about how butthurt I am about the outage. However, there are many paying customers, and if I were one of them I would be very upset, and Dropbox is providing an object lesson in mis-managing communication with their customers.
Communicate, Communicate, Communicate
I host my sites with NameCheap. Whether you happen to think they’re a great host or not, the one thing they do well is let their customers know what’s going on. I follow their RSS feed and Twitter accounts, and get regular updates on planned maintenance, unscheduled maintenance, upstream provider events, specific servers experiencing problems, and so on. You could argue it’s too much information, which it is, right up until it’s my hosting server that’s impacted, at which point I’m extremely grateful to see their timely updates.
There’s a saying that I’ve heard a lot over the years that addresses the importance of good and early communication rather succinctly:
Bad news doesn’t get better with age.
It really doesn’t.
So with that in mind, let’s examine the communication from Dropbox in the face of an apparently huge outage. We’ll use their Twitter stream as an example. Here’s the first “announcement” from @dropbox_support acknowledging the problems:
Well, at least it’s announcement. It’s not a good one – it doesn’t clearly say “We are experiencing some server issues, and you may be unable to access your files”, but it at least acknowledges that people have been complaining that there’s a problem. Still, I’m sure that they’ll give regular updates after that, given the scale of the problem.
Oh. Three hours of radio silence, then the news that things are fixed! In that three hours, there was nothing they could say? Well, at least it’s fixed now. The main @dropbox account shared the same news, which was retweeted by @dropbox_support:
It turns out that the outage wasn’t over after all, and users continued to complain about lack of service availability all night. Again, we wait for the status update; perhaps an acknowledgement that things might not be entirely fixed? And we got it – twelve hours later:
Am I missing something? I know it’s Friday night, but when you have a catastrophic outage under way, everybody is called in to work and nobody goes home until it’s fixed. That’s how the industry works – you don’t leave outages unresolved until the next morning; at least not the big ones. So why were there no status updates over a 12 hour period? No reassurances? No acknowledgements of ongoing problems? In fact, worse, no responses to users either – the Twitter account was silent for 12 hours, undoubtedly because somebody went home on Friday night, but seriously, did nobody think that staying in contact with customers was important? Are companies still this ignorant about how important their social media presence to the current generation of users?
Many Dropbox users took to the forums to complain. Finally, somewhere around the time that the outage was announced as being resolved via Twitter, this thread was started on the site by an administrator (Update: the admin in question is a “Super User” – a moderator who may not necessarily be a Dropbox employee), compassionately apologizing for the outage:
Did I say compassionately? It did ok until the thread deletion comment I suppose. The complaints kept pouring in, and as usual a number of users acted like over-entitled asses about the availability of the service, but the point was clear – users were angry about the outage, and even over just a couple of pages of posts overnight, were getting angrier still at the lack of response from Dropbox. Eight hours later,
anothera Dropbox employee (Ryan M) finally posted in the thread:
So let me get this straight; the plan is to post and say it’s fixed, then everybody goes home for the night, but it’s actually not fixed (at least, not entirely) and nobody thought to communicate about it? Cue page after page of customer rants. Some users sympathized, some just shrugged and say “it happens”, and some were in full on crisis mode. But the general mood I got was mutinous; they were furious that the service was down. When you have a group of angry users, you can either hide away and hope they’ll go away (highly unlikely based on previous experience with Internet users), or you respond and try to calm them down before there’s a snowball effect. Opportunity missed, Dropbox.
“You should have sent us an email!” say users. In the last hour, the forum administrator pointed out in response that Dropbox has 200 million users, and that sending out an email notification would probably take longer than the outage lasted. Ok, fair point. How about for your paying customers then, which I assume is a much smaller number?
The outage is, apparently, still affecting some customers. The Twitter stream has warmed up, even though most of the responses are a variant of “We apologize for the site troubles. We’re working hard to restore service to all users ASAP!” But it’s a response, at least. For many though it may be too late. On the forums just now, user Buddy H sums it up perfectly:
The damage has been done already. Even now, communication is still not where it should be, and the fact that there’s still no ETA for a fix nor transparency as to the nature of the outage and progress towards the fix suggests that Dropbox is still barely understanding how badly they are handling the situation. I can only guess that there was not a communication plan developed and ready to roll for a crisis like this, but there will be soon.
Will users move? Meh. I won’t, but then it didn’t really affect me. But many will (out of pure butthurt retaliation), and paying users I’m sure will be demanding compensation as well as investigating other service options. Dropbox has a lot of apologizing to do before they can even start to re-earn the trust of their user base.
I can’t wait to see the Root Cause / Post Mortem when they publish it. Assuming, that is, that Dropbox will choose to make it public. Again, I think we’re learned from many other outages that transparency around outages is seen by users as a positive thing, not a sign of failure.
Update 1/17: Here’s the Dropbox Post Mortem
Fingers crossed, Dropbox. My sympathies to all who had – I sincerely hope – to work around the clock to restore service – I know that sucks. Most of all I’m sorry that the engineers who caused (!) and are repairing the problems are being so badly let down by the communications of others in the company. That’s a huge disconnect and does a big disservice to the folks behind the scenes.