When Azure Goes Down, Do You Get Proactively Notified?

ControlUp ScoutbeesRoot Cause Analysis

September 28, 2020: Microsoft’s Azure Active Directory (Azure AD) had an outage.

The outage occurred between 2:25PM PST and 5:23PM PST and affected authentication services, preventing users from logging on and using resources hosted on Azure. Microsoft posted a root cause analysis detailing the incident.

Microsoft Azure Active Directory (Azure AD) had an outage.

Scoutbees—ControlUp’s synthetic testing application—immediately picked up on the outage, and proactively alerted its users about the issue in their environment. The screenshot below was captured from a browser at 5:15 p.m. EDT.

ControlUp detects Azure Active Directory (Azure AD) outage
Outage (12:15 a.m. IDT, 5:15 p.m. EDT) as seen by monitoring Citrix Cloud resources using Azure AD.

Scoutbees correctly identified the issue as an authentication failure, giving our customers in-depth information on pinpointing the root cause of the problem!

Scoutbees identifies authentication failure in Microsoft Azure Active Directory

Without Scoutbees, IT’s first line of information is usually end-users calling the help desk to complain about the error messages they get when trying to log in to their applications. In a case like this, they might tell your helpdesk that they were getting error “HTTP ERROR 503”:

End users report HTTP ERROR 503 for Azure Active Directory outage

“HTTP ERROR 503” doesn’t really provide much information about the problem. Googling the error message will return the generic “Service Unavailable.” You might think, “Service Unavailable? For Azure?” Further investigation would be required.

Being in IT, you might have authenticated into Azure, prior to the outage, to manage your resources. Given that, your experience would be inconsistent with your users, because Azure was working fine for you. “Azure is down? What? Could it just be my user’s home WiFi connection?” 

In a situation like this, your help desk would get overwhelmed with calls from users reporting the same thing. Tickets would get escalated, and panic would set in, as you investigate what in the heck was going on. Is your first step checking social media to see if it’s just you experiencing the outage? Should it be?! 

IT SHOULD NOT BE. With Scoutbees and its best-in-class synthetic testing, you’d have received a proactive alert about the outage and would have knows what was going on before it even popped up on Twitter:

Scoutbees provides proactive alerts for outages

With this information, you can give your help desk advanced warning or even proactively alert your users about a potential problem and to check your support page for outage information.

“Bee” Proactive. Try Scoutbees and proactively monitor your environment today!


Related Links:

 

Trentent Tye

Trentent Tye, a Tech Person of Interest, is based out of Canada and its many, many feet of snow. FUN FACT: Trentent came to ControlUp because, as a former customer, the product impacted his life in so many positive ways—from reducing stress, time to remediation, increased job satisfaction, and more—he had to be our evangelist. Now an integral part of ControlUp’s Product Marketing Team, he educates our customers, pours his heart and soul into the product, and generally makes ControlUp a better place. Trentent recently moved to be closer to family. He does not recommend moving during a pandemic.