Between 19:29 and 22:35 UTC on 02nd May 2019, customers may have experienced connectivity issues with Microsoft cloud services including Azure, Microsoft 365, Dynamics 365 and Azure DevOps. Most services were recovered by 21:40 UTC with the remaining recovered by 22:35 UTC.
Root cause: As part of planned maintenance activity, Microsoft engineers executed a configuration change to update one of the name servers for DNS zones used to reach several Microsoft services, including Azure Storage and Azure SQL Database. A failure in the change process resulted in one of the four name servers’ records for these zones to point to a DNS server having blank zone data and returning negative responses. The result was that approximately 25% of the queries for domains used by these services (such as database.windows.net) produced incorrect results, and reachability to these services was degraded. Consequently, multiple other Azure and Microsoft services that depend upon these core services were also impacted to varying degrees.
To mitigate the issue, Microsoft engineers corrected the delegation issue by reverting the name server value to the previous setting. Engineers verified that all responses were then correct, and the DNS resolvers began returning correct results within 5 minutes. Some applications and services that accessed the incorrect values and cached the results may have experienced longer restoration times until the expiration of the incorrect cached information.
Duncan is a technology professional with over 20 years experience of working in various IT roles. He has a interest in cyber security, and has a wide range of other skills in radio, electronics and telecommunications.