For today’s enterprise, being offline for even a few hours can have a detrimental effect on their business and customers. The consequences of a data center outage can be costly - from lost revenue, data and productivity, to consequential brand damage. High availability and disaster recovery have never been so critical, as more and more companies move applications and services to XaaS solutions hosted on infrastructure which they cannot control.
That’s why it is of the utmost importance to have a bulletproof business continuity and backup plan, to help prevent data loss and downtime. So, what are some of the things to look out for when partnering with a SaaS vendor, to ensure that they care about continuity just as much as you do?
iasset.com made the decision many years ago to host our customer solutions exclusively within Microsoft Azure, allowing us to deploy applications as required into their data centers around the world. Part of the decision was the resilience provided by the Microsoft infrastructure that offered solutions for High Availability and Disaster Recovery. These solutions along with a scalable platform for customer deployments made the decision for iasset.com straight forward.
Got a Backup Plan?
At iasset.com, our customers can choose from high availability and comprehensive disaster recovery options. As a result, they have peace of mind knowing their applications will continue to run in a healthy state with minimal downtime and can recover from rare but major events.
We experienced one of these events one morning on September 4, 2018 when high energy storms hit southern Texas in the United States of America, resulting in a complete loss of Microsoft’s South Central US data center. Thousands of businesses were impacted from the loss, which took Microsoft 6 days to recover - a significant delay for those who rely on their services being available.
Amongst those affected by the outage included several iasset.com customers. However, the impact for these customers was minimal. Although Microsoft states that the majority of services were up again in just over a day, it was not until the 7th of September that “full mitigation” was complete. iasset.com customers were able to be back online and operational in less than 4 hours, with no data loss.
So How Did We Do It? (The Techy Stuff)
In the first instance, we achieved this thanks to our ability to setup applications to be highly available (HA) within Azure. Using Traffic Manager within Azure, we are able to deploy application services in multiple data centers which provide highly available, highly scalable options for deployment of our customer solutions. When connections to the primary endpoint fail, requests are serviced from the secondary services in a different data center. Whilst this introduces latency into the performance matrix, it provides resilience at the application layer to intermittent service outages within a particular data center. This is automated, seamless and requires no recovery once service has been restored to the primary app service.
Additionally, for customers on our Enterprise Level service, iasset.com utilizes active geo-replication at the database layer using Azure SQL. This allows us to monitor the application state and manually failover to the secondary database in the event of a disaster. In the case of the Microsoft outage, the HA app services data connections point to the primary, but will fail if the database server is not available. iasset.com can then determine, in conjunction with our customers, the point at which we call a disaster and promote the secondary to primary.
This was the first time in our 10 year operational history that we have had to instigate a customer Disaster Recovery plan. Whilst no plan is perfect, utilizing Microsoft Azure as our hosting platform and leveraging the services provided has proven to be a wise choice. The ease at which we were able to move a production environment to operate from a different data center was proof of that, even given the extenuating circumstances around this particular outage. Our customers experienced a short inconvenience rather than a lengthy outage.