Monday, 16 May 2016

Sitemap Infrastructure Upgrade

On  the 16th April our hosting provider 123-reg made a catastrophic blunder deleting 100's of customer virtual servers. They have since been unable to recover the servers and many reported permanent loss of data. You can read more about the outage and how it unfolded on our previous blog and on the BBC news website.

Having been through a rocky period we are pleased to say that we are now coming out the other side of the disaster having relocated our services to Microsoft Azure.

Our previous setup


Our previous setup with 123-reg gave us limited options in terms of resilience and scalability. We essentially hosted the service on a number of manually managed Virtual Private servers using rudimentary backup provision provided by 123-reg.

A simplified view of our 123reg hosting setup

Our  setup with 123reg setup was in a single UK data centre with basic hosting facilities, giving rise to a number of challenges including:

  • Difficult abstract and manage key services
  • No guaranteed fault domains
  • Problematic load balancing
  • Very limited backup options
  • Labour intensive manual scale
  • Lack of performance and security management

Our Microsoft Azure setup


Our new Azure setup has allowed us to easily abstract the various components and services and host them in much more resilient environments, with automatic redundancy/failover and much more opportunity to scale.

Azure allows us to host services around the globe at the click of a button. In the first instance we have provisioned our primary services in the Western Europe Azure region which provides a good central operating base.


An overview of our Azure setup

We now have a dedicated load balancer fronting the service allowing us to better manage traffic across a number of server nodes and deal with maintenance and updates much more transparently.

The servers themselves sit in separate fault domains meaning they do not depend on the same infrastructure should a fault occur.

We're also able to take server snapshots and scale out servers up and out should we need to meet peaks and troughs in demand. This can even be done automatically in real time!

Azure has enabled us to easily abstracted our file and database services to dedicated managed backends with inherent resilience, fault domains and automated backups.

Azure also affords us an awesome set of tools including application insights, performance and security monitoring enabling us to proactively manage the server.

What's next


As the service beds in we hope to tune and optimise the service further so there may be a few further bumps along the way, but ultimately we hope to provide a more stable platform.

With Azure being a flexible global platform our ability to scale and deliver compute power where it is needed to reduce network latency becomes a real possibility. We hope to be able to improve and expand the service further in the future as budget and resources permit.

We appreciate all the support and offers of help we received during this problematic period and would encourage you if possible to please contribute and help use keep sitemaps free.