Can a scaleout cluster on AWS span availability zones for resilience?

#1
A few years ago I trialled scaleout on AWS and I had to reject it because we could not gain the resilience we required by spanning 3 or more availability zones (within a single region) for a cluster.

Please can someone advise if this constraint still applies?
Thank you.
 

admin

Administrator
#1
Yes, when launching ScaleOut StateServer and related products on the AWS Marketplace, they all launch into the same availability zone. This is by design for two reasons:
  • Host-to-host latency is increased in multi-AZ deployments, which negatively impacts the distributed membership protocol since it relies on low-latency heartbeating between hosts. This increase in latency can result in false positive recovery scenarios. A core requirement of our product (whether on-premise or in the cloud) is that all hosts must be within the same datacenter.
  • Multi-AZ deployments with any distributed software where all hosts actively participate with replication-based high availability provides a false sense of high availability, since the failure of a single AZ will result in a simultaneous multiple host failure, resulting in potentially significant data loss and complex recovery.
For high availability in this case, we recommend deploying with the ScaleOut GeoServer option, which can be deployed in either a separate availability zone or a separate region for stronger high availability.

That said, it is our understanding that Amazon has made several improvements in their cross-AZ latency since we first developed the tooling to support the AWS Marketplace infrastructure. Our engineering team is already evaluating several AWS Marketplace deployment improvements for our next major release, and we will certainly re-evaluate the single-AZ constraint as well.
 
Top