2021-12-22 Amazon Web Services Outage (Riva Cloud)

 
Last updated: 22 Dec, 2021

 

Service restored for Riva Cloud customers at 07:15 am MDT on December 22, 2021.

Summary

AmazonAWS outage was affecting Riva Cloud connectivity which could have caused delayed sync cycles.  

Root Cause Analysis

After review, it was determined that Amazon Web Services (AWS) reported a power outage and network connectivity issues in the US-EAST Region. 

Amazon had confirmed: that Amazon Web Services is currently working to resolve an outage for some instances in a single Availability Zone (USE1-AZ4) in the US-EAST-1 Region. This is the result of a loss of power within a single data center within a single Availability Zone (USE1-AZ4) in the US-EAST-1 Region. This is affecting availability and connectivity to EC2 instances that are part of the affected data center within the affected Availability Zone. We are also experiencing elevated RunInstance API error rates for launches within the affected Availability Zone. Connectivity and power to other data centers within the affected Availability Zone, or other Availability Zones within the US-EAST-1 Region are not affected by this issue

For more detail: https://status.aws.amazon.com/ 

Impact on Affected Customers

Riva Cloud was accessible for customers in our US datacenter however connectivity issues were impacting sync cycles.  www.rivacloud.com

Riva Cloud Status updates: http://status.rivacloud.com/

Action Plan

Our Riva Cloud Operations team completed a failover to alternate internet gateways in the US-EAST-2 AWS Region

Next Steps

We are continuously monitoring all systems to ensure the initial cause of the issue does not recur. We realize your successful daily customer interactions, sales, and support processes depend on a trusted system providing reliable, secure, and high-quality processing – this is our top priority. We sincerely apologize for any impact these disruptions may have caused.

If there are any outstanding questions, please do not hesitate to contact us.



Timeline

Technical Details

Note: All times are in Mountain Time (UTC-0600)

Service restored: Restored and service confirmed as healthy on December 22, 2021, at 7:15 am MDT
 

Wednesday, December 22, 2021

  • 05:06 - Automated Riva Cloud environment alerted the team to the issue
  • 05:30 - Issue identified as AWS, and ticket opened with AWS support
  • 06:00 - Internal review of options with Riva Cloud Operations team
  • 07:01 -  Working on failover to re-route affected nodes to alternate internet gateways
  • 07:15 - Re-route to alternative internet gateway completed and sync resumed on affected pods. Issue resolved and service fully restored.  

Was this article helpful?

/

Comments

0 comments

Article is closed for comments.