Mar 15, 23:26 UTC
Resolved - We have identified the source of the network bottleneck and addressed it. We will be conducting a RCA to better understand the failure case.
Mar 15, 23:16 UTC
Investigating - Looking into reports of latency for some indices in US-East.
Mar 29, 14:20 UTC
Resolved - The node has been repaired and is confirmed to be serving traffic normally.
Mar 29, 14:10 UTC
Identified - We have identified an unhealthy node in the region and are working to resolve.
Mar 29, 14:00 UTC
Investigating - We are investigating reports of persistent HTTP 503 errors in the US-East region.
Apr 24, 18:54 UTC
Resolved - Service has been restored to the impacted server.
Apr 24, 18:45 UTC
Identified - We have been automatically paged to respond to a server issue in the US-East region. Resolution will be forthcoming over the next several minutes.
Apr 27, 07:02 UTC
Resolved - The incident has been resolved.
Apr 27, 06:55 UTC
Identified - We have identified the issue and are working on a fix.
Apr 27, 06:30 UTC
Investigating - Our systems have detected an issue impacting about a dozen Cobalt and Staging indices in US East and we are investigating.
May 5, 09:40 UTC
Resolved - This incident has been resolved.
May 5, 08:36 UTC
Update - Starting at 07:29 UTC, a server failure caused increased 503 errors for roughly 3% of indices in the US East region. The issue was detected and subsequently a fix was implemented at 08:04 UTC. Index traffic is now stable, and further follow-up maintenance is now being performed on those indices affected.
May 5, 08:04 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
May 5, 07:52 UTC
Investigating - We are currently investigating this issue.
May 23, 18:12 UTC
Resolved - This issue has been resolved.
May 23, 17:36 UTC
Investigating - We are currently investigating this issue.
Aug 3, 00:45 UTC
Resolved - This incident has been resolved.
Aug 3, 00:07 UTC
Update - From AWS:
4:58 PM PDT We can confirm that some instances are unreachable and some EBS volumes are experiencing degraded performance in a single Availability Zone in the US-EAST-1 Region. Engineers are engaged and we are working to resolve the issue.
5:05 PM PDT We have identified the root cause and are beginning to see recovery for instances and EBS volumes in the affected Availability Zone in the US-EAST-1 Region. We continue to work toward full resolution.
Aug 2, 23:53 UTC
Monitoring - We're detecting a network connectivity event affecting a single Availability Zone in our AWS Virginia region. General system redundancy is operating as designed, with no impact to customer traffic at this time. However we are standing by to intervene if necessary. https://status.aws.amazon.com/
Aug 29, 21:54 UTC
Resolved - This incident has been resolved.
Aug 29, 19:45 UTC
Identified - We are observing an elevated error rate in US-East. The regression is limited in scope, affecting <0.1% of all requests. We have identified a root cause and are working on a fix.
Sep 1, 09:15 UTC
Resolved - Service has been restored.
Oct 2, 18:33 UTC
Resolved - We have identified and fixed the problem.