Apr 7, 23:26 EDT
Resolved - Our primary message bus is still undergoing maintenance and we have implemented all the mitigations that we can without making larger changes. Taking that into consideration and the fact that we have not seen a disruption which would cause out of order processing within the last hour we've decided to close this incident.
Apr 7, 21:02 EDT
Monitoring - We are continuing to monitor the situation as maintenance on our primary message bus continues at our upstream provider. We're currently discussing options for a longer term fix but we expect that fully solving the issue will take longer than it will take for the maintenance to complete.
Apr 7, 20:12 EDT
Identified - We have received reports of false alerts being sent out for some short interval Snitches. The issue is related to maintenance an upstream provider is performing on our main message bus service that is causing some check-ins to be processed out of order. We've taken an initial step to mitigate the issue but have more work to do to fully address the issue.
Apr 7, 21:02 EDT
Monitoring - We are continuing to monitor the situation as maintenance on our primary message bus continues at our upstream provider. We're currently discussing options for a longer term fix but we expect that fully solving the issue will take longer than it will take for the maintenance to complete.
Apr 7, 20:12 EDT
Identified - We have received reports of false alerts being sent out for some short interval Snitches. The issue is related to maintenance an upstream provider is performing on our main message bus service that is causing some check-ins to be processed out of order. We've taken an initial step to mitigate the issue but have more work to do to fully address the issue.
Apr 7, 20:12 EDT
Identified - We have received reports of false alerts being sent out for some short interval Snitches. The issue is related to maintenance an upstream provider is performing on our main message bus service that is causing some check-ins to be processed out of order. We've taken an initial step to mitigate the issue but have more work to do to fully address the issue.
Jan 28, 10:39 EST
Resolved - Alerts and emails are once again being sent in a timely manor. We've identified the root cause as a job which had a missing timeout which was causing issues due to an upstream connection issue. This highlighted that the monitoring we have in place for our worker system needs to be reviewed as we didn't receive an alert that there was an issue.
Jan 28, 09:57 EST
Monitoring - We've deployed a fix for the issue and our worker system has worked through all of the queued jobs. We're monitoring to make sure the system is stable and will be reviewing steps to avoid this in the future.
Jan 28, 09:48 EST
Update - We believe we've identified the issue which has been causing one job (sending iOS push notifications) to block our queue. We're working to deploy a fix now.
Jan 28, 09:39 EST
Identified - The system we use to send alerts and emails is currently backlogged due an upstream service issue. We're
Jan 28, 09:57 EST
Monitoring - We've deployed a fix for the issue and our worker system has worked through all of the queued jobs. We're monitoring to make sure the system is stable and will be reviewing steps to avoid this in the future.
Jan 28, 09:48 EST
Update - We believe we've identified the issue which has been causing one job (sending iOS push notifications) to block our queue. We're working to deploy a fix now.
Jan 28, 09:39 EST
Identified - The system we use to send alerts and emails is currently backlogged due an upstream service issue. We're
Jan 28, 09:48 EST
Update - We believe we've identified the issue which has been causing one job (sending iOS push notifications) to block our queue. We're working to deploy a fix now.
Jan 28, 09:39 EST
Identified - The system we use to send alerts and emails is currently backlogged due an upstream service issue. We're
Jan 28, 09:39 EST
Identified - The system we use to send alerts and emails is currently backlogged due an upstream service issue. We're
Dec 8, 21:04 EST
Resolved - This incident has been resolved.
Dec 8, 19:28 EST
Monitoring - DNS update is propagating and the site is coming back online.
Dec 8, 19:21 EST
Identified - We have tracked down the issue to issues with the DNS record of deadmanssnitch.com and are working on a fix.
Dec 8, 19:14 EST
Investigating - We are investigating into why deadmanssnitch.com is not responding.
Feb 28, 20:15 EST
Resolved - Both Amazon S3 and our email check-in volume has returned to normal. Thank you for your patience.
Feb 28, 15:39 EST
Identified - The major outage at Amazon S3 has also impacted the Simple Email Service (Amazon SES) in US-EAST-1 which we use for receiving emails. We have migrated the receiving of emails to US-WEST-2 to work around the issues. This required a change to DNS which may take some time to propagate to everyone.
Thank you for you patience as we work around this issue. I'm sorry for the confusion that false alerts can cause especially as many of you are dealing with Amazon's issues as well.
Feb 28, 14:59 EST
Investigating - We have been experiencing issues receiving check-ins via email since approximately 17:30 UTC. We are investigating the issue but believe it is related to the ongoing outage with Amazon S3.
Jun 5, 10:25 EDT
Completed - The scheduled maintenance has been completed.
Jun 5, 10:16 EDT
Verifying - We're verifying that everything completed successfully.
Jun 5, 10:05 EDT
Update - Beginning our scheduled website maintenance.
Jun 5, 10:00 EDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Jun 5, 08:29 EDT
Scheduled - Our website will be undergoing planned maintenance on Monday, June 5 between 10:05 and 11:05 a.m. EST. deadmanssnitch.com and alerts may be delayed. Snitch check-ins will not be affected.