Jul 19, 10:28 UTC
Resolved - The issue has been resolved.
Jul 19, 10:23 UTC
Investigating - We are deploying a fix that should address the issue in the next 30 min.
Jan 24, 09:20 UTC
Resolved - We are still receiving a high load of requests but we are now fufilling due to increased capacity, so this issue is being marked resolved.
Requests for DataCite DOIs should be unaffected going forward, requests for Crossref DOIs against DataCite Content negotiation directly may occasionally return 404s when it is unable to process.
It is preferred if possible to use Content Negotiation via doi.org and you will be redirected as appropriate to the registration agencies content negotiation service as appropriate.
Jan 19, 18:56 UTC
Monitoring - The service appears stable now, and we are processing requests. We are monitoring now with the changes we've made.
As mentioned in the previous update, do note requests for Crossref DOIs via DataCite Content Negotiation may still return a 404 and this is expected behaviour at present.
Jan 19, 15:06 UTC
Identified - A large number of requests has been identified that are specifically attempting to resolve Crossref DOIs via the DataCite content negotiation service, we are having on-going conversations about this use-case.
To remedy we've implemented some timeout logic and additional rate limiting, this however may have a knock on effect those attempting Content Negotiation for Crossref DOIs via DataCite Content Negotiation (Crossref Content Negotiation or via doi.org is unaffected) may receive a 404.
Requests for DataCite DOIs via Content Negotiation should now start to be resolving.
Jan 11, 11:00 UTC
Investigating - During the past week or so we've noticed a huge surge of increased traffic to the Content Negotiation Service. This has unfortunately caused various problems with requests not being resolved. This is currently affecting both the Content Negotiation service but also the Citation Formatter service.
We've made some initial improvements to support the increased load on the service, but it is unfortunately still causing issues.
We will continue to investigate.
Jan 25, 18:34 UTC
Resolved - Due to a deployment problem search and the oai-pmh service were unable to be deployed into our infrastructure after they were routinely cycled. This has now been upgraded to use our new standard workflow and have deployed successfully.
Jan 25, 17:05 UTC
Update - search.datacite.org is also affected. commons.datacite.org however is not affected, please use that for any front-end search queries.
Jan 25, 16:57 UTC
Investigating - The OAI-PMH service is currently down, we are investigating the probable cause.
Sep 9, 20:38 UTC
Resolved - This incident has been resolved. We will continue to monitor the MDS and REST APIs for performance issues.
Sep 9, 18:37 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Sep 9, 18:00 UTC
Investigating - We are currently investigating an issue that may be causing degraded performance for the MDS and REST APIs in our test and production systems.
Dec 2, 10:38 UTC
Resolved - We believe the performance issues were due to increased load, we've increased our capacity and still investigating further options. We hope changes made yesterday have returned stability but we will continue monitoring.
Dec 1, 14:12 UTC
Update - There are still some reported issues with the MDS API, we are investigating and continuing to monitor for any further issues.
Dec 1, 12:56 UTC
Monitoring - A fix has been deployed, there was an issue with automatic scaling of servers, this has been now corrected and it should be operating as normal.
We will continue to monitor.
Dec 1, 11:06 UTC
Investigating - We are currently investigating this issue.
Feb 24, 00:39 UTC
Resolved - This incident has been resolved.
Feb 24, 00:24 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Feb 24, 00:13 UTC
Identified - The issue has been identified and a fix is being implemented.
Feb 23, 22:20 UTC
Investigating - We are currently investigating errors in Commons. Other services are not affected.
Mar 24, 09:56 UTC
Resolved - Service appears stable for few hours now, incident resolved but will continue monitoring.
Mar 23, 20:06 UTC
Monitoring - We have identified that our services are experiencing high load.
It is primarily affecting MDS however REST API for DOI registrations may also be affected, consequently our frontend application Fabrica is potentially affected.
We've already compensated with higher number of servers running in our production cluster and our now monitoring the situation.
Apr 5, 08:17 UTC
Resolved - All services appear normal.
Apr 4, 15:47 UTC
Monitoring - The initial capacity increases looks to have provided the needed support for services, there may be some slowness for indexing of DOIs which may take longer than the usual few minutes, a backlog queue is currently being processed after the load spike. We are monitoring for further problems.
Apr 4, 15:13 UTC
Investigating - increased load has been creeping up for past couple of hours, some automatic and manual mitigation has already taken place, but we are investigating further.
Jun 2, 11:50 UTC
Resolved - All services appear normal now, closing incident. Services are still marginally scaled up to prevent issues.
Jun 2, 07:59 UTC
Monitoring - We increased our capacity to cope with an increased load on our services. We are still monitoring at this stage but affected systems appear operational.
Jun 1, 15:46 UTC
Investigating - We've received reports and internal monitoring that is showing errors when attempting to register DOIs, this appears to not affect all registrations but instead intermittent performance issues. We are currently investigating.
Jun 6, 23:49 UTC
Resolved - All services appear normal now, closing incident. Services are still marginally scaled up to prevent issues.
Jun 6, 20:27 UTC
Monitoring - Services have been scaled to cope with load, we are monitoring the situation.
Jun 6, 19:23 UTC
Investigating - We are investigating an issue with response times being slow for DOI registration. This appears to be related to increased service load and we are scaling services to compensate.