Multiple sites showing down/under maintenance
Incident Report for Opsgenie
Postmortem

Earlier this month, several hundred Atlassian customers were impacted by a site outage. We have published a Post-Incident Review which includes a technical deep dive on what happened, details on how we restored customers sites, and the immediate actions we’ve taken to improve our operations and approach to incident management.

https://www.atlassian.com/engineering/post-incident-review-april-2022-outage

Posted Apr 29, 2022 - 20:30 UTC

Resolved
We have restored impacted Opsgenie customer sites and the service is operating normally.

If you need assistance, please reply to your support ticket so that our engineers can work with you. If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu)

Our teams will be working on a detailed Post Incident Report to share publicly by the end of April.
Posted Apr 17, 2022 - 21:46 UTC
Monitoring
We have restored impacted Opsgenie customer sites and the service is operating normally.

If you need assistance, please reply to your support ticket so that our engineers can work with you. If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu)
Posted Apr 17, 2022 - 19:10 UTC
Update
We have now restored 99% of users impacted by the outage and have reached out to all affected customers.
Our teams are available to help customers with any concerns. If you need assistance, please reply to your support ticket so that our engineers can work with you.
If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 17, 2022 - 04:19 UTC
Update
We have now restored 85% of users impacted by the outage and will continue to get sites back to customers for validation, over the weekend.
As we hand your restored site over to you for validation, please reach out to our teams should you find any issues so that our support engineers can work to get you fully operational.

You can contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 16, 2022 - 20:08 UTC
Update
We have now restored 78% of users impacted by the outage as we continue to move with more speed and accuracy. Our teams will continue to restore sites through the weekend, and we expect to have all sites restored no later than end of day Tuesday, April 19th PT. As we restore your site and hand it over to you for validation, please reach out to our teams should you find any issues so that our support engineers can work to get you fully restored.
You can contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 16, 2022 - 01:42 UTC
Update
We have made significant progress over the last 24 hours and have now restored functionality for 62% of users impacted by the outage.

We have also doubled the size of the batches we are pushing through the restoration process, which was a result of optimizing automated processes as well as accelerating our restoration speed. Our global engineering teams continue to work 24/7, and we expect to progress quickly through technical restoration of remaining customer sites over the weekend.

If you do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 15, 2022 - 20:08 UTC
Update
We have now restored functionality for 55% of users impacted by the outage.
With automation in full effect, we have significantly increased the pace at which we are conducting technical restoration of affected customer sites, and we have reduced the time required for the validation of restored sites by half.
If you are still experiencing an outage and do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 15, 2022 - 02:24 UTC
Update
We have now restored functionality for 53% of users impacted by the outage.

As outlined in yesterday’s update, we are restoring affected customers using a three step process:

1. Technical restoration of affected sites
2. Internal validation of restored sites
3. Validating with affected customers before enabling their users

By automating some of our validation steps, we have now reduced time for internal validation of restored sites by half, which allows our support engineers to more quickly engage restored customers for validation and full site handover.

If you are still experiencing an outage and do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
Posted Apr 14, 2022 - 20:03 UTC
Update
We have restored functionality for 49% of users impacted by the outage.
We are taking a batch-based approach to restoring customers, and to-date, this process has been semi-automated. We are beginning to shift towards a more automated process to restore sites. That said, there are still a number of steps required before we hand a site to customers for review and acceptance.
We are restoring affected customers identified by a mix of multiple variables including site size, complexity, edition, tenure, and several other factors in groups of up to 60 at a time. The full restoration process involves our engineering teams, our customer support teams, and our customer, and has three steps:
1. Technical restoration involving meta-data recovery, data restores across a number of services, and ensuring the data across the different systems is working correctly for product and ecosystem apps
2. Verification of site functionality to ensure the technical restoration has worked as expected
3. Lastly, working directly with the affected customer to enable them to verify their data and functionality before enabling for their users
We have also contacted all customers who are *up next* for step 3 in the site restoration process described above. These customers are aware that they are next in queue through their support ticket and/or via a support engineer.
We have proactively reached out to technical contacts and system admins at all impacted customers, and opened support tickets for each of them. However, we learned that some customers have not yet heard from us or engaged with our support team. If you are experiencing an outage and do not have access to your open ticket, please contact us through our (choose the Billing, Payments, & Pricing options from the drop down menu): https://support.atlassian.com/contact/#/
For more information from our engineering team, please read our update from our CTO, Sri Viswanath: https://www.atlassian.com/engineering/april-2022-outage-update
Posted Apr 14, 2022 - 04:13 UTC
Update
The team is continuing the restoration process for the ~400 impacted customers. We have restored functionality for 45% of impacted users. As a reminder this incident is not a result of a cyber attack and most of our restored customers have not seen any data loss. You can read a more detailed technical overview of the situation directly from our CTO, Sri Viswanath, in our Engineering blog - https://www.atlassian.com/engineering/april-2022-outage-update
Posted Apr 13, 2022 - 03:37 UTC
Update
The team is moving through the restoration process this week and is accelerating toward recovery. Functionality for 40% of impacted users has been restored.
Posted Apr 12, 2022 - 09:36 UTC
Update
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
Posted Apr 11, 2022 - 15:35 UTC
Update
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
Posted Apr 11, 2022 - 12:11 UTC
Update
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
Posted Apr 11, 2022 - 08:28 UTC
Update
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
Posted Apr 11, 2022 - 05:01 UTC
Update
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
Posted Apr 11, 2022 - 02:00 UTC
Update
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
Posted Apr 10, 2022 - 22:49 UTC
Update
We are still working 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
Posted Apr 10, 2022 - 20:21 UTC
Update
We are still working 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
Posted Apr 10, 2022 - 17:34 UTC
Update
We continue to work 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
Posted Apr 10, 2022 - 13:13 UTC
Update
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
Posted Apr 10, 2022 - 09:36 UTC
Update
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
Posted Apr 10, 2022 - 06:24 UTC
Update
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
Posted Apr 10, 2022 - 03:17 UTC
Update
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
Posted Apr 09, 2022 - 23:08 UTC
Update
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
Posted Apr 09, 2022 - 20:45 UTC
Update
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
Posted Apr 09, 2022 - 17:47 UTC
Update
Our team is working 24/7 to progress through site restoration work. Core functionality has been restored across a number of sites. We are continuously improving the process with the aim of accelerating the restoration process from here.
Posted Apr 09, 2022 - 14:33 UTC
Update
Our team is working 24/7 to progress through site restoration work. Core functionality has been restored across a number of sites. We are continuously improving the process with the aim of accelerating the restoration process from here.
Posted Apr 09, 2022 - 10:47 UTC
Update
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
Posted Apr 09, 2022 - 06:34 UTC
Update
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
Posted Apr 09, 2022 - 03:33 UTC
Update
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
Posted Apr 09, 2022 - 00:28 UTC
Update
Restoration work to restore sites is underway and will continue into the weekend. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
Posted Apr 08, 2022 - 20:07 UTC
Update
Restoration work to restore sites is underway and will continue into the weekend. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
Posted Apr 08, 2022 - 17:05 UTC
Update
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
Posted Apr 08, 2022 - 14:01 UTC
Update
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
Posted Apr 08, 2022 - 10:59 UTC
Update
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
Posted Apr 08, 2022 - 07:38 UTC
Update
We continue to work on partial restoration to a cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
Posted Apr 08, 2022 - 04:34 UTC
Update
We continue to work on partial restoration to a cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
Posted Apr 08, 2022 - 01:12 UTC
Update
We continue to work on partial restoration to the first cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
Posted Apr 07, 2022 - 21:41 UTC
Update
We are beginning partial restoration to a cohort of customers. The early stages of this process will be controlled and hands-on, as we work with customers live to get feedback and ensure that restoration is working well before we accelerate the process for the next cohort. We will continue to post updates here as we move along this process.
Posted Apr 07, 2022 - 18:05 UTC
Update
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
Posted Apr 07, 2022 - 12:28 UTC
Update
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
Posted Apr 07, 2022 - 09:36 UTC
Update
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
Posted Apr 07, 2022 - 04:58 UTC
Update
We are continuing to move through the various stages for restoration. The team is currently in the verification stage on a subset of instances. Successful verification will then allow us to move to reenabling those sites. Once reenabled, support will update accounts via opened incident tickets. Our efforts will continue 24x7 through this process until all instances are restored.
Posted Apr 06, 2022 - 22:14 UTC
Update
We are continuing to work on the resolution of the incidents for some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. We will provide updates every 3 hours.
Posted Apr 06, 2022 - 18:51 UTC
Update
We continue to work on the resolution of the incident for a number of our Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. We can confirm this is not impacting all customers but remains a high priority for Atlassian as our dedicated team of SMEs work 24/7 to restore the sites as soon as possible. We will provide more detail as we progress through resolution.
Posted Apr 06, 2022 - 12:06 UTC
Update
We continue to work on the resolution of the incident for a number of our Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. This continues to be a high priority for Atlassian, and while we have made progress, these applications still remain unavailable for some customers. We will provide more detail as we progress through resolution.
Posted Apr 06, 2022 - 06:15 UTC
Update
We continue to work on the defined processes to the resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud. We are progressing through the stages defined and will continue to update this StatusPage as further details become available. We will provide more detail as we progress through resolution.
Posted Apr 06, 2022 - 00:10 UTC
Update
We continue to work on the defined processes to the resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud. We are progressing through the stages defined and will update Statuspage again in one hour. We will provide more detail as we progress through resolution.
Posted Apr 05, 2022 - 20:56 UTC
Update
We have defined two processes to resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery and Opsgenie Cloud. These processes each involve multiple stages of work. We are currently working on the processes and will update Statuspage again in one hour. We will provide more detail as we progress through resolution.
Posted Apr 05, 2022 - 18:43 UTC
Update
We continue to work on issues with multiple instances that are showing under maintenance impacting some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery and Opsgenie Cloud customers. We have identified the root cause and have a two pronged strategy. Currently working on manual restoration for a handful of tenants.
Posted Apr 05, 2022 - 15:12 UTC
Identified
We continue to work on issues with multiple instances that are showing under maintenance impacting some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access and Opsgenie Cloud customers. We have identified the root cause and planning for mitigation steps.
Posted Apr 05, 2022 - 11:12 UTC
Investigating
We are investigating an issue with multiple Cloud instances showing under maintenance that is impacting some Jira Work Management,Confluence, Jira Service Management, Jira Software, Atlassian Access, Opsgenie Cloud customers. We will provide more details within the next hour.
Posted Apr 05, 2022 - 09:44 UTC