On April 4, 2023, between 13:32 and 14:50 UTC, Atlassian customers using Opsgenie faced significant delays while receiving Android push notifications.
This was caused by an incident in a third party messaging service, which is responsible for Android push notification delivery. This in turn affected our systems.
The incident was immediately detected by our monitoring tools, our on-call engineers were paged, and at 14:50 UTC our systems recovered successfully.
The total time to resolution was about 80 minutes.
The overall impact was between April 4, 2023, 13:32 - 14:50 UTC in Opsgenie. The incident only resulted in delays in Android push notifications only, and these notifications were delivered successfully after FCM service was restored and no data loss occurred.
The issue was caused by an incident in a third party messaging service, which is responsible for delivering push notifications to Android devices.
REMEDIAL ACTIONS PLAN & NEXT STEPS
We know that outages impact your productivity. The impact was immediately caught by our monitoring tools, and the responsible team immediately started analysis of incident. We value transparency with our customers and will continue to notify you and take any necessary actions promptly during an incident.
In order to handle degradation or outage of messaging channels, Opsgenie recommends that users configure multiple channels of message delivery - including push notifications, mobile SMS, phone calls, and email.
In order to improve our response for the future, we will also be analyzing whether we can employ autoscaling solutions for our systems in case of an outage/high load related to one notification channel.
We apologize to customers whose services were impacted during this incident.
Thanks,
Atlassian Customer Support