Start time: 19 Jan 16:30 UTC
End time: 19 Jan 17:44 UTC
As a result of unprecedented surge in traffic, our team identified an issue affecting one of the services in our application infrastructure. This incident caused verifications to hold for approximately 50 minutes. All pending verifications during this time were automatically queued to be processed later. After identifying and fixing the underlying issue, our team ensured all queued operations were successfully completed without any impact on the users or any need for additional actions from our clients.
The incident was primarily caused by temporary downtime in an isolated service cluster, compounded by failover complications. Under peak load, excessive resource usage led to node instability. The failover mechanism did not operate as intended, delaying the restoration of normal functionality.
Our team has planned several improvements in the above-mentioned components of our infrastructure:
We sincerely apologize for any inconvenience this incident may have caused. Ensuring the reliability and stability of our systems is our highest priority, and we are committed to learning from this event. The changes and improvements outlined above will strengthen our infrastructure and reduce the likelihood of similar incidents in the future.
Thank you for your understanding and continued trust in our team and product.
If you have any questions or concerns, please don’t hesitate to contact us our Support team.