How LoginRadius Stayed Up During AWS US-EAST-1?

Last month, AWS had a major service disruption in its US-EAST-1 (N. Virginia) region. The impact was felt across the internet, with businesses facing everything from latency spikes to full downtime.

Meanwhile, LoginRadius stayed completely operational. No downtime. No degraded performance. No surprises for our customers.

For us, this wasn’t luck - this was our reliability-by-design architecture doing exactly what it was built to do.

Context on What Happened at AWS

The incident started late on October 19, 2025 (11:49 PM PDT) and stretched well into the next day. AWS reported increased error rates across several services in the US-EAST-1 region, which they traced back to DNS resolution failures for regional DynamoDB endpoints.

This issue cascaded through other AWS systems. AWS mitigated the core DNS problem by 2:24 AM PDT on October 20, but full recovery wasn’t confirmed until 3:01 PM PDT—roughly 16 hours of intermittent or limited service for the largest AWS region.

Resilient by Design

While the cloud ecosystem was scrambling back to normal, the LoginRadius platform kept running smoothly. Our monitoring showed no anomalies, no latency spikes, and no API errors across any of our environments.

Key outcomes during the outage:

Zero Downtime : Customer-facing APIs and services stayed online throughout.
Consistent Performance : We maintained normal API traffic and delivered 500 ms response times on 100% of requests, with no latency spikes.
Data Integrity : No data loss, no consistency issues.
Automated Validation : Health checks across all regions confirmed that failover mechanisms were ready—but not needed—thanks to efficient automated routing.

How We Stayed Resilient : The Architecture

Our ability to absorb the US-EAST-1 disruption wasn’t reactive. It’s the result of deliberate architectural decisions. LoginRadius runs an active-active, multi-region infrastructure built on core layers of defense:

1. Global Traffic Management with Two-Tier Failover

Every request passes through two layers of routing to ensure:

resilience
low latency
seamless failover

Our first layer of defense reroutes traffic away from unhealthy regions. If more granular failover is needed, a second layer of application-level routing kicks in. Combined, this ensures every request has a healthy path.

2. Multi-Region Compute Clusters

Our stateless microservice architecture runs across different independent AWS regions. Because our services are stateless, they can scale up or down based upon the need. When the primary EKS cluster was impacted, the multi-tier routing kicked in. It redirected traffic to the secondary cluster, ensuring continuous service.

3. Distributed Data Layer

We have active-active replication so data stays synchronized in real time across regions. This redundancy ensured complete consistency and zero data loss, even as one region experienced issues.

4. Continuous, Multi-Layered Monitoring

Visibility is foundational to resilience. Our observability stack brings together a real-time APM solution, a public endpoint health check tool, a dedicated logs and anomaly detection platform, and one designed to alert and escalate.

What This Means for Our Customers

This outage is a reminder of a simple truth: cloud incidents are inevitable. But your service doesn’t have to go down because of them.

LoginRadius is designed to handle failures beneath the surface through:

automated traffic steering
redundant compute across regions
distributed, consistent data layers
continuous global monitoring and failover automation

Your authentication flows, APIs, and hosted pages stay fast, secure, and uninterrupted—even when underlying infrastructure isn’t.

You can focus on growing your business knowing your identity platform is built on resilient engineering.

Looking Ahead: The Journey Continues

While this event was a tremendous validation of our engineering principles, resilience is a journey, not a milestone.

We’ll continue to:

expand disaster recovery playbooks with simulated regional failures
enhance observability ensuring that health metrics and anomaly detection remain predictive, not reactive.
strengthen multi-cloud readiness to provide even greater operational independence and flexibility

Our vision stays the same: to deliver a global identity platform that’s resilient by design. So your business stays online, even when parts of the internet don’t.

How LoginRadius Maintained 100% Uptime During the AWS US-EAST-1 Outage