The Rogers Outage: What It Taught Canada About Operational Resilience

The Day It Happened

At 4:58 a.m. EDT on Friday, July 8, 2022, Rogers Communications — one of Canada's Big Three telecom providers — went dark. Not partially. Not in one region. The entire network, wireless and wireline, collapsed simultaneously. More than 12 million customers lost internet, cellular service, and everything that depended on them.

By mid-morning, the cascade had become national. Interac, the backbone of Canadian debit transactions, went offline — 18 million daily transactions, gone. Debit cards stopped working at grocery stores, gas stations, and restaurants from Vancouver to Halifax. ATMs went dark. Small business owners sat in empty shops because nobody carries cash anymore. A Toronto plant shop owner described it simply: "It pretty much stopped my business."

But it got worse. 911 emergency services failed in Toronto, Ottawa, and other cities. In Hamilton, a man's father couldn't call 911 when his aunt suffered an aneurysm in a parking lot. He ran through the street looking for someone — anyone — with a working phone on a different carrier. Hospitals rescheduled radiation therapy appointments. Government services went down. Border crossings slowed. Flight bookings froze.

The outage lasted 26 hours. The economic cost was estimated at $142 million. And the root cause was a single configuration error during a routine maintenance window.

What Actually Broke

Rogers was in the sixth phase of a seven-phase IP core network upgrade when a staff member removed an Access Control List (ACL) policy filter from a distribution router configuration. The ACL was there for a reason — it controlled which routing information flowed through the network. Removing it unleashed a flood of IP routing data that overwhelmed the routers, triggering a cascading failure across the entire core network.

The critical detail: Rogers operates a converged network architecture. Wireless traffic and wireline traffic flow through the same IP core. This is efficient under normal conditions and catastrophic under failure conditions. When the core went down, it didn't take out internet or cellular — it took out both, simultaneously, nationwide. There was no failover because there was nothing to fail over to. The backup path and the primary path ran through the same infrastructure.

The CRTC later commissioned an independent assessment by Xona Partners, published in November 2024. The report found that Rogers lacked several protections and redundancies that could have prevented the outage or shortened its duration. The converged architecture was a textbook single point of failure — one that nobody had stress-tested against a core routing failure.

The Interac Cascade: How a Telecom Outage Became a Payment Crisis

The Rogers outage by itself was serious. What made it a national emergency was what depended on Rogers without adequate redundancy: Interac.

Interac processes virtually every debit transaction in Canada. When Canadians tap their card at a terminal, the transaction routes through Interac's network to their bank and back. On July 8, that network was almost entirely dependent on Rogers for connectivity. Telecom experts later called this "bizarre" — an essential financial service for an entire country, running through a single carrier with no backup path.

When Rogers went down, Interac didn't degrade gracefully. It didn't fall back to a secondary carrier. It went down completely. For over 14 hours, Canadians couldn't use debit cards. In a country where cash usage has been declining for years, this effectively shut down commerce for millions of people. Café owners served coffee on trust, asking customers to come back the next day with cash. Others simply closed.

This is what dependency chains look like in practice. Interac didn't have a service outage. Rogers did. But Interac's dependency on a single carrier meant that Rogers' failure cascaded directly into the financial system. The dependency map — if anyone had drawn one — would have shown a single red line between Interac and Rogers with no alternative path. That's the kind of risk that looks obvious after the fact and invisible before.

The Human Cost Nobody Planned For

Business continuity plans are written in terms of RTOs and RPOs. They measure recovery in hours and data loss in transactions. But the Rogers outage demonstrated what those numbers actually mean when they fail.

A man in Hamilton couldn't call 911 while his family member was dying. Not because 911 was down — the service itself was operational — but because the phone network that connected him to it wasn't. His recovery time wasn't measured in hours. It was measured in the minutes he spent running through a parking lot looking for help.

Oncology patients had radiation therapy postponed because hospital scheduling systems depended on the network. Those aren't transactions you reschedule casually.

These aren't edge cases. They're the direct, predictable consequences of a single-provider dependency in critical infrastructure. The plans existed. The dependencies weren't mapped. The cascade wasn't simulated. And when the failure happened, nobody was prepared for how far it would reach.

What It Revealed About Canadian Infrastructure

The Rogers outage exposed three structural vulnerabilities that went beyond one company's network architecture.

Telecom concentration creates systemic risk. Canada's telecom market is dominated by three carriers: Rogers, Bell, and Telus. When one of them fails, a third of the country's connectivity disappears. This isn't a redundancy problem that individual organizations can solve — it's a market structure problem. Companies that think they have network diversity because they use Rogers for primary and a Rogers-subsidiary for backup don't have diversity at all.

Financial infrastructure depends on telecom in ways nobody tested. Interac's single-carrier dependency wasn't a secret, but it hadn't been stress-tested against a total carrier failure because nobody believed a total carrier failure was plausible. The scenario wasn't in the tabletop exercise because it seemed too extreme. Then it happened.

Emergency services have hidden telecom dependencies. 911 is supposed to be resilient. It has redundancy built in at the service level. But the phones that connect citizens to 911 depend on commercial telecom networks. When the network fails, the redundancy at the 911 service level doesn't matter because the call never reaches it. The dependency chain extends beyond the service boundary into infrastructure the service owner doesn't control.

The Regulatory Response: OSFI E-21

The Rogers outage didn't just trigger corporate remediation. It accelerated a regulatory shift that was already building momentum internationally, informed by the UK's FCA/PRA operational resilience framework and the EU's DORA regulation.

In August 2024, the Office of the Superintendent of Financial Institutions (OSFI) published a significantly revised Guideline E-21 — Operational Risk Management and Resilience. Originally issued in 2016 as a general operational risk framework, the revised E-21 reflects a fundamentally different philosophy: disruptions will happen, and institutions must be able to absorb them while continuing to deliver critical operations.

E-21 requires federally regulated financial institutions in Canada to:

Identify critical operations — the services that, if disrupted, would threaten the institution's viability or the broader financial system
Map dependencies end-to-end — including third-party providers, technology infrastructure, and the interconnections between them
Set and test impact tolerances — not just theoretical RTOs, but evidence-based thresholds validated through scenario testing
Demonstrate resilience through testing — with scenario testing completion required by September 2027

E-21 works alongside Guideline B-13 (Technology and Cyber Risk Management) and Guideline B-10 (Third-Party Risk Management) to create a comprehensive resilience framework. The message from OSFI is clear: operational risks can become financial risks if left unmanaged. The Rogers outage proved that this wasn't theoretical.

Full adherence to E-21 is expected by September 1, 2026. For Canadian financial institutions, the deadline isn't distant — it's imminent.

What Changed After

To Interac's credit, they moved fast. By January 2023, they had implemented a private backup connectivity mode for e-Transfer participants. By June 2023, they had added a secondary network carrier and installed a third backup link, with 13 financial institutions already enabled on the secondary carrier. They embedded the lessons from the crisis into their enterprise business continuity management processes.

Interac went from single-carrier dependency to multi-carrier redundancy in under a year. The technical fix wasn't complicated. Adding a second carrier and a backup link is standard network engineering. The question that should bother every practitioner is: why wasn't it there before?

The answer is the same reason dependency maps are always wrong: nobody mapped the risk, nobody tested the scenario, and nobody wanted to fund the redundancy for a failure that seemed implausible. Until it happened.

At the government level, the CRTC mandated new policies requiring telecom providers to assist each other during outages and offer emergency roaming. Rogers itself implemented network architecture changes identified in the Xona Partners assessment. The entire Canadian telecom and financial services ecosystem shifted from assuming resilience to requiring evidence of it.

The Lesson for Every Organization

The Rogers outage is often discussed as a telecom story. It's not. It's a dependency mapping story.

Every organization that was affected on July 8, 2022 had the same problem: they depended on something they hadn't mapped, tested, or built redundancy for. Interac depended on Rogers. Businesses depended on Interac. Citizens depended on Rogers for 911. Hospitals depended on Rogers for scheduling. Each dependency was invisible until the single point of failure at the bottom of the chain was removed.

The technical fix — carrier diversity, backup links, private network failover — was straightforward. Interac implemented it in months, not years. The hard part was recognizing the risk before the outage forced the recognition. That's the gap that dependency chain analysis and continuous documentation are designed to close.

For Canadian financial institutions facing the September 2026 E-21 deadline, the Rogers outage isn't history. It's the case study that explains why the regulation exists. And for every organization — Canadian or otherwise — that depends on third-party infrastructure without mapping the dependency chain, it's a preview of what happens when the chain breaks.

The fault lines are already there. The Rogers outage just made one of them visible.