How the Internet Leaks
A few weeks ago large swathes of the internet stopped working around the world. The cause was quickly found: a computer networking mistake over at a specialty metals company based in Pittsburgh, Pennsylvania. Earlier that month, a similar accident took down payment terminals in The Netherlands because of a misconfiguration in a Swiss datacenter which rerouted key parts of the internet to China.
These ‘route leaks’ expose a fundamental weakness of the internet, but this is a weakness that has simultaneously delivered the flexibility that made the internet a success. What’s going on?
The internet is a decentralized place, meaning that how traffic flows from A to B is not coordinated by any single organization. Instead, to participate in the internet, operators electronically signal to adjacent networks which internet addresses they host or can reach, and the internet will generally respond by sending traffic destined for those addresses their way - whether the operator actually owns those addresses or not.
If this sounds terrifying, that is because it is. The Border Gateway Protocol (BGP) that is used to exchange such routing instructions was designed in a friendlier and possibly more naive time, back when access to high levels of the internet was restricted to well known operators who supposedly made no mistakes and could be trusted. Neither assumption turns out to have been correct.
To prevent accidents and fraud, responsible operators put filters in place that limit what announcements they are willing to trust and propagate. For example, a specialty metals company should not be believed when it claims to host a quarter of the internet on its premises. Filters can be specific (‘only announce your part of the internet’) but can also simply limit the amount of internet that an operator is allowed to announce.
In the case of the Pittsburgh metal company and Swiss data center, no filters were in place, and consequently the internet believed these small scale operators when they suddenly announced erroneous new routes to vast numbers of websites and services.
Filtering not only stops accidents from propagating, it also helps prevent fraud. Recent internet hijacks allowed attackers to capture parts of the internet they did not own. In one high profile case this worked long enough to capture logins to a well known bitcoin wallet site.
Route filtering measures are voluntary and not baked into the fabric of the internet. Operators that care can already prevent a lot of problems by consulting public route databases, but not all operators are as diligent.
Might automated technology offer a solution? Over the years, various ways of using cryptography to authenticate routing information have been explored, but until recently none of these have gained traction. A key worry is that while a stray mistaken announcement currently does not take down the whole internet (except sometimes), the tiniest mistake in a cryptographic certificate would make entire companies disappear from the network. In short, operators are not ready to potentially harm their own network in order to protect the internet.
Recently however a start has been made with a defensive technology called RPKI Origin Validation. Origin Validation provides a cryptographical certificate that somewhat restricts how networks can be announced to the internet. By judicious design, RPKI is far more likely to stop mistakes than to accidentally cause downtime, at the cost of providing less than perfect protection. But as a gateway technology it is very effective.
Partially because of headline-grabbing incidents, RPKI adoption has recently accelerated remarkably, with large scale network providers getting on board. While RPKI and other filtering solutions are being deployed, expect the internet to continue to go astray from time to time, but at a decreasing rate.