On the Border of the Internet #risk
The age old lie told by ISP support desks: ” The Internet is down,” was briefly reality again yesterday.
The past couple of days I’d been seeing and hearing comments that there was a disturbance in the force of the Internet. Initially a NANOG message was posted about a general malaise or instability in the Internet, some humorous quips were posted in response and the matter was soon forgotten.
A network operator looking with hindsight said that they had been able to see more than normal numbers of updates coming on BGP which is normally an indicator of network instability being solved by rerouting round the problem. That is all part of the normal operation of the Internet. And sometime yesterday morning as the east coast of the US was getting to work the looming disaster struck.
Juniper network devices started core dumping and restarting due to a bug in the code which handled the BGP UPDATE messages as another large updated was arriving. The self healing properties of the Internet broke and the Internet went with it. The Great Juniper Outage of 2011 was born.
Almost certainly. The reliance on the hardware of one specific vendor on the part of large ISPs – backbone carriers – creates a single point of failure which is bad – mkay. A fail over situation should always be in place, not just at the ISPs. Companies who rely on the Internet for business should take this into account too. A recent outages at some of companies I consulted said that by placing their faith in one specific vendor they had created a single point of failure which had caused some high profile repercussions.
Do you have a single point of failure?