This site (together with a number of other sites that I run on .uk domains and many domains run by other people) returned to life sometime this morning having fallen off the internet some time on Friday evening.
I use 123 Reg to handle the DNS for all of my .uk domains. It seems that this was a bad idea. They had some kind of DNS outage. It took them twelve hours to acknowledge the problem on the their status page and somewhere between another twelve and twenty-four hours to fix it.
Of course, this shouldn’t happen. All domains have at least two DNS servers. And they should be on different network segments. So it’s currently unclear how both of the DNS servers for my domains could be broken at the same time.
I’ve been using 123 Reg for the past few years because Gandi, my preferred DNS supplier didn’t support .uk domains. They recently added that support and this weekend’s problems have galvanised me into making the switch. My .uk domains will all be moving over the next week or so.
But I’m sorry if you’ve been unable to read any of my sites this weekend. And whilst the outage was short enough that any mail should still be queued for delivery somewhere on the internet, if you’ve sent something that I haven’t replied to, then please resend it.
Update: I’ve just received an email from Pipex (who own 123 Reg) in response to this blog posting. It’s interesting that they respond (in private!) to a public blog posting before they respond to the support mail that I sent them on Friday.
The email doesn’t add much useful information. I was going to ask for permission to quote it here, but I see it’s almost identical to the statement from 123 Reg quoted in this story on The Register covering the outage.
123-reg experienced intermittent performance issues on its DNS servers between late afternoon on Friday 16 November and Sunday 18 November. This meant that some customers have encountered difficulties with their domain names during this period.
This problem was caused by a combination of excessive loading on the DNS servers and a rare hardware failure. During this time, 123-reg engineers have replaced the hardware and full service has been resumed.
We apologise to our customers for the inconvenience that the outage would have caused and we have begun an investigation to identify the cause of the failure, and any necessary actions required will be implemented without delay.
They still haven’t explained how they managed to lose all DNS capability, despite the redundancy that is built into DNS.
And if anyone from Pipex is reading this and is thinking of sending another mail, why not leave a comment instead. That is, after all, how blogs are supposed to work.
Update: Tee hee. I just replied to that mail to see if I can get some more information. But the mail bounced back. Apparently that users mailbox is over quota. I wonder why?