Recently in internet Category

And We're Back

| No Comments | View blog reactions

This site (together with a number of other sites that I run on .uk domains and many domains run by other people) returned to life sometime this morning having fallen off the internet some time on Friday evening.

I use 123 Reg to handle the DNS for all of my .uk domains. It seems that this was a bad idea. They had some kind of DNS outage. It took them twelve hours to acknowledge the problem on the their status page and somewhere between another twelve and twenty-four hours to fix it.

Of course, this shouldn't happen. All domains have at least two DNS servers. And they should be on different network segments. So it's currently unclear how both of the DNS servers for my domains could be broken at the same time.

I've been using 123 Reg for the past few years because Gandi, my preferred DNS supplier didn't support .uk domains. They recently added that support and this weekend's problems have galvanised me into making the switch. My .uk domains will all be moving over the next week or so.

But I'm sorry if you've been unable to read any of my sites this weekend. And whilst the outage was short enough that any mail should still be queued for delivery somewhere on the internet, if you've sent something that I haven't replied to, then please resend it.

Update: I've just received an email from Pipex (who own 123 Reg) in response to this blog posting. It's interesting that they respond (in private!) to a public blog posting before they respond to the support mail that I sent them on Friday.

The email doesn't add much useful information. I was going to ask for permission to quote it here, but I see it's almost identical to the statement from 123 Reg quoted in this story on The Register covering the outage.

123-reg experienced intermittent performance issues on its DNS servers between late afternoon on Friday 16 November and Sunday 18 November. This meant that some customers have encountered difficulties with their domain names during this period.

This problem was caused by a combination of excessive loading on the DNS servers and a rare hardware failure. During this time, 123-reg engineers have replaced the hardware and full service has been resumed.

We apologise to our customers for the inconvenience that the outage would have caused and we have begun an investigation to identify the cause of the failure, and any necessary actions required will be implemented without delay.

They still haven't explained how they managed to lose all DNS capability, despite the redundancy that is built into DNS.

And if anyone from Pipex is reading this and is thinking of sending another mail, why not leave a comment instead. That is, after all, how blogs are supposed to work.

Update: Tee hee. I just replied to that mail to see if I can get some more information. But the mail bounced back. Apparently that users mailbox is over quota. I wonder why?

Using TinyURL

| 2 Comments | View blog reactions

The BBC Backstage mailing list has briefly turned its attention from the iPlayer's DRM and Ashley Highfield's estimates of Linux usage and is actually having an interesting conversation about URL schemes.

This was all set off by an email sent out to participants in the BBC archive trial. The mail used TinyURL to shorten a couple of long URLs that it mentioned. The URLs in question were

Long-time readers might remember that I'm interested in the problems that people have with URLs, so you won't be surprised that this discussion piqued my interest.

The first interesting point that was raised was that URL-shortening services like TinyURL can be used to disguise dubious addresses in a phishing attack. When clicking on links in mail it's always a good idea to ensure that you know which site the link is taking you too. URL-shortening services prevent you from doing this as the URL you see it is to, for example, tinyurl.com. It's unlikely, of course, that anyone wants to get your login details to the BBC archive trial, but it's certainly a bad habit for an organisation like the BBC to be encouraging.

The main point, for me, of a URL shortening service is that it's an easy way to share URLs from sites that have nasty addressing schemes which lead to unmanageably long URLs - like the URLs created by most e-commerce and content management systems (or, at least, most of the ones that I see being used). It's just a fact of internet life that you often want to share a URL which is far too long for sane people to deal with. And URL-shortening services are perfect for cases like that. You can shorten a long URL to a short link that won't get broken by your friends' email program.

But I see that as a solution to a temporary problem. As some point in the future, we will no longer have unmanageably long URLs. Everyone designing URL schemes will understand how they should work and no-one will encode session information in URLs. Well, I can dream can't I?

More practically, URL-shortening is a solution to the problem of sharing problematic URLs when you have no control over the URL scheme in question. In other words, the problem of sharing other people's URLs. If you're trying to share one of your own URLs and you find yourself wanting to use a URL shortening service, then perhaps you should be reconsidering your URL scheme.

And that's why I don't think that the BBC should be using things like TinyURL. They shouldn't need to as they control the URLs that they are sharing. Personally I think that the two URLs in question are pretty good URLs. They are both easily readable and they aren't too long. Oh, you can make picky suggestions for improving them (I'd want to lose the '2' from 'login2.shtml' at least) but they are a vast improvement over most of the URLs you see out on the web. But if however composed the email thought that they were too long to include, then they should have fixed the URLs rather than resorting to TinyURL. I realise that in an organisation like the BBC getting the relevant web server configuration in place might take time, but that's just another argument for getting your URL scheme right from the start.

Your URLs are your address on the web. They are how people find the information that you want to share with them. It's well worth putting some effort into them.

Selling Domains

| No Comments | View blog reactions

I've been dabbling in web sites for some time now and over the years I've had my fair share of projects that either never got started or started well but eventually withered and died[1]. Most of these projects had an associated domain name.

Previously, once I've decided that a project is moribund, I've just let the domain name lapse. But now I wonder if that's the best approach. Maybe it would make sense to sell them. Some domain names can be worth quite a lot of money. Obviously I don't think I've got anything as valuable as business.com but perhaps I can make a couple of quid selling these domains.

Currently I'm considering selling these:

  • standup.co.uk - this is one of the first domains I registered. Many years ago I was running a UK standup comedy news site there. I even used it to blag press passes for the Edinburgh Fringe one year. But I ran out of ideas for it almost four years ago.
  • wishlistwatch.com - this is far more recent. Last year I saw a good talk about Amazon Web Services and this was the site where I was going to experiment with the API. It was going to be a site where you'd register your wish list and you'd get notifications (email, RSS feeds - all that kind of stuff) if anything on your wish list went down in price. I still think it's a good idea, but so does someone else and now I've lost all enthusiasm for implementing it.

So now I have a couple of problems. Firstly I have no idea how to value a domain name. And whilst there are plenty of people on the web who will do that for you, all of the decent ones (or, at least, the ones who look decent) charge for the service. And secondly I need to find somewhere to advertise and sell the domains. I've got no idea which of those (many, many) sites I can trust.

So while I ponder these issues I've just bunged Google Ads and a "this domain for sale" sign on them. Perhaps I'd be better off parking them with Sedo or someone like that.

This is all new ground to me and I'd be grateful for any advice from anyone who has done this before.

[1] I've had successful projects too. Don't want to make it sound like everything I do is doomed to failure. It's just that successful projects aren't the subject of this post.

I've been a bit busy this week, so I haven't mentioned the Amazon Web Services talk that I went to on Monday evening. Amazon Web Service Evangelist (cool job title) Jeff Barr talked for almost two hours about what Amazon are making available. It was all very interesting and I wish I had more time to investigate it in depth.

I had an idea for an application during the talk but then Jeff mentioned Wish List Buddy as an example. It doesn't do all that I was thinking of, but it's a start. Maybe I'll find time to work on my version at some point in the next few weeks.

Oh, and I loved the phrase "artificial artificial intelligence" to describe Amazon's Mechanical Turk service where you can use human input as part of your application.

Thanks to Jeff for giving the talk and Dean for organising it.

Basic URL Advice

| 2 Comments | View blog reactions

It's about time for another look at some basic mistakes that people make on the internet. Today we're going to be looking at URLs. It's important to design a useful URL scheme for your web site. The easier your URLs are to understand, the more likely it is that people with share them with their friends.

The basic premise behind all of these ideas is that by making it easier for people to link to specific parts of your site then they will link to your site and will bring you more visitors. Of course this means that you'll be encouraging people to visit your site using routes that don't bring them through the front page. Some people have a problem with that. I believe that sites that don't encourage that flexibility will slowly lose visitors to sites that do.

Domain names

We'll start with the first part of a URL - the domain name. I'm not going to talk about registering a domain, I'll assume that you've already done that. But how is your web server configured? Do you insist that people type 'www' at the start of your URL? And if so, why? There is no good reason why a visitor should type those extra three characters (four including the dot) each time they want to visit your site. It's very simple to configure your web server to respond to both names.

Memorable

A good URL is memorable. If you have an article about sheep farming that has the URL http://your_domain.com/sheep_farming.html then people will be able to find it more easily than if the URL is http://your_domain.com/0,,1704174,00.html or something like that.

This often isn't an easy problem to solve. In my experience all commercial Content Management Systems produce horrible URLs (which is why most newspaper sites have horrible URLs) and most blog software isn't much better (I realise that the URLs on this site aren't at all memorable - I have plans to improve that).

Simple

Part of being memorable is making your URLs as simple as possible. A good example of not doing that is Amazon. The URL for any product on Amazon is very simple. It looks like this http://www.amazon.co.uk/exec/obidos/ASIN/0596004761. The only bit that changes is the number at the end which defines the particular product that we are looking at.

So far, so good. That's the URL as you need it if you want to pass it on to someone else or to link to the product. But Amazon never shows you that URL. It always adds some tracking information to the end of the URL. So Amazon URLs always appear more complex than they need to be. Of course I don't expect Amazon to take this advice and change the way their systems overnight. This seems to be a good example of a company that is soe successful that they can ignore good practice whenever they want.

Other good examples of overcomplex URLs are often found on mapping sites. Try searching for a postcode on Multimap. I just got a URL back that containing 13 parameters. It looked like this (I've inserted spaces so that it wraps):

http://www.multimap.com/map/browse.cgi?client=public &search_result= &db=pc &cidr_client=none &lang= &keepicon=true &pc=SW129RW &advanced= &client=public &addr2= &quicksearch=sw12+9rw &addr3= &addr1=

A bit of experimentation revealed that only one of them was necessary:

http://www.multimap.com/map/browse.cgi?pc=SW129RW

Which of those two would you rather send to a friend?

Permanent

If you want people to link to your pages then you need to give them fixed places to link to. A few years ago I was trying to discuss a particular news story with some friends over email. The site in question (and I can't remember what it was) had ten news stories on its front page. The newest story was always http://some_news_site.com/news1.html and so on through to news10.html. As a new story was published, all the existing stories had their URLs changed. It was difficult to hold a conversation about the story. It would have been impossible to link to it in a blog entry.

Not all permanence problems are so obvious. Some news sites have free access to stories for a few days after they are published but later move them behind a registration screen (or require payment). Others move stories to a different location after some time. If your content isn't always available at the same URL then people won't link to it. Of course, some newspapers might see that as a advantage.

That's not to say that you shouldn't also have transient URLs. It's perfectly acceptable to have a latest.html link which always points to your latest article. But each article should also have a permanent URL and it should be easy to find out what that URL is.

Hackable

(And I'm using "hack" in the positive sense here.)

If your URLs reflect the structure of your site then people will be able to navigate round the site by editing the URL in the location bar of their browser. For example if /news/uk/politics/blair_resigns.html is the URL of a particular story, then /news/uk/politics should contain a list of current UK political news stories and /news/uk should contain a list of UK news.

One side-effect of this is that you need to work out all possible URLs that someone might try to visit and put something in place. You can't assume that people will only visit URLs that you publish. In the previous example you might not ever publish a link to /news/uk but you still need to put some kind of content there as otherwise anyone trying to visit that URL will get either an error page or (probably worse) a list of files in that directory.

Anchors

It's always worth adding internal anchors within your page so that people can link to specific sections of the page. For example, all comments on my blog have their own anchor so that you can link to them.

Frames

Only one thing to say about frames - don't use them. The URL for a frameset refers to its initial state. Once you've clicked on something and changed the view, it becomes impossible for someone to construct a URL which will bring someone else back to that exact view.


This is part of an occasional series of articles about basic internet technologies. The previous articles in the series are Basic Password Handling and Basic Bulk Emailing.

Google Web Survey

| No Comments | View blog reactions

See, this is a good example of what you can do if you're parsing every page you can find on the web. Google has been examining the HTML used in the pages that it trawls and has published its findings. The results are interesting but not altogether surprising. The executive summary seems to be "most people use invalid HTML"..

Hacking URLs

| 3 Comments | View blog reactions

How do you surf the web? Chances are that you're like most people and you just click on links to move from page to page. Seems that most people don't use the location bar in their browser. That's the text box near the top of your browser window that contains the URL (or, in plain English, the address) of the current web page. Even less people realise that they can edit that address and thereby go to different pages. For example, if I follow a link to http://example.com/some/interesting/page and I then want to see more of the site I'll often just edit the URL to remove "some/interesting/page" and end up at http://example.com/ which is hopefully the site's main page.

For me, and most of my geekier friends, that's a common part of our day. We'll often poke around on sites like that. It's not "hacking" (at least not in the nasty meaning of the word used by most mainstream media) it's just curiousity.

But it looks like this has just become a potentially dangerous activity. On New Years Eve, Daniel Cuthbert was using the DEC web site to make a donation to the tsunami appeal. Something went wrong with his transaction and he became suspicious and began to think that the site might be a phishing site[1]. As a bit of a geek, he poked around on the site a bit to find out what was going on. After a couple of probes he gave up and thought no more of it.

But his probes had set off an intruder detection system and his actions were reported to the police. They were able to track him down using the details of his credit card and he was prosecuted under the Computer Misuse Act.

Here's where it gets really surreal. Even though the judge accepted that there was no malicious intent in anything that Cuthbert had done, he said that he had no choice but to follow the letter of the law and to find Cuthbert guilty. He was fined £400 and ordered to pay £600 in costs. Full details of the case are here and there is comment from various security experts here.

I find this whole story incredible. There is now a precedent that says that any time you visit a web site in a way not foreseen by the site's owners, you are liable to be prosecuted. And that might cost you £1,000. As someone who regularly "hacks" URLs, I now need to be a lot more careful about the sites that I visit. Any site could potentially be monitoring accesses and looking for unusual ones. Does this mean that every time I get a 404 error, I could get fined?

It also has potential impact on me as a site owner. All web sites come under attack. Every day my web servers get probed to see if they are running software that has security holes. I just shrug and ignore it. Should I report all of these to the police? Should I report all 404 errors to the police? Can the police handle the thousands of new reports they've just opened themselves up to each day? Haven't they got more important things to do?

It just goes to show that laws which effect the ways that people use technology should really be written by people who understand that technology.

[1] A web site that pretends to be something it isn't in order to get confidential information from visitors.

Update: More detail here and the original posting about the story (from January) is here.

Neologism

| 1 Comment | 1 TrackBack | View blog reactions

Is Lloyd the first person use the word "pingosphere"? I've never seen it before. Oh wait, Google has one previous usage.

Does anyone else just think of penguins when they read it?

Rojo is an RSS reader. I tried it out for a week or so last year, but soon decided that I still prefered Bloglines and forgot about Rojo.

That is, I forgot about them until a couple of weeks ago. Then I got an email from them. They had decided that they would start to send a weekly email to all of their users which summarised the last week's exciting news in the world of RSS feeds.

There were a couple of problems with this. Firstly, I had never signed up for this. Whenever I sign up for a user account on a web site I never tick the boxes that say "I'd like you to send me marketing emails whenever you like". I supposed that when I signed up with Rojo, this option didn't exist. So they've now added that option. But they've added it for all users with the "please spam me" option turned on. So I had to go into my user account and specifically turn it off. They should have added it with the option turned off and allowed users to opt in rather than having to opt out.

The second problem was worse. I've mentioned before how (and why) I don't read HTML email. My email program is configured to show me the plain text version of any mail I am send. Except Rojo sent me an HTML email which was labelled as plain text. So in my mail program I got a dump of raw HTML. For a technical company this is a basic error. It just makes them look completely unprofessional.

I replied to their mail, explaining these problems. Only to find that their email had been sent from an email address that didn't accept email. I know this is becoming more common, but to my mind it's just plain rude. But by rummaging around in the HTML I found a feedback address and sent my complains there. I got a nice reply saying that my mail had been passed on to the technical department.

And that was the situation a week ago. I had changed my account options to ask them not to send me mail and my points about the HTML email problem had been passed on to the technical department.

So imagine my surprise when I got another email from them this morning in exactly the same (broken) format. Their "please don't spam me" account option doesn't work. And they haven't fixed their email.

I really wouldn't consider using them. They obviously have either a total lack of knowledge about basic internet standards or they have chosen to completely ignore them.

Update: I've exchanged a couple of emails with Chris Alden the CEO of Rojo and I'm convinced that these were honest mistakes. They were mistakes that a serious technology company shouldn't make, but they were mistakes none the less. Chris has also blogged on the subject.

The Internet is 10

| No Comments | View blog reactions

The internet is 10 this week. Well, no, of course it isn't. It's been around in some form or another since 1969. But a leader in today's Guardian says that this week is being celebrated as the tenth anniversary of the internet as a mass phenomenon - and I can't really argue with that.

Interestingly, the leader goes on to emphasise the connections between the internet and the Open Source movement.

Although, contrary to the instincts of its early protagonists, the web has long since been colonised by commerce, it still nurtures its founding community spirit. Nowhere is this more apparent than in the startling success of the open source movement which enables enthusiasts and professionals all over the world to work together from remote locations to produce services that are freely available for anyone with a computer linked to the internet. The thousands of products so far released include the Linux operating system (a free alternative to Microsoft's pervasive Windows), OpenOffice (an alternative to Microsoft's Word and Excel) and Wikipedia, the online encyclopedia, with well over a million entries written entirely by its readers.

Ten years ago you would never have read about Open Source software[1] in the leader column of a national newspaper. Now that's progress.

[1] Or, as it was called back then, "free software".

About this Archive

This page is an archive of recent entries in the internet category.

history is the previous category.

language is the next category.

Find recent content on the main index or look in the archives to find all content.

Archives

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.21-en

Recent Comments

  • erez.wordpress.com: I wouldn't tell, as long as you won't tell them read more
  • James Mastros: It's interesting that you bring this up now, but don't read more
  • Aristotle Pagaltzis: Thankfully, this at least doesn’t directly affect the children of read more
  • skugg: It could have been your cover letter. Did you fall read more
  • John: ebay have done it again. They have changed the system read more
  • erez.wordpress.com: Being skeptic isn't "questioning everything scientists say," but "questioning arguments read more
  • https://me.yahoo.com/tuxservers#96247: I'd go with Planet Skeptic - apart from anything else, read more
  • https://me.yahoo.com/a/fxkAuR4r0.3.JVJqDK.J.DHVMsvW: Maybe they're enraged that Google even proposed the first EULA; read more
  • Dave Cross: login.launchpad.net/+id/cMCFxsB (cool name!), I never said that installing the Theora read more
  • https://login.launchpad.net/+id/cMCFxsB: What a bunch of FUD. Installing Theora codecs is absolutely read more