Tag Archives: tech

Social Networking 101

If you have a blog and a Twitter account then it’s nice to feed your tweets onto the front page of your blog. It can be an effective way to let your friends see what you’re saying in both places.

If, however, you later delete your Twitter account then it’s probably a good idea to remove the widget from your blog.

There’s one very important reason for doing this. Eventually Twitter will allow your deleted account name to be recycled. And then someone else will be able to post tweets which automatically appear on your blog.

Say, for example, you’re an MP who has made a few enemies in her time. And say that you’ve flounced away from Twitter claiming that it is a “sewer”. In that situation you probably don’t want to leave a way open for people who don’t like you to post whatever they want on your web site.

I mean, if you’re currently campaigning about abstinence and sex education, you probably don’t want your web site to say:

I think sex before marriage should be discouraged. It’s better if at least one of you is married, doesn’t matter who to particularly.

Or:

I suppose with fisting there’s no risk of pregnancy.. ..maybe kids should be taught about that?

Sometimes I wonder if the money that Nadine Dorries spent on “PR” wouldn’t have been better spent on IT consultancy.

They’ll fix it eventually, so Tim has captured it for us.

Update: And it’s gone. That was slightly quicker than I expected. I’m now expecting a blog post from her accusing someone (probably Tim) of hacking her computer.

Independent URLs

Today Twitter got very excited about a story on the Independent web site. Actually, it wasn’t the story that got people excited, it was the URL that was being shared for the story. The story was some nonsense about Kate Middleton’s face being seen in a jelly bean. The URL was:

http://www.independent.co.uk/life-style/food-and-drink/utter-PR-fiction-but-people-love-this-shit-so-fuck-it-lets-just-print-it-2269573.html

And if you click on it, sure enough, it takes you to the story on the Independent web site. Some people presented this as evidence of a joker (or, worse, a republican) taking control of the web site. But the actual explanation is a little more complex than that.

The real URL – the one that the Independent published on the site and in its web feed – was somewhat different. It was:

http://www.independent.co.uk/life-style/food-and-drink/kate-middleton-jelly-bean-expected-to-fetch-500-2269573.html

That seems far more reasonable, doesn’t it (well, of course, the story is still completely ridiculous, but we’ll ignore that). So what was going on?

Well, if you look closely at both URLs you’ll see that the number at the end of them (2269573) is the same. That number is obviously the unique identifier for this story in the Independent’s database. That is the only information that the web site needs in order to present a visitor with the correct story. So the web site is being quite clever and ignoring any text that precedes that number. This means that you can put any text that you want in the URL and it will still work correctly as long as you have the correct identifier at the end. So the URL could just as easily have been one of these:

http://www.independent.co.uk/life-style/food-and-drink/why-do-people-still-read-the-indy-2269573.html

http://www.independent.co.uk/life-style/food-and-drink/you-can-put-any-text-here-2269573.html

The slight problem that the Independent had was that the alternative version of the URL was being shared so widely that Google was ranking it higher than the official version. So when people were Googling for the “kate middleton jelly bean” story, Google was presenting them with the dodgy version of the URL.

So why do the Independent use such a clever system if it’s so open to abuse?

One reason is for search engine optimisation. As I said above, you only really need the unique identifier for the story in order to find it in the database. And that means that the URL can be simplified to:

http://www.independent.co.uk/life-style/food-and-drink/2269573.html

But that doesn’t give Google much information about the content. So it’s generally considered good practice to have some text in the URL as well. And I suppose one of the simplest ways to implement that is to ignore everything in the URL except the last sequence of digits. That’s apparently what the Independent do.

There’s an alternative approach. And that’s to include both the text and the identifier in the URL. And to only accept a URL as valid if both match exactly. I can think of a good reason why that might not work for a newspaper web site. Sometimes newspapers change the headline on a story. And sometimes that change is for legal reasons. In cases like that you really don’t want to have the old headline left around in the URL. And you don’t want to change the URL as any links to the original URL will no longer work. In cases like that, the Independent’s approach works well. You can change the headline (and, hence, the URL) as often as you like and everything will still work.

Incidentally, whilst researching this post I found that the Daily Mail had written a rather gloating article about the Independent’s problems today. The URL for that article is:

http://www.dailymail.co.uk/sciencetech/article-1378504/Embarrassment-Independent-URL-twitter-fiasco.html

What’s interesting to note is that the text portion of that link is just as flexible as the Independent link. I can change it to:

http://www.dailymail.co.uk/sciencetech/article-1378504/The-Daily-Mail-Is-A-Bit-Crap.html

And everything still works correctly. The big difference between the two implementation is that the Mail version will redirect the browser to the canonical version of the URL whereas the Independent will leave the alternative URL in the browser address bar. I have to say that, in this case, I think the Daily Mail is right.

It’s not just newspapers that have this flexible approach to URLs. Amazon URLs have a flexible text section in them too. Each item that Amazon sells has a unique identifier, so the canonical Amazon URL looks like:

http://www.amazon.co.uk/dp/B003U9VLKG/

But whenever you see a URL on Amazon, they have added a descriptive text field:

http://www.amazon.co.uk/Harry-Potter-Deathly-Hallows-Part/dp/B003U9VLKG/

But, as with the newspaper URLs, that text field can be changed to anything. It’s only the identifier that is required.

http://www.amazon.co.uk/Three-Hours-Of-A-Boy-Looking-Glum-In-A-Tent/dp/B003U9VLKG/

Hours of fun for all the family.

Your mission, should you choose to accept it, is to find other web sites where there’s a ignored text section in their URLs. Please post the best ones you find in the comments.

Bonus points for getting one of the papers to write about your prank.

Update: Here’s Independent editor Martin King’s take on the incident. He says that the system is used for exactly the two reasons I mentioned above – “The feature has search engine benefits but from an editorial perspective it enables us to change repeatedly a headline on a moving article.”

Moonfruit and Techcrunch

For the past few weeks I’ve been working with Moonfruit. They have been working to replace their rather aging web site with something that looks a lot more contemporary.

Today was the day that the new version went live. And it was also the day that I got an interesting lesson in how marketing works in our digital world.

The company’s co-founder Wendy Tan White had been interviewed by Techcrunch and we were expecting that article to be published at about lunchtime. In order to get an idea of when the article went live, I set up a search panel on TweetDeck watching for mentions of “moonfruit” on Twitter.

During the morning there was a steady stream of mentions. This was largely people pushing their Moonfruit-hosted web sites. Then at about 12:25 that all changed. Where previously each update of the search was bringing in two or three new results, suddenly there were twenty in one go. And then another twenty. And another. And another.

On closer inspection I saw that the vast majority of them were exact reposts of this tweet from @techcrunch.

500 Startups Bites Into Moonfruit’s Simple Site Builder For Design Fans http://tcrn.ch/dYbp98 by @mikebutcher

Some of them were retweets, but most of them were automated reposts (often using Twitterfeed). In the first twenty-five minutes I estimate that the story was reposted 400 times. By now (about nine hours later) the number must be two or three times that.

I was astonished to see this volume of reposts. I knew that a story on Techcrunch was good publicity, but I had no idea just how good it was. That’s an incredible number of people who have been told about this article – and, hence, the Moonfruit relaunch.

But there’s another side to this. Why are there so many automated systems set uo to repost tweets from Techcrunch? I know that Techcrunch is a useful source of tech news, but doesn’t that mean that anyone who is interested in tech news will already be following @techcrunch on Twitter? If every tweet from @techcrunch is repeated a few hundred time and @techcrunch posts a few dozen tweets easch day, isn’t that a few thousand pointless tweets? I’m sure that these two or three hundred reposters aren’t amplifying Techcrunch’s reach by two or three hundred times. I’d be surprised if they were amplifying it by even ten times.

So what is the point of these hundreds of reposting engines? Is it some kind of spam system? Or an SEO trick? Or are there really hundreds of people out there who think that their followers benefit from reposted content from Techcrunch?

You might be wondering why I haven’t linked to any of the reposts. Well, of course, in the nine hours it’s taken me to get round to writing this post, most of them have vanished from Twitter’s search engine. Does that mean they were scams that Twitter has cleaned up? Or does Twitter’s search engine just have a really short lifespan?

Where’s Your Data

We hear a lot of talk about how cloud computing is the future. Those of us who still run some of our own internet infrastructure are increasingly seen as slightly eccentric and old-fashioned. Why would anyone host their own mail server when we have Gmail or run their own blog when there is WordPress or Posterous. In fact, why have your own server at all when you can just use Amazon EC2?

Well during September I was reminded of the downside of the cloud when I almost lost two old blogs.

One of the earliest blogs I wrote was on the use.perl web site. Yes, it all looks a bit ropey now, but back in 2001, it was cutting edge stuff. Everyone in the Perl community was using it. But it never really had a service level agreement. It was run on someone’s employer’s network. And, of course, that was never going to last forever. Earlier this month he announced that he was leaving that job and the use.perl would be closing down. Currently, I think that the site is in read-only mode and there are some people in the Perl community who are trying to set up alternative hosting for the site. I hope that comes off. There’s almost ten years of Perl history stored up in that site. It would be a shame to see all those URLs turn into 404s.

And then there’s Vox. I never really used Vox that heavily, but I dabbled with it for a while. And now it’s also closing down. Six Apart put in place some procedures to transfer your blog posts to TypePad, but for reasons I couldn’t work out, that didn’t work for me. What I really wanted was to import the data into this blog (which runs on Movable Type, another Six Apart product) but for some reason that option wasn’t available. In the end I managed to import the posts into Posterous, but I seem to have lost all of the tags (not really a problem) and the comments (a pretty big problem), Oh, and I’ve just noticed that the images are still being hosted on Vox. Better fix that before Vox closes down – tonight.

So I’ve learnt an important lesson about trusting the cloud. It’s all very well putting your data up there, but be sure that you have an exit strategy. Find out how you can get your data out. And how much of your data you can get out easily. I put all of my photos on Flickr, but I keep copies locally as well. But the again, that’s not really enough is it? Sure I’ve got the photos, but if Flickr closes down tomorrow, I won’t have all the social interactions that have built up around my photos.

These scares have made me start to think about these issues. And I’ve been tracking down some other old stomping grounds. I’m pleased to report that my first ever blog (hosted by Blogger, which is now owned by Google) is still available.

Where’s your data? How much could you reconstruct if Facebook closed down tomorrow?

Email From The PM

We’re all, no doubt, used to getting 419 scams in email. I get several a day, but they’re not often as brazen as this.

PRIME MINISTER’S OFFICE

TREASURY AND MINISTER FOR THE CIVIL SERVICE,

LONDON, UNITED KINGDOM.

Our ref: ATM/13470/IDR

Your ref:…Date: 14/09/2010

IMMEDIATE PAYMENT NOTIFICATION

I am The Rt Hon David Cameron MP,Prime Minister, First Lord of the Treasury and Minister for the Civil Service British Government. This letter is to officially inform you that (ATM Card Number 7302 7168 0041 0640) has been accredited with your favor. Your Personal Identification Number is 1090.The VISA Card Value is £2,000,000.00(Two Million, Great British Pounds Sterling).

This office will send to you an Visa/ATM CARD that you will use to withdraw your funds in any ATM MACHINE CENTER or Visa card outlet in the world with a maximum of £5000 GBP daily.Further more,You will be required to re-confirm the following information to enable;The Rt Hon William Hague MP First Secretary of State for British Foreign and Commonwealth Affairs. begin in processing of your VISA CARD.

(1)Full names: (2)Address: (3)Country: (4)Nationality: (5)Phone #: (6)Age: (7)Occupation: (8) Post Codes

Rt Hon William Hague MP.

First Secretary of State for Foreign and Commonwealth Affairs

Email; bfcaffairs@info.al
Tel: +447405235350

TAKE NOTICE: That you are warned to stop further communications with any other person(s) or office(s) different from the staff of the State for Foreign and Commonwealth Affairs to avoid hitches in receiving your payment.

Regards,

Rt Hon David Cameron MP

Prime Minister

I’ve left the contact details in there as I feel sure that William Hague doesn’t really use an Albanian email address:-)

The email pretends to come from an address at the directgov.uk domain (note, not direct.gov.uk) and the reply-to address goes to Thailand.

Opentech 2010

On Saturday I was at the Opentech conference. Some brief notes about the sessions I saw.

The day was sponsored by data.gov.uk, so it seemed polite to see one of their sessions first. I watched Richard Stirling and friends talk about some of the work they’re doing on releasing lots and lots of linked data. There were some interesting-looking demonstrations (using a tool that, I believe, was called Datagrid [Update: Sam Smith reminds me that it was actually Gridworks]) but I was in the back half of the room and it was a little hard to follow the details. The session also had a demonstration of the new legislation.gov.uk site.

The next session I attended was in the main hall. Hadley Beeman talked about the LinkedGov project which aims to take a lot of the data that the government are releasing and to improve it by adding metadata, filling in holes and generally cleaning it up.

Hadley was followed by Ben Goldacre and Louise Crow who have a cracking idea for a web site. They want to expose all of the clinical trial data which never gets published (presumably because the trial didn’t go the way that the people running it wanted it to go). They already have a prototype that demonstrates which pharmaceutical companies are particularly bad at this.

The final talk in this session was by Emma Mulqueeny and a few friends. They were introducing Rewired State, which runs hackdays to encourage people to build cool things out of government data. I was particularly impressed with Young Rewired State which runs similar events aimed people under the age of 18,

It was then lunchtime. That went disastrously wrong and I ended up not eating and getting back late so that I missed the start of the next session. Unfortunately I missed half of Louise Crow’s talk about MySociety’s forthcoming project FixMyTransport. I stayed to watch Tom Steinberg give an interesting explanation of why he though GroupsNearYou hadn’t taken off. Finally in this session, Tim Green and Edmund von der Berg talked about how three separate groups had worked together on some interesting projects during the last general election.

I was speaking in the next session. Unusually for Opentech, the organisers decided to have a session about the technology that  underlies some of the projects that the conference is about. I talked about Modern Perl, Mark Blackman covered Modern FreeBSD and Tom Morris introduced Modern Java (or, more accurately, Scala).

The next session I attended was largely about newspapers. Phil Gyford talked about why he dislikes newspaper web sites and why he built Today’s Guardian – a newpaper web site that looks more like a newspaper. Gavin Bell talked about the future of social networking sites and Chris Thorpe talked about automating the kind of serendipity that makes newspapers such a joy to read.

For the final session I went back to the main hall. Mia Ridge talked about why the techies who work for museums really want to open up their data in the same way as the government is now doing and asked us to go banging on the museums’ doors asking for access to their data. And finally Robin Houston told some interesting stories about the 10:10 campaign.

As always the conference was really interesting. As always there were far too many things that I wanted to see and in every session I could have just as easily gone to see one of the other tracks. And as always, I have come away from the conference fired with enthusiasm and wanting to help all of the projects that I heard about.

Of course, that’s not going to happen. I’m going to have to pick one or two of them.

If you weren’t at Opentech, then you missed a great day out. You should make an effort to come along next year.

BBC Radio Streams

I’ve just written this over on my BBC Radio Streams page:

I’ve got email from a couple of people saying that the Real Audio radio streams were finally turned off overnight. This means that the few links left on these pages (and any links that you have saved from earlier versions of this page) will no longer work.

I expected this day to come at some point. The BBC really want everyone to use the Radio iPlayer instead.

This does, however, pose a problem for people who where using the Read Audio streams to power internet radios and similar devices. I’m not sure that there’s a solution to this problem, but I’ll have a poke around and see if I can see if I can find a way around it.

Other than that, I’d just like to say thanks for using these pages during the five and a half years that they have been live. When I sat down to hack out a quick solution in November 2004 I had no idea how many people would find the pages so useful.

I’d also like to thank the BBC for the enlightened approach they took to my pages. They could easy have just asked me to close the site down, but instead they chose to turn a blind eye and take my pages as an indication of something that was missing from their site.

Update: I’ve just found this entry on the BBC Internet Blog. The BBC have introduced live streaming of their radio stations to various mobile devices. I haven’t investigated in detail, but this looks like it might be a replacement for the Real Audio streams.

The Political Web

Long-time readers might remember The Political Web, a web site that I threw together at a BBC hack day a couple of years ago.

The site has languished as I haven’t had time to do anything with it for well over a year, but last night I refreshed the database that powers it so that it now contains details of all of the new constituencies and MPs.
I have other plans too (just no real idea when I’ll have time to implement them).

Modern Campaigning

I got in touch with all of the Battersea candidates who aren’t publishing web feeds to ask them if there was anything I had missed. I only got a reply from one of them.

But that’s ok. They’re probably busy. Campaigning is a time-consuming business.

The response I got was marked as “not for publication” so I’m not going to quote from it. I’m not even going to say which of the candidates it was from. But I do want to paraphrase and reply to the main couple of points that were raised as I think they indicate a lack of understanding about digital campaigning that is probably more common than we’d like to believe.

Firstly, the candidate expressed a concern that starting to use something like Twitter would set up an expectation for two-way communication that would be hard to meet. And it’s true, of course, that I really like to see Twitter being used for dialogues rather than monologues. I’ve written about that several times. But given a choice between people using tools in ways I don’t really like or them just not using them, then I’m very happy to lower my standards. And it’s not like treating Twitter as a one-way medium isn’t even an unusual way to use it. Many people use Twitter like that. Here’s Tory MP Douglas Carswell telling me that he sees Twitter as a “RSS feed” – by which he means something that he publishes for people to read rather than something that he uses as a source of information.

Secondly, the candidate claims not to have the time to keep web feeds updated. And I think that just comes back to using the wrong tools (something else we’ve discussed on this blog). If your web site is run using decent software then it will be automatically publishing a web feed whenever you write a new entry. Tie that up with something like TwitterFeed and you’ve got an automatically updated Twitter account too. I know that the people standing for election will not usually be geeks who know this kind of thing but digital communication is important and I would expect that any candidate will be able to find a tame geek to help out with things like this.

The candidate heavily implied that “the old ways are the best”. That time spent knocking on people’s doors was far more useful than time spent playing with computers. And whilst I would never suggest that time spent knocking on doors isn’t useful and important I think that time spent playing on computers is just as important and has the capability of reaching a far higher percentage of the electorate far more efficiently. Imagine if candidates had reached a similar conclusion about campaign leaflets (“oh no, we need to actually speak to the voters – we can’t just leave a leaflet”) or party political broadcasts (“one-way communication can’t work – it needs to be conversation”).

It’s all about getting your message across to as many people as possible as efficiently as people. You might get away with it this election. But by the next one, a candidate who doesn’t use digital communication efficiently will look hopelessly outdated.

The People’s Pamphlet

Update: Ok, yes, we admit it. It was an April Fool’s joke. Well most of it was. I’m not really going to be taking a month off to live in a camper van with Tim and Sim-O (though I’m sure it would have been fun!)

But the wiki really exists. And we really want your help to create a pamphlet that we can distribute to the voters of Mid-Bedfordshire.

I expect that Tim and Sim-O will also be coming clean about now. Here are the full details from Tim.


Hopefully you’ll have seen this morning’s posts by Tim and Sim-O about our new project aiming to bring the politics of accountability to the good burghers of Mid-Narnia. Their MP, Nadine Dorries, is famous for avoiding questions that she doesn’t want to answer so we’re going to to our best to ensure that the Mid-Narnians get the answers they deserve during the election campaign. Tim is in charge of high level strategy, Sim-O has sorted the wheels and I’m the project geek.

A project like this has a few interesting challenges for a geek. Firstly I had to hack a GPS system so that it would guide us through the back of the wardrobe. But secondly, and more importantly, I had to come up with a wiki.

“A wiki?”, I hear you cry, “What would a political campaign want with a wiki?” And I’m glad you asked. Because I’m going to tell you. You see, this isn’t just any old political campaign. No, this is Politics 2.0. We’ll be using the power of Social Media. We’ll be crowd-sourcing some of the campaign’s contents [Is that enough buzzwords, Tim?]

We all have our own ideas for what questions Mad Nad should be answering. Personally, I’d like to ask how many foetuses she saw ripping holes in their mothers’ stomachs whilst she was a nurse. But we need to realise that what’s important to us might not be import to the people of Mid-Narnia. Hence the need for the wiki. This afternoon we’ll be throwing it open for people to suggest questions for Ms Dorries. Once we have broad agreement on the contents of the “people’s pamphlet” we’ll lock the page and print copies of the pamphlet to be distributed in Narnia.

But a wiki is a dangerous thing. Particularly on a contentious subject like this. We need to be sure that everyone who contributes is doing so constructively. So we’ve put some measures in place to try and minimise the amount of vandalism. We’re using a standard installation of MediaWiki to which we added the Confirm Accounts extension. This means that only registered account holders will be able to edit the wiki. And we’ll only being handing out accounts to people with confirmed email addresses. So if anyone starts being stupid, we’ll know exactly where to send our strongly-worded emails of rebuke.

However, it seemed to me that this might not be enough. And late last night I had another idea which I was up until 3am implementing. I’ve written another extension which increases security even more. Now you’ll only be able to edit the wiki if you have a webcam attached to your computer. And the webcam will take photos of you whilst you are editing. The photos will be uploaded to a secure server in Switzerland where they will only be accessed in case of a dispute over the authorship of particular changes. I’m sure I don’t need to emphasise the importance of remaining fully clothed whenever editing the wiki.

Still a few wrinkles to iron out – but once I’m happy with it I’ll be releasing the source code under an open source licence.

Looking forward to seeing some of you in Mid-Narnia over the next few weeks.