Hack Day Plans

This weekend is Yahoo!’s Hack Day. And as in the last two years, I’m going to be there. Although (also like the last too years) I’m far to old and soft to consider staying up and hacking through the night. I’ll be leaving at a reasonable time on Saturday evening to get home to a comfortable bed. This will be easier than in previous years as this year’s venue is near the tube network (Alexandra Palace is a lovely venue – but a real bugger to get to).

So the question is, what to hack on. Actually I already have some ideas. And (unsurprisingly for those of you who are regular readers) it’ll be based around the local community stuff that I’ve been writing about (and talking to Lloyd about) recently.

Here’s the current plan.

Building local planets is all very well, but it can be hard work to get a good one going. As I’ve mentioned before, you need good local knowledge to pick up interesting feeds about a location. This certainly doesn’t scale to building local community sites for the whole of the UK (well, not without a lot of help). But I think you can get a lot of the way there – close enough to be useful – with an automated process. Last month I mentioned some feeds that I was using as a basis for all of my local planets. I think that’s an idea that is worth exploring further. There are other feeds that can be added to that list. Things like MySociety‘s FixMyStreet and GroupsNearYou. There are also things like TheyWorkForYou‘s feeds of when your MP has spaid something in Parliament.

One problem with this approach is that localities aren’t named consistantly. For some of these feed you need a placename (a Google news search for news mentioning “Balham”) and for others you need a postcode (which MP represents SW12). I’ve been looking at Yahoo!’s GeoPlanet API and it looks like it will get me some way towards solving this problem (as a bonus, there’s already a Perl module for it).

All of which leads me to my plan. A service that builds automated web sites providing local information for communities in the UK. I’m imagining that you put in a post code (or, perhaps, a placename) and it goes away and builds a useful and interesting web site for you.

I have no idea how close I’ll get in 24 hours of hacking, but it will be an interesting experiment. If you’re going to be Hack Day and this sounds interesting to you, then please get in touch.

Local Planets

Over the last few years I’ve written a few times about how I’ve been building planets. A planet is a web site which aggregates web feeds on a particular topic and republishes them as a combined web site (almost certainly with a combined web feed as well). One of my earliest planets was Planet Balham which combines feeds about Balham, the area of London where I live.

The idea of using the internet to bring together local communities has been gaining a lot of traction recently, so I’ve been doing a bit of work on Planet Balham firstly to improve the design and secondly to make the content as interesting as possible. I’ve also promoting it a bit and, as a result of that work, someone suggested to me earlier today that a Planet Streatham might also be useful. For those of you who don’t have an encyclopaedic knowledge of London geography I should probably point out that Streatham is the area to the east of Balham.

I have a pretty good system in place for building planets quickly so in my lunch break I threw together a quick prototype for Planet Streatham. And then (because I was on a roll) I did Planet Tooting and Planet Clapham too.

Of course, the problem with building planets is finding good content. For Balham, I have some local knowledge and I’m pretty happy with what I have[1]. For Streatham, Tooting and Clapham I have less local knowledge and, to be honest, less inclination to research the subject. But I’ve discovered that you can actually build a decent local planet with just a few standard feeds. And those are what I’ve used for the new planets. I think these would be a good start for any local planet. The great thing about them is that it’s easy to customise them for any other location.

In all cases, I hope it’s obvious how to customise the  link.

Automated searches aren’t without their problem, of course. Since I’ve been following Google’s news search for “Balham”, I’ve learned more than I really wanted to about Nebraskan basketball player Chris Balham. But that’s only to be expected and the “real” results far outweigh the problems. Initial results indicate that the problem might be worse for Planet Tooting. Tuning the search terms – perhaps to include “London” – might be an improvement.

So I have basic planets for Streatham, Tooting and Clapham. I don’t intend to spread my empire any further. But I’d really like to see more local planets like this springing up. I’ve already had a couple of people contact me on Twitter about creating others. There’s already a Planet SE16, but there’s no reason why every part of London shouldn’t have one.

The technology isn’t hard. My planets are built with my own software, but I expect pretty much any language will have some kind of planet application available. I’ll write in more detail later this week about how I’ve built mine, but if I’ve inspired you to build one, please let me know and I’ll start some kind of directory.

[1] Still interested in adding more though, let me know if you know of a good local feed that I’m missing.

Overcomplicating Matters

It is, of course, a truism that the larger and more complex a project is, the less likely it is to come in on budget, on spec or in time. When the project in question involves IT, the chances of any of the original targets being met approach zero.

This is why one of the tenets of the Agile Programming movement is “the simplest thing that can possibly work”. When faced with a problem, solve only the current problem. Don’t waste time complicating matters by adding extra features (“you ain’t gonna need it” is another of their slogans).

Writing a blog engine is a pretty simple project. A reasonable programmer could produce a pretty good first attempt in a weekend given the right tools. But ask any of the worlds most talented programmers for a blog engine and the chances are that they won’t spend the next two days writing one. Because that’s not the simplest solution.

The world already has more blog engines than it really needs. And many of the existing tools are quite capable of handling just about any requirement that you might have. Unless you have very specific requirements there is just no need to write your own.

Hold that thought.

Five or six years ago, blogs started to become popular. Tim Ireland wrote this article. What Tim realised was that the blog format was a great tool that politicians could use to communicate with the electorate. By their very nature blogs encourage two-way communication. The blogger posts information that they want people to be aware of and people can add their comments. Also, the web feeds that are an essential feature of blogging engines make it easy to disseminate and aggregate this information.

Later on, Tim launched the Political Weblog Project, where he offered to set up blogs for any politician who wanted one. What Tim and his team would have done was to set up blogs using the existing free tools like Blogger and Movable Type. They also wanted to offer advice on the most effective way to use blogs. As I understand it, only two MPs (Tom Watson and Boris Johnson) took Tim up on his offer.

Time passes. At some point in the last couple of years the popularity and usefulness of blogs finally started to seep into Westminster. MPs started to want blogs. A small number of MPs did “the simplest thing that could possibly work” and got a blog on Blogger or WordPress. Many others didn’t do that. And that’s where things start going wrong.

I have an interest in blogging MPs. I run a site called Planet Westminster which attempts to aggregate all of the blog postings from Westminster MPs. At the last count there were about 40 MPs blogging. I say “about” as it’s hard to keep track. Some MPs have a burst of enthusiasm for a few weeks and then give up. It’s hard to be sure when their blogs are dead so I can remove them from the list.

But the biggest problem I have is that most MPs blogs are crap. And I don’t mean that what they write in them is crap (though that’s certainly true for many of them). I mean that they are crap from a technical perspective. When faced with the desire to set up a blog, it seems that most MPs had no idea where to start. And that for some reason many of them ended up with horrible proprietary systems that bolted on to their existing web site. These systems were often written by people who really didn’t seem to understand the simplest things about how blogs or the web worked[1]. One good example is Nadine Dorries blog. A basic requirement for a blog is that each entry has an address (a “permalink”) which refers to that entry uniquely and permanently. Dorries’ blog has some weird date-based system which gets horribly confused if she blogs more than once a day.

Planet Westminster is a feed aggregator. So most of the contact that I have with MPs’ blogs is through the web feeds that they produce. Web feeds seem simple enough to produce, but the various formats are picky enough that it’s a hard job to get exactly right. That’s another reason for using the existing tools. They get it right far more than some home-brewed system will. My local MP is Martin Linton and it was a problem with his web feed that galvanised me into writing this article. I’ve been tracking Linton’s web feed for several years. It often vanished without warning and, on further investigation I find that it has changed address (there are methods for handling that without losing existing subscribers – but that’s another area where MPs’ IT knowledge seems to be lacking).

Earlier today I realised that I had seen nothing from Linton’s feed for some weeks. Checking his site I saw that the feed has moved again. The new address is:


Now, I know I can be a picky about wanting nice-looking URLs. But, honestly, how much faith can you put in a system that produces URLs like that? Unsurprisingly, the answer is “not very much”. Checking the feed with a feed validator revels a relatively small number of errors, but they are really serious ones. In particular, having incomplete URLs in a web feed renders it almost completely useless. I should run all of the MPs’ feeds through the validator. Well, really, the people creating their feeds should. They might learn something useful.

So we have a situation where a small number of MPs are publishing blogs and most of the ones who are doing so are using seriously sub-standard tools. And this is where we come back to my original point. The simple blog systems that are already out there are perfectly adequete for what our MPs want to do. In most cases using the tools is free and it takes less than an hour to set up a blog that is more functional than the ones that most MPs currently have.

I know that most MPs run their office on a tiny staff. And that they probably don’t employ IT experts. But every week thousands of people set up blogs and they do it using the existing tools – because it is cheap and effective. Even people like Iain Dale who know nothing at all about blogging have been able to choose a decent blog platform. Why do our MPs feel that they need something more complex? Why do they waste time and money on systems that aren’t as good as the free solutions? It doesn’t need to be that complicated.

I believe that Tim Ireland’s points from 2003 are still valid. Blogs can be an important and useful tool for politicians. And in the run-up to next year’s General Election they will become more and more important. I predict we’ll see a large increase in the number of blogging MPs over the next year.

So I’m going to repeat Tim’s offer from 2005. If any MP wants a blog set up for them,then I’m happy to help them or to put them in touch with someone who can help them. It needn’t be expensive. It needn’t be complex. But it can be very effective. And it will work.

[1] A lesser writer might make the point that a large proportion of these broken system are written in ASP. But we’re way above such petty point-scoring here.

Internet Genealogy

In 1992 I started tracing my family history. The two main tools for amateur genealogists (at least until they get back to about 1840) are the indexes of registrations of births, deaths and marriages and the returns from the census which has been taken every ten years since 1841 (there are earlier censuses, but they don’t record individual names).

Back when I started, accessing these records was a painfully manual process. The BMD indexes were held in large leather-bound volumes in St Catherine’s House on the Aldwych. The members of the public were free to search these volumes looking for references to their ancestors. Once you had the reference numbers you needed, you could fill in a form, pay £5.50 and a week or so later a copy of the certificate would drop through your letterbox.

The census records were slightly easier to deal with. They had been scanned onto film and microfiche, so you had to go to the Public Record Office on Chancery Lane to spend hours searching for your ancestors’ names – often written in a hard to read nineteenth century hand. And, of course,  the records were ordered by parish, so if your ancestors moved it became a very hit and miss affair. I spent many days hunched over a microfiche reader or risking physical damage by lugging the oversized BMD indexes around and I still have piles of notebooks full of the notes I took over fifteen years ago.

I largely stopped research several years ago. It was just too hard to make much progress. I didn’t have the time to put into in. Towards the end of the period I was working on the project the census and BMD records were both brought together in the Family Records Centre in Islington, but the basic process was still just the same.

Recently I decided to get back into tracing my family tree. And I’m amazed to see how much things have changed in  the intervening years. These days you can do most of what I did fifteen years ago from the comfort of your own home. All of the census records are available online as are a large proportion of the BMD indexes. I put this down to a combination of two factors. Firstly the Public Records Office (who own the census) and the General Register Office (who own the BMD data) have become more aware of the potential of sharing this data across the internet. And secondly there has been a massive increase in the public’s interest in genealogy. This is obvious from the success of TV shows like Who Do You Think You Are and the large number of family history magazines that are published each month.

It hasn’t all been successful. When the 1901 census was first put online in 2002, the site soon collapsed under the strain and remained unavailable for over six months. These days the government just licenses the data to commercial organisations like Ancestry and FindMyPast and lets them deal with the scaling issues. This leads to a slightly confusing situation where different companies have access to different sets of census data and you might end up having to pay more than one company in order to have access to all of the data you need. It’s not ideal but it’s far better than it was when I started out.

For example, all of the census search sites have indexed the data. That means that you’re no longer just skimming scans of the original documents. You can search for names and you’ll be given a list of matching records from anywhere in the country. That has helped me track down a large number of previously missing ancestors. Of course, you’re relying on someone else’s interpretation of nineteenth centrury handwriting, but you get used to typical transcription errors. I’m finding that my mother’s surname, Sowman, is often mistranscribed as “Lowman”. An easy mistake to make if you see the original document.

BMD records are also being indexed. But at a slower rate. A wonderful project called FreeBMD are working on it. Currently their coverage is great for the nineteenth century, but patchier for the twentieth. They’re working on that though and are still looking for volunteers to help with the project.

Soon after I started out, in the 90s, I bought a book called “The Genealogists Internet”. To be honest it was rather a desultory affair. There wasn’t much out there and what there was had been created by genealogists with very little knowledge of the power of the internet. Recently I bought a copy of the fourth edition. And what a change their has been. These days the internet has amazing amounts of genealogical data available. The book’s web site has a links section which I’m still working my way through almost a month later. Plenty of interesting stuff there.

If you’re interested in tracing your family tree then now is a great time to start. You can make great progress just sitting in front of your computer. I’ve got my tree back to the late eighteenth century without any trouble at all. And I’m from a line of complete peasants who made no mark on the world at all.

If you try to trace your family (or if you already have), I’d be very interested in hearing how you did.

TV Licence by Email

I’m not the world’s most organised person. I’m forever losing important pieces of paper. Over the christmas break I went through some of what I laughably call my filing system and attempted to impose a little more order.

One of the important pieces of paper I found was my TV licence. In an attempt to cut down the number of important pieces of paper I have to keep track on, I visited the TV Licensing web site and was happy to see that they have a Licence by Email facility. That seemed to be a good idea, so I signed up.

Today I got my first email from them. Opening it up I saw the text:

Your TV Licence is available

And nothing else. No information on how I could get my licence, or anything useful like that.

I realised what had happened. For various reasons I always read incoming mail in plain text mode. When I switched Thunderbird to HTML mode I saw that there was a carefully constructed HTML section as well and that this section had lots more information, including a link to a PDF of my new licence.

It’s nice that they bothered to create a text portion of the mail (many people still don’t). But I’m sure it could have been a bit more useful. Perhaps they could have included the link to my licence in the text portion. That way I wouldn’t have had to open the HTML version at all.

It’s a real shame when you see people trying to do the right thing, but just not understanding enough to get it right. I bet someone made a lot of money designing that system.

Oh, and the mail came from a “do not reply” email. That’s just rude. What’s the point of an email address that you can’t reply to?

Combining Google Accounts

Somehow over the last few years I have acquired two Google accounts. One of them is associated with my Gmail email address and the other is associated with my dave.org.uk address. Recently I heard that the G1 phone ties itself to a single Google account when you activate it, so if I’n going to get a G1[1] then I need to combine them as far as I can.

This has proved to be a bit of a battle. And as it gives an interesting insight to how Google’s tools aren’t quite as integrated as they would like you to think they are, I thought I’d write up my experiences so far.

My first approach was to find some way to just merge  the two accounts. That would have been great – just take the data from both accounts and combine it. But there wasn’t an option to do that. I could add other email addresses to my dave.org.uk account, but they explicitly stop you from adding Gmail accounts. So I was left with trying to combine things a product at a time. I decided that I wanted to move everything over to the Gmail account.

Google Calendar
I’ve been using Google Calendar a lot recently. But it was on the dave.org.uk account. So I wanted to move control of that calendar to my Gmail account. That proved to be impossible. I could give the Gmail account complete access rights to the calendar, but I couldn’t give it ownership. In the end I exported the calendar to a .ics file and imported it into the other account.

Google Docs
I have a number of documents in Google Docs. As with Calendar, it’s easy enough to give another account complete rights to access and update you documents. And, even better, there’s a new feature to transfer ownership of documents to another account. There’s a rather scary-looking warning that you can only transfer ownership to another account from the same domain, but that didn’t seem to be a problem as I was able to transfer documents from my dave.org.uk account to my Gmail account. Well, I could transfer some of my documents. For some reason, thsi feature isn’t currently supported for spreadsheets. So I hav a bout a dozen spreadsheets that are still owned by the wrong account. I suppose I can download them as OpenOffice files and then recreate then in the other account. But it seems rather a roundabout approach.

Google Analytics
This worked well. I could add another account as an adminstrator of my Google Analytics account. And then that account could remove access from the original account. If only all the transfers were as simple as this one.

Google Adwords
I have a couple of small ad campaigns running through Google Adwords. This transfer was supposed to be simple. You can replace the owning Google account with another Google account. Except, apparently, my Gmail account was already the owner of another Adwords account. This might be to do with the connection between Adwords and Analytics. Anyway, I just closed down the old account and opened a new one.

Google Adsense
This is the one that it’s most important to get right. I don’t want to lose any money from my Adsense account. And I’d really like to hold on to all of the historical data from the existing account. I can’t see any way to transfer control to another account, so currently I’m thinking that I might have to keep the old account open. If anyone has any advice, I’d love to hear it.

Google Maps
Trivial but annoying. I’ve got a map stored in Google Maps (it’s the one on my Livery Companies site). As with Calendar, I can share it with the other account but I can’t actually transfer ownership (as far as I can see). It would only tak an hour or so ot recreate it, but it’s annoying to have to take that time.

Google Groups
Another slightly annoying one. I can obviously unsubscribe from all of my dave.org.uk groups and resubscribe from Gmail (there are only eleven of them). But then I’d get the mail in Gmail and I’d really rather that it continued going to dave.org.uk. I suppose what I’d like to do is to make another email address the main address on the old dave.org.uk account, then move the dave.org.uk address into the Gmail account. I haven’t looked to see if you can do that yet. Something to try this evening. [Update: I’ve just looked. It seems you can’t remove the primary email address on an account]

Having so many linked services run by one company is supposed to make life easier. But having battled with this over the last couple of weeks, it’s clear that these services aren’t as closely linked as you think they are.

I wonder what proportion of Google’s customers have multiple accounts, and how many of them have tried to correct that. I bet most of them just give up.

If anyone has any stories about this (or, even better, inside information) I’d love to hear them.

[1] Actually, having had experiences similar to Nik’s it’s becoming less and less likely that I’ll get a G1. But I still think this is a useful exercise.

Livery Companies – Project Complete

(Well, stage one of the project, anyway.)

A couple of years ago whilst I was working in the heart of the City of London, I noticed that my lunchtime wanders were taking me past a few of the City Livery Halls. I’d always been aware of the Livery Companies, but I’d never really investigated them, so I didn’t know how many of them there were or how many still had Livery Halls. So I decided to find out a bit more about them.

I also started taking photos of the halls that I passed. Of course, when you have the collector gene that I have, just taking pictures as you wander past buildings randomly isn’t enough. I had to find out where all of the remaining halls were and get pictures of them.

And finally, a couple of months ago (as I was walking to a London.pm meeting) I took photos of the last three. I only uploaded them to Flickr last night as I had some trouble with Shozu (which may or may not be related to the general phone weirdness I mentioned last week). But anyway, I fixed the phone last night and was able to upload the final pictures.So now I have a set of photos which (as far as I know) contains all of the Livery Halls. There are forty-one pictures in the set, but one of them is a plaque marking the site of the Cooks’ Hall which is no longer there (they kept burning it down). If you know of any I’ve missed, I’d love to hear about it.

Why do I say that this is just the end of stage one of the project? Well, I was a bit disappointed to see that there was no good site on the web to get information about the Livery Companies. What information there is out there is scatter amongst a number of sites. So I decided to put that right. I’m in the process of building liverycompany.org.uk which will hopefully become the definitive place on the web to find information about these fascinating institutions.

More Password Idiocy

When will web sites start to be careful with people’s passwords? Oh, I know that a few sites get it right, but it seems to me that the vast majority still don’t have a clue what they are doing. Here is today’s example.

I got an email this morning from a company called RAM (that’s Research and Analysis of Media). Somehow they knew that I was an (occasional) Observer reader and they were inviting me to join a panel that would (as I understand it) answer occasional surveys about the Observer. It sounded like a good cause, so I signed up. As part of that process I gave them both a username and a password. They immediately confirmed my sign-up by sending them both back to me in an email.

That is, of course, a serious cause for concern, but there’s a slim chance that they aren’t storing my password in an accessible form in their database. The mail might have been generated from the data in the web form I filled in. However, an hour or so later I got another mail from them telling my how to log into me account and including my username and password. In fact, that one email contained all of the information needed to log into my account (web site address, username and password). So they have established themselves as a company who can’t be trusted with your password.

On the off-chance that they wouldn’t be sending me any more mails containing those details, I thought I’d try to return at least a small amount of security to my account by changing my password. Except that there is apparently no way to change your password from within your account. By this stage they are breaking records for password stupidity.

I’ve contacted them about the problems and send them a link to my basic guide to password handling article. I’ll let you know if I get any response. I hope their surveys are constructed with a little more thought then their web site.

Update: I heard back from them about not being able to change my password. You can do that in an “update profile” screen. Not sure why I didn’t spot that last week. Nothing from them about the password storage issues though.

Twitter Hierarchy

For most of the last year, I’ve been working behind a corporate firewall which blocks most social networking sites. It’s therefore only in the last month or so that I’ve been able to use Twitter all day every day.

It seems to me that many of Twitter’s users have slightly distorted the sites original purpose. It was originally intended to be used for posting brief “I’m doing this” messages, But many people seem to be using it to hold conversations with their friends. It’s become a sort of “non-instant messaging”. Interestingly, the site’s developers noticed this change and added features (like replies) which made it easier to use the site in this way.

But there are still places where the site’s origins are obvious. On anyone’s profile page you can see two numbers listing the number of people that person is following and the number of people who follow that person. But actually the Twitterverse doesn’t break down into two sets like that. There is a more interesting set of three numbers. For most people their sets of followers and followees aren’t disjoint sets. There is another set of people who both follow you and are being followed by you. Let’s call them your peers.

So we have three sets of people. The people who you follow but who don’t follow you in return (people you think are interesting but who don’t think you are interesting enough to follow), your peers and the people who follow you but who you don’t follow in return (people who think you are interesting but who you don’t think are interesting enough to follow). There’s probably a whole cyber-sociology paper in analysing the ratios between the sizes of those three groups for different types of people.

But the important thing is that you can only carry on a conversation with people in your peer group. It remind me of the old Frost Report sketch about class differences. The people higher than you in the food chain don’t listen to what you say. A few times I’ve missed things that people said to me because I’m not following them and simply adding “@davorg” to your message doesn’t add it to my home page (think of the spam potential if it did).

I get round this by using Twitter Search (previously Summize) to search for messages to me. Actually I go a step further than that and have a feed from that query in Bloglines. Is that a common solution to the problem? What do other people do? Is it a problem that you’ve noticed?

Another, related, issue is how do you move up the hierarchy? Is there an etiquette for contacting people who you follow but who don’t follow you? Can you just send them a direct message saying “hey I’m interesting, follow me”? And is anyone being inclusive and automatically following anyone who follows them?

Oh and what does Twitter have that Pownce, Jaiku or identi.ca don’t have? Is it just the number of users? Will we ever see a big move from Twitter to identi.ca like the MySpace to Facebook move of last year?

Update: hanakomu points out (on twitter of course) that if someone replies to you then the message appears in your ‘replies’ tag whether or not you’re following to them. Also, people get a mail when you follow them – but I think that’s probably optional.