Recently in xml Category

Martin's article today about the Daily Express web site reminded me that it's been some months since I looked at my list of Newspaper RSS feeds. As the list is created by screen-scraping the individual papers' web sites, it's no surprise that it all goes out of date as the sites are redesigned and updated.

And sure enough, it was a real mess. When I ran the program that generates the pages, about half of them were broken. But it wasn't too serious, and after half an hour or so of tinkering with regular expressions, it all seems to be working again.

But all in all, it's a good lesson in why screen-scraping is a really bad idea. This would be far easier (in fact it would pretty much be unnecessary) if the papers took the next step and released OPML files of their feeds, rather than free-form web pages.

Anyway, it's all back again now. Please take a look and let me know if I'm missing anything obvious.

smok0.com

| 3 Comments | View blog reactions

Mike has emailed me to point out smok0.com which shows the top news stories from a number of different news outlets. Apparently he's making use of the OPML files from my newsfeeds page.

Nice to know some of my stuff is useful.

I've been a bit busy this week, so I haven't mentioned the Amazon Web Services talk that I went to on Monday evening. Amazon Web Service Evangelist (cool job title) Jeff Barr talked for almost two hours about what Amazon are making available. It was all very interesting and I wish I had more time to investigate it in depth.

I had an idea for an application during the talk but then Jeff mentioned Wish List Buddy as an example. It doesn't do all that I was thinking of, but it's a start. Maybe I'll find time to work on my version at some point in the next few weeks.

Oh, and I loved the phrase "artificial artificial intelligence" to describe Amazon's Mechanical Turk service where you can use human input as part of your application.

Thanks to Jeff for giving the talk and Dean for organising it.

Dean Wilson is fast becoming known as the person who organises the best geek talks in London. He has previously arranged hugely successful nights for both GLLUG and london.pm. The web frameworks night that he organised last November is already legendary.

And he has another extravaganza on the way. Next Monday he has arranged for Jeff Barr of the Amazon Web Services group to give a talk at the New Cavendish St campus of Westminster University. Full details are on Dean's blog.

It looks like it'll be a great night. Hope to see you there.

Another book review. This time it's XML Hacks. O'Reilly's Hacks books are great and this one is no exception.

The BBC Backstage project has announced a new data feed which contains details of BBC TV and Radio programs for the next seven days. Looks interesting but, of course, what I really want is details of the listen again radio streams so I don't have to screenscrape them.

About this Archive

This page is an archive of recent entries in the xml category.

writing is the previous category.

Find recent content on the main index or look in the archives to find all content.

Archives

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.21-en

Recent Comments

  • erez.wordpress.com: I wouldn't tell, as long as you won't tell them read more
  • James Mastros: It's interesting that you bring this up now, but don't read more
  • Aristotle Pagaltzis: Thankfully, this at least doesn’t directly affect the children of read more
  • skugg: It could have been your cover letter. Did you fall read more
  • John: ebay have done it again. They have changed the system read more
  • erez.wordpress.com: Being skeptic isn't "questioning everything scientists say," but "questioning arguments read more
  • https://me.yahoo.com/tuxservers#96247: I'd go with Planet Skeptic - apart from anything else, read more
  • https://me.yahoo.com/a/fxkAuR4r0.3.JVJqDK.J.DHVMsvW: Maybe they're enraged that Google even proposed the first EULA; read more
  • Dave Cross: login.launchpad.net/+id/cMCFxsB (cool name!), I never said that installing the Theora read more
  • https://login.launchpad.net/+id/cMCFxsB: What a bunch of FUD. Installing Theora codecs is absolutely read more