Greasemonkey Arms Race

A few random thoughts came together in a vaguely coherent form on the way home from Opentech yesterday. Allow them to share them with you.

  1. For years we’re been trying to persuade web designers to move away from nasty “tag soup” HTML and to use clean semantic markup with stylesheets to control the presentation. This persuasion is starting to be effective.
  2. Greasmonkey allows end-users to remix web pages and change the presentation in ways that suit the users, not necessarily the owner of the web page. For example, there is a Greasemonkey that rewrites Amazon pages, adding a list of the prices of the current item on other web sites together with links to buy the item from these other sites.
  3. Some web sites may not be altogether happy with end-users being able to do this.
  4. There will therefore be some kind of arms race where content providers try to make it harder for technologies like Greasemonkey to change their pages and the authors of Greasemonkey scripts work to overcome these obstacles.
  5. A well marked-up web page is far easier to alter with Greasemonkey than one constructed of “tag soup”.
  6. Therefore one weapon in the war to prevent your web site being reconstructed by Greasemonkey will be the use of increasingly baroque HTML.
  7. Therefore Greasemonkey is likely to be a major setback in the attempt to encourage sites to use cleaner markup.

What do you think?

Update: Er… thanks everyone for pointing out the obvious errors in my thinking. The major problem was in point 5. It’s Firefox that parses the page, not Greasemonkey. Greasemonkey just traverses the DOM tree built by Firefox. If you break your page enough that the DOM tree is broken, then Firefox (and probably any other browser) isn’t going to be able to display your page.

I did say my thoughts were only “vaguely coherent”.

And I should clarify that I think that Greasemonkey and semantic markup are both damn fine ideas.

Move along please. Nothing to see here. Just some idiot waffling.

5 comments

  1. While there my be some reactions like this, they are likely to be self-defeating, as sites that obfuscate will necessarily become less useful in the process.Conversely, what we have seen with microformats is that the presence of tools that manipulate these common elements encourage publishers to use them.

  2. I left the Web services session yesterday with similar impressions. Thepro-standards, pro-semantic arguments we’ve heard and voiced over thepast few years focus on making information available to a wide range ofcircumstances (devices, software, people, etc.).

    Making information easier to deal with empowers people who wish to dealwith that information. Greasemonkey has raised the visibility and easeof repurposing Web content, but Perl modules like WWW::Mechanize andbefore that LWP have helped developers reuse well developed Web sitesbefore.

    If content owners choose to obfuscate their sites, they make it harderfor everyone to interpret their content: search engine spiders,alternative devices, obscure browsers, not just Greasemonkey.

    Despite the potential, I suspect Greasemonkey won’t appeal to most Webusers for some time, if ever. Advert blockers, for example, didn’tbecome popular for some years after they first existed. We’ll see..

  3. It doesn’t seem to have any basis in fact. Greasemonkey is does not parse pages, Firefox does. As long as the page shows up in Firefox, Greasemonkey scripts will be able to fiddle with it.Now, you might expand the argument to say content providers may try to make their sites unusable for Firefox users, thereby preventing Greasemonkey tricks indirectly. But I don’t think any content provider will be able to afford to do so in the long term, nor do I believe that the amount of Greasemonkey users is particularly significant. Only geeks really play with it; even the non-alpha-geeks playing with it are still geeks.Then, remember that Opera 8.5 has user scripts; effectively, it comes with a different flavour of Greasemonkey built right into the browser. And I doubt that it will be long before we see something comparable for IE7, once it’s out. In both cases, again, it’s the browser who does the parsing, and once it has eaten the markup, however baroque, scripts have access to it. So breaking the scripts again would require breaking the browser as a whole.In light of these facts, I see only possible course for content providers to deal with user-installed page-modifying scripts: resign themselves to the new reality.It is possible that some sort of arms race will initially ensue, but it will take place on the Javascript level, not the markup level, and it won’t last for very long.This is what I’m seeing in the future.

  4. I don’t buy it. Sure it’s slightly easier to do Greasemonkey things against semantic markup, but it really isn’t that hard to do them against tag soup. Additionally, the percentage of the web browsing public using Greasemonkey is never likely to be high enough to even come on the radar of most organisations. The benefits of good markup (especially the financial benefits – cheaper to maintain, less bandwidth to pay for) are huge, while using bad markup against Greasemonkey is pretty much ineffective. Hence you’d have to be very, very stupid to use bad markup just to set back Greasemonkey script authors by a few minutes.

    At any rate, this isn’t a new problem: people have been writing screenscraping tools for years, and sites that want to guard against them could have been using deliberately poor markup to do it. Again, this is a very short sighted position to take.

  5. Firstly, is GreaseMonkey likely to take off in a big way, or is it just going to be one of those things that never really escapes from the hacker community? That will be the key to any arms race. I’d recommend integration with Firefox to kick-start things for GreaseMonkey, since Firefox really does seem to have escaped into the wider world.In whose interests is it to write decent HTML though? Only the poor coders who have to maintain it, I’d guess. The mythical man-in-the-street presumably couldn’t care less. So why does an arms race actually matter?Is an arms-race doomed to fail for one side anyway? For example, given that all HTML code has to be ultimately rendered by a browser, theoretically I’d guess that a program can be written to clean up any code, no matter how obfuscated. On the other side, perhaps it is possible to build an obfuscator, or do some kind of distributed computing trick, such that the code becomes inpenetrable. Personally, I’d bet on the GreaseMonkey side always being able to clean up the code.Now for my new specialist topic after almost completing my MBA: business strategy. It’s interesting to see what Amazon is doing with its web services strategy, with Don Young doing his stuff (see previous posts).At the risk of stating the obvious, there are (at least) two parties with both opportunities and threats here: the modifiers (GreaseMonkey), and the modified (Amazon).Let’s look at the sites being modified. In my view, legal threats are unlikely to work, and would not have global reach anyway. I don’t know to what extent a legal argument can be made that whatever you do to the HTML on your own computer is your own business, especially if you don’t redistribute the modified version. It would be difficult to enforce, anyway! So if you can’t stop it (and it’s likely that the bigger modified sites will try), then how do you embrace, extend and make money from it?Will it level the playing field for the “parasites” (para-sites, ho ho) at the expense of the big boys? Actually I wouldn’t be surprised if it’s the other way around. The big sites should have economies of scale, so unless the smaller sites are selling at below-cost (which could be classed as predatory pricing and thus would be illegal) then the big site should be able to prove that it is more competitive on price that the smaller sites, and that there is no point in shopping around. The trust built up in a brand like Amazon may well also count for a lot. The problem is probably with a Clash of the Titans, where both sites are big and have a lot of brand equity (trust, etc) built up within them, because the only differentiator ends up being price. Either way, I think big and small competitors must be treated differently to outwit a GreaseMonkey threat.If the only differentiator is price, then you’ve effectively got a commodity product. Perhaps the effect of GreaseMonkey, if it takes off, will be to move a lot of products further towards being a commodity. There are two worldviews on commodities: you either hate them and run a million miles to avoid being a commodity supplier due to the low margins, or you embrace them and concentrate on volume, making all those tiny margins add up into a significant total. Which of these views a supplier should choose varies according to many factors, so there’s unlikely to be any generally-applicable advice.If the supplier wants to avoid this commodisation by price comparison (which it has to be said the web is still patchy on delivering), it will have to come up with another way to differentiate. Could this be service? Bundling of products? Strategic alliances and partnerships to offer a broader or less comparable offering that cannot easily be imitated or matched?It all comes down to those classic cliches: creativity and innovation. Most businesses pay lip service to this, few businesses understand the processes and environments required to make it happen. (Yes folks, creativity and innovation *can* be structured processes; you *can* manage them.)Changing the focus now, how can the modifiers make money out of this? I know that a lot of this software is open-source and done for the love of doing it, but I think that the business side of software is interesting so I’m going to talk about it anyway! Could there be some way in which rival suppliers pay the modifier in order to be listed, matching keywords like Google ads? (This is yet another reason why I question Google’s long term future, and certainly its current valuation.) Will sites like Kelkoo or PriceRunner build a toolbar to modify the pages and capitalise on their existing relationships, business model and infrastructure?SHAMELESS SALES PITCH: This is potentially a *very* disruptive technology. Hire me as a consultant and I’ll work with you to build a practical strategy for getting the most out of it.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.