alan little’s weblog

fencing the commons

13th December 2005 permanent link

Every couple of months I look in the logs of and, among other things, worry about the number of bad links in there and decide to do something about them. What I did about them this time was write a link checker and let it loose on the site.

As of last night there are 3,585 internal links within, counting both pages and links within pages, 49 of which are bad. Well done me – that’s not a bad level of output, and nearly 99% correctness, for one man in five years of his spare time, during which he also managed to have major adventures travelling the world, and get married and start a family.

The entropy level for external links is much higher. One external site I have a lot of links to is the old yahoo ashtanga yoga discussion group; these all now redirect to a yahoo logon page. I worried five years ago that this might happen:

Usenet had advantages over yahoo clubs. It was a decentralised, distributed system - anybody could set up a news server and carry messages from whatever newsgroups they chose, so it wasn't dependent on the whims of one company, whereas yahoo could theoretically choose to delete or deny access to message archives any time they felt like it.

Yahoo haven’t deleted or denied access to anything, nor do I think it’s particularly likely that they would. They’re not even charging money for access – registration for a yahoo account is free as long as you’re prepared to put up with looking at adverts. They have made it unlikely that many people are going to read any of the things I laboriously linked to, some of which may still have some value. (I stopped using that discussion group years ago, but that was because the noise-to-signal ratio got too high, not because of anything yahoo did)

I don’t have anything in principle against yahoo’s behaviour here. They are a commercial organisation, and I freely chose to put my time and effort into writing and linking to things on their servers without ever (as far as I can remember) paying them a penny for the resources I was using, knowing that they were bound to try to make money from it somehow sooner or later. Good luck to them; I have nothing whatsoever against a couple of early web enthusiasts having been in the right place at the right time, having a good idea and becoming billionaires as a result. People have become rich doing far worse things.

Nevertheless, I’m old enough to remember and be a little bit nostalgic for the (very end of the) pre-web Internet and the early days of the web, when most of the interesting things were being done for free by idealistic volunteers. Nowadays the question uppermost in everybody’s minds is “how are we going to pay for/make money from this?”. It may look ugly at first sight, but this is actually a more honest question than the number one question from the good old days which was often “how are we going to misappropriate (mostly university) resources paid for by other people’s (mostly the taxpayer’s) money, in order to do what we feel like doing with them instead?”

In related news, yahoo bought last week. Congratulations and good luck to the people behind, who had a good idea, implemented it very well and thus provided a useful service to a lot of people, including me. I do hope yahoo don’t start slapping advertisements all over my precious links or charging me money to see them, even though in principle I admit they would be perfectly within their rights to do so.

UPDATE: a little oh, shit moment this evening when my script for backing up my links reports a socket timeout error. Oh no, they disabled the API already! No they didn’t – the second attempt works. I’m glad I have that backup script though, and will be using it more often.

UPDATE UPDATE: (19th December) a week later, is down for a day. Power outages happen, though, and being bought by yahoo a week or two earlier would probably have helped in this particular situation if it meant the service had already been transferred into one (or more) of yahoo's proper industrial grade server rooms.

all text and images © 2003–2008