alan little's weblog
traffic to alanlittle.org
24th July 2003 permanent link
I've been doing some analysis of the visitors to alanlittle.org, and thought the results could conceivably be of interest to somebody.
First, to establish my credentials as the owner of a small, obscure website with a statistically insignificant amount of log traffic to analyse, here's a chart of page views per month since the beginning of alanlittle.org in October 2000.
Traffic has been growing steadily - or had been up until the last couple of months, when I became a father and didn't have time to write anything any more. In total I have around 1.1 million raw hits in my logs, of which around 200,000 represent successful views of html pages by what appear to be real people using browsers. (The rest are errors, page views by robots and hits on image files - the latter are a high proportion of all hits because much of the site is a photo gallery)
What browsers are my human viewers using?
No surprises there. Netscape 4 is rapidly dying out but most of its share is being taken by Internet Explorer. Modern standards-compliant browsers like Mozilla and Safari are only increasing slowly, although Mozilla has finally overtaken Netscape 4 as the number two browser. (I do also get some hits from other minor browsers like Opera and OmniWeb, but I've excluded anything with under one percent share from the charts to avoid clutter)
I'm particularly interested in what percentage of viewers I'm getting using Macs. I bought my first mac this time last year and have been very happy with it. I've also suspected for quite a while that the subject matter of my website, primarily photography and yoga, might appeal to a disproportionately Mac-using audience.
Well, my share of page views by Mac users is around 10%, which is certainly significantly higher than Apple's market share of new computer sold (2.3% in the US, the last I heard) and probably somewhat higher than their share of installed base (I'm guessing - haven't seen any reliable estimates for the Apple installed base figure). On the other hand, it's gently-but-steadily declining as Windows gently-but-steadily increases, and Linux and other Unix variants remain insignificant. (I would be interested in being able to distinguish between OS X and Mac Classic visitors, but I don't know how to do that reliably using User Agent Strings.)
The browser picture on the Mac is somewhat different.
Netscape 4 was still close to parity with Internet Explorer on the Mac two years ago, but already declining rapidly. But on the Mac, Internet Explorer market share has levelled off and a lot Netscape 4's traffic share is being taken by Mozilla and, more recently, Safari. Let's take a closer look at the bottom right hand corner of that last chart - what's been happening with non-IE browsers on the Mac since the beginning of last year.
The most striking features is Safari's rapid rise to the number 2 position with 15% share. Safari seems to have killed off Omniweb. Chimera seems to have been completely stillborn - which is a pity, I liked Chimera and used it as my standard Mac browser for a while before I switched to Safari. Hence why I chose to count it separately from other Mozilla variants - those Chimera hits are probably all me. It will be interesting to watch what happens with Safari over the next few months, particularly once it starts shipping as the standard browser on new Macs.
notes, definitions & assumptions
I'm a novice at do-it-yourself http log analysis. I used to use analog for basic statistics, but I got frustrated with it because it can't produce any cross-category reports like "browser share by platform", so decided to write my own. I have a little java programme that parses my server log files and writes them into a MySQL database where I can do ad hoc reports and analysis of whatever I happen to be interested in. Using a database would probably be unacceptably slow if I was running a millions-of-hits-an-hour site, but as we've already established, I'm not and I like the flexibility it gives me.
Here are the assumptions and definitions that the figures above are based on (some of which may be wrong):
- anything I don't otherwise recognise that reads /robots.txt is a robot
- ... and so is anything that appears in this list of robot user agent strings from pgts.com.au
- anything that isn't a robot is a browser
- only hits on html pages count. (This is actually a conservative assumption - sometimes when I'm lazy or in a hurry I link embedded thumbnail images directly to larger jpegs without wrapping them in html pages, so there could also be quite a few jpeg file views that a human being has explicitly clicked to)
- anything with an http result status in the 200s, or 304, is successful
- "Mozilla" includes Netscape 6 & 7
- "Netscape 4" includes all older Netscape versions
- I would be interested in looking at Mac Classic versus OS X, but I don't know how to reliably detect OS X using user agent strings. Some browsers only exist for OS X, and/or explicitly say "OS X" in their user agent strings, but I don't think all OS X browsers do.
- charts produced with the excellent JFreeChart (which doesn't seem to have a "powered by JFreeChart" promotional logo, otherwise I would use it)
related entries: Programming
all text and images © 2003–2008