Bell Curve Pagination - For The Google Juice!
Something that was on our backlog of work at Global Radio for a loooooong time (we're talking two years plus here) was 'bell-curve' pagination, (or maybe 'concertina' pagination - different people called it different things).
What follows is (mostly) a description of the problem, followed by a (sketchy) description of the solution and, finally, some proof that this stuff actually works.
But if you want to see the demo without ploughing though all that, feel free!
Bell Curve Pagination Demo
And now, the blurb....
The Problem
The main area we wanted to target was heart.co.uk's article archive. Previous pagination looked something like this:
Previous 1 2 3 4 5 6 7 8 9 Next
Which is all fine and dandy, but let's say there's 40 pages in total. That means, to get from page 1 to page 20 a user (or, more crucially a search engine spider) the shortest click path would be:
1) Click on '9' which gives us:
Previous 5 6 7 8 9 10 11 12 13 Next
2) Click on '13', which gives us:
Previous 9 10 11 12 13 14 15 16 17 Next
3) Click on '17' which gives us:
Previous 13 14 15 16 17 18 19 20 21 Next
And, finally, click on '20'. Four clicks to get from page 1 to page 20. Now imagine there's 40 pages!
Previous 17 18 19 20 21 22 23 24 25 Next
Previous 21 22 23 24 25 26 27 28 29 Next
Previous 25 26 27 28 29 30 31 32 33 Next
Previous 29 30 31 32 33 34 35 36 37 Next
Previous 33 34 35 36 37 38 39 40 41 Next
That's another 5 clicks away - 9 clicks in total from the first page of the archive. As time passes the older pages get further and further away.
Given that this article archive exists, in the main, to allow Google to easily spider the site's articles (the juiceiest content) this is bad. This is also bad from a user point of view. If a user has a vague idea of how deep into the pagination they want to go (which is possible given that this set of results is ordered by date) it would be a frustratingly long sequence of clicks to get to the specific page. To show what affect this might have in a google context, the heart.co.uk article archive stood at 50 pages. But, using Google's 'cache preview', we knew that the search engine only kept a cached version of the first 15 or so pages - it seemed that, after 3 clicks, the google spider got bored and stopped exploring the article index.
The Solution
One solution might be to do what ukdata.com does for their company listings pages (screengrab)... but that, for obvious reasons, isn't ideal ;)
Our proposed solution was an attempt to allow the user (and google) the ability to get to any page within three clicks, and via an interface that was concise and easy-to-understand.
The list of requirements looked something like this:
- pagination should be all one a single line (with font sizes set to the browser default)
- the first and last page should always be linked
- all pages should be accessible from any other page, within three clicks
- this should be a solution that scales effectively
Clearly, these requirements contradict each other - having pagination 'that fits all on a single line' and having all pages accessible within three clicks isn't going to be possible when the pages exceed a certain number and/or the amount of space for 'one line' becomes severely limiting.
But we had some parameters that we considered to be common cases - the number of pages unlikely to exceed 100 and the width likely to be around 500px, with a typical font size of 12px.
Given these parameters, we were aiming to have all pages within the range 100 accessible within three clicks from 13 pagination links.
Python AND Javascript (a side note, really)
The project I finally managed to crowbar this pagination stuff into was a Gallery project. The greater value of it will be seen elsewhere - but that's the project that we launched it on. This proved something of a problem in that the gallery user experience was enhanced with javascript so that new images were loaded into an existing document rather than re-loading a new HTML document. Which means we'd ideally have the pagination update via javascript too. With the greatest value in this pagination lying in the google juice we hoped it would generate, this meant having the pagination render on each page without javascript. Which meant writing this pagination code in both JS and Python.
I figured it would be more tricky to write in JS, so I wrote that code first and ported it as simply as I could to Python so that if we need to fix bugs or upgrade the code, it would be similar code we'd be changing. It *could* be way tidier Python code, but that would mean the code as a whole wold be less maintainable. What I've ended up with is this weird javascript-onic Python, which I'm not proud of, but it's the best way to do it. I could have reloaded the in-page pagination by making an AJAX call to the server, generating the links server side and returning an array which the client-side JS could use to render the updated pagination. The down side would be an extra HTTP request - given that users click quickly through a gallery it seemed to be a big WIN for this pagination code to be available in JS.
Generating The Pagination Numbers
The existing pagination code we were using was the Digg Style Pagination by X - I borrowed I in this code borrowed the arguments used in this code to generate the pagination, these being:
- Total number of pages
- Current Page
- Total number of pagination links
- the 'padding' wither side of the current page
So, to generate pagination, a function call looks like so:
pagination.paginate(200, 50, 4, 11)
which returns a list like so:
[1, '...', 46, 47, 48, 49, 50, 51, 52, 53, 54, '...', 200]
This can then be used to generate a list of pagination links.
The Code Approach, Step-By-Step
We generate the numbers with the following steps:
- generate the single increments to the left and right
- allocate a number of links to the left and right sides, based on how large these ranges are and how many links we have left
- generate the left and right ranges as separate arrays, ensuring they 'bell-curve' away from the start point
- rationalise the numbers, rounding them to human-palatable increments (e.g. rounding to the nearest 5 or 10, if that's suitable)
- stitch the three arrays together
- add ellipses where the increments stop being between single numbers
et voila!
The Benefit
The one provable advantage we've gained from applying this pagination is a higher rate of google indexing. Previously, google would show us its cached version of the first 15 pages of the article archive on heart.co.uk.
Googling for the following strings: "cache:http://www.heart.co.uk/help/article-archive/?archive-page=16" would return a "Sorry, Google doesn't know about this URL" message for pages beyond page 15.
A little over a week later google had cached every single page in the archive :) We'd gone from 20% coverage to 100% coverage. The direct result of this is that all the articles on the site are known to google - giving us a longer tail on all our most valuable content. The content is now more findable and we have increased traffic via google as a result.
Releated Links
Latest Posts
How The sort() Method Of An Array Works
1:17p.m., 2 Dec
... or "What I What I Learned from the Exercise In Futility, Part 2". (This follows on from my earlier ...Custom Constructor With An Unknown Number of Arguments
10:19p.m., 30 Nov
...or "What I What I Learned From An Exercise In Futility, Part 1" - how to enforce the 'new' keyword ...Muppets Birthday Card
5:47p.m., 28 Nov
Emma loves The Muppets. She even has her own Muppet who we call Emma Too and who was born at ...Detecting Online Status In The Browser
11:55a.m., 28 Nov
I was just heading into a meeting when I was asked how our (mostly web-based) iOS application was going to ...