Skip to Content

Gareth53.co.uk - the online home of Gareth Senior

AutoCasing Javascript - Correcting Bad Data

4:01p.m., Sat 16 May 2009

I blogged sometime ago about a script that would auto-correct the casing of artist names.

The impulse to do this was bad data being supplied by a 3rd party system I had no control over - the playout system used by Xfm.

Xfm staff were twitchy about auto-correcting the case and it being potentially more harmful than helpful.

I've written a javascript to auto-correct case since then, I'm not going to go into great detail about what it does, since that would be just going over old ground.

On a technical level though, here's what it does to a submitted string:

1 - uses a regular expression to strip all characters that aren't alpha-numeric and then compares this string against a list of known exceptions to the autocasing

2 - if it finds a match by looping through the array of known exceptions it returns that

3 - if not it auto-cases using two regular expressions.

The demo is here. There's a set of test-cases that I was using to develop which is also available. Code available to download from the demo page.

What finally nudged me to blog about it was a development over on MusicBrainz: their GuessCase script. There's much more detailed and rigorous casing here, backed by in-depth debate about what should be uppercase and what shouldn't. Go read about it for yourself and I'll stop writing about it.

Still reading? OK. I was just going to mention the testing that I had in place. What this does is take 211 examples of incorrectly cased artist names and runs them through the script. This 211 includes all the exceptions and then an example that tests each part of the regular expression.

I'm also capturing how long the test takes. Here's the response times for A-grade browsers (and Chrome). The times are an average of five 'scores'.

OS Browser time taken (secs)
OSX Safari 4 Public Beta 0.084
OSX Firefox 3.0.8 0.149
Windows XP Firefox 3.0.8 0.195
Windows XP Google Chrome 1.0 0.215
Windows XP Opera 9.52 0.225
Windows XP Internet Explorer 8 0.315
OSX Opera 9.52 0.462
Windows XP Internet Explorer 7.0 0.579
Windows XP Internet Explorer 6 0.692

These figues should only be used for comparison of course. And they widely tally with other regular expression benchmarks that you can find with a quick google:

http://www.codinghorror.com/blog/archives/001023.html

Latest Posts

  • Muppets Birthday Card

    5:47p.m., 28 Nov

    Emma loves The Muppets. She even has her own Muppet who we call Emma Too and who was born at ...
  • Detecting Online Status In The Browser

    11:55a.m., 28 Nov

    I was just heading into a meeting when I was asked how our (mostly web-based) iOS application was going to ...
  • Dropping Support for Internet Explorer 6

    2:37p.m., 11 Oct

    Microsoft's Internet Explorer 6 has long been the bane of every front-end developer's life. It's a 10-year old browser - ...
  • Xfm Buzz - A Radio Hack

    1:15p.m., 31 May

    At Global Towers we developers have 10% time to go away and hack at something that might, ultimately, bring value ...

Blog Categories