OED, again

A little more on the OED. The idea of creating a publicly-accessible version has obviously been floating around for a few years. As well it might: not only would an open OED be fantastically useful, but there’s a certain justice in bringing it back to the community. As Kragen Sitaker writes, the original OED

is one of the earliest instances of what are now called “pro-am” or

“commons-based peer production” projects. From 1857 to 1928, thousands

of readers collected examples of uses of words their dictionaries didn’t

define; they mailed these examples on slips of paper to a small number

of editors, who undertook to collate them into a dictionary.

Kragen’s attempt to liberate the OED was the most effective: not only did he get one set of the OED scanned, he also cooked up some code making it possible to look up individual words. Alas, his system is now offline – such is the fate of one-man projects. Rufus Pollock’s attempt to revive it, within the framework of the Open Knowledge Foundation, seems not to have got anywhere.

More ambitious are the Distributed Proofreaders, a group who take OCR’ed books, edit and correct them by hand, and pass them on to rProject Gutenberg. They’ve been contemplating the idea of tacking the OED for some time now. But it’s a pretty daunting project – both in scale, and in the complexity of the typography – and every attempt seems to peter out.

Which is all a bit of a disappointment. I’m not quite foolhardy enough to lauch myself into digitising the OED just yet, but there must be at least some prospect to make those scans slightly more user-friendly.

Leave a comment

Your email address will not be published. Required fields are marked *