The Oxford English Dictionary: focus areas and goals for 2021
In the runup to the 2028 centenary of the completion of OED’s First Edition, the OED team is undertaking a series of projects to update the content more dynamically than in the past. Traditional entry-by-entry revision sits at the core of our programme, but we also aim to implement improvements across the text, in order to bring the benefits of our scholarly research programme to the widest range of our readers as rapidly as possible. A key part of this programme is to explain and present our plans and our results to our readers with greater transparency and frequency, and that process starts here.
A look back at what we’ve done so far
First a quick summary of work to date, and a preview of our main editorial progress targets for the year ahead. We’ve now rewritten almost half the entries in OED2 (48.8%). If you count in terms of senses, slightly more than half the current senses (51%) are revised. But we continue to add new words and senses as we go, so the overall sense total is a moving target.
That high figure for new senses merits a passing comment. It reflects not just newly added content, but existing senses which may (when updated) have been split into two or three subsenses, or compounds now separately treated which were previously treated collectively. It’s also important to note that what we call new in OED isn’t always very new at all. We’re updating the whole historical record. OED’s latest quarterly update in December 2020 update contained a new entry for a familiar sense of follow (meaning to pursue and observe surreptitiously) that dates back to the eleventh century. By contrast this year saw what must be the fastest ever progression from coinage to inclusion in the OED: Covid-19 was first recorded in February, and by early April OED had published an entry as part of the first of two special updates devoted to terminology associated with the pandemic.
Looking forward to 2021 (and beyond)
We set out in our summary above some key targets for our planned editorial progress in the year ahead. We want to explain a little more about the work itself: both the traditional entry-by-entry updates, and some new projects to make targeted improvements across the text (in both revised and unrevised entries).
Entry-by-entry revision: prioritizing function words and foundation word
We continue to prioritize revising some of the more complex material in the dictionary. The table below shows a list of the top 100 most frequently used words in English: the 79 words marked in bold are ones we’ve edited or are editing now. (The eagle-eyed will spot that not all the listed words here correlate to separate dictionary entries: a and an, for example, are one entry in OED.)
Many of these words can be described broadly as function words; the glue of the English language. They are perhaps not the kind of words that get looked up in synchronic or desk dictionaries; but OED’s entries are repositories of scholarly information about the cornerstones of English grammar and usage, information that simply isn’t available elsewhere in the same form. In prioritising this work, we’re often undertaking new primary research on the history of English. Many of these words date back to Old English, and the other major scholarly dictionary project that would cover them – the Dictionary of Old English – hasn’t yet reached the later parts of the alphabet. As with so many other parts of OED’s work, we’re undertaking primary research, as well as inheriting others’ scholarship.
In another stream of work, we are updating OED’s coverage of words with the greatest longevity and frequency, and which exhibit the greatest historical, semantic, and cultural complexity. We are calling these foundation words, and some examples are shown in the word cloud below. Typically, these are familiar words with sustained and prolific use in multiple subject domains. They often show a similar elasticity and adaptiveness in their relationships with other words, by forming many compounds and phrases.
Due to their grammatical or semantic complexity, revising function words and foundation words inevitably slows the rate at which we work through individual senses and entries—put simply, harder work takes longer—but these words, with their rich and complex histories, often demonstrate most clearly the unique contribution that the research and revision of the OED makes to scholarship.
Work across the dictionary
Since the launch of OED Online in 2000, OED’s editorial progress has been quantified almost exclusively by counting senses and entries which are new or fully updated in the Third Edition; this was conceptually how the project of the Third Edition was defined. The sense metric reflects the work that has been and will remain the core activity of the OED team, but it doesn’t fully reflect the diversity of work we now do to enhance OED’s content across the whole text.
In the past ten years, we have undertaken an increasingly diverse range of tasks designed to enhance the text and the data across the entire dictionary. Most of this work can’t be quantified in terms of sense-by-sense progress, but the scale of the work is significant, and greatly improves the usefulness and usability of the OED.
From the outset, work on OED’s bibliographical citations and illustrative quotations has involved cross-text validation and consolidation. Stylistic conventions varied over time (and sometimes from editor to editor), so the 2 million-plus of quotations added over decades before we had the means to interrogate the data electronically inevitably showed considerable variation, most notably in how the titles of quoted works were abbreviated. But in its digital form, in order to offer maximum utility to users, OED needed to audit, link, and homogenize our bibliographical records and citation styles, being scrupulously careful not to misaggregate. This is such a significant long-term undertaking that a short report here can’t do justice to the breadth and complexity of the work, so we plan to report separately on this topic in a future update.
Another cross-dictionary project we’ve been carrying on over the past several years has been a complete overhaul of our pronunciation transcriptions across the whole of the OED. At the same time, we have been recording our own high-quality audio files to sit alongside the transcriptions, rather than using any of the various kinds of synthesized speech, so that users can hear the words pronounced by native speakers. 89% of OED entries not marked as obsolete now have current pronunciations and audio files. We’ve also done something no other standard dictionary has, by providing regional pronunciations – phonetic transcriptions and audio files – for words labelled as specific to a region.
Most of our etymologists’ effort is focussed on providing etymologies and variant forms sections for existing entries as they are revised, and for newly drafted entries. Alongside this, we have recently added two new streams of cross-dictionary work. First, we are adding etymologies to unrevised OED entries where previously none was offered; perhaps surprisingly, this is the case for about 15,000 unrevised first and second edition entries. Second, we are updating the etymologies for a prioritized list of unrevised entries with long histories, such as numerals. Undertaking this work steadily alongside the main revision and publishing programme, we think that over the next 8 years we can fill all the gaps. By 2028, when we reach that centenary milestone, all OED entries will have a full etymology, variant forms list, pronunciation transcription, and audio.
Historical Thesaurus of the OED
The Historical Thesaurus is a remarkable work created over several decades at Glasgow University. We continue to collaborate with the team in Glasgow on their Second Edition. As a semantic network of English past and present, the Thesaurus has significant potential to support a wide range of linguistic and historical research, both as a tool in its own right and (we believe) as data underpinning other research tools. But its usefulness has been inhibited by two factors: first, it is not yet fully integrated in OED Online, and second, the fact that we have not yet integrated into the Thesaurus many of the entries and senses newly added in the Third Edition. The consequence is that lists of synonyms ordered chronologically sometimes end in the first half of the 20th century. In an ongoing project, OED editors have linked some 45,000 newly added OED senses to Thesaurus categories, and (where necessary) added new categories for things that hadn’t been conceived of when the project began – like extreme sports or the internet.
World English projects
The OED has always included words from across the English-speaking world. What’s changed – drastically – since the first edition is the size and breadth of the English-speaking population. The British Council estimates 1.75 billion people worldwide can speak English to what it calls ‘some useful degree’.
For the past few years we have been undertaking a series of projects to improve our coverage of words and senses from those parts of the world in which English is used most prolifically and distinctively. In each of these projects we have formed partnerships with external experts from or in the region. Increasingly, terms arising in these varieties will spread internationally. One example from the earliest projects here: in 2016, OED added an entry for wet market, with examples dating back to the late 1970s, from the Philippines, Malaysia, Hong Kong, and Singapore. At that time the term wasn’t much known or used outside of southeast Asia, but in 2020 the whole world has learnt about wet markets.
These world English projects are not massive in scale—in general we add and update between 50 and 200 new words or senses—but they punch above their weight. We can measure their impact not just in the extensive regional media coverage, but also in the regional peaks we regularly see in lookups on oed.com for these words and phrases.
Continuous Improvement & General Maintenance
The projects mentioned above are just some of the areas in which we’re making significant improvements to entries in advance of full-scale revision, and where we are seeking to complete elements of OED content well in advance of full revision of the dictionary. But what about maintenance of the material we’ve already revised? Over the years we’ve made noteworthy changes to around 11% of Third Edition entries since they were revised or drafted: that’s 16,000 of 149,000 revised entries where we have made significant further change. This is an aspect of the work we expect to see increase over time.
More broadly, in the last year alone we made 115,840 changes of some kind, large or small, to entries across the database. Most of those changes will be small things – improving bibliographical consistency and linking, maintaining cross-references, correcting typos (they do happen) – but they also include refinements to definitions, adding further quotation evidence, or updating encyclopaedic information in light of world events. We maintain careful records of this work and material changes to content will be flagged for users of OED Online. But the sheer extent of change demonstrates how dynamically the OED is being maintained and improved.
A changing approach
Why have we diversified our approach to revising and updating the OED to include cross-text improvements and lexicographical projects, as well as entry-by-entry editing? Fundamentally, we believe it makes the dictionary more useful sooner. It also allows us to engage more dynamically with current language change in English across the globe, to incorporate the discoveries of new scholarship, and to adapt the OED to meet the evolving needs of researchers worldwide. By updating key aspects of unrevised entries, and revisiting entries we’ve already revised, we ensure that the OED, once updated, stays up to date as the definitive record of the English language.
The opinions and other information contained in the OED blog posts and comments do not necessarily reflect the opinions or positions of Oxford University Press.