OED100: The OED in 2022
2028 will mark the centenary of completion of OED’s First Edition. In the run-up to that milestone, the OED team is undertaking a series of projects (collectively named OED100) designed to extend OED’s usefulness, value, and reach, and which reflect our commitment to OED’s evolution and continued development. The scope of this work is wide: updating the dictionary content more dynamically than in the past, building a new platform for the oed.com site, and establishing productive partnerships with OED’s users and stakeholders. In this annual update, I’ll set out some of the key elements in OED’s diverse programme of work.
Updating the OED: Progress to Date
The core objective of OED100, of course, is comprehensively updating the OED’s previous editions with 21st-century scholarship, adding new words and senses, and filling historical gaps. Here’s a high-level summary of how things stand right now.
With the release of our March 2022 update, just over half (54%) of the entries are new or fully revised. If you count in terms of senses (as dictionary editors tend to), 55% of senses currently in OED are either new or updated. I emphasize currently because as well as updating OED’s existing content, we continually add new words and senses as English usage evolves and expands globally. Language is a moving target.
For everyone, the past two years have presented numerous unexpected challenges. For OED, some of these were logistical, particularly around lack of access to libraries and to our own in-house research materials. Nonetheless we set ourselves challenging, rising progress targets for this and coming years.
The OED team has worked diligently and often ingeniously throughout the pandemic, and this has been a productive year. With our March update the project will hit and slightly exceed that ambitious annual target of 14,000 senses (35% more than in 2020/21). We have also been recruiting prolifically; two researchers, a science editor, and four new general/new words editors. There will be a lot of new talent coming onstream over the next year, and this represents a major investment in training the next generation of historical lexicographers.
Traditional entry-by-entry revision sits at the core of our editorial programme, but we are also implementing targeted improvements across the text, to bring OED’s scholarly research to our readers as rapidly as possible. This is a natural continuation of the principle we’ve followed since we first launched oed.com in 2000: to make new work-in-progress available quickly and regularly, rather than publish only when the full project is complete.
In the last year we started to prioritize editorial work in three distinct streams. This graphic highlights a selection of the words in each category that OED editors have updated over the past year.
Our aim is to prioritize work which brings the greatest benefit to readers. Foundation Words (blue in the graphic) are those with the greatest longevity and frequency, or historical, semantic, and cultural complexity. Typically, these are familiar words with sustained and prolific use in multiple subject domains. They often show an elasticity and adaptiveness in their relationships with other words, prolifically forming compounds and phrases.
In the function words category (shown in green), we continue to prioritize some of the most complex linguistic material in the dictionary. The table below shows a list of the top 100 most frequently used words in English: the words marked in bold are ones already updated or currently in progress. (Note: not all the listed words here form separate entries: a and an, for example, are one entry in OED.)
Many of these words are perhaps not the kind that get looked up in most dictionaries, as their meaning and usage are generally understood. But OED’s entries are repositories of scholarly information about these cornerstones of English grammar and usage; information that simply isn’t available elsewhere in the same form. In prioritizing this work, as well as inheriting others’ scholarship, we’re often undertaking new primary research on the history of English.
Due to their grammatical or semantic complexity, revising function words and foundation words can slow the rate at which we work through individual senses and entries. But these words, with their rich and complex histories, often demonstrate most clearly the unique contribution that the research and revision of the OED makes to scholarship.
Dynamic Updating Across the OED content
Since the launch of OED Online in 2000, OED’s editorial progress has been quantified almost exclusively by counting senses and entries which are new or fully updated; this was conceptually how the Third Edition was defined. That sense metric reflects the work that has been and will remain the core activity of the OED team, but it doesn’t fully reflect the diversity of work we now do to enhance OED’s content across the whole text.
Everything ages, even history, which is why rewriting the OED involves not only adding new words and senses, but also making prolific changes to the historical record of English: in etymologies, pronunciations, spelling variants, quotation evidence, and definitions. Most of these changes are factual, but some are more tonal. All texts inevitably show traces of the cultural attitudes and linguistic habits which prevailed at the time of writing. With unrevised sections of the OED that can be a complex proposition, as the text has been written over a period of 150 years, and parts of the same entry may date from different eras of editorial work. Some of the older content can seem culturally dissonant, and we are working to address that. So alongside full-entry updates, we’re undertaking a range of priority updates and spot corrections to rectify known issues and deficiencies across the OED text. By updating key aspects of unrevised entries, and – for words exhibiting rapid ongoing change – revisiting them as needed, we ensure that the OED, once updated, stays up to date, avoiding the periodic cycles of editorial dilapidation and renovation that previously marked its publication history. Some of this work can’t be quantified in terms of sense-by-sense progress, but the scale of the work is significant, and greatly improves the usefulness and usability of the OED, so I’d like to highlight some key initiatives.
Special Projects: Ecowords
In 2020, OED published two special updates outside of the usual quarterly publication cycle, to cover the emerging and rapidly changing language associated with the Covid-19 pandemic. This year we conducted a similar but larger-scale project to review, update, and extend our coverage of language relating to environmental issues and climate change, publishing our update just ahead of the COP26 conference. This link will take you to the main blog post we published alongside the new content which discusses not only the history, meaning, and usage of the words covered in the update, but also surfaces some of the corpus-based research analysis that supports our editorial decision-making.
Historical Thesaurus to the OED
This ongoing project is to link newly added OED content to Historical Thesaurus categories. As a semantic network of English past and present, the Thesaurus has the potential to support a wide range of linguistic and historical research, both as a tool in OED Online and as data underpinning other research tools.
The usefulness of the HTOED content was inhibited by its incompleteness: we had not integrated into the Thesaurus many of the entries and senses newly added since the Second Edition, with the result that lists of synonyms ordered chronologically sometimes end in the first half of the 20th century. To date, we have linked more than 55,000 further OED senses to Thesaurus categories, and where necessary added new categories for things that hadn’t been conceived of when the Thesaurus project began in the 1960s – like extreme sports or the internet. We’ve also been working to improve OED-to-HTOED linking for senses dating from Early Modern English, and for the highest frequency compounds.
From the outset, work on OED’s bibliographical citations and illustrative quotations has involved cross-text validation and consolidation. OED was written over many decades and stylistic conventions varied over time (and sometimes from editor to editor). The 2 million-plus of quotations added before we had the means to interrogate the data electronically inevitably showed considerable variation, most notably in how the titles of quoted works were abbreviated. But in its digital form, to offer maximum utility to users, OED needed to audit, link, and homogenize our bibliographical records and citation styles, being scrupulously careful not to misaggregate. This is such a significant long-term undertaking that a short report here can’t do justice to the breadth and complexity of the work, but its benefits filter continuously through to OED’s users.
Another cross-dictionary project we’ve been carrying on over the past several years has been a complete overhaul of our pronunciation transcriptions across the whole of the OED. At the same time, we have been recording our own high-quality audio files to sit alongside the transcriptions, rather than using any of the various kinds of synthesized speech, so that users can hear the words pronounced by native speakers. Over 80% of OED entries not marked as obsolete now have current pronunciations and audio files, and there are over 500,000 distinct pronunciations currently in oed.com. We’ve also done something no other standard dictionary has, by providing regional pronunciations – both phonetic transcriptions and audio files – for words labelled as specific to a region.
Most of our etymologists’ efforts are focused on providing etymologies and variant forms sections for entries updated or newly drafted for our quarterly updates. Since last year, they’ve also been working on this new task: to improve the quality of the unrevised entries by adding etymologies and forms lists (hence ETFL) where none previously existed, or on updating and improving them where they did. Perhaps surprisingly, about 15,000 unrevised entries lacked an etymology.
We set an annual target of 1800 entries which the team exceeded by an impressive 10%. The benefit of this stream of work is that we can incorporate new philological research into OED entries now, rather than having to wait for the full entry to be revised. Naturally, we’re carefully flagging entries to show where etymologies are more up-to-date than other components. Undertaking this work steadily alongside the main revision and publishing programme, over the next 8 years we should be able to fill all the gaps. By 2028, when we reach that centenary milestone, all (non-obsolete) OED entries will have a full etymology, variant forms list, pronunciation transcription, and audio.
Over the past few years, OED has undertaken a series of projects to improve our coverage of words and senses from regions in which English is used most prolifically and distinctively. In each of these projects we have formed partnerships with external experts from or in the region. Increasingly, terms arising in these varieties will spread internationally. While the OED has always included words from across the English-speaking world, what’s changed – dramatically – over the last century is the size and breadth of the English-speaking population. The British Council estimates 1.75 billion people worldwide regularly speak English.
This summary of projects completed, in progress, and in prospect illustrates the growing pace and scope of this work. Many of these projects are relatively small scale — typically adding or updating between 50 and 200 new words or senses—but they are high-impact. This year’s Korean English project both reflected and coincided with the increased international profile of South Korean culture – in music K-pop as a global phenomenon; in film and tv, the Oscar-winning Parasite and Netflix hit Squid Game. OED’s release was met with extensive coverage in Korean and Asian media (as well as in the UK and the US). More important than the flare of media coverage is the creation of lasting educational value. The project coincided with OED Online being made available, through a consortia deal, to students of English throughout South Korea.
As we work on these projects, we are building an extensive set of research resources. We’ve now collected these and made them available (free to all) in a new Varieties of English site which you can find here. From this page you can find detailed descriptions and guides to the regional varieties OED covers, and to the pronunciation models we’ve devised for our audio recordings and IPA transcriptions. There are links to webinars and videos, and to external resources. You’ll also find blog posts written by our editors and our expert consultants, describing the projects and the words and meanings we’ve researched.
And as an expression and extension of this work, in April OED is sponsoring an online conference, the Oxford World English Symposium. Speakers from all over the world will share their insights in a wide range of linguistic topics. You can see the full programme and register to join the (free) event here.
OED’s growing coverage of regional Englishes reflects a larger shift to new centres of lexical innovation, and signals OED’s commitment to reflecting 21st-century global culture and language, and to extending the dictionary’s reach, value, and relevance.
As we begin to design a new platform for oed.com, we have launched the OED Labs site as a public incubator and testing ground for tools, interfaces, and research project collaborations. This initiative will help to shape the future of the OED, and we believe the research community is an integral part of this evolution. At present, OED’s uniquely rich language data can be searched only through the oed.com website. Our aim is to offer researchers new, more direct and more flexible ways to access this massive curated dataset and to gain deeper insights in the English language than ever before. Through OED Labs, researchers can help shape the development of new digital tools that help us to reimagine how the power of OED’s data can be shared and accessed to support and enrich academic research.
In this work, as in our programme of editorial revision, the OED’s aim is to bring the benefits of our scholarly research programme to the widest range of our readers as rapidly as possible, and to deliver these in the most useful and flexible ways. Continuous, prioritized revision of the dictionary will allow OED to engage more dynamically with language change in English across the globe, to incorporate the discoveries of new scholarship, and to adapt to meet the evolving needs of researchers worldwide.
The opinions and other information contained in the OED blog posts and comments do not necessarily reflect the opinions or positions of Oxford University Press.