OED100: Expanding the Historical Thesaurus of the OED

OED100: Expanding the Historical Thesaurus of the OED

The Historical Thesaurus of the OED (HTOED) is a semantic network of OED senses arranged by concept or meaning. It allows users to explore the different ways that meanings have been expressed in English over time, and to discover which synonyms were available for particular words in different periods. Its data has also been used to support various other research projects, such as the semantic tagging of historical corpora and the analysis of metaphorical patterns in English.

HTOED was created at the University of Glasgow from the late 1960s, and was largely based on data from the second edition of the OED. But as the OED continues to be revised and new words, senses, and phrases are added, these need to be linked to HTOED in order to present a complete semantic history of the language, and to enable HTOED to fully realize its potential to support a wide range of linguistic and historical research Over the past few years as part of the OED100 project we have been working to expand HTOED in this way, using a partially automated process: we developed a tool to automatically identify possible classifications of senses (based on features such as definition wording, cross-references, and labels), and a team of editors manually check and modify the suggested classifications. Through this project we have linked more than 55,000 further OED senses to HTOED categories so far.

Let’s take, for example, the category excellent. In the print edition of HTOED, the most recent word in this category was the slang term kicking, first recorded in 1983. We’ve added numerous other words to bring this category up to date with late-20th century and early-21st century coinages, including pukka, lovely jubbly, epic, awesomesauce, amazeballs, nang, peng, and daebak.  Or, to take a more concrete example, we’ve expanded the category types of rock music with terms such as darkwave, queercore, and nu metal

We’re also improving coverage of English in earlier periods, adding links to older senses which were previously not covered in OED or not linked to HTOED. For example, the already large category stupid person, dolt, blockhead has been extended not just with new terms (of which there are many: dingus, muppet, dough ball, and numpty are a few recent additions) but also with Early Modern synonyms in this area, including noddyship (1589), oatmeal-groat (1594), and simpletonian (1652 as a noun). Another category which we have expanded is light meal or snacks, which now includes more regional and World English uses such as nacket (chiefly Scottish, first recorded in 1694), docky (English regional, 1846), namkeen (Indian English, 1942), and biting (Kenyan English, 1997). Thus far we have mainly been updating HTOED by adding new senses to existing categories, such as the ones mentioned above. However, there are some semantic areas which did not exist or were not salient when the Thesaurus project began – such as extreme sports, social media, and environmental activism. The next stage of the project will be to update HTOED’s taxonomy so as to enable more comprehensive coverage of these and other semantic fields.

Additional HTOED resources:

The University of Glasgow version of the Historical Thesaurus is available at https://ht.ac.uk/, and associated projects are listed at https://ht.ac.uk/associated-projects/

The opinions and other information contained in the OED blog posts and comments do not necessarily reflect the opinions or positions of Oxford University Press.