Prof. Bolette Sandford Pedersen
July 6th, 2022
Short bio
Bolette Sandford Pedersen is professor of computational linguistics, Deputy Head of the Department of Nordic Studies and Linguistics & Centre Leader of the Centre for Language Technology. Her main research interests include computational lexicography, lexical semantics and linguistic ontologies.
Bolette Sandford Pedersen was coordinator of the Nordic NORFA network SPINN on harmonisation of language resources in the Nordic countries, coordinator of the Danish Senseval2 participation on sense tagging, project manager of DanNet, package leader of lexical resources in DK-CLARIN (2008-2011), Danish coordinator of the EU project CLARA — Common Language Resources and their Applications — a Marie Curie Initial Training Network (2011-2014) and of the EU project META-NORD (2011-2013), project co-leader of the project Semantic Processing Across Domains financed by the Danish Research Council (2013-2016).
She has been member of selected scientific committees at ACL, COLING, the Global WordNet Conference, the Euralex Congress, LREC, OntoLex, among others.
Talk abstract
Lexical Conceptual Resources in the Era of Neural Language Models
Lexical conceptual resources in terms of e.g. wordnets, framenets, terminologies and ontologies have been compiled for many languages during the last decades in order to provide NLP systems with formally expressed information about the semantics of words and phrases, and about how they refer to the world. In most recent years, neural language models have become a game-changer in the NLP field – based, as they are, solely on text from large corpora. It is time we ask ourselves: What is the role of lexical conceptual resources in the era of neural language models? The claim of my talk is that they still play a crucial role since NLP systems based on textual distribution alone will always to some extent be insufficient and biased. Through my own work, which has over the years taken place in close collaboration with leading lexicographers in Denmark, I will illustrate how such conceptual resources can be compiled based on existing high-quality and continuously updated lexicographical resources and how they can be further curated by examining the distributional patterns captured in word embeddings.