Prof. Dragomir Radev (Department of Electrical Engineering and Computer Science, University of Michigan) | Computational Linguistics in Bulgaria (CLIB-2026)

Short bio

Dragomir Radev is a Professor of Computer Science and Engineering, Information, and Linguistics at the University of Michigan. He also has an appointment in the Michigan Institute for Data Science (MIDAS).

Dragomir grew up in Bulgaria and got interested in Computational Linguistics in high school when he participated in a number of contests in mathematical linguistics. Dragomir has a PhD in Computer Science from Columbia University, where he currently holds a Visiting Professor title. Dragomir’s research is in Natural Language Processing, Applied Machine Learning, and Information Retrieval. He works in the fields of text summarization, lexical semantics, sentiment analysis, open domain question answering, and the application of NLP to other areas such as Bioinformatics and Political Science.

Dragomir is the past secretary of the Association for Computational Linguistics (ACL). Dragomir is also co-founder of the North American Computational Linguistics Olympiad (NACLO) and the coach of the US team for the International Linguistics Olympiad (IOL). Dragomir has close to 200 international publications as well as three patents. He is the co-author (with Rada Mihalcea) of the book “Graph-based Natural Language Processing and Information Retrieval” and the editor of two volumes of “Puzzles in Logic, Languages and Computation”.

Dragomir has worked for or consulted for IBM, Yahoo, Microsoft, AT&T, and other companies. In 2013, Dragomir received the University of Michigan’s Distinguished Faculty Award. He is an associate editor of the Journal of Artificial Intelligence Research (JAIR). Dragomir also teaches introduction to Natural Language Processing on Coursera. Dragomir became an Association for Computing Machinery (ACM) Fellow in 2015.

Talk abstract

Natural Language Processing for Collective Discourse

Natural Language Processing (NLP) has become very popular in recent years thanks to new technologies like IBM’s Watson, Apple’s Siri, Google Translate, and Yahoo’s text summarization system. One of the fundamental challenges in NLP is to automatically recognize similar words and sentences. I will talk about research done in the Computational Linguistics And Information Retrieval lab (CLAIR) on graph-based methods for similarity recognition and its applications to NLP tasks. These projects are related to Collective Discourse (text collections produced by large numbers of users) and its inherent properties such as centrality and diversity. In the first project we team up with the New Yorker magazine. Each week a captionless cartoon is published in the magazine and thousands of readers try to come up with funny captions for it. In our work, we try to uncover the topics of the jokes in the submitted captions. The second project is about analysing a corpus of word clues used in New York Times crossword puzzles. We compare different clustering methods for word sense disambiguation using these crossword clues. The third project is about the automatic generation of citation-based summaries of research articles. These summaries describe what readers of the papers find most important in the cited papers. If there is time, I will also briefly mention some applications to bioinformatics, political science, and social network analysis.