Uncategorized | Computational Linguistics in Bulgaria (CLIB-2016)

Test

Posted by Svetlozara Lesseva On August 30th, 2016

Златен спонсор на CLIB 2016:

Prof. Dragomir Radev (Department of Electrical Engineering and Computer Science, University of Michigan)

Posted by Svetlozara Lesseva On June 16th, 2016

Short bio

Dragomir Radev is a Professor of Computer Science and Engineering, Information, and Linguistics at the University of Michigan. He also has an appointment in the Michigan Institute for Data Science (MIDAS).

Dragomir grew up in Bulgaria and got interested in Computational Linguistics in high school when he participated in a number of contests in mathematical linguistics. Dragomir has a PhD in Computer Science from Columbia University, where he currently holds a Visiting Professor title. Dragomir’s research is in Natural Language Processing, Applied Machine Learning, and Information Retrieval. He works in the fields of text summarization, lexical semantics, sentiment analysis, open domain question answering, and the application of NLP to other areas such as Bioinformatics and Political Science.

Dragomir is the past secretary of the Association for Computational Linguistics (ACL). Dragomir is also co-founder of the North American Computational Linguistics Olympiad (NACLO) and the coach of the US team for the International Linguistics Olympiad (IOL). Dragomir has close to 200 international publications as well as three patents. He is the co-author (with Rada Mihalcea) of the book “Graph-based Natural Language Processing and Information Retrieval” and the editor of two volumes of “Puzzles in Logic, Languages and Computation”.

Dragomir has worked for or consulted for IBM, Yahoo, Microsoft, AT&T, and other companies. In 2013, Dragomir received the University of Michigan’s Distinguished Faculty Award. He is an associate editor of the Journal of Artificial Intelligence Research (JAIR). Dragomir also teaches introduction to Natural Language Processing on Coursera. Dragomir became an Association for Computing Machinery (ACM) Fellow in 2015.

Talk abstract

Natural Language Processing for Collective Discourse

Natural Language Processing (NLP) has become very popular in recent years thanks to new technologies like IBM’s Watson, Apple’s Siri, Google Translate, and Yahoo’s text summarization system. One of the fundamental challenges in NLP is to automatically recognize similar words and sentences. I will talk about research done in the Computational Linguistics And Information Retrieval lab (CLAIR) on graph-based methods for similarity recognition and its applications to NLP tasks. These projects are related to Collective Discourse (text collections produced by large numbers of users) and its inherent properties such as centrality and diversity. In the first project we team up with the New Yorker magazine. Each week a captionless cartoon is published in the magazine and thousands of readers try to come up with funny captions for it. In our work, we try to uncover the topics of the jokes in the submitted captions. The second project is about analysing a corpus of word clues used in New York Times crossword puzzles. We compare different clustering methods for word sense disambiguation using these crossword clues. The third project is about the automatic generation of citation-based summaries of research articles. These summaries describe what readers of the papers find most important in the cited papers. If there is time, I will also briefly mention some applications to bioinformatics, political science, and social network analysis.

Dr. Preslav Nakov (Qatar Computing Research Institute, HBKU)

Posted by Svetlozara Lesseva On June 16th, 2016

Short bio

This year our invited speaker will be Dr. Preslav Nakov of the Qatar Computing Research Institute, HBKU. His primary research interests include computational linguistics, machine translation, question answering, lexical semantics, Web as a corpus, and biomedical text processing.

Preslav Nakov holds a PhD degree in Computer Science from the University of California at Berkeley and a MSc degree from Sofia University. He was a Research Fellow at the National University of Singapore (2008-2011), a honorary lecturer at Sofia University (2008), researcher at the Bulgarian Academy of Sciences (2008), and a visiting researcher at the University of Southern California, Information Sciences Institute (2005).

Preslav Nakov is a co-author of a book on Semantic Relations between Nominals, two books on computer algorithms, and over 100 research papers, including over 40 in top-tier conferences and journals.

He received the Young Researcher Award at the Recent Advances in Natural Language Processing Conference 2011 (RANLP’2011) and was the first to receive the Bulgarian President’s John Atanasoff award. His research in machine translation won competitions in the Seventh and the Ninth Workshop on Statistical Machine Translation (WMT’12 and WMT’14), as well as in the 10th International Workshop on Spoken Language Translation (IWSLT’13).

Preslav Nakov is an Associate Editor of the AI Communications journal and an elected member of the ACL SIGLEX board (since 2013). He served on the programme committees of the major conferences and workshops in computational linguistics, including as a co-chair of SemEval 2014-2016, and as an area chair of *SEM’13 and EMNLP’16.

Talk abstract

Exposing Paid Opinion Manipulation Trolls in News Community Forums

The practice of using opinion manipulation trolls has been reality since the rise of Internet and community forums. It has been shown that user opinions about products, companies and politics can be influenced by posts by other users in online forums and social networks. This makes it easy for companies and political parties to gain popularity by paying for “reputation management” to people or companies that write in discussion forums and social networks fake opinions from fake profiles.

During the 2013-2014 Bulgarian protests against the Oresharski cabinet, social networks and news community forums became the main “battle grounds” between supporters and opponents of the government. In that period, there was a very notable presence and activity of government supporters in Web forums. In series of leaked documents in the independent Bulgarian media Bivol, it was alleged that the ruling Socialist party was paying Internet trolls with EU Parliament money. Allegedly, these trolls were hired by a PR agency and were given specific instructions what to write.

A natural question is whether such trolls can be found and exposed automatically. This is a very hard task, as there is no enough data to train a classifier; yet, it is possible to obtain some test data, as these trolls are sometimes caught and widely exposed (e.g., by Bivol). Yet, one still needs training data. We solve the problem by assuming that a user who is called a troll by several different people is likely to be one, and one who has never been called a troll is unlikely to be such. We compare the profiles of (i) paid trolls vs. (ii) “mentioned” trolls vs. (iii) non-trolls, and we further show that a classifier trained to distinguish (ii) from (iii) does quite well also at telling apart (i) from (iii).

Plenary Talks

Posted by Svetlozara Lesseva On June 15th, 2016

Hello world!

Posted by On April 11th, 2014

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!