Prof. Galya Angelova (Institute of Information and Communication Technologies) | Computational Linguistics in Bulgaria (CLIB-2026)

Short bio

Galya Angelova is Professor in Computer Science and Doctor of Sciences, director of the Institute of Information and Communication Technologies (IICT) at the Bulgarian Academy of Sciences. She studied Mathematics and Informatics at Sofia University “St. Kliment Ohridski” and received her PhD from MTA SZTAKI (Computer and Automation Institute at the Hungarian Academy of Sciences). Her major fields of research are: Knowledge-based natural language processing (information extraction from text, automatic acquisition of conceptual information from text, analysis of clinical patient records in Bulgarian language, analysis of image tags and automatic tag sense recognition); Big data analytics and visualization; Digitization and Intelligent management of digital content.

Prof. Angelova has published more than 150 scientific publications in journals, book chapters, and edited Conference volumes. She was the coordinator or principal investigator of more than 25 projects with international or national funding. In 2012-2016 she coordinated the project AComIn “Advanced Computing for Innovation”, a 3.2 MEuro grant with the European Commission, FP7 Capacity, included by the European Commission in the book “Achievements of FP7: examples that make us proud”. Prof. Angelova received the Big Award PITAGOR of the Bulgarian Ministry of Education and Science in the category “Successful leader of international projects for 2015”. She acts as a reviewer and evaluator for the European Commission.

In 2002-2013, Prof. Angelova was Member of the Editorial Board of the International Conference of Conceptual Structures (ICCS) with a series of peer-reviewed Proceedings published by Springer in Lecture Notes on Artificial Intelligence. After 2001, she is the Chair of the Organising Committee of the International Conference RANLP (Recent Advances in Natural Language Processing), held biennially in Bulgaria. After 2009, the RANLP Proceedings is uploaded in the ACL Anthology. After 2007, the RANLP Proceedings is indexed by Scopus with current SJR-rank 0.143.

>> Back to Plenary Talks

Talk abstract

Tag Sense Disambiguation in Large Image Collections: Is It Possible?

Automatic identification of intended tag meanings is a challenge in large annotated image collections where human authors assign tags inspired by emotional or professional motivations. This task can be viewed as part of the AI-complete problem to integrate language and vision. Algorithms for automatic Tag Sense Disambiguation (TSD) need “golden” collections of manually created tags to establish baselines for accuracy assessment. In this talk the TSD task will be presented with its background, complexity and possible solutions. An approach to use WordNet senses and Lesk algorithm proves to be successful but the evaluation was done manually for a small number of tags. Another experiment with the MIRFLICKR-25000 image collection will be presented as well. Word embeddings create a specific baseline so the results can be compared. The accuracy achieved in this exercise is 78.6%.

By improving TSD and obtaining high quality synsets for the image tags, we are actually supporting the machine translation of the large annotated image collections to languages other than English.

>> Back to Plenary Talks