9 September 2024

House of Europe (124 G. RAkovski St.)


8:30 – Registration

9:00 – 9:15 – Conference Opening


Neural Networks, Large Language Models and Language Modelling


9:15 – 10:00 – Plenary Talk: Dr. Veselin Stoyanov (TOME AI, USA): Large Language Models for the Real World: Explorations of Sparse, Cross-lingual Understanding and Instruction-Tuned LLMs


10:00 – 11:15 – Session 1: Large Language Models and Language Learning | Chair: Stoyan Mihov (Institute of Information and Communication Technologies, BAS)


10:00 – 10:25 – Radu Ion, Verginica Barbu Mititelu, Vasile Pais, Elena Irimia, Valentin Badea: A Cross–model Study on Learning Romanian Parts of Speech with Transformer Models

10:25 – 10:50 – Ekaterina Goliakova, David Langlois: What do BERT Word Embeddings Learn about the French Language?

10:50 – 11:15 – Camille Lavigne, Alex Stasica: Whisper–TAD: A General Model for Transcription, Alignment and Diarization of Speech


11:15 – 11:30 – Coffee Break


 


11:30 – 12:45 – Session 2: Large Language Models in Analysis and Generation | Chair: Ivan Koychev (Sofia University St. Kliment Ohridski)


11:30 – 11:55 – Iglika Nikolova–Stoupak, Gael Lejeune, Eva Schaeffer–Lacroix: Contemporary LLMs and Literary Abridgement: An Analytical Inquiry

11:55 – 12:20 – Milica Ikonic Nesic, Sasa Petalinkar, Mihailo Skoric, Ranka Stankovic, Biljana Rujevic: Advancing Sentiment Analysis in Serbian Literature: A Zero and Few–Shot Learning Approach Using the Mistral Model

12:20 – 12:45 – Lyuboslav Karev, Ivan Koychev: Generating Phonetic Embeddings for Bulgarian Words with Neural Networks


12:45 – 13:45 – Lunch and Poster Session


13:45 – 14:30 – Plenary Talk: Prof. Joakim Nivre (Uppsala University and RISE, Sweden): Ten Years of Universal Dependencies


14:30 – 15:45 – Session 3: Treebanks and Parsers in Universal Dependencies | Chair: Milena Dobreva (University of Strathclyde)


14:30 – 14:55 – Nelda Kote, Rozana Rushiti, Anila Cepani, Alba Haveriku, Evis Trandafili, Elinda Kajo Mece, Elsa Skenderi Rakipllari, Lindita Xhanari, Albana Deda: Universal Dependencies Treebank for Standard Albanian: A New Approach

14:55 – 15:20 – Verginica Barbu Mititelu, Tudor Voicu: Function Multiword Expressions Annotated with Discourse Relations in the Romanian Reference Treebank

15:20 – 15:45 – Atanas Atanasov: Dependency Parser for Bulgarian


15:45 – 16:00 – Coffee Break


 


16:00 – 17:40 – Session 4: Modeling Multiword Expressions | Chair: Aleksandra Bagasheva (Sofia University St. Kliment Ohridski)


16:00 – 16:25 – Madalina Chitez, Ana–Maria Bucur, Andreea Dinca, Roxana Rogobete: Towards a Romanian Phrasal Academic Lexicon

16:25 – 16:50 – Laura Rituma, Gunta Nespore–Berzkalne, Agute Klints, Ilze Lokmane, Madara Stade, Peteris Paikens: Classifying Multi–Word Expressions in the Latvian Monolingual Electronic Dictionary Tezaurs.lv

16:50 – 17:15 – Laura Occhipinti: Complex Word Identification for Italian Language: A Dictionary–based Approach

17:15 – 17:40 – Ivana Brac, Matea Birtic: Verbal Multiword Expressions in the Croatian Verb Lexicon


10 September 2024


Datasets, Corpora and Lexical-semantic Resources


9:00 – 9:45 – Plenary Talk: Prof. Vito Pirrelli (NRC, Institute for Computational Linguistics, Pisa, Italy): Written Text Processing and the Adaptive Reading Hypothesis


9:45 – 10:35 – Session 5: Language Technologies and Language Acquisition | Chair: Mariana Damova (Mozaika LTD.)


9:45 – 10:10 – Alessandro Lento, Andrea Nadalini, Marcello Ferro, Claudia Marzi, Vito Pirrelli, Tsvetana Dimitrova, Hristina Kukova, Valentina Stefanova, Maria Todorova, Svetla Koeva: Assessing Reading Literacy of Bulgarian Pupils with Finger–tracking

10:10 – 10:35 – Denitza Charkova: Educational Horizons: Mapping the Terrain of Artificial Intelligence Integration in Bulgarian Educational Settings


10:35 – 11:25 – Session 6: Corpus–based Studies: Part 1 | Chair: Svilena Georgieva (DG Translation, EU)


10:35 – 11:00 – Ekaterina Tarpomanova: Evidential Auxiliaries as Non–reliability Markers in Bulgarian Parliamentary Speech

11:00 – 11:25 – Iglika Nikolova–Stoupak, Eva Schaeffer–Lacroix, Gael Lejeune: Extended Context at the Introduction of Complex Vocabulary in Abridged Literary Texts


11:25 – 11:40 – Coffee Break


 


11:40 – 12:55 – Session 6: Corpus–based Studies: Part 2 | Chair: Elitza Horozova (Translation Agency Sofita)


11:40 – 12:05 – Junya Morita: Corpus–based Research into Derivational Morphology: A Comparative Study of Japanese and English Verbalization

12:05 – 12:30 – Ivan Derzhanski, Olena Siruk: The Verbal Category of Conditionality in Bulgarian and Its Ukrainian Correspondences

12:30 – 12:55 – Natalia Dankova: Lexical Richness of French and Quebec Journalistic Texts


12:55 – 13:55 – Lunch and Poster Session


 


13:55 – 15:10 – Session 7: Language Resources and Datasets | Chair: Irina Temnikova (GATE Institute)


13:55 – 14:20 – Maria Khokhlova, Mikhail Koryshev: A Corpus of Liturgical Texts in German: Towards Multilevel Text Annotation

14:20 – 14:45 – Valentin Zmiycharov, Ivan Koychev, Todor Tsonkov: EurLexSummarization – A New Text Summarization Dataset on EU Legislation in 24 Languages with GPT Evaluation

14:45 – 15:10 – Petya Osenova: On a Hurtlex Resource for Bulgarian


15:10 – 15:30 – Coffee Break


 


15:30 – 17:10 – Session 8: WordNets, FrameNets and Ontologies | Chair: Tinko Tinchev (Sofia University St. Kliment Ohridski)


15:30 – 15:55 – Ivelina Stoyanova: Semantic Features in the Automatic Analysis of Verbs of Creation in Bulgarian and English

15:55 – 16:20 – Svetlozara Leseva: A ‘Dip-dive’ into Motion: Exploring Lexical Resources towards a Comprehensive Semantic and Syntactic Description

16:20 – 16:45 – Ivelina Stoyanova, Hristina Kukova, Maria Todorova, Tsvetana Dimitrova: Multilingual Corpus of Illustrative Examples on Activity Predicates

16:45 – 17:10 – Svetla Koeva: Large Language Models in Linguistic Research: The Pilot and the Copilot


17:10 – 17:30 Conference Closing


POSTER SESSION

Chair: Svetlozara Leseva (Institute for Bulgarian Language, BAS)


The Poster Session will take place during the lunch break on 9 and 10 September.


The posters are listed in alphabetical order of the first authors’ surnames.


Fabio Maion, Tsvetana Dimitrova, Andrej Bojadziev: A Unified Annotation of the Stages of the Bulgarian Language. First Steps  

Amal Haddad Haddad, Damith Premasiri: ChatGPT: Detection of Spanish Terms Based on False Friends

Jordan Kralev: Deep Learning Framework for Identifying Future Market Opportunities from Textual User Reviews  

Ruslana Margova, Bastiaan Bruinsma: Look Who’s Talking: The Most Frequently Used Words in the Bulgarian Parliament 1990–2024

Sabrina Mennella, Maria Di Maro, Martina Di Bratto: Estimating Commonsense Knowledge from a Linguistic Analysis on Information Distribution

Georgi Pashev, Silvia Gaftandzhieva: Pondera: A Personalized AI–Driven Weight Loss Mobile Companion with Multidimensional Goal Fulfillment Analytics

Stanislav Penkov: Mitigating Hallucinations in Large Language Models via Semantic Enrichment of Prompts: Insights from BioBERT and Ontological Integration

Maria Todorova: Commercially Minor Languages and Localization