TUTORIALS | Computational Linguistics in Bulgaria (CLIB-2026)

PROVISIONAL SCHEDULE

PROGRAMME | 7 September 2026

Tutorial: Ollama, No Drama: Step-by-Step Guide of Practical Local AI
Lecturer: Dimitar Hristov
Affiliation: Institute for Bulgarian Language, Bulgarian Academy of Sciences

Tutorial: Hack the Agent: Vulnerabilities and Security in AI Systems
Lecturer: Ivan Ivanov
Affiliation: Stihia.ai

PROGRAMME | 10 September 2026

Tutorial: The Work of Linguists in Computational Linguistics
Lecturer: Irina Temnikova
Affiliation: Big Data for Smart Society Institute (GATE)

Tutorial: Composing Theoretical and Computational Linguistic Problems
Lecturer: Ivan Derzhanski
Affiliation: Institute of Mathematics and Informatics, Bulgarian Academy of Sciences

TUTORIALS DESCRIPTION

Tutorial: Ollama, No Drama: Step-by-Step Guide of Practical Local AI | Download PDF

Abstract: This tutorial introduces participants to Ollama, a lightweight and user-friendly framework for running large language models locally. The session covers installation, model selection, loading and switching between models, running prompts and chats directly in the terminal, and exploring simple integrations – all without requiring programming experience. The tutorial emphasizes hands-on interaction, transparency, and demystification of local AI workflows.
The primary goal is to give beginners – including first-year computer science and engineering students and participants with non-engineering backgrounds – the knowledge and confidence to use local LLMs for learning, experimentation, and small-scale research tasks. By the end of the tutorial, attendees will be able to:

Install and configure Ollama on their own machines
Understand the differences between available open models
Run prompts and interactive chats in the terminal
Configure models for specific tasks and experiments
Use local LLMs for practical tasks such as summarization, brainstorming, and text exploration
Connect Ollama to external tools or interfaces
Develop intuition about how local AI fits into broader NLP and research workflows

The session is designed intentionally code-light, focusing on conceptual clarity and practical utility rather than software engineering.

Tutorial type: Introductory

Target audience: The course is aimed at students and young researchers and professionals of various backgrounds, without necessary prior experience in code-writing or command-line interfaces.

Prerequisites: Some experience with common AI chatbots and assistants is beneficial. The attendees will be expected to bring their own computers for the hands-on part of the tutorial.

Lecturer: Dimitar Hristov (Institute for Bulgarian Language, Bulgarian Academy of Sciences)

Short bio: Dimitar Hristov is a graduate of the Sofia High School of Mathematics, with a BSc from the University of Southampton and an MSc from Sofia University. He is currently a PhD student at the Department of Computational Linguistics of the Institute for Bulgarian Language at the Bulgarian Academy of Sciences, focusing on the development, fine-tuning, and optimisation of large language models, including their specialisation for on-device execution. He has long collaborated with the Department on research in NLP, corpus development, and WordNet resources for English and Bulgarian. Dimitar also has experience leading educational sessions in both non-formal and professional settings, including workshops in student organisations and technical tutorials for apprentices in Cleversoft Bulgaria’s training programme.

Tutorial: Hack the Agent: Vulnerabilities and Security in AI Systems | Download PDF

Tutorial description: Prompt injection – the act of tricking an AI system into following malicious instructions hidden in its context – is fundamentally a linguistic attack. Unlike traditional software exploits that target memory layouts or protocol weaknesses, prompt injection exploits the fact that large language models cannot reliably distinguish between legitimate instructions and adversarial data when both are expressed in natural language. The attack surface is language itself.

This makes prompt injection a first-class problem for computational linguistics. Detecting these attacks requires NLP classification, semantic analysis, and an understanding of how meaning is constructed and manipulated across languages. Defending against them demands the same skills that the computational linguistics community applies to sentiment analysis, intent detection, and discourse parsing – but in an adversarial setting where attackers actively craft inputs to evade detection.

The cross-lingual dimension is especially relevant to the CLIB community. Current prompt injection research and defense tooling is overwhelmingly English-centric. Models trained primarily on English data exhibit measurably lower resilience when processing prompts in under-resourced languages such as Bulgarian,
Greek, Romanian, Hungarian, Polish, or Czech. Attackers already exploit this gap: switching to a low-resource language is a documented evasion technique. This means that the languages and communities CLIB serves are disproportionately vulnerable to prompt injection, and that progress on detection and defense for these languages requires exactly the kind of linguistic expertise this conference fosters.

This tutorial bridges AI security and computational linguistics by treating prompt injection as adversarial NLP. Participants will learn how attacks work, see live demonstrations (including multi-language attacks),
and explore NLP-based defense strategies – with particular attention to the challenges and opportunities posed by under-resourced European languages.

Tutorial type: Cutting-edge. Prompt injection in agentic AI systems is an active, unsolved research problem. The OWASP Top 10 for Agentic Applications was published in December 2025, and new attack vectors are discovered regularly. No consensus defense exists, and the cross-lingual dimension remains largely unexplored. This tutorial presents the current state of the art alongside open problems that computational linguists are uniquely positioned to address.

Target audience: NLP researchers, computational linguists, AI/ML practitioners, and cybersecurity professionals interested in the intersection of language and security.

Prerequisites: Basic familiarity with large language models (understanding of prompts, completions, and basic machine learning concepts). No prior cybersecurity expertise is required – the tutorial introduces all necessary security concepts from the ground up.

Lecturer: Ivan Ivanov (Stihia.ai)

Short bio: Ivan Ivanov is the Founder of Stihia.ai, company developing a security observability system for AI agents. With over a decade of industrial experience in Data Science and Machine Learning, Ivan possesses a deep understanding of the real-world risks and requirements involved in deploying AI in business environments. Throughout his career, he has served as a technical leader for high-growth companies, managing data science teams and delivering scalable AI systems for the retail, manufacturing, education, and healthcare sectors. His portfolio spans use cases ranging from predictive maintenance and demand forecasting to intelligent document processing, including platforms serving over 40,000 active users. Ivan holds a Master’s degree in Computer Science from the University of Bonn, Germany, and a Bachelor of Engineering in Computing from TU-Varna, Bulgaria. He has been active in the Bulgarian data science community through organizations including Data for Good Bulgaria.

Tutorial: The Work of Linguists in Computational Linguistics | Download PDF

Tutorial description: This is an introductory tutorial, based on the presenter’s experience as a linguist, translator, annotator, and researcher in the Computational Linguistics (CL) field. The tutorial is for linguists, beginning their careers in the Computational linguistics field, such as University students. It presents an overview of the tasks which are completed by linguists, assigned by more experienced researchers and companies, such as data collection, annotation, tools’ evaluation, and post-editing of machine translation (including the use of language models). It provides definitions and explains some basic concepts, such as the main CL methods, the levels of text processing, Corpus Linguistics, Translation Technologies, and Natural Language Processing (NLP) applications. Some practical exercises will be given.

The tutorial also provides practical tips for beginning linguists regarding where they can find related jobs in the field. The tutorial is compiled from previous University lectures of the presenter. This tutorial is of great value for linguists, especially those with no mathematical or programming skills, who begin their journey in the CL/NLP fields.

Tutorial type: Introductory

Target audience: The tutorial is designed to train linguists, beginning their careers in the Computational Linguistics / Natural Language Processing fields, as well as translators, planning to use translation memories and machine translation. Linguists with existing annotation, statistical and programming skills are also welcome. More experienced researchers and annotators are welcome to join the discussion with their suggestions and personal experiences.

Prerequisites: The prerequisite background for the course is linguistic knowledge or linguistic (university-level) education.

Lecturer: Irina Temnikova (Big Data for Smart Society Institute (GATE))

Short bio: Dr. Irina Temnikova holds a B.A. degree in Italian linguistics, as well as an M.A. and a PhD in computational linguistics. She started her career as a translator and linguist with basic Perl and Python programming skills. During the first years of her career, she conducted pure linguistic and psycholinguistic research applied to the field of natural language processing (NLP). During the last few years she has self-taught herself and made a breakthrough progress in understanding, fine-tuning and training NLP models, as well as in statistical calculations, while leading a research team and successfully completing the externally funded research project TRACES. During all of her career, she has worked as an annotator in freelance projects, and has designed and run annotation and NLP systems evaluation experiments. Her research interests cover interdisciplinary and practical approaches to emotion detection, text readability and simplification, controlled languages and sublanguages, NLP for disaster computing, post-editing and evaluation of machine translation, extracting linguistic insights from interpreting transcriptions, corpus linguistics, and detecting deception and disinformation.
Irina has extensive experience teaching Italian language courses. At the university level, she has delivered lectures on topics related to her research interests to both undergraduate and master’s students at the University of Wolverhampton, UK. She has also contributed to a course and has independently designed and successfully delivered another full undergraduate (B.A.) course focused on the work of linguists in computational linguistics, both taught at Sofia University, Bulgaria.

Tutorial: Composing Theoretical and Computational Linguistic Problems | Download PDF

Tutorial description: This tutorial will present the genre of the self-contained linguistic problem as a form of teaching facts about languages and concepts from linguistics, as well as the major techniques of composing problems and problem sets for linguistic contests, both classical and computational, but with a special focus on the latter. It is intended for all who are interested in linguistic problems, whether because of current or envisaged participation in organising a contest, or interest in trying one’s hand as a problem author, or involvement in teaching in any form, or plain curiosity.

For more than six decades, linguistics olympiads – contests for secondary-school students in solving linguistic problems – have served as a means of acquainting such students, but also the general public, with the science of language and its applications, thus filling a gap in school curricula from which this domain of human intellectual pursuit is typically absent. Accordingly, the idea is that the problems should be self-contained – primarily inductive rather than deductive, not testing previously obtained knowledge but inviting the discovery of rules on the basis of unseen data, using only one’s analytical skills and general culture in addition to such fundamental notions about language as are introduced in native and foreign language classes.

The contests were inspired by and modelled on mathematics olympiads, mathematicians cooperated with linguists in their design and implementation, and the word ‘mathematics’ or ‘mathematical’ was (and still is) often present in the title; but it was understood that what mathematics provided was the general method, the idea of abstract logical reasoning, not the content. The subfield of linguistics to which most linguistic problems can be said to be relevant is unquestionably typology.

Notwithstanding, since the earliest instalments of the first regular linguistic contest in history, the Moscow Traditional Olympiad in Linguistics and Mathematics (established in 1965), concepts and topics from computational linguistics have been present in it, in anticipation of the explosive development of language technologies that was destined to come shortly. Unsurprisingly, in recent years such problems have been appearing more frequently at ‘classical’ linguistic olympiads, and several contests have been launched which explicitly name computational linguistics as their primary or only subject, such as the North American Computational Linguistics Open Competition (NACLO, est. 2007) and the Australian Computational and Linguistics Olympiad (OzCLO, est. 2008), or the Contest in Computational Linguistics offered by the Institute for Bulgarian Language, Bulgarian Academy of Sciences. As the olympiads grow in numbers and size, the number of problem authors grows as well – but so does the need for problems.

Tutorial type: Introductory

The proposed tutorial will present the art – that is, the major techniques – of composing problems and problem sets for linguistic contests, both classical and computational, but with a special focus on the latter.

Target audience: The target audience are primarily those who are engaged or interested in organising a contest in linguistics or computational linguistics, but also linguists, and especially computational linguists, both professionals and (postgraduate) students, who are involved (or considering being involved) in teaching at any level and would like to broaden the range of methods that they use.

Prerequisites: None.

Lecturer: Ivan Derzhanski (Institute of Mathematics and Informatics, Bulgarian Academy of Sciences)

Short bio: Ivan Derzhanski is an associate professor at the Institute of Mathematics and Informatics, Bulgarian Academy of Sciences. His research interests are in (mostly computer- and corpus-aided) contrastive linguistics and typology, corpus linguistics, translation theory and practice and the methodology of linguistics. He teaches courses in Python programming for linguists, computational linguistics and natural language processing in undergraduate and postgraduate programmes at the University of Sofia and the New Bulgarian University. In Bulgaria he has been playing the central part in providing scientific support to the Olympiad and National Contest in Linguistics and to linguistic seminars for secondary-school students, putting together problem sets for the former and lecture programmes for the latter, continuously since 1996. He is a co-founder of the International Linguistics Olympiad (IOL, est. 2003) and has served on the problem committee and jury of all its instalments and chaired them thrice. Also he is a member of the team organising the national Contest in Computational Linguistics, centred at the Institute for Bulgarian Language, Bulgarian Academy of Sciences. He has authored and co-authored more than 120 linguistic problems.