Monthly Archives: November 2019

Assoc. Prof. Svetla Boytcheva (Institute of Information and Communication Technologies)

Short bio

Svetla BoytchevaDr. Svetla Boytcheva has a PhD in Computer Science and MSc in Mathematics from Sofia University St. Kliment Ohridski, Bulgaria. Her PhD thesis is in the field of Machine Learning and NLP. She has a long track of computer science courses teaching more than 25 years at one of the top universities in Bulgaria – Sofia University, American University in Bulgaria, University of Library Studies and Information Technologies, and New Bulgarian University. She has leadership and Management Skills gained as Vice Dean of Academic Affairs and head of the graduate program in Artificial Intelligence at Sofia University, as well as supervisor of successfully defended PhD student and more than 30 undergraduate and graduate student thesis projects.

Her current research interests include different aspects of Artificial Intelligence and Biomedical Informatics – machine learning, data mining, big data analytics, natural language processing, health informatics, and e-learning. She gained experience in participating in several EU funded and national research projects including EC FP7 — PSIP+, AcomIn, SISTER; EC FP6 — TENCompetence, KALEIDOSCOPE; INCO-Copernicus — LarFlast, ILPnet2; SOCARATES/ERASMUS— ETN-DEC; BMBF Germany —BIMDANUBE; Bulgarian National Science Fund — EVTIMA, IZIDA, DemoSem. She is the responsible person from the Bulgarian team for H2020 projects InnoRate, ExaMode, theFMS. Currently, she also participates ин several governmental projects and operational program projects co-financed by EU: eHealth National Scientific Programmes, Information and Communication Technologies for a Unite Digital Market in Science, Education and Security, the Centre for Advanced Computing and Data Processing and Ministry of Health Projects: Development of National Electronic Health Register for Diabetes Mellitus Diseases, Analysis of the morbidity, prevalence, and treatment assessment of Diabetes Mellitus and Cardiovascular Diseases. In 2011 she received the Rolf Hansen Memorial Award of the European Federation for Medical Informatics.

Currently, she is also an associate professor of computer science in the Department of Linguistic Modelling and Knowledge Processing at the Institute of Information and Communication Technologies of the Bulgarian Academy of Sciences. Her current position in Sirma AI (trading as Ontotext) is Senior Research Lead, that includes responsibilities for conducting research and prototypes development for scientific projects of the company. She has authored 10 books and more than 90 scientific papers. She is also an author for several textbooks in Computer Science and Information Technologies for middle and secondary schools in Bulgaria.

>> Back to Plenary Talks

Talk abstract

Clinical Natural Language Processing in Bulgarian

Healthcare is a data intense domain. A large amount of patient data is generated daily. However, more than 80% of this information is stored in an unstructured format – as clinical texts. Usually, clinical narratives contain a description with telegraph-style sentences, ambiguous abbreviations, many typographical errors, lack of punctuation, concatenated words, and etc. Especially in the Bulgarian context – medical texts contain terminology both in Bulgarian, Latin and transliterated Latin terminology in Cyrillic, that makes the task for text analytics more challenging. Recently, with the improvement of the quality of natural language processing (NLP), it is increasingly recognized as the most useful tool for extracting clinical information from free text in scientific medical publications and clinical records. Natural language processing (NLP) of non-English clinical text is quite a challenge because of the lack of resources and NLP tools. International medical ontologies such as SNOMED, MeSH (Medical Subject Headings), and the UMLS (Unified Medical Languages System) are not yet available in most languages. This necessitates the development of new methods for processing clinical information and for semi-automatically generating medical language resources. This is not an easy task because of the lack of a sufficiently accessible repositories with medical records, due to the specific nature of the content, which contains a lot of personal data and specific regulations for their access.

In this talk will be discussed the multilingual aspects of automation Extract text from clinical narratives in the Bulgarian language. This is very important task for medical informatics, because it allows the automatic structuring of patient information and the generation of databases that can be further investigated by retrieving data to search for complex relationships. The results can help improve clinical decision support, diagnosis and treatment support systems.

>> Back to Plenary Talks

Dr. Preslav Nakov (Qatar Computing Research Institute)

Short bio

Preslav_Nakov1Dr. Preslav Nakov is a Principal Scientist at the Qatar Computing Research Institute (QCRI), HBKU. His research interests include computational linguistics, “fake news” detection, fact-checking, machine translation, question answering, sentiment analysis, lexical semantics, Web as a corpus, and biomedical text processing. He received his PhD degree from the University of California at Berkeley (supported by a Fulbright grant), and he was a Research Fellow at the National University of Singapore, a honorary lecturer at Sofia University, and research staff at the Bulgarian Academy of Sciences.

At QCRI, he leads the Tanbih project, developed in collaboration with MIT, which aims to limit the effect of “fake news”, propaganda and media bias by making users aware of what they are reading. Dr. Nakov is the Secretary of ACL SIGLEX and of ACL SIGSLAV, and a member of the EACL advisory board. He is member of the editorial board of TACL, C&SL, NLE, AI Communications, and Frontiers in AI. He is also on the Editorial Board of the Language Science Press Book Series on Phraseology and Multiword Expressions. He co-authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals.

Dr. Nakov received the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President’s John Atanasoff award, named after the inventor of the first automatic electronic digital computer. Dr. Nakov’s research was featured by over 100 news outlets, including Forbes, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.

>> Back to Plenary Talks

Talk abstract

Detecting the Fake News at Its Source, Media Literacy, and Regulatory Compliance

Given the recent proliferation of disinformation online, there has been also growing research interest in automatically debunking rumors, false claims, and “fake news”. A number of fact-checking initiatives have been launched so far, both manual and automatic, but the whole enterprise remains in a state of crisis: by the time a claim is finally fact-checked, it could have reached millions of users, and the harm caused could hardly be undone. An arguably more promising direction is to focus on fact-checking entire news outlets, which can be done in advance. Then, we could fact-check the news before they were even written: by checking how trustworthy the outlets that published them are.

We will show how we do this in the Tanbih news aggregator (//www.tanbih.org/), which aims to limit the effect of “fake news”, propaganda and media bias by making users aware of what they are reading. The project’s primary aim is to promote media literacy and critical thinking, which are arguably the best way to address disinformation and “fake news” in the long run. In particular, we develop media profiles that show the general factuality of reporting, the degree of propagandistic content, hyper-partisanship, leading political ideology, general frame of reporting, stance with respect to various claims and topics, as well as audience reach and audience bias in social media. We further offer explainability by automatically detecting and highlighting the instances of use of specific propaganda techniques in the news (https://www.tanbih.org/propaganda).

Finally, we will show how this research can support broadcasters and content owners with their regulatory measures and compliance processes. This is a direction we recently explored as part of our TM Forum & IBC 2019 award-winning Media-Telecom Catalyst project on AI Indexing for Regulatory Compliance, which QCRI developed in partnership with Al Jazeera, Associated Press, RTE Ireland, Tech Mahindra, V-Nova, and Metaliquid.

>> Back to Plenary Talks

Loading...
X