The Bulgarian wordnet (BulNet) is a lexical semantic network of
Bulgarian that was launched within the project for development of a
lexical semantic network of the Balkan languages BalkaNet. The Bulgarian database is integrated into the BalkaNet and the network of the European languages EuroWordNet
through unique interlingual indexes (ILIs) marking unambiguously the
counterparts in the different languages. After the completion of the
BalkaNet project, the construction of the Bulgarian wordnet has
continued within the nationally-funded projects BulNet - a
Lexical-semantic Network of Bulgarian, and Electronic resources and
processing tools" cofunded along the project CESAR: Central and South-East European Resources funded under The Information and Communication Technologies Policy Support Programme Call: CIP ICT-PSP-2010-4.
The BulNet is developed following the Princeton WordNet (PWN)
framework being a subtype of the traditional semantic networks whose
structure consists of nodes and relations between the nodes. The nodes
are synonym sets (synsets) that contain words or compounds (literals).
Arcs connecting the nodes express semantic, derivative and
extralinguistic relations between objects in the nodes. Literals
(senses) and synsets (meanings) encode language independent concepts.
The semantics of lexical units in wordnet is expressed implicitly by the
synonymous relations between literals in the synset and relations to
other nodes in the network, and explicitly through the explanatory
definition and usage examples.
of January 21, 2013, the Bulgarian wordnet consists of 49,189 synonym
sets, distributed into nine parts of speech - nouns, verbs, adjectives,
adverbs (open-class words); pronouns, prepositions, conjunctions,
particles, interjections (closed-class words). Each synonym set is
supplied with explanatory definition which represents the common
referential meaning of all its members.
synonym sets are linked to each other by means of semantic,
morpho-semantic and extralinguistic relations that connect the words in a
language. A Sense-Annotated corpus of Bulgarian (BulSemCor) was developed.
The ongoing tasks of the project are:
Further expansion of the Bulgarian WordNet with new synsets.
Editing of the existing data.
Development and application of tests for completeness and consistency of the database.
Design and implementation of a system for automatic word sense disambiguation (WSD).
Dissemination of the project's results.