EN BG

Програми

(Български) Hydra

Hydra is an OS-independent system designed for wordnet development, validation and exploration. The program enables users to browse and edit any number of monolingual wordnets at a time. The individual wordnets are synchronised, so that equivalent synonym sets, or synsets, may be viewed and explored in parallel.     Fig. 1. Hydra’s Synset view with the Bulgarian WordNet and the…

(Български) WinEst

  General description WinEst is a Bulgarian spelling checking for Microsoft Office. The program detects and highlights the misspelled words in a text and suggests the most probable replacement candidates, which are ordered according to their probability. WinEst makes use of a proficiently compiled dictionary of over a million and a half words. It is based on the Electronic Grammar…

bgMWE – a tool for MWE recognition

bgMWE is a tool for corpus processing and MWE recognition and tagging. It is developed in Java and is thus platform independent. bgMWE comprises a set of modules which can be applied for particular NLP tasks. It is largely language independent and can work either in resource-light mode, or its performance can be boosted by employing lexical resources. The system…

Corpus collocation service

General description The Corpus Collocations Service is a web service for collocations search and extraction of different types of statistics from the Bulgarian National Corpus including the parallel corpora in it – the Bulgarian-X Language Parallel Corpus. It employs the NoSketchEngine, a system for corpora processing that combines Manatee and Bonito. Access The Collocation service is a RESTful web service…

Frequency Dictionaries of Bulgarian

  General overview The Frequency Dictionaries are derived from the Bulgarian National Corpus (BulNC), which is the largest systematically created and representative corpus of Bulgarian. The Frequency Dictionaries reflect the frequency of occurrence of lexical items in the corpus (BulNC version: December 2011). The classification of the BulNC samples is based on their style, domain and genre. Texts are divided…

Chooser

General description   Chooser is an OS independent multi-functional system for linguistic annotation adaptable to different linguistic levels and different annotation schemata. Below Chooser’s features are discussed in relation to semantic annotation. The basic annotation functionalities implemented in Chooser are:   fast and easy-to-perform annotation; run-time access to detailed information for the annotation candidates through the associated wordnet senses with…

A Web-Based Infrastructure for Bulgarian Data Processing

  General description The Bulgarian Language Processing Chain includes the following types of text processing and linguistic annotation: • Sentence segmentation; • Tokenisation; • POS tagging and grammatical annotation; • Lemmatisation.   BgTagger The Bulgarian POS tagger (BgTagger) marks up each word with the most probable Part of Speech and unambiguous morphosyntactic information among the set of tags associated with…

MacEst

General description The system for spelling checking MacEst for Mac OS detects and highlights the incorrectly written words in a text and suggests the most probable candidates to correct the errors. MacEst offers the entire potential of contemporary spelling correction: proficiently compiled large dictionary and replacement suggestions, which are ordered according to their probability. The software is accessible to all…

Copyright © 2015 Department of computational linguistics. All rights reserved.