This project (2020-1-PT01-KA226-HE-094809) has been funded with support from the European Commission.
This web site reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained therein.

Select language  >  EN ES HU IT LT PT RO

Database of Teaching Sources

A database of selected, reviewed, tested, assessed and validated e-learning based language teaching sources addressed to Higher education students for the learning of 18 different European languages.

Back to Teaching Sources

Bulgarian National Corpus

Date of Publication

2001-2009 ; 2009-2013 and continued

Target Group


Domain Area

Arts & Music
Business & Communication
International Relations
Journalism & multimedia
Medicine & Nursing
Teacher Education

Learning Scenario

Autonomous learning
Classroom Context

Target Language


Language of Instruction

Any language

CEFR level


Type of Material

Reference resources (online Dictionaries/ grammar guides/phrasebooks)

Linguistic Features



Critical Thinking


The corpus includes valuable material on contemporary Bulgarian an Bulgarian in diachrony.
Institute for Bulgarian Language Prof.L. Andreychin
Constantly enlarged. Consists of a Bulgarian (1.2 billion words and over 240,400 text samples) part and 47 parallel corpora. Reflects state of Bulgarian since 1945 until now.
As all corpuse, BNC includes sub-corpora on subject, author, year, spurce etc. Can be used as training corpora for terminology, idioms, grammar etc.
Resource includes 4 lexicographic tools: BulNet; The Bulgarian Semantically Annotated Corpus is part of the Bulgarian Brown Corpus. It consists of 95119 lexical units, annotated with the most appropriate synonymous set from the Bulgarian wordnet.; The Bulgarian PoS Annotated Corpus.

Case study

Source was tested by all Bulgarian lecturers who provided comments for this and other sources.
Level: B1 to C1. No.of students:varying.
The Bulgarian National Corpus /BNC/ is the largest systematically created and representative corpus for the Bulgarian language. It contains more than 1.2 billion words and includes over 240,000 texts. The materials in the Corpus reflect the state of the Bulgarian language (mostly in its written form) from the middle of the 20th century until present day.
The BNC is a complex language resource used for both research and linguistic training purposes. It can also be used in learning Bulgarian as a foreign language, although it has not been specifically developed for this purpose.
Most widely used during classes with students studying Bulgarian is the search system of the corpus. Through it, one can:
- illustrate the use of a random word in context, with the examples generated being from texts differing in genre and style. The teacher can preselect parts of the results obtained in view of the topic and the specific objectives of the lesson. Students can use the search system on their own to work on tasks set by the teacher.
- obtain information on the meaning and grammatical characteristics of the word.
The BNC is easy and convenient to work with, as it is mainly used as an additional resource by both beginners and advanced students. The innovative character of the Corpus in learning Bulgarian as a foreign language is connected with the possibility to extract linguistic information at different levels simultaneously - semantic, grammatical, etc.


1. Warm up activity. Topic announced by teacher: Doing business in Bulgaria(n). Basic vocabulary:
2. Teacher will elicit from students the basic definition of the term and other terms related to the topic and the recommended medical treatment. List the terms, add others if necessary.
• Announce the activity related to quasi-synonyms бизнес, предпринимателство (entrepreneurship); cтопанство, икономика (economics)
3. Give definitions of terms; present students with excerpts extracted from BNC;
4. Ask students to identify/underline the keywords that helped them establish the field (related terms); discuss the relation to the terms.
5. Ask students to use the corpus to identify another example where the terms are used in Bulgarian and European context;
6. Handout with basic and more detailed terminology and sample text on specific topic provided by teacher.
7. Teacher presents resource and makes a demonstration.
8. Each student is asked to search for some (different) items in glossary in handout, extract text samples.
9. Class discussion of vocabulary (matching terms with their definitions).
10. Class reading of related article and discussion of main points.
Assignment> further search on related terminological vocabulary on topic of choice. Drafting an article abstract.


Comprehensive approach
Capacity to match the needs of lecturers and students


Added value
The provided tangible improvements


Motivation enhancement
The capacity to motivate students to improve their language skills


Effectiveness in introducing innovative, creative and previously unknown approaches to LSP learning


Measurement of the transferable potential and possibility to be a source of further capitalisation/application for other language projects in different countries


Skills assessment and validation
Availability of appropriate tools for lecturers to monitor students’ progress and for students to assess own progress and to reflect on learning


Flexibility of the contents and possibilities for the LSP lecturers to adapt the contents to their and to students’ need


Assess the technical usability from the point of view of the lecturer and the student


Assess the accessibility from the point of view of the lecturer and the student


BNC is a useful tool in teaching and learning (with initial training) all LSP skills, as it can be used in terminology, grammar and translation classes, in building reading and writing skills. Working with BNC (as with any LC) can bring added value by providing students with a means to check the forms, lexical and grammatical meanings of words, their behaviour in various context, syntagms and short texts.; this will help in extending the students' view on language learnt, beyond bilingual dictionaries. It can motivate the students by the examples from various specialized text in any field. While a constant practice with language students, working with dictionaries might be innovative in the, say, medical B class. It can be adapted to the needs of any learners. It is quite easy to use. Some of it is not accessible. Web page needs some additions to the English version.
Of a varying level of difficulty, the corpus offers a user-friendly interface.
Students need introduction to how to make queries; this area is partially restricted, but access can be gained.
Administrative corpus of official EU documents – parallel, in 23 languages with largest corpora in English, German, Romanian, Polish and Greek.
Most of the corpuses are downloadable, e.g. the Journalistic corpus from, which exists in parallel, in 9 Balkan languages (Bulgarian, Romanian, Macedonian, Serbian, Albanian, Greek, Turkish, Croatian, Bosnian) and English.
Website of the Teaching Source: