This project (2020-1-PT01-KA226-HE-094809) has been funded with support from the European Commission.
This web site reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained therein.

Select language  >  EN ES HU IT LT PT RO

Database of Teaching Sources

A database of selected, reviewed, tested, assessed and validated e-learning based language teaching sources addressed to Higher education students for the learning of 18 different European languages.

Back to Teaching Sources

National Corpus of Polish

Date of Publication

National Corpus of Polish 2008-2012 Research funded in 2007-2012 by a research and development grant from the Polish Ministry of Science and Higher Education.

Target Group


Domain Area

Arts & Music
Business & Communication
International Relations
Journalism & multimedia
Medicine & Nursing
Teacher Education

Learning Scenario

Autonomous learning
Classroom Context

Target Language


Language of Instruction

Any language

CEFR level


Type of Material

Reference resources (online Dictionaries/ grammar guides/phrasebooks)

Linguistic Features



Critical Thinking


The National Corpus of Polish (Narodowy Korpus Języka Polskiego) is a initiative of Institute of Computer Science (from Polish Academy of Science) in collaboration with Institute of Polish Language, Polskie Wydawnictwo Naukowe (PWN) and the Department of Computational and Corpus Linguistics (University of Łódż).
The corpora contain classic literature, newspapers, specialist periodicals, transcripts of conversations, internet texts.
The University of Reading has used the NKJP PELCRA search engine for anthropological research. The University of Utrecht has obtained a license to use the NKJP's spoken language corpus to research speech modelling. The University of Barcelona used a corpus of five hundred thousand words for research.〈=

Case study

Narodowy Korpus Języka Polskiego/The National Corpus of Polish is a very useful tool for foreigners. This is another example of the possibility of using materials not necessarily created for learning/teaching a foreign language. The student has the opportunity to observe the use of various grammatical forms in thematically and stylistically very diverse genuine texts, not specially prepared by teachers/in textbooks. For example, the student may identify and write down the phrases in it that include the word/term/phrase of interest. The Corpus is easy to use and transparent. Using it, the teacher can also create resources for an entire lesson by developing appropriate exercises. In the description of the use of the Corpus, we have an example of interesting classes in the field of medicine. Classes in other fields can also be structured in a similar way.


1. Announce the title: Problemy ze wzrokiem. U okulisty/ Sight problems. At the ophthalmologist
2. Present 3 excerpts (adapted, if necessary) extracted from Narodowy korpus języka polskiego, where the term okulista (ophthalmologist) appears. List the terms, search collocations. Make sure the students know the inflected forms of the term:
3. Give the students related terms:
- wzrok / badanie wzroku / wady wzroku / kłopoty z wzrokiem
- upośledzenie widzenia
- gałka oczna
- krótkowzroczność
- dalekowzroczność
4. Create tests (fill-in, multiple choice, reading comprehension) with all this terms.
5. Use the vocabulary for a group conversation – Dbaj o wzrok! Bo twoje oczy są najważniejsze

The results achieved
– enhance specialized vocabulary in authentic contexts
the risks to be taken into account while using the resource
– users need some amount of instruction in order to use the interface at acceptable speed
– some examples may belong to (sub)domains not known to students or to general language


Comprehensive approach
Capacity to match the needs of lecturers and students


Added value
The provided tangible improvements


Motivation enhancement
The capacity to motivate students to improve their language skills


Effectiveness in introducing innovative, creative and previously unknown approaches to LSP learning


Measurement of the transferable potential and possibility to be a source of further capitalisation/application for other language projects in different countries


Skills assessment and validation
Availability of appropriate tools for lecturers to monitor students’ progress and for students to assess own progress and to reflect on learning


Flexibility of the contents and possibilities for the LSP lecturers to adapt the contents to their and to students’ need


Assess the technical usability from the point of view of the lecturer and the student


Assess the accessibility from the point of view of the lecturer and the student


The corpora contain volumes of Polish classics, publications, recordings. Advanced search tools allow parsing, collocation search, etc. The collocation extraction module allows the analysis of the phraseological structures of the frequency of the terms. The corpus provides examples of real use of the language by the speakers’ community and can be used (in class or autonomously) to enhance the students’ awareness of the native use of language. The corpora can be used by students in Economics, Law, Art, Business, Journalism, Tourism, to check their use of various specialized terms in context. Advanced search tools (
are relatively easy to use.
Lecturers can use examples extracted from the corpus to check vocabulary by creating various types of exercises (e.g. fill-in exercises) and tests (fill-in, multiple choice, reading comprehension).
Website of the Teaching Source: