Central and Eastern European Survey
The Institute of the Czech National Corpus
Faculty of Arts, Charles University General Info
Type of Organisation: public
Number of Employees: about 10
Activities developed: Research, Education,
build-up of the Czech National Corpus and related
matters.
The basic goal of the CNC Institute was to
build up representative Czech National Corpus of 100 million
words in its first stage, i.e. by 2000. Activities directed
primarily to this goal include collecting data in various ways,
their clean-up, conversion, standardisation, mark-up,recording
of evidence, research on related problems and first outputs among
which the ultimate goal, though not the only one, is to form
a basis for a new dictionary of the Czech language.
The Institute of Czech National Corpus aims now at
a considerable expansion of the written corpus, moving
in the direction of multiples of data. There is also a diachronic corpus
being built, covering all past stages of Czech, and
a corpus of spoken language. See
http://ucnk.ff.cuni.cz
for information about access. Tools for this work are
developped, too, the basic one being a superstructure to the Stuttgart
University cqp/xquick programme. Membership in a few projects just recently
finished, such as TELRI, Multext-East, and, recently, in the TELRI
Association.
|