Central and Eastern European Survey
Resources
Dept.of Information Technologies
Faculty of Informatics, Masaryk University
NL and Speech Resources available: Textual, Speech, Software, Lexical resources (including
terminology) >.
Name: ESO Corpus, FIT Corpus, Spoken Czech Corpus Nature: newspaper texts, computer journals, machine stem dictionaries for Czech, Slovak,
Russsian ,English, German, French - size between 200 000-120 000
entries Language: Czech Size: approximately 50 mil. word forms,
spoken: now about 100 hours Format: ASCII and SGML, WAW Coverage: newspapers, computer journals, spoken Czech - interview and dialogues Medium: hard disk, CD-ROM Availability: free for research purposes, partially commercial
products
Software description: Czech, Slovak, Russian spell checker, Czech, Slovak, Russian
lemmatizer and tagger, Czech, Slovak hyphenation programs,
available through personal contact, Czech Electronic Thesaurus,
Czech-English and Czech-German Electronic Dictionary
|