Central and Eastern European Survey
Recognition Processes Department
Institute of Mathematics and Informatics Resources
NL and Speech Resources available at the organisation: Software created by our Department, NL resources: lexicon of Lithuanian
word-roots, multi-level model of Lithuanian morphology.
Name: Inventory of Lithuanian contemporary word-forms, except propernouns Nature: lexical Language: Lithuanian Size: 132,000 word-forms Format: ASCII, Foxpro dbf, our internal format Coverage: word-forms from well-ballanced text corpora of newspaper, prose, law, etc. Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian contemporary proper word-forms Nature: lexical Language: Lithuanian Size: 7400 word-forms Format: ASCII, Foxpro dbf, our internal format Coverage: proper word-forms (personal names, geographical names, names offirms, etc.) from well-ballanced corpora of newspaper, prose, law, etc
texts Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian contemporary abbreviations Nature: lexical Language: Lithuanian Size: 225 Format: ASCII, Foxpro dbf, our internal format Coverage: abbreviations from well-ballanced text corpora of newspaper,prose, law, etc. Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian word stems and roots, except that of propernouns Nature: lexical and morphological Language: Lithuanian Size: 62,000 entries Format: ASCII, Foxpro dbf, our internal format Coverage: word roots and stems derived from the dictionaries "Dictionaryof Contemporary Lithuanian (65000 words)" and "Dictionary of International
Words (25000 words)"; each root or stem has a pointer to its morphological
properties Medium: diskette Availability: free for research purposes
Name: Inventory of stems and roots of proper words Nature: lexical and morphological Language: Lithuanian Size: 11,000 entries Format: ASCII, Foxpro dbf, our internal format Coverage: roots and stems derived from proper nouns from well-ballancedtext corpora of newspaper, prose, law, etc.; each root or stem has a pointer
to its morphological properties Medium: diskette Availability: free for research purposes
Software description:
1) Lithuanian spelling checker,
2) morphological parser for Lithuanian word-forms,
3) morphological analyser-synthesiser for Lithuanian word-forms; free for
research purposes.
4) Isolated word recognition modeling, based on DTW, software (C++).
Programms on C language for speaker identification and verification
based on average distance between speaker's vocal tracts,
based on vector quantization method,
based on average distance between speaker's vocal tracts and residue signals
programms for feature extraction from pseudostationary segments of voiced
Other, that is: computational model of Lithuanian morphology. Digital
data in special internal format, several versions of 40-100 kilobytes each.
Used for automatic inflexion and word-building processes. Capable of
recognition/generation of several billions of theoretically available
different Lithuanian word-forms when gggregated with word stems/roots
inventory files. Needs special software modules for manipulating.