Central and Eastern European Survey
Recognition Processes Department
Institute of Mathematics and Informatics Resources
NL and Speech Resources available at the organisation: Software created by our Department, NL resources: lexicon of Lithuanian
word-roots, multi-level model of Lithuanian morphology.
Name: Inventory of Lithuanian contemporary word-forms, except propernouns Nature: lexical Language: Lithuanian Size: 132,000 word-forms Format: ASCII, Foxpro dbf, our internal format Coverage: word-forms from well-ballanced text corpora of newspaper, prose, law, etc. Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian contemporary proper word-forms Nature: lexical Language: Lithuanian Size: 7400 word-forms Format: ASCII, Foxpro dbf, our internal format Coverage: proper word-forms (personal names, geographical names, names offirms, etc.) from well-ballanced corpora of newspaper, prose, law, etc
texts Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian contemporary abbreviations Nature: lexical Language: Lithuanian Size: 225 Format: ASCII, Foxpro dbf, our internal format Coverage: abbreviations from well-ballanced text corpora of newspaper,prose, law, etc. Medium: diskette Availability: free for research purposes
Name: Inventory of Lithuanian word stems and roots, except that of propernouns Nature: lexical and morphological Language: Lithuanian Size: 62,000 entries Format: ASCII, Foxpro dbf, our internal format Coverage: word roots and stems derived from the dictionaries "Dictionaryof Contemporary Lithuanian (65000 words)" and "Dictionary of International
Words (25000 words)"; each root or stem has a pointer to its morphological
properties Medium: diskette Availability: free for research purposes
Name: Inventory of stems and roots of proper words Nature: lexical and morphological Language: Lithuanian Size: 11,000 entries Format: ASCII, Foxpro dbf, our internal format Coverage: roots and stems derived from proper nouns from well-ballancedtext corpora of newspaper, prose, law, etc.; each root or stem has a pointer
to its morphological properties Medium: diskette Availability: free for research purposes
Software description:
1) Lithuanian spelling checker,
2) morphological parser for Lithuanian word-forms,
3) morphological analyser-synthesiser for Lithuanian word-forms; free for
research purposes.
4) Isolated word recognition modeling, based on DTW, software (C++).
Programms on C language for speaker identification and verification
based on average distance between speaker's vocal tracts,
based on vector quantization method,
based on average distance between speaker's vocal tracts and residue signals
;
programms for feature extraction from pseudostationary segments of voiced
sounds.
Other, that is: computational model of Lithuanian morphology. Digital
data in special internal format, several versions of 40-100 kilobytes each.
Used for automatic inflexion and word-building processes. Capable of
recognition/generation of several billions of theoretically available
different Lithuanian word-forms when gggregated with word stems/roots
inventory files. Needs special software modules for manipulating.
|