Elsnet
 


Central and Eastern European Survey

Department of Telecommunications and Telematics Budapest University of Technology and Economics General Info

Type of Organisation: University
Number of Employees: 15
Activities developed: teaching and research.

Speech research activities are currently organized within the following three laboratories: Speech Technology Laboratory (contact person: Dr. Géza Németh, nemeth@alpha.ttt.bme.hu, Scientific Secretary of Eurospeech'99) Multilingual Text-to-Speech Synthesis * MULTIVOX text-to-speech system (multilingual grapheme-sound conversion, prosody modelling, formant synthesis, supporting 10 languages, 1986-1996), * PROFIVOX waveform based TTS development environment (implemented for Hungarian yet, with client-server architecture, TTS high-level control mark up language MVML, 1994-) Description of detailed data and rule system for Hungarian TTS and automatic prosody generation (timing rules, amplitude structures and multilevel F0 rules) is available in a cca. 200 pages document. * high quality number-to-speech generation (implemented in Hungarian and German, 1996-) Computer Telephony Integration (CTI), Dialogue Systems * Development of the first multi-line Hungarian e-mail reading system with automatic diacritic regeneration (1998) * Development of commercially used audiotext/voice response applications including the first Hungarian Speaking Bill over the Telephone (1995) and the first Hungarian residential voice-mail system (1996) * Development of the speech interface of an automatic announcement system for the Hungarian Telecommunications Company used on app. 600.000 lines based on formant synthesis (1992) * Human factors, usability issues (COST219, EU project Mobile Rescue Phone, 1990-) Applications for the Disabled and the Elderly * Speaking systems for the blind and the speech impaired (1984-) Laboratory of Speech Acoustics, (contact person: Klára Vicsi, vicsi@ttt-202.ttt.bme.hu) The main areas: basic acoustics-phonetics research, speech perception, speech analysis, data-base construction, speech recognition, speech enhancement in noisy speech, development of sound tools for handicapped, and linguistics processing of the speech. Interdisciplinary way of thinking. Construction of instruments and tools in the field of speech technology for different industries. Database collection * SpeechDat(E) -European project (1999-) Speech database collection through telephone lines. It is a realistic base both for the training and testing of the present-day teleservices and for the training of real speaker independent recognizers. * BABEL - A multilingual speech database collection (1995-99) INCO - COPERNICUS project. Clear, read speech for general speech processing purposes. * CHILDREN SPEECH data-base collection (1999-). Part of the SPECO Copernicus program. Annotation * Automatic speech segmentation on phonetic, sub-phonetic level of continuous speech, automatic labelling (1997-) Speech recognition * Speaker independent isolated speech recogniser (1989-) Recognise numbers, short instructions through telephone line, and in different sound fields * Dialogue systems (1998-) Adaptation of well known systems to the Hungarian language and to the Hungarian habits * Text independent continuous speech recogniser (1995-) Neural network based solutions, phoneme, diphone, half-syllable based recognition on phonetic and phonological level. Speech and articulation teaching * "SPECO - A Multilingual pronunciation teaching and training method and a software system for hearing and speech-handicapped children."(1998-) Inco-Copernicus project coordinator. Telecommunications & Signal Processing Lab., (contact person: Péter Tatai, tatai@bme-tel.ttt.bme.hu) Speech quality measurement A speech quality measurement system has been developed that includes * subjective testing tools - absolute and comparison tests * objective testing - calibrated to the subjective results * detecting signalling and other special signals in the channel * on line observation of the transmission for calibration Front-end development * enhanced line spectrum estimation, cepstral trajectory approximation with FFT for automatically segmented sub word units, standard front-ends are also available * automatic sub word (demi-syllable) segmentation Small vocabulary recognition * DTW based user dependent recognition for isolated and connected words (e.g. voice dialling) Language modelling * FSA grammar descriptions for specific applications * written text to phoneme sentence conversion * triphone set creation regarding the coarticulation effects (manual - automatic) Finite state grammar constrained connected-word recognition using HMM technology The entire system, including HMM training and the decoder, has been developed already. The system is capable of speaker dependent, multi speaker and speaker independent recognition depending on training data. (Speaker independent models are under construction because sufficient amount of training data has become available only recently.) Tested up to vocabulary size of 30. Current research efforts * Development of a 1000 word vocabulary speaker independent recognition system (the task is the recognition of Hungarian cities). * Development of a system for automatic prediction of all possible pronunciations of Hungarian words. * Development of a morpheme-based grammar-model for the recognition of all inflected forms of Hungarian words. (Hungarian morphology is quite complex and there are literally thousands of inflected forms of a single root, not mentioning derived roots.)


This page is no longer maintained. Please visit http://www.elsnet.org/survey/quests to find out how to update your organisation profile or to find information about this organisation

[Survey] [Organisation] [General Info] [Training] [Resources] [Research] [Staff] [Publications]

 

 

[print/pda] [no frame] [navigation table] [navigation frame]     Page generated 04-01-1998 by Steven Krauwer Disclaimer / Contact ELSNET