Central and Eastern European Survey
General Info
Electronics and Telecommunications Department
Applied Electronics Group
Type of Organisation: public
Number of Employees: Less than 10
Activities developed: Research, Education.
My research team is involved in written natural language processing. It is
a small team: these are some of my graduated students making their diploma
project work under my supervision and former students presently teaching
assistants, doctorands, and researchers in other institutions which are my
collaboratores.
This research does not constitute the main object of our university; it is
rather a matter of hobby, of scientific interest which is materialized in
published works and studies. In the last two years, our research was
supported by a Grant from the National Concil for the Universitary Scientific
Research. Our study leads to a standardization for Romanian language
concerning the statistical structure of digrams,trigrams, m-grams.
(We have developed a new method for the random data sampling
from NL in order to obtain the conditional probabilities on a single preceeding
letter. Based on this data sample we could determine the conditional
probabilities and the digram probabilities with a statistical error control.
We further determined the entropies for the first and second approximation
to Romanian, including the conditional entropies on a single preceeding
letter. For details, see 26 bellow.)
We carried out this statistical analysis of printed Romanian having at our
disposal some texts (about 21 books meaning more than 15 million characters)
printed by Metropol Publishing House (Bucharest).
The numerical results we obtained could be of interest for large dictionaries
encoding, intelligent OCR, cryptography, etc.
Speaking about my concerns, my interest in NL began in 1979 when I completed
my PhD thesis with a special chapter developing new enciphering methods for
multiple Markov-Chains, with applications to Romanian written language.
I resumed these works in the last five years, some results beeing presented
in some papers (see 26 bellow).
In the university there is also a group involved in Speech (Research
and Education). I filled in this questionnaire concerning only the (written)
Natural Language involvement.
|