Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.intsys.msu.ru/en/invest/speech/articles/pronunc.htm
Дата изменения: Unknown
Дата индексирования: Sun Apr 10 00:40:59 2016
Кодировка:
Intelligent Systems :: Research :: :: Articles

About automatic correction of wrong pronunciation of foreign words

Dmitry Babin; Ivan Mazurenko, Pavel Aliseichick
Faculty of Mechanics and Mathematics,
Department of Mathematical Theory of Intelligent Systems (MaTIS).

Formulation of a problem of automatic correction of wrong pronunciation of foreign words is introduced. Work of a system constructing the exercises with the help of a teacher is described. Algorithms of fragmentation of sound data into phonemes and of analysis of correctness of pronunciation of a student are suggested. Necessity of voice tuning is explained. Some characteristics of a training system are adduced.

Lately there appeared a large number of systems of automatic training of the foreign languages with an opportunity of demonstration of video and audio information and with input of student's speech. The weakest point in such training systems is absence of appreciation of correctness of pronunciation, as well as localization of errors of pronunciation. The complexity of the problem is explained by large variety of equally correct pronunciations of the different announcers, different conditions of recording of speech and existence of a plenty of types of errors of pronunciation: from incorrect stress and intonation up to wrong pronunciation of separate sounds, that is usually caused by absence of the majority of sounds of foreign language in native language of a student. Thus, the solution of this problem requires work with speech at a level, not dependent on the announcer and allowing an inexact pronunciation of sound phrases.

The authors offer the functioning variant of a system, making it possible to a student to appreciate objectively a degree of correctness of his pronunciation, to classify errors and to listen to a difference of pronunciations of incorrect sounds interactively.

The system works as follows.

To ensure the independence of functioning of algorithm from a voice of a teacher and a student the system in a natural mode of speech dialogue in a language native for the user determines the objective parameters of his speech. These parameters are later used to transform the characteristics of a sound signal to a special voice-invariant form.

The preparation of an exercise by the teacher assumes special processing of sound images of words. The teacher devices a sound signal into fragments corresponding to phonemes of language. After that the system calculates the special parameters of these fragments (loudness, intonation, rate and the correctness of pronunciation). Then the teacher defines allowable deviations of all parameters. For example, the definition of allowable deviations of all parameters of all fragments, except for one, to infinity will result in the supervising by the system of only one phoneme in a phrase.

The dialogue of a system with a student occurs in an automatic mode. After listening to a sound phrase, recorded by the teacher, the student repeats it. The system calculates special voice-invariant characteristics of a recorded phrase said by the student and automatically divides it into sound fragments, corresponding to phonemes of foreign language, using a well-known method of dynamic programming. After this the student has an opportunity to estimate the correctness of pronunciation of each fragment acoustically and visually. Inadmissible deviation of any student's curve, describing the loudness, intonation, rate or the correctness of pronunciation of a separate sound fragment, from the corresponding curve of the teacher means that an error occurred in pronunciation of this fragment.

The tests of this system were conducted for training more than 10 Russian "students" in teaching them English on words and phrases, containing the following English sounds: th (like in this), th (like in think), ng (like in ring), r (like in rest), h (like in hello), t (like in ten), d (like in desk), w (like in wall), having no close analogues in Russian. The probability of an error during the division of student's words into fragments averaged 0.05, the probability of determination of a false error on the correct pronunciation averaged 0.04 and the probability of not detection of an obvious error in pronunciation averaged 0.15.

The work was executed on the chair of the Mathematical theory of intelligent systems of faculty of Mechanics and Mathematics of the Moscow State University named after M.Lomonosov.