Abstract:
We show the results of studying models of the Russian language constructed with recurrent artificial neural networks for systems of automatic recognition of continuous speech. We construct neural network models with different number of elements in the hidden layer and perform linear interpolation of neural network models with the baseline trigram language model. The resulting models were used at the stage of rescoring the N best list. In our experiments on the recognition of continuous Russian speech with extra-large vocabulary (150 thousands of word forms), the relative reduction in the word error rate obtained after rescoring the 50 best list with the neural network language models interpolated with the trigram model was 14 %.
Keywords:
language models, neural networks, automatic speech recognition, Russian speech.
This work was supported by the Russian Foundation for Basic Research, projects nos. 15-07-04322, 15-07-04415, and 16-37-60100, Russian President grants nos. MK-1000.2017.8 and MD-254.2017.8, and the budget topic 0073-2014-0005.
Presented by the member of Editorial Board:V. I. Vasil'ev
Citation:
I. S. Kipyatkova, A. A. Karpov, “A study of neural network Russian language models for automatic continuous speech recognition systems”, Avtomat. i Telemekh., 2017, no. 5, 110–122; Autom. Remote Control, 78:5 (2017), 858–867
Abdinabi Mukhamadiyev, Mukhriddin Mukhiddinov, Ilyos Khujayarov, Mannon Ochilov, Jinsoo Cho, “Development of Language Models for Continuous Uzbek Speech Recognition System”, Sensors, 23:3 (2023), 1145
Wolk K., Wolk A., Wnuk D., Grzes T., Skubis I., “Survey on Dialogue Systems Including Slavic Languages”, Neurocomputing, 477 (2022), 62–84
Ashok Sharma, Ravindra Parshuram Bachate, Parveen Singh, Vinod Kumar, Ravi Kant Kumar, Amar Singh, Madan Kadariya, Praveen Kumar Reddy Maddikunta, “Parallel Big Bang-Big Crunch-LSTM Approach for Developing a Marathi Speech Recognition System”, Mobile Information Systems, 2022 (2022), 1
Amitoj Singh, Navkiran Kaur, Vinay Kukreja, Virender Kadyan, Munish Kumar, “Computational intelligence in processing of speech acoustics: a survey”, Complex Intell. Syst., 8:3 (2022), 2623
Thimmaraja Yadava G., Jayanna H.S., “Enhancements in Automatic Kannada Speech Recognition System By Background Noise Elimination and Alternate Acoustic Modelling”, Int. J. Speech Technol., 23:1 (2020), 149–167
P. S. Praveen Kumar, G. Thimmaraja Yadava, H. S. Jayanna, “Continuous kannada speech recognition system under degraded condition”, Circuits Syst. Signal Process., 39:1 (2020), 391–419
I. Kagirov, D. A. Ryumin, A. A. Axyonov, A. A. Karpov, “Multimedia database of russian sign language items in 3D”, Vopr. Yazykoznaniya, 2020, no. 1, 104–123
L. V. Savchenko, A. V. Savchenko, “Fuzzy phonetic encoding of speech signals in voice processing systems”, J. Commun. Technol. Electron., 64:3 (2019), 238–244
A. V. Zolotaryuk, V. I. Zavgorodniy, O. Yu. Gorodetskaya, “Intellectual prediction of student performance: opportunities and results”, Proceedings of the 1St International Scientific Conference Modern Management Trends and the Digital Economy: From Regional Development to Global Economic Growth (Mtde 2019), Aebmr-Advances in Economics Business and Management Research, 81, ed. A. Nazarov, Atlantis Press, 2019, 555–559
L. Pipiras, R. Maskeliunas, R. Damasevicius, “Lithuanian speech recognition using purely phonetic deep learning”, Computers, 8:4 (2019), 76
Thimmaraja Yadava G., H.S. Jayanna, 2019 4th International Conference on Electrical, Electronics, Communication, Computer Technologies and Optimization Techniques (ICEECCOT), 2019, 146
Irina Kipyatkova, Lecture Notes in Computer Science, 10458, Speech and Computer, 2017, 362