A. A. Karpov, “An automatic multimodal speech recognition system with audio and video information”, Avtomat. i Telemekh., 2014, no. 12, 125–138; Autom. Remote Control, 75:12 (2014), 2190

Avtomatika i Telemekhanika

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor
	Guidelines for authors
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Avtomat. i Telemekh.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Avtomatika i Telemekhanika, 2014, Issue 12, Pages 125–138 (Mi at14166)

This article is cited in 15 scientific papers (total in 15 papers)

Intellectual Control Systems

An automatic multimodal speech recognition system with audio and video information

A. A. Karpov^ab

^a St. Petersburg Institute of Informatics and Automation, Russian Academy of Sciences, St. Petersburg, Russia
^b ITMO University, St. Petersburg, Russia

Full-text PDF (523 kB) Citations (15)

References:

PDF

HTML

Abstract: The mathematical model and software implementation of an automatic Russian speech recognition system that employs techniques of digital processing and analysis of audiovisual signals from a microphone and a video camera are presented. The description of probabilistic modeling of audiovisual speech based on coupled hidden Markov models, information fusion methods with weight coefficients for audio and video speech modalities, and parametric representation of signals is provided. Quantitative results in multimodal recognition of continuous Russian speech indicate high accuracy and reliability of the automatic system.

Presented by the member of Editorial Board: A. V. Bernshtein

Received: 28.03.2012

English version:
Automation and Remote Control, 2014, Volume 75, Issue 12, Pages 2190–2200
DOI: https://doi.org/10.1134/S000511791412008X

Bibliographic databases:

Document Type: Article

Language: Russian

Citation: A. A. Karpov, “An automatic multimodal speech recognition system with audio and video information”, Avtomat. i Telemekh., 2014, no. 12, 125–138; Autom. Remote Control, 75:12 (2014), 2190–2200

Citation in format AMSBIB

\Bibitem{Kar14}

\by A.~A.~Karpov

\paper An automatic multimodal speech recognition system with audio and video information

\jour Avtomat. i Telemekh.

\yr 2014

\issue 12

\pages 125--138

\mathnet{http://mi.mathnet.ru/at14166}

\transl

\jour Autom. Remote Control

\yr 2014

\vol 75

\issue 12

\pages 2190--2200

\crossref{https://doi.org/10.1134/S000511791412008X}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=000346402900008}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-84919360128}

Linking options:

https://www.mathnet.ru/eng/at14166

https://www.mathnet.ru/eng/at/y2014/i12/p125

This publication is cited in the following 15 articles:

Astha Gupta, Rakesh Kumar, Yogesh Kumar, 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), 2022, 1492
Fang Y., Yu L., Fei Sh., “Contactless Interactive Control Technology Based on Switching Filtering Algorithm”, Trans. Inst. Meas. Control, 43:2 (2021), 484–494
Malakar M., Keskar R.B., “Progress of Machine Learning Based Automatic Phoneme Recognition and Its Prospect”, Speech Commun., 135 (2021), 37–53
Denis Ivanko, Dmitry Ryumin, Irina Kipyatkova, Alexandr Axyonov, Alexey Karpov, Smart Innovation, Systems and Technologies, 154, Proceedings of 14th International Conference on Electromechanics and Robotics “Zavalishin's Readings”, 2020, 477
M. P. Farkhadov, N. V. Petukhova, S. V. Vaskovskii, M. E. Farkhadova, “Povyshenie effektivnosti rechevogo interfeisa s primeneniem kognitivnykh i lingvisticheskikh znanii”, UBS, 81 (2019), 90–112
S. Pekarskikh, E. Kostyuchenko, L. Balatskaya, “Evaluation of speech quality through recognition and classification of phonemes”, Symmetry-Basel, 11:12 (2019), 1447
Evgeny Kostuchenko, Dariya Novokhrestova, Marina Tirskaya, Alexander Shelupanov, Mikhail Nemirovich-Danchenko, Evgeny Choynzonov, Lidiya Balatskaya, Lecture Notes in Computer Science, 11658, Speech and Computer, 2019, 237
Evgeny Kostuchenko, Dariya Novokhrestova, Svetlana Pekarskikh, Alexander Shelupanov, Mikhail Nemirovich-Danchenko, Evgeny Choynzonov, Lidiya Balatskaya, Lecture Notes in Computer Science, 11658, Speech and Computer, 2019, 359
D. Ivanko, A. Karpov, D. Fedotov, I. Kipyatkova, D. Ryumin, D. Ivanko, W. Minker, M. Zelezny, “Multimodal speech recognition: increasing accuracy using high speed video data”, J. Multimodal User Interfaces, 12:4, SI (2018), 319–328
N. Radha, A. Shahina, P. Prabha, P. B. T. Sri, N. A. Khan, “An analysis of the effect of combining standard and alternate sensor signals on recognition of syllabic units for multimodal speech recognition”, Pattern Recognit. Lett., 115, SI (2018), 39–49
A. A. Karpov, R. M. Yusupov, “Multimodal Interfaces of Human–Computer Interaction”, Her. Russ. Acad. Sci., 88:1 (2018), 67
I. S. Kipyatkova, A. A. Karpov, “A study of neural network Russian language models for automatic continuous speech recognition systems”, Autom. Remote Control, 78:5 (2017), 858–867
Denis Ivanko, Alexey Karpov, Dmitry Ryumin, Irina Kipyatkova, Anton Saveliev, Victor Budkov, Dmitriy Ivanko, Miloš Železný, Lecture Notes in Computer Science, 10458, Speech and Computer, 2017, 757
Alexey Karpov, Alexander Ronzhin, Irina Kipyatkova, Andrey Ronzhin, Vasilisa Verkhodanova, Anton Saveliev, Milos Zelezny, Lecture Notes in Computer Science, 9732, Human-Computer Interaction. Interaction Platforms and Techniques, 2016, 170
A. Karpov, A. Ronzhin, I. Kipyatkova, “Automatic analysis of speech and acoustic events for ambient assisted living”, Universal Access in Human-Computer Interaction: Access To Interaction, Pt II, Lecture Notes in Computer Science, 9176, eds. M. Antona, C. Stephanidis, Springer-Verlag Berlin, 2015, 455–463

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	1171
Full-text PDF :	200
References:	71
First page:	46

Что такое QR-код?

Registration to the website

Logotypes