A Review on Deep Learning Architectures for Speech Recognition

Dokuz, Yeşim; Tüfekçi, Zekeriya

A Review on Deep Learning Architectures for Speech Recognition

dc.contributor.author	Dokuz, Yeşim
dc.contributor.author	Tüfekçi, Zekeriya
dc.date.accessioned	2024-11-07T13:19:01Z
dc.date.available	2024-11-07T13:19:01Z
dc.date.issued	2020
dc.department	Niğde Ömer Halisdemir Üniversitesi
dc.description.abstract	Deep learning is a branch of machine learning that uses several algorithms which tries to model datasets by using deep architectureswith many processing layers. With the popularity and successful applications of deep learning architectures, they are being used inspeech recognition, as well. Researchers utilized these architectures for speech recognition and its applications, such as speechemotion recognition, voice activity detection, and speaker recognition and verification to better model speech inputs with outputs andto reduce error rates of speech recognition systems. Many studies are performed in the literature that use deep learning architecturesfor speech recognition systems. The literature studies show that using deep learning architectures for speech recognition and itsapplications provide benefits for many speech recognition areas and have ability to reduce error rates and provide better performance.In this study, first of all, we explained speech recognition problem and the steps of speech recognition. Then, we analyzed the studiesrelated to deep learning based speech recognition. In particular, deep learning architectures of Deep Neural Networks, ConvolutionalNeural Networks, and Recurrent Neural Networks and hybrid approaches that use these architectures are evaluated and the literaturestudies related to these architectures for speech recognition and the application areas of speech recognition are investigated. As aresult, we observed that RNNs are the most utilized and powerful deep learning architecture among all of the deep learningarchitectures in terms of error rates and speech recognition performance. CNNs are other successful deep learning architectures andhave closer results with RNN in terms of error rates and speech recognition performance. Also, we observed that new deeparchitectures that use either hybrid of DNNs, CNNs, and RNNs or other deep learning architectures are getting attention and haveincreasing performance and could reduce error rates in speech recognition.
dc.identifier.doi	10.31590/ejosat.araconf22
dc.identifier.endpage	176
dc.identifier.issn	2148-2683
dc.identifier.issue	Ejosat Özel Sayı 2020 (ARACONF)
dc.identifier.startpage	169
dc.identifier.trdizinid	364765
dc.identifier.uri	https://doi.org/10.31590/ejosat.araconf22
dc.identifier.uri	https://search.trdizin.gov.tr/tr/yayin/detay/364765
dc.identifier.uri	https://hdl.handle.net/11480/12829
dc.identifier.volume	0
dc.indekslendigikaynak	TR-Dizin
dc.language.iso	en
dc.relation.ispartof	Avrupa Bilim ve Teknoloji Dergisi
dc.relation.publicationcategory	Makale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_20241107
dc.subject	Bilgisayar Bilimleri
dc.subject	Yazılım Mühendisliği
dc.subject	Bilgisayar Bilimleri
dc.subject	Sibernitik
dc.subject	Bilgisayar Bilimleri
dc.subject	Bilgi Sistemleri
dc.subject	Bilgisayar Bilimleri
dc.subject	Donanım ve Mimari
dc.subject	Bilgisayar Bilimleri
dc.subject	Teori ve Metotlar
dc.subject	Akustik
dc.subject	Bilgisayar Bilimleri
dc.subject	Yapay Zeka
dc.title	A Review on Deep Learning Architectures for Speech Recognition
dc.type	Review Article

Koleksiyon

TR-Dizin İndeksli Yayınlar Koleksiyonu

A Review on Deep Learning Architectures for Speech Recognition

Dosyalar

Koleksiyon