UA   RU

Abstract

  1. 1. INTRODUCTION
  2. 2. RELEVANCE OF THE TOPIC
  3. 3. THE PURPOSE AND OBJECTIVES OF THE STUDY
  4. 4. CONCLUSION
  5. SOURCES

1. INTRODUCTION

Today, many who solve everyday tasks on the go - from the phone. With it, you can check mail, send documents and photos, find the nearest ATM or build a car route. Not all of these tasks are convenient to use the keyboard, so now one of the most relevant areas of mobile development is voice control.

At the heart of voice control is speech recognition technology. It involves achievements in various fields: from computational linguistics to digital signal processing.

2. RELEVANCE OF THE TOPIC

Currently, there are many different methods and algorithms that allow you to process human speech and get information about the characteristics of the audio signal. To select the optimal solution, in the conditions of the sound signal recognition problem being solved, it is necessary to consider several solutions

Speech recognition is a task that a person performs without special efforts several times a day. This is one of the key biometric technologies. Speech recognition has several advantages over other biometric technologies: it is natural, accessible and easy to use. Based on this, the issue of improving and upgrading this type of recognition system is relevant.

3. The purpose and objectives of the study

The purpose of this work is to optimize the existing method of speech recognition based on neural networks.

For this it is necessary: to explore the subject area, analyze existing methods for solving such problems, highlight their strengths and weaknesses, choose the most promising methods for solving this problem, analyze the results of their application and select the best one.

Formed a number of tasks to achieve the goal:

  1. Review of existing sound recognition methods.
  2. Overview of the implementation of neural networks on FPGA.
  3. Analysis of the architecture of sound recognition systems.
  4. Realization of sound recognition based on FPGA.

4. CONCLUSION

Based on the above, it can be concluded that using speech recognition systems made a very big step forward, but these systems are not perfect.

Not solved the problems of filtering noise, speech clarity, recognition of large amounts of information. And the task of creating a quality system that is able to adapt to different conditions and different speakers, at the moment has not lost its relevance.

NOTES

At the time of writing this essay the master's work is not yet completed. Estimated completion date: May 2019. Full text of the work, as well as materials on the topic can be obtained from the author or his manager after the specified date.

SOURCES

  1. Блог Яндекса. Как это работает? Распознавание речи [Электронный ресурс]. — Режим доступа: https://yandex.ru/blog/company/72171
  2. Маковкин К.А. Гибридные модели: скрытые марковские модели и нейронные сети, их применение в системах распознавания речи // Модели, методы, алгоритмы и архитектуры систем распознавания речи. М.: Издво «Вычислительный центр им. А.А. Дородницына РАН», 2006.
  3. Гефке Д.А., Зацепин П.М. Применение скрытых марковских моделей для распознавания звуковых последовательностей [Электронный ресурс]. — Режим доступа: http://docplayer.ru/34318860-Udk-d-a-gefke-p-m-zacepin-primenenie-skrytyh-markovskih-modeley-dlya-raspoznavaniya-zvukovyh-posledovatelnostey-a-n-1-n-s-1-s-2-s-2-s.html
  4. Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury Deep Neural Networks for Acoustic Modeling in Speech Recognition [Электронный ресурс]. — Режим доступа: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38131.pdf
  5. Обзор существующей концепции и возможностей реализации нейронных сетей / А. Б. Азаров, В. С. Константинов, Ю. Е. Зинченко, Т. А. Зинченко // Материалы студенческой секции IX Международной научно-технической конференции «Информатика, управляющие системы, математическое и компьютерное моделирование» (ИУСМКМ–2018). – Донецк: ДонНТУ, 2018. – С. 390-394.
  6. Preeti Saini, Parneet Kaur Automatic Speech Recognition: A Review - International Journal of Engineering Trends and Technology [Электронный ресурс]. — Режим доступа: http://ijettjournal.org/volume-4/issue-2/IJETT-V4I2P210.pdf
  7. А.В. Волков. Анализ существующих методов распознавания на инвариантность к фоновым помехам и дикции диктора [Электронный ресурс]. — Режим доступа: https://cyberleninka.ru/article/v/analiz-suschestvuyuschih-metodov-raspoznavaniya-na-invariantnost-k-fonovym-pomeham-i-diktsii-diktora.
  8. Костенко А.В. Новые подходы к проблемам конца речевого сигнала Персональный сайт на портале магистров ДонНТУ, 2010 г. — Режим доступа: http://masters.donntu.ru/2012/iii/kostenko/diss/index.htm.
  9. Че В. Прошлое, настоящее и будущее технологий распознавания речи [Электронный ресурс]. — Режим доступа: https://habr.com/ru/company/infopulse/blog/346928/
  10. Радченко Г. Распознавание речи. Часть 1. Классификация систем распознавания речи [Электронный ресурс]. — Режим доступа: https://habr.com/ru/post/64572/