We live in a place where everyone has a different language & thus this becomes a barrier in communication & relationship. Many animals like dolphins communicate using sounds but human beings communicate with their different languages. Even in mobile phones, we have the technology to type or search in our own language. But sometimes it becomes difficult for us to type something maybe because we are busy with other work or our hands are not free for that. In such cases, we needed a device which can work just by listening to our orders & then do our work.
Then, in 2011, Siri came out which changed the whole technology of speech recognition & became a major digital trend which nobody even thought about. After that many voices activated technologies came into the market like Google assistant, Cortana, Amazon’s Alexa, etc. With the introduction of AI, this became a dream come true which has many benefits like it helped in doing multitasking, increases the speed of the site, makes the things easier for the user, etc.
There are 4 ways by which the computer turns the spoken word into the written word. When human beings listen to anything, their brain flips them into words & since it happens so quickly that it looks like magic. Computers & other appliances manipulate Phonemes & Phones after analyzing & processing it.
Firstly it is matched & compared with words having similar sounds stored in the memory.
For that, the entire word is considered & recognized as a single entity.
Secondly, the pattern & feature analysis is done. Here, when we say a complete sentence, then the words in the sentences are separated &is taken one-by-one. That single word is broken into bits which are then recognised from key the features like the occurrence of the consonants or the vowels.
Thirdly, language modeling & statistical analysis is done. Here knowledge of grammar & sounds of the words works together which improves the accuracy & increases the speed of the voice recognition process.
In the end, artificial neural networks are employed. They are very similar to the brain-computer model which are trained to recognise the patterns of the sounds & words. In 2012, with the introduction of Deep Neural
Networks (DNNs), it improved the accuracy of speech recognition as it can access the sound
produced by the users every time.