Speech Recognition – ByteGuide

Speech recognition has various applications. It can be used in automated phone systems that allow the users to input choices and browse through menus without pressing any keys. Speech recognition systems are also used in businesses and homes to allow users to dictate to their computers and have their words transcribed into their word processing applications. They are currently being used for legal, medical or other transcription needs.

Speech recognition also offers a method for giving computer commands. With voice instructions, users can command their computers to browse through folders, look for a specific file or open and execute a specific program. Speech recognition software is also being used by differently able persons when they make use of their computers.

Multiple Users and Limited Vocabulary Speech Recognition Systems

There are speech recognition systems that are designed for many users but have a limited vocabulary. These speech recognition systems are designed to recognize voice commands of different persons. For this reason, this type of speech recognition system is the one used by telephone answering services. The number of commands this system can recognize is limited, however. Voice systems that use this type of speech recognition technology have a small number of predetermined inputs and commands that the users will use.

Limited Users and Large Vocabulary Speech Recognition Systems

The other category of speech recognition is the system that has a large vocabulary in its database but is user limited. Although the system has the ability to recognize and execute a lot of the commands being dictated to it, the number of users that can effectively use the system is limited to just a few. This speech recognition system type involves extensive training.

Speech recognition systems are much more effective when the users speak each word clearly and with pauses between each word. However, most people speak in a continuous manner and the technology has not been able to find a solution to this problem – until recently.

The Transcription Process

Conversion of speech to text involves a series of complex steps. First, the ADC or analog-to-digital converter converts the recorded sound into digital form in order to make it compatible with the computer. The quality of conversion from analog to digital depends much on the sampling rate of the system.

During conversion, it filters out the noise that the microphone has picked up along with the voice and adjusts the sound levels and volumes to a constant level. Sometimes the speed rate will also be adjusted in order to compensate for the differences in speaking rate of the user and for the computer to compare the digital file to its memory files.

After this, the digital data then undergoes a series of splicing, statistical modeling and comparison with the speech recognition software's database in order to accurately transcribe the audio file into text.

The most important innovation of current speech recognition systems over the older systems is their ability to recognize continuous speech and to employ statistical computations in order to transcribe audio.

Multiple Users and Limited Vocabulary Speech Recognition Systems

Limited Users and Large Vocabulary Speech Recognition Systems

The Transcription Process

Comments - No Responses to “Speech Recognition”