The idea of a thinking machine, an alter ego of humanity, had sparked a lot earlier, maybe with Mary Shelley writing “Frankenstein” in 1818. Artificial Intelligence, as we see it today, is the magical climax of the grinding efforts of physicists and engineers over a period of nearly seventy years. The first working AI was a checkers playing machine created in 1951. It was, of course, a program based on statistical computing. Today, almost the whole body of work involving AI revolves around three distinctive cognitive qualities – speech recognition, computer vision, and natural language processing. We will try to understand what lies beneath these technologies, which makes them so important today.
Voice recognition or speech recognition refers to computers’ ability to turn words or commands spoken aloud into text or take some action in response to those commands. The Audrey system, built-in 1952 by Bell Labs, was the first voice recognition device. It could understand 10 digits spoken by a single voice. IBM’s Shoebox machine, created in 1962, was a little more advanced. It could recognize 16 English words, 10 digits and could perform 6 arithmetic calculations.
That humble beginning led to applications like Google assist, Amazon Alexa, and Siri. These voice recognition systems can respond to complex commands given in multiple languages and different accents. So, we could say that speech recognition has come a long way.
How does it work?
The speech recognition program breaks the spoken words down to bits of sounds. These sounds are run through algorithms that figure out the probable words in the language that might match those sounds. Then the words are transcribed and processed through natural language processing to prompt the computer into action. All of this happens in almost real-time.
How good it really is
Right now, with substantial work being put in by Google, voice recognition applications have countered a lot of problems that used to bug users. For instance, Google’s voice filter lite can eliminate irrelevant noises and even recognizes the user’s voice from overlapping speech. This sort of noise filtering is being used to make meetings on Google meet better.
The key applications
Speech recognition, so far, is like a personal assistant who can act according to your commands. It can book a flight, schedule a meeting, seek an appointment, set a reminder, play a movie, or find a document or a file from the internet. This technology has been heavily integrated into mobile devices.
Natural Language Processing
Natural language processing or NLP is the key technology that drives speech recognition. We know that computers receive commands through codes that translate to binary. NLP enables a computer to understand and act upon commands in human language or recognize human language in general. With the advent of deep learning and neural networks, NLP has taken a whole new dimension. A neural network can be used to recognize even handwritten alphabets and digits.
The key applications
We have already learned that NLP is the driving force behind speech recognition. However, it has an extreme business end. The most frequent and significant use of NLP is in the chatbots used on various E-commerce platforms. Textual analysis, which plays a huge part in business intelligence and data analytics, is based upon natural language processing.
This is another marvel of deep learning and an inseparable part of the AI dream. Computer vision refers to computers’ ability to recognize visual inputs in the forms of photos and videos. In the initial years, computer vision moved quite slowly; it was bugged with tonnes of problems. But things have changed now. The ambitious project of the self-driving car is based on computer vision.
The key applications
Computer vision plays a huge part in remotely operated security systems. Some systems can recognize problematic movement and send alerts to security personnel. The traffic cameras supported with AI can map cars’ speed and recognize traffic violations and initiate prosecution. The image recognition program on Facebook does not need an introduction.
These technologies are driving our generation towards the future. They are rendering some jobs obsolete while creating others. They are shaping the world for us. You can be a part of the change by undergoing artificial intelligence and machine learning courses. We are witnessing the future in the making; it is time to participate.