In this tutorial, we will learn how to convert speech to text using the Google API and Python. With extensive application in many practical and experimental realms like AI-driven chatbots or voice-controlled applications, the realm of speech-to-text applications is ever-expanding.
Python coding language is a favorite amongst developers for AI-related tasks due to its simple syntax and vast libraries. Python’s library, SpeechRecognition, allows us to translate spoken language into written text.
Prerequisites
Before we get started, we have to make sure that Python and its necessary libraries are installed on your system. If you haven’t installed Python yet, you can download it from the official Python website.
The other requirement is the SpeechRecognition library. If it’s not installed, you can do it using pip:
1 |
pip install SpeechRecognition |
Step 1: Import the Required Library
The first step is to import the SpeechRecognition library. It’ll be used to convert the speech into text elements.
1 |
import speech_recognition as sr |
Step 2: Initialize Recognizer
The Recognizer instance is the primary controller and the access point in the SpeechRecognition library that lets you trigger the conversion.
1 |
r = sr.Recognizer() |
Step 3: Convert Speech into Text
Say something, and the Python logic inside the withthe
statement will take care of the conversion.
1 2 3 4 5 6 7 8 |
with sr.Microphone() as source: print("Speak Anything :") audio = r.listen(source) try: text = r.recognize_google(audio) print("You said : {}".format(text)) except: print("Sorry could not recognize what you said") |
Once you run the code and speak something, Python will recognize your speech and convert that into text.
Full Code:
1 2 3 4 5 6 7 8 9 10 11 |
import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print("Speak Anything :") audio = r.listen(source) try: text = r.recognize_google(audio) print("You said : {}".format(text)) except: print("Sorry could not recognize what you said") |
Speak Anything : You said : I love python
Conclusion:
Now you have learned how to convert speech into text using Python. Remember, speech-to-text conversion has many applications, including transcribing audio files, aiding the disabled, and powering voice-driven functionalities such as chatbots and voice assistants like Siri and Google Assistant.
There are a lot of parameters and settings you can play with to fine-tune the recognition process. Feel free to explore the possibilities, create your own text-to-speech converter, and share your experiences in the comments.