Typing out long prompts can sometimes be quite tedious and time-consuming in ChatGPT. To make it easier to communicate, voice control on ChatGPT can be a great help.
Instead of typing out your requests and reading the AI’s responses, you can just use your voice to ask questions or give commands and then hear its responses in audio form. Today, we will look at all the different ways you can get voice control for ChatGPT. Let’s find out!
Why using Voice Control for ChatGPT is important?
Using voice control on ChatGPT can have numerous benefits for you. Some of which include:
It saves time and effort
You don’t have to type long sentences or paragraphs to interact with ChatGPT. You can just speak your mind and get instant responses.
It improves accessibility
You don’t need to use a keyboard to use ChatGPT. You can just use your voice and listen to the answers. This is especially helpful for those who have visual or physical problems that make reading/typing difficult.
It enhances engagement
By using voice control on ChatGPT, you can liven up the conversation and make it more natural and engaging. You can change up your vocal tones, express different emotions, and even imitate accents to make things more interesting.
It increases learning outcomes
You can use voice control for ChatGPT to hone your speaking and listening skills in different languages. ChatGPT’s responses can give you interesting new facts, concepts, and opinions.
3 Ways to Use Voice Control for ChatGPT
Use Voice Control for ChatGPT Extension
One of the simplest ways to use voice commands with ChatGPT is to add a Chrome extension. Theis Frøhlich’s Voice Control for ChatGPT does just that – all you need to do is install the extension and follow these steps:
- Install the extension from the Chrome Web Store.
- Go to the ChatGPT website. Make sure you open the website from Google Chrome.
- Click on the microphone icon below the input field to start recording your voice.
- Speak your question or command to ChatGPT clearly and wait for the extension to transcribe it into text.
- Click on the microphone icon again to send your input to ChatGPT.
- Wait for ChatGPT’s response and listen to it in audio form.
- You can also adjust some settings of the extension, such as the language, the speech recognition engine, the text-to-speech engine, and the voice speed.
Use Python Code for Voice Control on ChatGPT
To enable voice control for ChatGPT, you can write some Python code that uses speech recognition and text-to-speech libraries to interface with the ChatGPT API. Installing the necessary libraries is easy – just ask ChatGPT for instructions based on your operating system, and then you’re ready to go! Once you have the libraries installed, you can run the code below.
from datetime import datetime from logging.config import listen import speech_recognition as sr import pyttsx3 import webbrowser import wikipedia import wolframalpha import openai # Speech engine initialisation engine = pyttsx3.init() voices = engine.getProperty(‘voices’) engine.setProperty(‘voice’, voices[0].id) # 0 = male, 1 = female activationWord = ‘execute’ # Single word # Configure browser # Set the path chrome_path = r”C:\Program Files\Google\Chrome\Application\chrome.exe” webbrowser.register(‘chrome’, None, webbrowser.BackgroundBrowser(chrome_path)) # Wolfram Alpha client appId = ‘5R49J7-J888YX9J2V’ wolframClient = wolframalpha.Client(appId) def speak(text, rate = 120): engine.setProperty(‘rate’, rate) engine.say(text) engine.runAndWait() def parseCommand(): listener = sr.Recognizer() print(‘Listening for a command’) with sr.Microphone() as source: listener.pause_threshold = 2 input_speech = listener.listen(source) try: print(‘Recognizing speech…’) query = listener.recognize_google(input_speech, language=’en_gb’) print(f’The input speech was: {query}’) except Exception as exception: print(‘No command received, insert a command’) speak(‘No command received, insert a command’ ) print(exception) return ‘None’ return query def search_wikipedia(query = ”): searchResults = wikipedia.search(query)(search_wikipedia(query)) if not searchResults: print(‘No wikipedia result’) return ‘No result received’ try: wikiPage = wikipedia.page(searchResults[0]) except wikipedia.DisambiguationError as error: wikiPage = wikipedia.page(error.options[0]) print(wikiPage.title) wikiSummary = str(wikiPage.summary) return wikiSummary def listOrDict(var): if isinstance(var, list): return var[0][‘plaintext’] else: return var[‘plaintext’] def search_wolframAlpha(query = ”): response = wolframClient.query(query) # @success: Wolfram Alpha was able to resolve the query # @numpods: Number of results returned # pod: List of results. This can also contain subpods if response[‘@success’] == ‘false’: return ‘Could not compute’ # Query resolved else: result = ” # Question pod0 = response[‘pod’][0] pod1 = response[‘pod’][1] # May contain the answer, has the highest confidence value # if it’s primary, or has the title of result or definition, then it’s the official result if ((‘result’) in pod1[‘@title’].lower()) or (pod1.get(‘@primary’, ‘false’) == ‘true’) or (‘definition’ in pod1[‘@title’].lower()): # Get the result result = listOrDict(pod1[‘subpod’]) # Remove the bracketed section return result.split(‘(‘)[0] else: question = listOrDict(pod0[‘subpod’]) # Remove the bracketed section return question.split(‘(‘)[0] # Search wikipedia instead speak(‘Computation failed. Querying universal databank.’) return search_wikipedia(question) # Authenticate to the OpenAI API openai.api_key = “sk-xxxxxxxxxxxxxxxxx” # Replace with your own API key # Function to generate a response using GPT-3 def generate_response(prompt): completions = openai.Completion.create( engine=”text-davinci-002″, prompt=prompt, max_tokens=2048, n=1, stop=None, temperature=0.5, ) message = completions.choices[0].text return message def speak(text, rate = 120): engine.setProperty(‘rate’, rate) engine.say(text) engine.runAndWait() def parseCommand(): listener = sr.Recognizer() print(‘Listening for a command’) with sr.Microphone() as source: listener.pause_threshold = 2 input_speech = listener.listen(source) try: print(‘Recognizing speech…’) query = listener.recognize_google(input_speech, language=’en_gb’) print(f’The input speech was: {query}’) except Exception as exception: print(‘No command received, insert a command’ ) speak(‘No command received, insert a command’ ) print(exception) return ‘None’ return query # Main loop if __name__ == ‘__main__’: speak(‘All systems are working correctly and are up to date.’) while True: # Parse command query = parseCommand() if query != ‘None’: response = generate_response(query) speak(response) # List commands if query[0] == ‘say’: if ‘hello’ in query: speak(‘Hello, Big Boss. How are you today? How can i help you conquer the world?’) else: query.pop(0) # Remove say speech = ‘ ‘.join(query) speak(speech) # Set commands if query[0] == ‘say’: if ‘hello’ in query: speak(‘Greetings, all!’) else: speak(‘ ‘.join(query[1:])) elif query[0] == ‘search’: if ‘wikipedia’ in query: speak(search_wikipedia(‘ ‘.join(query[1:]))) else: speak(search_chatgpt(‘ ‘.join(query[1:]))) # Navigation if query[0] == ‘go’ and query[1] == ‘to’: speak(‘Opening…’) query = ‘ ‘.join(query[2:]) webbrowser.get(‘chrome’).open_new(query) # Wikipedia if query[0] == ‘wikipedia’: query = ‘ ‘.join(query[1:]) speak(‘Querying the universal databank.’) speak(search_wikipedia(query)) # Wolfram Alpha if query[0] == ‘compute’ or query[0] == ‘computer’: query = ‘ ‘.join(query[1:]) speak(‘Computing’) try: result = search_wolframAlpha(query) speak(result) except: speak(‘Unable to compute.’) # OpenAI client openai_client =openai.Client(api_key=”YOUR-API”) def speak(text, rate = 120): engine.setProperty(‘rate’, rate) engine.say(text) engine.runAndWait() def parseCommand(): listener = sr.Recognizer() print(‘Awaiting command’) with sr.Microphone() as source: listener.pause_threshold = 2 input_speech = listener.listen(source) try: print(‘Recognizing speech…’) query = listener.recognize_google(input_speech, language=’en_gb’) print(f’The input speech was: {query}’) except Exception as exception: print(‘no command received, repeat your command’) speak(‘no command received, repeat your command’) print(exception) return ‘None’ return query def search_wikipedia(keyword=”): searchResults = wikipedia.search(keyword) if not searchResults: return ‘No result received’ try: wikiPage = wikipedia.page(searchResults[0]) except wikipedia.DisambiguationError as error: wikiPage = wikipedia.page(error.options[0]) print(wikiPage.title) wikiSummary = str(wikiPage.summary) return wikiSummary def search_chatgpt(prompt): response = openai_client.completion(engine=”text-davinci-002″, prompt=prompt, max_tokens=2048) return response.choices[0].text prompt = “Hello, How can i help you today?” completions = openai.Completion.create(engine=”text-davinci-002″, prompt=prompt) print(completions.choices[0].text) from gtts import gTTS tts = gTTS(“Hello, How can I help you today?”) # Main loop if __name__ == ‘__main__’: speak(‘All systems are working optimal.’, 120) while True: # Parse as a list query = parseCommand().lower().split() if query[0] == activationWord: query.pop(0) # Set commands if query[0] == ‘say’: if ‘hello’ in query: speak(‘Greetings, all!’) else: speak(‘ ‘.join(query[1:])) elif query[0] == ‘search’: if ‘wikipedia’ in query: speak(search_wikipedia(‘ ‘.join(query[1:]))) else: speak(search_chatgpt(‘ ‘.join(query[1:]))) # Note taking if query[0] == ‘log’: speak(‘Ready to record your note’) newNote = parseCommand().lower() now = datetime.now().strftime(‘%Y-%m-%d-%H-%M-%S’) with open(‘note_%s.txt’ % now, ‘w’) as newFile: newFile.write(newNote) speak(‘Note written’) if query[0] == ‘exit’: speak(‘Goodbye’) break |