Build your own Voice Assistant in Python

4 min readApr 25, 2021

Voice bots are AI-powered softwares that converse with humans. They understand natural languages and, synthesize speech for this interaction. Without getting into the technicalities on how this works, let me walk you through my python code which acts as a personal Voice Assistant.

Firstly let’s walk through the libraries that are used to code our Voice Assistant.

SpeechRecognition Library
This library is used to identify the human spoken natural language and then convert them into the textual form so that our assistant can reply accordingly. The link to download this library is given below.
https://pypi.org/project/SpeechRecognition/
pyttsx3 Library
After the human language is understood by the assistant, for the reply to be spoken out we use this library. The main advantage of this library is that, it works offline. The link to download this library is given below.
https://pypi.org/project/pyttsx3/
neuralintents Library
This is a new library released on March 10, 2021 (version 0.0.3). It does all the functions required by us and, is very easy to use. This library is used to build easy interfaces and chatbots. The link to download this library is given below.
https://pypi.org/project/neuralintents/

Even before beginning to code, we need to have an “intents.json” file, that stores the replies for a set of inputs by the user. This JSON file is accessed by the Voice Assistant and the response accordingly.

{"intents": [
  {
    "tag": "greeting",
    "patterns": ["Hey", "Hello", "Hi", "What's up?", "Good Day"],
    "responses": ["Hello there!", "Hello, what can I do for you?"]
  },
  {
    "tag": "create_note",
    "patterns": ["New note", "Create a note"],
    "responses": [""]
  },
  {
    "tag": "add_todo",
    "patterns": ["New item", "Add an item"],
    "responses": [""]
  },
  {
    "tag": "show_todos",
    "patterns": ["Show my todos", "What is on my list"],
    "responses": [""]
  },
  {
    "tag": "exit",
    "patterns": ["Bye", "See you", "Quit", "Exit"],
    "responses": ["Thank you for spending time with me."]
  },
]}

Let’s start coding by importing all the required libraries

import speech_recognition
import pyttsx3 as tts
from neuralintents import GenericAssistant
import sys

After importing all the required modules, we need to create an instance of the speaker and the recognizer so that the assistant can capture what we humans say and convert it into textual form and, the remaining code is explained by comments within the program. A list named “todo_list” is created to work on the list that the assistant maintains for us.

recognizer = speech_recognizer.Recognizer()
speaker = tts.init()
speaker.setProperty('rate', 150) #rate is property, 150 is the value#Creating an object to access the todo list
todo_list = ['Go Shopping', 'Clean Room']

Now, let’s begin coding functions for each of the required tasks. The below code snippet shows you how the system reacts to a greeting.

#Greeting the user
def greeting():
  speaker.say("Hello, What can I do for you?")
  speaker.runAndWait()

We code the below function for the program to create notes based on the user requirements and save them in a particular file.

#Function to create and add new note
def create_note():
  global recognizer   #Making the variable global
  speaker.say("What do you want to write as note?")
  speaker.runAndWait()    #Asking for user input
  done = True
  #The try block is used in case the microphone fails  
  while done:
    try:         
      with speech_recognition.Microphone() as mic:
        recognizer.adjust_for_ambient_noise(mic, duration = 0.2)
        #Accepting user voice input
        audio = recognizer.listen(mic)
        note = recognizer.recognize_google(audio)
        note = note.lower()
        speaker.say("Choose a filename!")
        speaker.runAndWait()
        recognizer.adjust_for_ambient_noise(mic, duration = 0.2) 
        #Accepting user filename
        audio = recognizer.listen(mic)
        filename = recognizer.recognize_google(audio)
        filename = filename.lower()
      with open(filename + '.txt', 'w') as f:
        f.write(note)
        done = False
        #Terminating the while loop if listened properly
        speaker.say("New note successfully created")
        speaker.runAndWait()
    except speech_recognition.UnknownValueError:
      recognizer = speech_recognizer.Recognizer()
      speaker.say("I did not understand you. Please try again!")
      speaker.runAndWait()

The to-do list is a must in every Voice Assistant as it helps us in remembering the tasks or activities that have to complete. The below code shows us, how such a to-do list can be created and read out to the user. A function for adding new elements to the list has also been created.

#Speaking out the list
def show_todo():
  speaker.say("Your list contains the following elements")
  for item in todo_list:
    speaker.say(item)
  speaker.runAndWait()#Adding elements to a todo list
def add_todo();
  global recognizer
  speaker.say("What item do you want to add?")
  speaker.runAndWait()
  done = True
  while done:
    try:
      with speech_recognition.Microphone() as mic:
        recognizer.adjust_for_ambient_noise(mic, duration = 0.3)
        audio = recognizer.listen(mic)
        item = recognizer.recognize_google(audio)
        item = item.lower()
        todo_list.append(item)
        done = False
        speaker.say(item+" was added to the list!")
        speaker.runAndWait()
    except speech_recognition.UnknownValueError:
      recognizer = speech_recognition.Recognizer()
      speaker.say("I'm sorry, can you repeat it again!")
      speaker.runAndWait()

We now create an exit function to give a send-off to the user :P

#Exiting from your assistant
def close():
  speaker.say("Bye. Coming back soon!")
  speaker.runAndWait()
  sys.exit(0)

After defining all the functions, we have to map each of the functions to one of the objects in the JSON file. The syntax for that is given below.

mappings = {
  "greeting": hello,
  "create_node": create_node,
  "add_todo": add_todo,
  "show_todo": show_todo,
  "exit": close
}

Note: In the above code, remeber to map function names and not call the function. The program automatically calls the function based on the function names.

The last in the program is for us to train the model in recognising intents. This part of learning is done using the GenericAssistant method, which is present in the neuralintents module.

#Training a model to recognize the intents
assistant = GenericAssistant('intents.json',intent_methods=mappings)
assistant.train_model()
assistant.request()

We have now built a Voice Assistant that works on user intents. We just need to create a code snippet which listens to the user continuously. The below code shows how our assistant keeps listening to the user input.

while True:
  try:
    with speech_recognition.Microphone() as mic:
      recognizer.adjust_for_ambient_sound9mic, duration = 0.2)
      audio = recognizer.listen(mic)
      message = recognizer.recognize_google(audio)
      message = message.audio()
    assistant.request(message)
  except speech_recogniton.UnkownValueError:
    recognizer = speech_recognition.Recognizer()

The whole code is present in my GitHub repository. Please check the file if required.
https://github.com/ThejasBK/Python-Projects/blob/master/virtualAssistant.py

Build your own Voice Assistant in Python

Written by Thejas Kiran