Build a Sentiment Analysis system in 3 lines of code

Thejas Kiran
2 min readJun 11, 2021

A sentiment analysis system is used to understand the nature of the phrase i.e. to understand the emotions behind the phrase. We as humans can understand it easily, but it is way difficult for machines. The main challenges of the machine can be found here. Before writing our sentiment analysis model, let us look at the NLTK library that I will be using for the rest of my blog. NLTK is a shorthand for Natural Language Toolkit and is mainly used for statistical natural processing. I use this to build a tokenizer and the inbuilt Sentiment Analyzer for prediction. The below 3 lines of code are used to import all the required modules and methods for analyzing.

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.tokenize import sent_tokenize
import nltk

We need to download the ‘punkt’ data from the NLTK library as it contained the tokenizer that we are going to be using. A tokenizer is used to divide the whole text into a set of sentences by an unsupervised learning method. We also need to download the ‘vader_lexicon’ data as it contains the text of high-intensity emotions. VADER stands for ‘Valence Aware Dictionary and sEntiment Reasoner’.

nltk.download('punkt')
nltk.download('vader_lexicon')

The below lines of code is the implementation of the sentiment analyzer. As promised, the sentiment analysis system is built using only three lines but a total of 10–12 lines is written for easier reading and printing our predictions in a readable manner. The comments in the below codes show the lines used for analysis. The sentence tokenizer is used to break the text into sentences.

string = '''I have recently read a book. It was boring in the beginning. Later on it catches the user attention as it turns out to be very good. I am going to recommed this book to all readers.'''sent = sent_tokenize(string)           #Line 1

We create an instance of the Sentiment Intensity Analyzer and pass each sentence to it with the help of a for loop. After each iteration, the polarity of the sentence is returned i.e. the probability of negativity, positivity, or neutrality of the sentence.

sa = SentimentIntensityAnalyzer()      #Line 2
count = 1
for sentence in sent:
print(sentence)
ps = sa.polarity_scores(sentence) #Line 3
print('Sentence Number: ', count)
for n in ps:
print(n, ps[n])
count += 1
print('-' * 20)

That’s it, guys. We have successfully built a sentiment analysis model in 3 lines of code. You can find the python file containing the whole code here. Keep learning :)

--

--