3 idiots’ Project

Preface :

It’s a 3 idiots team ( Sakthi Dasan , Santosh Kumar, Sumit Kumar) who gathered together to proceed the final year project. In this post I have document our project details that gives an abstract view of our entire works and described our approaches towards the problems.

Quick Look of project : 

We tried to build an intelligent system under the title Emotelligence on Text to recognize human emotion from textual contents ,i.e. if you give an input string , our system would possibly able to say the emotion behind that textual content. 

Emotelligence on Text

Emotelligence on Text

How did we approached the problem 

  • Step 1 : We determined what are emotion we are interested first. 
  • Step 2 : What type of input we are going give. 
  • Step 3 : How do we going to find the emotion

Step 1 : We determined what are emotion we are interested first.

its better to make clear understand of what we are going to find ” On our investigation and research it has been found that a human can express 16 types of emotion with  the help of body gesture and speech. Since we are indented to find emotion from textual content ,we reduce our scope to find 8 basic emotion that are commonly seen in human expressed language. ( 8 basic emotion : Joy ,Trust , Fear , Surprise , Sadness , Disgust , Anger , Anticipate )

Another strong reason for selecting these 8 basic emotion is that , in future if we wish to find all 16 emotions then these 8 basic emotion will act as base to find the remaining 8 advance emotions. It is because that each and every advance emotion is composed of 2 basic emotion.

 Eg :    Basic Emotion + Basic Emotion = Advance Emotion

          Joy   +    Trust   =    Love

So selecting these 8 basic emotion are optimal choice for us.

Step 2 : What type of input we are going give.

It is clear that our input is going to be a text but text could be on any language. we decided to go for English language  , the only reason is that we have to finish our project in short span of time. Considering other language will consume more time in understanding the language structure and it is difficult to apply NLP techniques to unknown language. (other details are covered in step 3)

Step 3 : How do we going to find the emotion

We know what to find , We know what is input , while thinking how to find , an idea bulb glowed over my head indicating to use classifier. After read some literature works we came to know that most expects use classifier for similar kind of problems. The idea is to treat 8 emotions as 8 different class for classifier. Train the classifier with the good training sets and then go for Testing. The result of classier will point to a class which is nothing but a expect emotion.  

Training phase : Proper data set should be collected, inputs have to be sent to training phase of classifier. Training phase include two modules (I) Keyword extracting (II) Keyword conversion. 

Keyword extraction : Unlink other native classification problems direct use of data set will not be useful to us. We need to identify the key terms that are useful for classifier from the in-putted data set. And Noun , Verb , Adverb , Adjective are the useful key terms to find emotion from text. In order to find them we applied POS tagger ( Part-of-speech tagging is the process of assigning a part-of-speech like noun, verb, pronoun, preposition, adverb, adjective or other lexical class marker to each word in a sentence.) and extracted  words are the key terms that we want.

Eg : 

Data Set : My brother was happy after passing the examination.

POS Tagging : My/PRP$ brother/NN was/VBD happy/JJ after/IN passing/VBG the/DT examination/NN ./.

Keywords extracted : brother was happy passing examination

Keyword conversion : We just implemented our own keyword conversion logic that convert the extracted keywords into numeric format that is accepted for our now implement classifier( NB Classifier ). 

Eg :

 Keywords extracted : brother was happy passing examination

( will be converted into something like )

Keyword Conversion : 3#  2:1   4:1   5:2   7:1 ……..

(As the post covers the abstract view of project and approach towards the problem I have neither explained the details of our implemented NB-classifer nor the reasoning behind the keyword conversion format )  

Training phase - Classifier

Training phase - Classifier

Data set collection :

We really showed our innovations in data set collection also. A good and proper dataset have to be collected . First question came to our mind is how to find dataset that are related to emotion and where to find them. Then we focused on the statement (English sentence that talk about emotion ) , we start our haunting on different blogs sites , we searched for English quotes , short poems etc. Then moved our search to social sites like twitter, face books to hunt for the emotional messages that shared among the friends etc. , we also collected news headlines and SMS as they also bring the emotional feel in ourself when we read them. In short Data set collection was a tough and we enjoy that also.

Testing phase : In testing phase also Keywords extraction and keyword conversion occurs then testing set subject to predicting part of the classifier to predict the class. We test few data set to measure the accuracy of the system and below table shows our accuracy results.

Testing Phase - Classifier

Testing Phase - Classifier

Accuracy results of our model

Accuracy results of our model

  • No of corpus we user for Training : 1800
  • No of corpus we user for Testing : 200
  • Over all accuracy of the model : 71 %
  • Highest individual class accuracy : 96 %  for joy
  • Lowest individual class accuracy :  2 %  for  surprise

My Acknowledgements

There are two teacher whom I would like to express my sincere gratitude . First is to our senior prof Arivazhagan N as project guild he was passionate and source of encouragement through out works.

Second is to  Sudarsun Santhiappan (visiting professor of SRM University) who tough me Mining and NLP technique , his class lectures and assignments had really helped a lot in proceeding this project.

I also wish to thank my project mates Santosh Kumar, Sumit Kumar who really coordinated their activities in the entire project. Their contribution towards the work cannot be just expressed in words.  


1 Comment

  1. Shivy wrote
    at 1:51 PM - 23rd October 2015 Permalink

    Sir ,can u plss help us by providing the dataset for this project .

Post a Comment

Your email is never published nor shared. Required fields are marked *