Arunava Kar | The Benchmarking Conference | LDC-IL

Developing A Benchmarking Tool for Indian Sentiment Analysis models

Arunava Kar

Research Scholar
Centre for Applied Linguistics and Translation Studies
University Hyderabad


Abstract

Sentiment analysis has emerged as one of the important task and use-case of machine learning. It is used by business to understand the business prospect and acceptance of their product as well as by the researcher and strategists who try to fathom the direction and depth of the online interactions It is often the core part of many statistical survey projects, business decisions and many upstream tasks like hate-speech detection (Santosh and Aravind, 2019). Leveraging the power of state-of-the-art architecture like transformer, the sentiment analysis model become very efficient as well as accurate. It is now possible to build multilingual sentiment analysis using the pre-trained multilingual transformer models like xml-roberta (Liu et al., 2019).

However, Indian internet users are predominantly multilingual as it is evident from their online interaction. Hence, sentiment analysis models built for Indian languages should be able to understand and analyse code-mixed content (Choudhary et al., 2023). Furthermore, all Indian languages use different scripts which is also a challenge as models need to recognize the script to understand the content (Bhargava et al., 2016). Therefore, the evaluation and benchmarking tool for the sentiment analysis models specifically developed for Indian language needs to incorporate very high linguistic diversity which is the characteristics of online interaction of the Indian people.

Here, I propose a novel benchmarking tool that evaluates the accuracy level of sentiment analysis models developed for Indian languages. I have developed benchmarking datasets comprising user’s online comment in different Indian languages. Right now, there are only Bangla and Hindi dataset created. Each dataset is divided into three sentiment classes: positive, negative and neutral. I have specifically included challenging cases like code-mixed sentence, sentence where two different script is used, incomplete sentences, emojis, sarcasm etc. An API is developed on the top of this dataset which can evaluate a sentiment analysis model and output its accuracy. I have evaluated 4 different sentiment analysis models created for Bangla and Hindi language. This paper describes the development of the datasets as well as the API and presents the result. I hope it will be a valuable tool for the evaluation and benchmarking of the sentiment analysis models.