Alvyn Abranches | The Benchmarking Conference | LDC-IL

KonkABS: Evaluating Low-Resource Benchmark for Aspect-based Summarization in Konkani using Translated OASum Dataset with LLMs

Alvyn Abranches

University of Goa


Authors : Alvyn Abranches, Pradnya Bhagat, Pratik D. Korkankar, Jyoti D. Pawar

Abstract

Aspect-based Summarization (ABS) aims to generate concise summaries of textual content by focusing on specific aspects or topics. While extensive research has been conducted on widely used datasets in high-resource languages, whereas low-resource languages like Konkani remain underexplored. This paper introduces a benchmark for aspect-based summarization using a Konkani-translated version of the OASum dataset, a prominent dataset designed for ABS tasks. We describe the methodology for translating and adapting the dataset, ensuring linguistic and contextual alignment with Konkani language characteristics. Further, we evaluate the performance of various state-of-the-art ABS models on this dataset, analysing their effectiveness in handling the unique linguistic nuances of Konkani. Our experiments provide valuable insights into the challenges of ABS in low-resource languages, highlighting areas for improvement and setting a foundation for future research. This benchmark aims to bridge the gap in ABS research for under-represented languages and promote advancements in multilingual natural language processing.

Keywords: Aspect-based Summarization, Konkani Low-Resource NLP, OASum Dataset, Benchmarking, Computational Linguistics, Large Language Models (LLMs)