News >Konknai NLP Workshop Report (Department of Konkani, Goa University, Taleigao, Goa)
Konknai NLP Workshop Report
(Department of Konkani, Goa University, Taleigao, Goa)
10th - 20th January, 2012

Linguistic Data Consortium For Indian Languages (LDC-IL), Central Institute of Indian Languages (CIIL), Mysore in collaboration with Department of Konkani, Goa University organized an ten days Orientation cum Training Programme  On Natural Language Processing from the date 10th January to 20th January at Goa University Goa. This Orientation was organized for Konkani and Gujarati Language. The Inaugural function was held at The Senate Hall of Goa University on 10th January, 2012. Prof. Priyadarshini Tadkodkar, Head of the Konkani Department gave welcome speech which is followed by the Inaugural address by the faculty dean Prof. R. N. Mishra and the Presidential speech by Registrar of Goa University Mr. Rajendra Kamat.  The zest of the speech of theses dignitaries is the status of Indian Languages in current scenario. In answer to their addressed question, Dr. L. Ramamoorthy, Head, LDC-IL said how LDC-IL project is contributing to the up-liftment of Indian languages as it is digitizing Indian Languages and create Artificial Intelligence tools out of it. Lightening up the lamp ceremony is done by dignities Dr. L. Ramamoorthy, Prof. Priyadarshini Tadkodkar, Mr. Rajendra Kamat, Prof. R. N. Mishra, Prof. Nilotpala Gandhi (Gujarati Language expert). Mr. Prakash Pariekar gave vote of thanks on behalf of Goa University.

Dr. L. Ramamoorthy, Head, LDC-IL gave a brief sketch on the activities of the LDC-IL, CIIL, Ministry of HRD, Mysore. He emphasized on how the work on Konkani language is so much needed to be done as per the scheduled language.  He pointed out the critical situation of the languages of India which are yet to bring to the next level. He insisted the importance of Machine Translation and the use of language Technology to preserve the endangered languages. He further stressed the role of linguistics in the development of language technology and also pointed out that the development of Computer analyzed computer-mediated grammar is the objective of the LDC-IL.

The programme was organized for ten days from 10th – 20th January 2012. May illuminating and provoking lectures on the topics such as Phonetics, Morphology, Syntax, Semantics, Approaches to NLP, Corpus Linguistics, Encoding and Balanced Corpus, Linguistic Knowledge and Corpus Extraction, Linguistic Analysis, Local word grouping, POS Tagging, Speech Annotation, Spell Checker, Word Net and Syntactic Parsing and sign Language were given by the internal faculty of LDC-IL and some of Language Experts from Goa and Gujarat University.

After the inaugural function which is followed by Lunch, we started with the Orientation at Academy staff college, Goa University. Shahid Bhat gave presentation on Language and Linguistics.

On the second day Purva Dholakia talked about Morphology a basic which is followed by Shahid Bhat presentation on Artificial Intelligence. After tea Prof. Nilotpala Gandhi gave extensive lecture on Gujarati morphology. After lunch break Prof. Nilotpala Gandhi talked about phonetics which is followed by tea break. After tea break Shahid Bhat gave lecture on syntax.

Third day was started off with Prof.  Madhavi Sardesai lecture on Konkani Morphology. After tea break Purva Dholakia talked over what is POS tagging which is followed by the lunch break. In post lunch session Purva Dholakia presented her paper on Comparison and mapping of two tagsets for Gujarati. After tea break we gave exercise to students on pos tagging and morphology.

The fourth day was devoted to an extensive lecture on Corpus linguistics and its relevance for the field of NLP by Atreyee Sharma. After tea break Atreyee Sharma gave lecture on Corpus annotation: Concepts, types and application and she threw light on the area of sign language work carried forward in LDC-IL. After Lunch Shahid Bhat gave presentation on approaches to NLP. In post tea session Saurabh Varik talked over speech data collection which is done for Konkani.

The fifth day was scheduled for POS (tagsets / issues). Associate Prof. Sushant Naik handled the first session where he discussed about the importance and uniqueness of Konkani language, he described the pos tagsets with Konkani example for each tag, later he discussed issues which they have come across while tagging. After Lunch Vishal – a computer science student has given the demo of a pos tagging tool which they have built in house. After tea break Purva Dholakia discussed about Morphological Analyzer with a demonstration of Appertium and Suffix stripping tools.

On the sixth day Shahid Bhat took charge on first session by giving lecture on Chunking. After tea Shilpa Desai – lecturer in computer science dept. who also works for building up Morphological Analyzer for Konkani gave lecture on Morphological Analyzer finite state automata. After lunch Shri. Damodar Ghanekar delivered lecture on Lexicography. After tea session students who work in different projects on pos and morph both from Gujarat and Goa university , they talked about their NLP  work experience , then after we gave them exercise of paradigm making , both representatives from Gujarat and Goa carried on this exercise together and drawn a paradigm chart.

The seventh day started off with the presentation by Aju Thomas on the topic of TTS: Concept, development and challenges. After tea break Aju Thomas gave lecture on ASR: Concept, development and challenges. In post lunch session Aju Thomas talked over sign language: data collection and challenges.

On 8th day that was scheduled for technical session, Bharathraju took charge in first session, he discussed about OCR: Concept, Application, development & challenges. After tea break he talked about Machine Translation: Concept, Application, development & challenges. After lunch session Rashmi Shet gave demonstration of corpus analysis tools. In post tea session Shahid Bhat talked about syntactic Parsing.

On ninth day Bharathraju discussed about Encoding (Concept of Font & Encoding; ASCII, ISCII & Unicode), he continued his lecture after tea break where he discussed about spell checkers. After lunch break Shahid Bhat explained extensively on the topic of tree banking and later on he gave exercise of tree banking to students.

The tenth day was devoted to speech section. Earlier Saurabh Varik has talked about speech data collection. On that day he took over the charge of orientation and started with speech segmentation. In between Rashmi Shet has given the demo of segmentation tool. Then onwards, Saurabh Varik explained extensively about speech annotation: process and issues. In post lunch session we took examination on whatever we have taught them to check how much students have garbed. Students performed well in the exam. The workshop concluded with a feedback session by students. Later we had valedictory talk and certificate distribution by Prof. Priyadarshini Tadkodkar. She congratulated all on the success of the workshop with a proposal of having a refresher course on NLP in the future. Finally, Purva, Shahid and Saurabh gave vote of thanks on behalf of LDC-IL. Students took part with enthusiasm, acknowledging the aim behind the orientation which marks its success.

