Central Institute of Indian Languages [CIIL] MISSION STATEMENT:  Annotated, quality language data (both-text & speech) and tools in Indian Languages to Individuals, Institutions and Industry for Research & Development - Created in-house, through outsourcing and acquisition..  Our Other Sites  Related Sites 
You are here: BACK
Resources > Comparable Text Corpora
Comparable Text Corpora

Status of Comparable Text Corpora :

Sl. No.

Language

No. of Words

1.

English - Bengali

126828 - 93952

2.

English - Dogri

88025 - 93293

3.

English - Hindi

1814273 - 1802435

4.

English - Kannada

779258 - 476855

5.

English - Maithili

159419 - 136421

6.

English - Nepali

263256 – 202157


  Sample files  
TOP BACK
Visitor Counter

502896

Developed & Maintained by:
LDC-IL, CIIL
Copyright © LDC-IL,
Central Institute of Indian Languages
Central Institute of Indian Languages
Department of Higher Education
Ministry of Education
Government of India
Manasagangothri, Hunsur Road, Mysore-570006, Karnataka, India.
Tel: (0821) 2515820 (Director)
Reception/PABX : (0821) 2345000
Fax: (0821) 2515032 (Off)
        Home | Announcements | News | CIIL | Contact Us