Skip to main content | Skip to Navigation | Text Size : | Language :

logo of Linguistic Data Consortium for Indian Languages (LDC-IL)
Comparable Text Corpus | Official Website of Linguistic Data Consortium for Indian Languages

Comparable Text Corpus

Status of Comparable Text Corpora :

Slno Language No of words
1. English - Bengali 126828 - 93952
2. English - Dogri 88025 - 93293
3. English - Hindi 1814273 - 1802435
4. English - Kannada 779258 - 476855
5. English - Maithili 159419 - 136421
6. English - Nepali 263256 – 202157

Sample files