Skip to main content | Skip to Navigation | Text Size : | Language :

logo of Linguistic Data Consortium for Indian Languages (LDC-IL)
Speech coropa | Official Website of Linguistic Data Consortium for Indian Languages

Speech coropa

Size of Speech Corpora ( As on Jul 2014)

Slno Language Duration(hh:mm:ss) Speakers Size(in GB) Sample
1 ASSAMESE 54:21:12 304 32.5 Link
2 BENGALI 128:46:59 476 81.2 Link
3 BODO 176:53:28 456 113 Link
4 CHHATTISGARHI 138:09:27 140 88.9 Link
5 DOGRI 17:10:26 61 11 Link
6 GUJARATI 57:17:08 204 37 Link
7 GUJARATI(Mono recordings) 64:44:02 233 7.1 Link
8 HINDI 121:00:06 488 76.6 Link
9 KANNADA 179:32:52 656 115 Link
10 KASHMIRI 28:10:07 150 18 Link
11 KONKANI 156:37:51 504 100 Link
12 MAITHILI 78:45:33 306 49.2 Link
13 MALAYALAM 164:01:02 458 105 Link
14 MANIPURI 156:28:32 620 100 Link
15 MARATHI 89:17:25 307 58 Link
16 NEPALI 87:14:44 350 56.5 Link
17 ODIA 138:06:18 474 89 Link
18 PUNJABI 101:09:28 467 65.5 Link
19 TAMIL 139:11:41 452 86 Link
20 TELUGU 22:43:59 80 15 Link
21 URDU 99:18:21 499 64.2 Link
22 Indian English - Bengali Variant 25:47:11 53 15.5 Link
23 Indian English - Kannada Variant 23:43:04 56 15.3 Link
24 Multilingual 97:43:54 1916 62.2 Link