Bengali

Corpora

Use CorpusImporter() or browse the CLTK GitHub organization (anything beginning with bengali_) to discover available Bengali corpora.

In [1]: from cltk.corpus.utils.importer import CorpusImporter

In [2]: c = CorpusImporter('bengali')

In [3]: c.list_corpora
Out[3]:
['bengali_text_wikisource']