8. cltk package

Init module for importing the CLTK class.

8.1. Subpackages

8.2. Submodules

8.3. cltk.nlp module

Primary module for CLTK pipeline.

class cltk.nlp.NLP(language, custom_pipeline=None, suppress_banner=False)[source]

Bases: object

NLP class for default processing.

process_objects: dict[Type[cltk.core.data_types.Process], cltk.core.data_types.Process] = {}
process_lock = <unlocked _thread.lock object>
_print_cltk_info()[source]

Print to screen about citing CLTK.

Return type:

None

_print_pipelines_for_current_lang()[source]

Print to screen the Process``es invoked upon invocation of ``NLP().

Return type:

None

_print_special_authorship_messages_for_current_lang()[source]

Print to screen the authors of particular algorithms.

Return type:

None

_print_suppress_reminder()[source]

Tell users how to suppress printed messages.

Return type:

None

_get_process_object(process_object)[source]

Returns an instance of a process from a memoized hash. An un-instantiated process is created and stashed in the cache.

TODO: Figure out typing in this.

Return type:

Process

analyze(text)[source]

The primary method for the NLP object, to which raw text strings are passed.

Parameters:

text (str) – Input text string.

Return type:

Doc

Returns:

CLTK Doc containing all processed information.

>>> from cltk.languages.example_texts import get_example_text
>>> from cltk.core.data_types import Doc
>>> cltk_nlp = NLP(language="lat", suppress_banner=True)
>>> cltk_doc = cltk_nlp.analyze(text=get_example_text("lat"))
>>> isinstance(cltk_doc, Doc)
True
>>> cltk_doc.words[0].string
'Gallia'
_get_pipeline()[source]

Select appropriate pipeline for given language. If custom processing is requested, ensure that user-selected choices are valid, both in themselves and in unison.

>>> from cltk.core.data_types import Pipeline
>>> cltk_nlp = NLP(language="lat", suppress_banner=True)
>>> lat_pipeline = cltk_nlp._get_pipeline()
>>> isinstance(cltk_nlp.pipeline, Pipeline)
True
>>> isinstance(lat_pipeline, Pipeline)
True
>>> cltk_nlp = NLP(language="axm", suppress_banner=True)
Traceback (most recent call last):
  ...
cltk.core.exceptions.UnimplementedAlgorithmError: Valid ISO language code, however this algorithm is not available for ``axm``.
Return type:

Pipeline