8. cltk package

Init module for importing the CLTK class.

8.1. Subpackages

8.2. Submodules

8.3. cltk.nlp module

Primary module for CLTK pipeline.

class cltk.nlp.NLP(language, custom_pipeline=None, suppress_banner=False)[source]

Bases: object

NLP class for default processing.

process_objects = {}
process_lock = <unlocked _thread.lock object>
_print_pipelines_for_current_lang()[source]

Print to screen the Process``es invoked upon invocation of ``NLP().

_get_process_object(process_object)[source]

Returns an instance of a process from a memoized hash. An un-instantiated process is created and stashed in the cache.

Return type

Process

analyze(text)[source]

The primary method for the NLP object, to which raw text strings are passed.

Parameters

text (str) – Input text string.

Return type

Doc

Returns

CLTK Doc containing all processed information.

>>> from cltk.languages.example_texts import get_example_text
>>> from cltk.core.data_types import Doc
>>> cltk_nlp = NLP(language="lat", suppress_banner=True)
>>> cltk_doc = cltk_nlp.analyze(text=get_example_text("lat"))
>>> isinstance(cltk_doc, Doc)
True
>>> cltk_doc.words[0] 
Word(index_char_start=None, index_char_stop=None, index_token=0, index_sentence=0, string='Gallia', pos=noun, lemma='mallis', stem=None, scansion=None, xpos='A1|grn1|casA|gen2', upos='NOUN', dependency_relation='nsubj', governor=3, features={Case: [nominative], Degree: [positive], Gender: [feminine], Number: [singular]}, category={F: [neg], N: [pos], V: [neg]}, stop=False, named_entity='LOCATION', syllables=None, phonetic_transcription=None, definition='')
_get_pipeline()[source]

Select appropriate pipeline for given language. If custom processing is requested, ensure that user-selected choices are valid, both in themselves and in unison.

>>> from cltk.core.data_types import Pipeline
>>> cltk_nlp = NLP(language="lat", suppress_banner=True)
>>> lat_pipeline = cltk_nlp._get_pipeline()
>>> isinstance(cltk_nlp.pipeline, Pipeline)
True
>>> isinstance(lat_pipeline, Pipeline)
True
>>> cltk_nlp = NLP(language="axm", suppress_banner=True)
Traceback (most recent call last):
  ...
cltk.core.exceptions.UnimplementedAlgorithmError: Valid ISO language code, however this algorithm is not available for ``axm``.
Return type

Pipeline