8. cltk package¶
Init module for importing the CLTK class.
8.1. Subpackages¶
- 8.1.1. cltk.alphabet package
- 8.1.1.1. Subpackages
- 8.1.1.2. Submodules
- 8.1.1.3. cltk.alphabet.ang module
- 8.1.1.4. cltk.alphabet.arb module
- 8.1.1.5. cltk.alphabet.arc module
- 8.1.1.6. cltk.alphabet.ave module
- 8.1.1.7. cltk.alphabet.ben module
- 8.1.1.8. cltk.alphabet.egy module
- 8.1.1.9. cltk.alphabet.enm module
- 8.1.1.10. cltk.alphabet.fro module
- 8.1.1.11. cltk.alphabet.gmh module
- 8.1.1.12. cltk.alphabet.guj module
- 8.1.1.13. cltk.alphabet.hin module
- 8.1.1.14. cltk.alphabet.kan module
- 8.1.1.15. cltk.alphabet.lat module
JVReplacer
LigatureReplacer
dehyphenate()
swallow()
swallow_braces()
drop_latin_punctuation()
remove_accents()
remove_macrons()
swallow_angle_brackets()
disappear_angle_brackets()
swallow_square_brackets()
swallow_obelized_words()
disappear_round_brackets()
swallow_editorial()
accept_editorial()
truecase()
normalize_lat()
- 8.1.1.16. cltk.alphabet.non module
- 8.1.1.17. cltk.alphabet.omr module
- 8.1.1.18. cltk.alphabet.ory module
- 8.1.1.19. cltk.alphabet.osc module
- 8.1.1.20. cltk.alphabet.ota module
- 8.1.1.21. cltk.alphabet.oty module
- 8.1.1.22. cltk.alphabet.peo module
- 8.1.1.23. cltk.alphabet.pes module
- 8.1.1.24. cltk.alphabet.pli module
- 8.1.1.25. cltk.alphabet.processes module
- 8.1.1.26. cltk.alphabet.san module
- 8.1.1.27. cltk.alphabet.tel module
- 8.1.1.28. cltk.alphabet.text_normalization module
- 8.1.1.29. cltk.alphabet.urd module
- 8.1.1.30. cltk.alphabet.xlc module
- 8.1.1.31. cltk.alphabet.xld module
- 8.1.2. cltk.core package
- 8.1.2.1. Submodules
- 8.1.2.2. cltk.core.cltk_logger module
- 8.1.2.3. cltk.core.data_types module
Language
Word
Word.index_char_start
Word.index_char_stop
Word.index_token
Word.index_sentence
Word.string
Word.pos
Word.lemma
Word.stem
Word.scansion
Word.xpos
Word.upos
Word.dependency_relation
Word.governor
Word.features
Word.category
Word.embedding
Word.stop
Word.named_entity
Word.syllables
Word.phonetic_transcription
Word.definition
Sentence
Doc
Process
Pipeline
- 8.1.2.4. cltk.core.exceptions module
- 8.1.3. cltk.corpora package
- 8.1.4. cltk.data package
- 8.1.5. cltk.dependency package
- 8.1.5.1. Submodules
- 8.1.5.2. cltk.dependency.processes module
- 8.1.5.3. cltk.dependency.spacy_wrapper module
- 8.1.5.4. cltk.dependency.stanza_wrapper module
StanzaWrapper
StanzaWrapper.nlps
StanzaWrapper.parse()
StanzaWrapper._load_pipeline()
StanzaWrapper._is_model_present()
StanzaWrapper._download_model()
StanzaWrapper._get_default_treebank()
StanzaWrapper._is_valid_treebank()
StanzaWrapper.is_wrapper_available()
StanzaWrapper._get_stanza_code()
StanzaWrapper.get_nlp()
- 8.1.5.5. cltk.dependency.tree module
- 8.1.5.6. cltk.dependency.utils module
- 8.1.6. cltk.embeddings package
- 8.1.6.1. Submodules
- 8.1.6.2. cltk.embeddings.embeddings module
CLTKWord2VecEmbeddings
CLTKWord2VecEmbeddings._build_filepath()
CLTKWord2VecEmbeddings.get_word_vector()
CLTKWord2VecEmbeddings.get_embedding_length()
CLTKWord2VecEmbeddings.get_sims()
CLTKWord2VecEmbeddings._check_input_params()
CLTKWord2VecEmbeddings._download_cltk_self_hosted_models()
CLTKWord2VecEmbeddings._is_model_present()
CLTKWord2VecEmbeddings._load_model()
Word2VecEmbeddings
Word2VecEmbeddings.get_word_vector()
Word2VecEmbeddings.get_embedding_length()
Word2VecEmbeddings.get_sims()
Word2VecEmbeddings._check_input_params()
Word2VecEmbeddings._build_zip_filepath()
Word2VecEmbeddings._build_nlpl_filepath()
Word2VecEmbeddings._is_nlpl_model_present()
Word2VecEmbeddings._download_nlpl_models()
Word2VecEmbeddings._unzip_nlpl_model()
Word2VecEmbeddings._load_model()
FastTextEmbeddings
FastTextEmbeddings.get_word_vector()
FastTextEmbeddings.get_embedding_length()
FastTextEmbeddings.get_sims()
FastTextEmbeddings.download_fasttext_models()
FastTextEmbeddings._is_model_present()
FastTextEmbeddings._check_input_params()
FastTextEmbeddings._load_model()
FastTextEmbeddings._is_fasttext_lang_available()
FastTextEmbeddings._build_fasttext_filepath()
FastTextEmbeddings._build_fasttext_url()
- 8.1.6.3. cltk.embeddings.processes module
- 8.1.6.4. cltk.embeddings.sentence module
- 8.1.7. cltk.languages package
- 8.1.7.1. Submodules
- 8.1.7.2. cltk.languages.example_texts module
- 8.1.7.3. cltk.languages.glottolog module
- 8.1.7.4. cltk.languages.pipelines module
AkkadianPipeline
ArabicPipeline
AramaicPipeline
ChinesePipeline
CopticPipeline
GothicPipeline
GreekPipeline
HindiPipeline
LatinPipeline
MiddleHighGermanPipeline
MiddleEnglishPipeline
MiddleFrenchPipeline
OCSPipeline
OldEnglishPipeline
OldFrenchPipeline
OldNorsePipeline
PaliPipeline
PanjabiPipeline
SanskritPipeline
- 8.1.7.5. cltk.languages.utils module
- 8.1.8. cltk.lemmatize package
- 8.1.8.1. Submodules
- 8.1.8.2. cltk.lemmatize.ang module
- 8.1.8.3. cltk.lemmatize.backoff module
- 8.1.8.4. cltk.lemmatize.fro module
- 8.1.8.5. cltk.lemmatize.grc module
- 8.1.8.6. cltk.lemmatize.lat module
- 8.1.8.7. cltk.lemmatize.naive_lemmatizer module
- 8.1.8.8. cltk.lemmatize.processes module
- 8.1.9. cltk.lexicon package
- 8.1.10. cltk.morphology package
- 8.1.10.1. Submodules
- 8.1.10.2. cltk.morphology.akk module
- 8.1.10.3. cltk.morphology.lat module
- 8.1.10.4. cltk.morphology.morphosyntax module
- 8.1.10.5. cltk.morphology.universal_dependencies_features module
MorphosyntacticFeature
N
V
F
POS
VerbForm
Mood
Tense
Aspect
Voice
Evidentiality
Polarity
Person
Politeness
Clusivity
Strength
Case
Case.nominative
Case.accusative
Case.ergative
Case.absolutive
Case.abessive
Case.befefactive
Case.causative
Case.comparative
Case.considerative
Case.comitative
Case.dative
Case.distributive
Case.equative
Case.genitive
Case.instrumental
Case.partitive
Case.vocative
Case.ablative
Case.additive
Case.adessive
Case.allative
Case.delative
Case.elative
Case.essive
Case.illative
Case.inessive
Case.lative
Case.locative
Case.perlative
Case.sublative
Case.superessive
Case.terminative
Case.temporal
Case.translative
Gender
Animacy
Number
NumForm
Definiteness
Degree
NameType
PronominalType
PronominalType.article
PronominalType.contrastive
PronominalType.demonstrative
PronominalType.emphatic
PronominalType.exclamative
PronominalType.indefinite
PronominalType.interrogative
PronominalType.negative
PronominalType.personal
PronominalType.reciprocal
PronominalType.relative
PronominalType.total
AdpositionalType
AdverbialType
VerbType
Possessive
Numeral
Reflexive
Foreign
Abbreviation
Typo
InflClass
NumValue
Proper
Form
- 8.1.10.6. cltk.morphology.utils module
- 8.1.11. cltk.ner package
- 8.1.12. cltk.phonology package
- 8.1.12.1. Subpackages
- 8.1.12.1.1. cltk.phonology.ang package
- 8.1.12.1.2. cltk.phonology.arb package
- 8.1.12.1.3. cltk.phonology.enm package
- 8.1.12.1.4. cltk.phonology.gmh package
- 8.1.12.1.5. cltk.phonology.got package
- 8.1.12.1.6. cltk.phonology.grc package
- 8.1.12.1.7. cltk.phonology.lat package
- 8.1.12.1.8. cltk.phonology.non package
- 8.1.12.1.8.1. Subpackages
- 8.1.12.1.8.2. Submodules
- 8.1.12.1.8.3. cltk.phonology.non.orthophonology module
- 8.1.12.1.8.4. cltk.phonology.non.phonology module
- 8.1.12.1.8.5. cltk.phonology.non.syllabifier module
- 8.1.12.1.8.6. cltk.phonology.non.transcription module
- 8.1.12.1.8.7. cltk.phonology.non.utils module
- 8.1.12.2. Submodules
- 8.1.12.3. cltk.phonology.akk module
- 8.1.12.4. cltk.phonology.orthophonology module
PhonologicalFeature
Consonantal
Voiced
Aspirated
Geminate
Roundedness
Length
Height
Backness
Manner
Place
AbstractPhoneme
make_phoneme()
PositionedPhoneme()
AlwaysMatchingPseudoPhoneme
WordBoundaryPseudoPhoneme
SyllableBoundaryPseudoPhoneme
PhonemeDisjunction
Consonant
Vowel
BasePhonologicalRule
PhonologicalRule
PhonemeNotFound
LetterNotFound
Orthophonology
Orthophonology.add_rule()
Orthophonology.is_syllable_initial()
Orthophonology.is_syllable_final()
Orthophonology._position_phonemes()
Orthophonology.transcribe_word()
Orthophonology.transcribe()
Orthophonology.transcribe_to_modern()
Orthophonology.voice()
Orthophonology.aspirate()
Orthophonology.geminate()
Orthophonology.lengthen()
- 8.1.12.5. cltk.phonology.processes module
- 8.1.12.6. cltk.phonology.syllabifier_processes module
- 8.1.12.7. cltk.phonology.syllabify module
get_onsets()
Syllabifier
Syllabifier.set_invalid_onsets()
Syllabifier.set_invalid_ultima()
Syllabifier.set_hierarchy()
Syllabifier.set_vowels()
Syllabifier.syllabify()
Syllabifier.syllabify_ssp()
Syllabifier.onset_maximization()
Syllabifier.legal_onsets()
Syllabifier.syllabify_mop()
Syllabifier.set_short_vowels()
Syllabifier.set_diphthongs()
Syllabifier.set_triphthongs()
Syllabifier.set_consonants()
Syllabifier.syllabify_ipa()
Syllabifier.syllabify_phonemes()
Syllable
- 8.1.12.8. cltk.phonology.transcription_processes module
PhonologicalTranscriptionProcess
GothicPhonologicalTranscriberProcess
GreekPhonologicalTranscriberProcess
LatinPhonologicalTranscriberProcess
MiddleHighGermanPhonologicalTranscriberProcess
OldEnglishPhonologicalTranscriberProcess
OldNorsePhonologicalTranscriberProcess
OldSwedishPhonologicalTranscriberProcess
- 8.1.12.1. Subpackages
- 8.1.13. cltk.prosody package
- 8.1.13.1. Subpackages
- 8.1.13.1.1. cltk.prosody.lat package
- 8.1.13.1.1.1. Submodules
- 8.1.13.1.1.2. cltk.prosody.lat.clausulae_analysis module
- 8.1.13.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module
- 8.1.13.1.1.4. cltk.prosody.lat.hexameter_scanner module
- 8.1.13.1.1.5. cltk.prosody.lat.macronizer module
- 8.1.13.1.1.6. cltk.prosody.lat.metrical_validator module
- 8.1.13.1.1.7. cltk.prosody.lat.pentameter_scanner module
- 8.1.13.1.1.8. cltk.prosody.lat.scanner module
- 8.1.13.1.1.9. cltk.prosody.lat.scansion_constants module
- 8.1.13.1.1.10. cltk.prosody.lat.scansion_formatter module
- 8.1.13.1.1.11. cltk.prosody.lat.string_utils module
- 8.1.13.1.1.12. cltk.prosody.lat.syllabifier module
- 8.1.13.1.1.13. cltk.prosody.lat.verse module
- 8.1.13.1.1.14. cltk.prosody.lat.verse_scanner module
- 8.1.13.1.1. cltk.prosody.lat package
- 8.1.13.2. Submodules
- 8.1.13.3. cltk.prosody.gmh module
- 8.1.13.4. cltk.prosody.grc module
- 8.1.13.5. cltk.prosody.non module
- 8.1.13.1. Subpackages
- 8.1.14. cltk.sentence package
- 8.1.15. cltk.stem package
- 8.1.15.1. Submodules
- 8.1.15.2. cltk.stem.akk module
- 8.1.15.3. cltk.stem.enm module
- 8.1.15.4. cltk.stem.fro module
- 8.1.15.5. cltk.stem.gmh module
- 8.1.15.6. cltk.stem.lat module
- 8.1.15.7. cltk.stem.processes module
- 8.1.16. cltk.stops package
- 8.1.16.1. Submodules
- 8.1.16.2. cltk.stops.akk module
- 8.1.16.3. cltk.stops.ang module
- 8.1.16.4. cltk.stops.arb module
- 8.1.16.5. cltk.stops.cop module
- 8.1.16.6. cltk.stops.enm module
- 8.1.16.7. cltk.stops.fro module
- 8.1.16.8. cltk.stops.gmh module
- 8.1.16.9. cltk.stops.grc module
- 8.1.16.10. cltk.stops.hin module
- 8.1.16.11. cltk.stops.lat module
- 8.1.16.12. cltk.stops.non module
- 8.1.16.13. cltk.stops.omr module
- 8.1.16.14. cltk.stops.pan module
- 8.1.16.15. cltk.stops.processes module
- 8.1.16.16. cltk.stops.san module
- 8.1.16.17. cltk.stops.words module
- 8.1.17. cltk.tag package
- 8.1.18. cltk.text package
- 8.1.19. cltk.tokenizers package
- 8.1.19.1. Subpackages
- 8.1.19.2. Submodules
- 8.1.19.3. cltk.tokenizers.akk module
- 8.1.19.4. cltk.tokenizers.arb module
- 8.1.19.5. cltk.tokenizers.enm module
- 8.1.19.6. cltk.tokenizers.fro module
- 8.1.19.7. cltk.tokenizers.gmh module
- 8.1.19.8. cltk.tokenizers.line module
- 8.1.19.9. cltk.tokenizers.non module
- 8.1.19.10. cltk.tokenizers.processes module
TokenizationProcess
MultilingualTokenizationProcess
AkkadianTokenizationProcess
ArabicTokenizationProcess
GreekTokenizationProcess
LatinTokenizationProcess
MiddleHighGermanTokenizationProcess
MiddleEnglishTokenizationProcess
OldFrenchTokenizationProcess
MiddleFrenchTokenizationProcess
OldNorseTokenizationProcess
- 8.1.19.11. cltk.tokenizers.utils module
- 8.1.19.12. cltk.tokenizers.word module
- 8.1.20. cltk.utils package
- 8.1.21. cltk.wordnet package
- 8.1.21.1. Submodules
- 8.1.21.2. cltk.wordnet.processes module
- 8.1.21.3. cltk.wordnet.wordnet module
nesteddict()
_INF
WordNetError
_WordNetObject
_WordNetObject.antonyms()
_WordNetObject.hypernyms()
_WordNetObject.hyponyms()
_WordNetObject.member_holonyms()
_WordNetObject.substance_holonyms()
_WordNetObject.part_holonyms()
_WordNetObject.member_meronyms()
_WordNetObject.substance_meronyms()
_WordNetObject.part_meronyms()
_WordNetObject.attributes()
_WordNetObject.entailments()
_WordNetObject.causes()
_WordNetObject.also_sees()
_WordNetObject.verb_groups()
_WordNetObject.similar_tos()
_WordNetObject.nearest()
Lemma
Semfield
Synset
Synset.id()
Synset.semfields()
Synset.sentiment()
Synset.positivity()
Synset.negativity()
Synset.objectivity()
Synset.language()
Synset.pos()
Synset.offset()
Synset.gloss()
Synset.examples()
Synset.lemmas()
Synset.root_hypernyms()
Synset.max_depth()
Synset.min_depth()
Synset.closure()
Synset.hypernym_paths()
Synset.common_hypernyms()
Synset.lowest_common_hypernyms()
Synset.hypernym_distances()
Synset.shortest_path_distance()
Synset.tree()
Synset.path_similarity()
Synset._lcs_ic()
Synset.lch_similarity()
Synset.wup_similarity()
Synset.res_similarity()
Synset.jcn_similarity()
Synset.lin_similarity()
Synset._iter_hypernym_lists()
Synset.related()
WordNetCorpusReader
WordNetCorpusReader.host()
WordNetCorpusReader._compute_max_depth()
WordNetCorpusReader.get_status()
WordNetCorpusReader.lemma()
WordNetCorpusReader.lemma_from_uri()
WordNetCorpusReader.semfield()
WordNetCorpusReader.synset()
WordNetCorpusReader.synset_from_pos_and_offset()
WordNetCorpusReader.lemmas()
WordNetCorpusReader.lemmas_from_uri()
WordNetCorpusReader.synsets()
WordNetCorpusReader.semfields()
WordNetCorpusReader.lemmatize()
WordNetCorpusReader.translate()
WordNetICCorpusReader
8.2. Submodules¶
8.3. cltk.nlp module¶
Primary module for CLTK pipeline.
- class cltk.nlp.NLP(language, custom_pipeline=None, suppress_banner=False)[source]¶
Bases:
object
NLP class for default processing.
- process_objects: dict[Type[cltk.core.data_types.Process], cltk.core.data_types.Process] = {}¶
- process_lock = <unlocked _thread.lock object>¶
- _print_pipelines_for_current_lang()[source]¶
Print to screen the
Process``es invoked upon invocation of ``NLP()
.- Return type:
None
- _print_special_authorship_messages_for_current_lang()[source]¶
Print to screen the authors of particular algorithms.
- Return type:
None
- _get_process_object(process_object)[source]¶
Returns an instance of a process from a memoized hash. An un-instantiated process is created and stashed in the cache.
TODO: Figure out typing in this.
- Return type:
- analyze(text)[source]¶
The primary method for the NLP object, to which raw text strings are passed.
- Parameters:
text (
str
) – Input text string.- Return type:
- Returns:
CLTK
Doc
containing all processed information.
>>> from cltk.languages.example_texts import get_example_text >>> from cltk.core.data_types import Doc >>> cltk_nlp = NLP(language="lat", suppress_banner=True) >>> cltk_doc = cltk_nlp.analyze(text=get_example_text("lat")) >>> isinstance(cltk_doc, Doc) True >>> cltk_doc.words[0].string 'Gallia'
- _get_pipeline()[source]¶
Select appropriate pipeline for given language. If custom processing is requested, ensure that user-selected choices are valid, both in themselves and in unison.
>>> from cltk.core.data_types import Pipeline >>> cltk_nlp = NLP(language="lat", suppress_banner=True) >>> lat_pipeline = cltk_nlp._get_pipeline() >>> isinstance(cltk_nlp.pipeline, Pipeline) True >>> isinstance(lat_pipeline, Pipeline) True >>> cltk_nlp = NLP(language="axm", suppress_banner=True) Traceback (most recent call last): ... cltk.core.exceptions.UnimplementedAlgorithmError: Valid ISO language code, however this algorithm is not available for ``axm``.
- Return type: