8.1.13.1.1. cltk.prosody.lat package

8.1.13.1.1.1. Submodules

8.1.13.1.1.2. cltk.prosody.lat.clausulae_analysis module

Return dictionary of clausulae found in the prosody of Latin prose.

The clausulae analysis function returns a dictionary in which the key is the type of clausula and the value is the number of times it occurs in the text. The list of clausulae used in the method is derived from the 2019 Journal of Roman Studies paper “Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm”. The list of clausulae are mutually exclusive so no one rhythm will be counted in multiple categories.

class cltk.prosody.lat.clausulae_analysis.Clausula(rhythm_name, rhythm)

Bases: tuple

rhythm

Alias for field number 1

rhythm_name

Alias for field number 0

class cltk.prosody.lat.clausulae_analysis.Clausulae(rhythms=[Clausula(rhythm_name='cretic_trochee', rhythm='-u--x'), Clausula(rhythm_name='cretic_trochee_resolved_a', rhythm='uuu--x'), Clausula(rhythm_name='cretic_trochee_resolved_b', rhythm='-uuu-x'), Clausula(rhythm_name='cretic_trochee_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_cretic', rhythm='-u--ux'), Clausula(rhythm_name='molossus_cretic', rhythm='----ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_a', rhythm='uuu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_b', rhythm='-uuu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_molossus_cretic_resolved_d', rhythm='uu---ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_e', rhythm='-uu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_f', rhythm='--uu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_g', rhythm='---uuux'), Clausula(rhythm_name='double_molossus_cretic_resolved_h', rhythm='-u---ux'), Clausula(rhythm_name='double_trochee', rhythm='-u-x'), Clausula(rhythm_name='double_trochee_resolved_a', rhythm='uuu-x'), Clausula(rhythm_name='double_trochee_resolved_b', rhythm='-uuux'), Clausula(rhythm_name='hypodochmiac', rhythm='-u-ux'), Clausula(rhythm_name='hypodochmiac_resolved_a', rhythm='uuu-ux'), Clausula(rhythm_name='hypodochmiac_resolved_b', rhythm='-uuuux'), Clausula(rhythm_name='spondaic', rhythm='---x'), Clausula(rhythm_name='heroic', rhythm='-uu-x')])[source]

Bases: object

clausulae_analysis(prosody)[source]

Return dictionary in which the key is a type of clausula and the value is its frequency. :type prosody: list[str] :param prosody: the prosody of a prose text (must be in the format of the scansion produced by the scanner classes. :rtype: list[dict[str, int]] :return: dictionary of prosody >>> Clausulae().clausulae_analysis([‘-uuu-uuu-u–x’, ‘uu-uu-uu—-x’]) [{‘cretic_trochee’: 1}, {‘cretic_trochee_resolved_a’: 0}, {‘cretic_trochee_resolved_b’: 0}, {‘cretic_trochee_resolved_c’: 0}, {‘double_cretic’: 0}, {‘molossus_cretic’: 0}, {‘double_molossus_cretic_resolved_a’: 0}, {‘double_molossus_cretic_resolved_b’: 0}, {‘double_molossus_cretic_resolved_c’: 0}, {‘double_molossus_cretic_resolved_d’: 0}, {‘double_molossus_cretic_resolved_e’: 0}, {‘double_molossus_cretic_resolved_f’: 0}, {‘double_molossus_cretic_resolved_g’: 0}, {‘double_molossus_cretic_resolved_h’: 0}, {‘double_trochee’: 0}, {‘double_trochee_resolved_a’: 0}, {‘double_trochee_resolved_b’: 0}, {‘hypodochmiac’: 0}, {‘hypodochmiac_resolved_a’: 0}, {‘hypodochmiac_resolved_b’: 0}, {‘spondaic’: 1}, {‘heroic’: 0}]

8.1.13.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module

Utility class for producing a scansion pattern for a Latin hendecasyllables.

Given a line of hendecasyllables, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

class cltk.prosody.lat.hendecasyllable_scanner.HendecasyllableScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_tranform=False, *args, **kwargs)[source]

Bases: VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False)[source]

Scan a line of Latin hendecasyllables and produce a scansion pattern, and other data.

Parameters:
  • original_line (str) – the original line of Latin verse

  • optional_transform (bool) – whether or not to perform i to j transform for syllabification

Return type:

Verse

Returns:

a Verse object

>>> scanner = HendecasyllableScanner()
>>> print(scanner.scan("Cui dono lepidum novum libellum"))
Verse(original='Cui dono lepidum novum libellum', scansion='  -  U -  U U -   U -   U -  U ', meter='hendecasyllable', valid=True, syllable_count=11, accented='Cui donō lepidūm novūm libēllum', scansion_notes=['Corrected invalid start.'], syllables = ['Cui', 'do', 'no', 'le', 'pi', 'dūm', 'no', 'vūm', 'li', 'bēl', 'lum'])
>>> print(scanner.scan(
... "ārida modo pumice expolitum?").scansion)  
- U -  U U  - U   -  U - U
correct_invalid_start(scansion)[source]

The third syllable of a hendecasyllabic line is long, so we will convert it.

Parameters:

scansion (str) – scansion string

Return type:

str

Returns:

scansion string with corrected start

>>> print(HendecasyllableScanner().correct_invalid_start(
... "- U U  U U  - U   -  U - U").strip())
- U -  U U  - U   -  U - U
correct_antepenult_chain(scansion)[source]

For hendecasyllables the last three feet of the verse are predictable and do not regularly allow substitutions.

Parameters:

scansion (str) – scansion line thus far

Return type:

str

Returns:

corrected line of scansion

>>> print(HendecasyllableScanner().correct_antepenult_chain(
... "-U -UU UU UU UX").strip())
-U -UU -U -U -X

8.1.13.1.1.4. cltk.prosody.lat.hexameter_scanner module

Utility class for producing a scansion pattern for a Latin hexameter.

Given a line of hexameter, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

Because hexameters have strict rules on the position and quantity of stressed and unstressed syllables, we can often infer the many stress qualities of the syllables, given a valid hexameter. If the Latin hexameter provided is not accented with macrons, then a best guess is made. For the scansion produced, the stress of a dipthong is indicated in the second of the two vowel positions; for the accented line produced, the dipthong stress is not indicated with any macronized vowels.

class cltk.prosody.lat.hexameter_scanner.HexameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]

Bases: VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False, dactyl_smoothing=False)[source]

Scan a line of Latin hexameter and produce a scansion pattern, and other data.

Parameters:
  • original_line (str) – the original line of Latin verse

  • optional_transform (bool) – whether or not to perform i to j transform for syllabification

  • dactyl_smoothing (bool) – whether or not to perform dactyl smoothing

Return type:

Verse

Returns:

a Verse object

>>> scanner = HexameterScanner()
>>> print(HexameterScanner().scan(
... "ēxiguām sedēm pariturae tērra negavit").scansion) 
- -  -   - -   U U -  -  -  U  U - U
>>> print(scanner.scan("impulerit. Tantaene animis caelestibus irae?"))
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> print(scanner.scan(
... "Arma virumque cano, Troiae qui prīmus ab ōrīs").scansion) 
-  U  U -   U  U -    -  -   -   - U  U  - -
>>> # some hexameters need the optional transformations:
>>> optional_transform_scanner = HexameterScanner(optional_transform=True)
>>> print(optional_transform_scanner.scan(
... "Ītaliam, fāto profugus, Lāvīniaque vēnit").scansion) 
- -  -    - -   U U -    - -  U  U  - U
>>> print(HexameterScanner().scan(
... "lītora, multum ille et terrīs iactātus et alto").scansion) 
- U U   -     -    -   -  -   -  - U  U  -  U
>>> print(HexameterScanner().scan(
... "vī superum saevae memorem Iūnōnis ob īram;").scansion) 
-  U U -    -  -  U U -   - - U  U  - U
>>> # handle multiple elisions
>>> print(scanner.scan("monstrum horrendum, informe, ingens, cui lumen ademptum").scansion) 
-        -  -      -  -     -  -      -  - U  U -   U
>>> # if we have 17 syllables, create a chain of all dactyls
>>> print(scanner.scan("quadrupedante putrem sonitu quatit ungula campum"
... ).scansion) 
-  U U -  U  U  -   U U -   U U  -  U U  -  U
>>> # if we have 13 syllables exactly, we'll create a spondaic hexameter
>>> print(HexameterScanner().scan(
... "illi inter sese multa vi bracchia tollunt").scansion)  
-    -  -   - -  -  -  -   -   UU  -  -
>>> print(HexameterScanner().scan(
... "dat latus; insequitur cumulo praeruptus aquae mons").scansion) 
-   U U   -  U  U -   U U -    - -  U  U   -  -
>>> print(optional_transform_scanner.scan(
... "Non quivis videt inmodulata poëmata iudex").scansion) 
-    - -   U U  -  U U - U  U- U U  - -
>>> print(HexameterScanner().scan(
... "certabant urbem Romam Remoramne vocarent").scansion) 
-  - -   -  -   - -   U U -  U  U - -
>>> # advanced smoothing is available via keyword flags: dactyl_smoothing
>>> # print(HexameterScanner().scan(
#... "his verbis: 'o gnata, tibi sunt ante ferendae",
#... dactyl_smoothing=True).scansion) 
#-   -  -    -   - U   U -  -   -  U  U -   -
correct_invalid_fifth_foot(scansion)[source]

The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed when it occurs at the end of a line

Parameters:

scansion (str) – the scansion pattern

Return corrected scansion:

the corrected scansion pattern

>>> print(HexameterScanner().correct_invalid_fifth_foot(
... " -   - -   U U  -  U U U -  - U U U  - x")) 
-   - -   U U  -  U U U -  - - U U  - x
Return type:

str

invalid_foot_to_spondee(feet, foot, idx)[source]

In hexameters, a single foot that is a unstressed_stressed syllable pattern is often just a double spondee, so here we coerce it to stressed.

Parameters:
  • feet (list) – list of string representations of meterical feet

  • foot (str) – the bad foot to correct

  • idx (int) – the index of the foot to correct

Return type:

str

Returns:

corrected scansion

>>> print(HexameterScanner().invalid_foot_to_spondee(
... ['-UU', '--', '-U', 'U-', '--', '-UU'],'-U', 2))  
-UU----U----UU
correct_dactyl_chain(scansion)[source]

Three or more unstressed accents in a row is a broken dactyl chain, best detected and processed backwards.

Since this method takes a Procrustean approach to modifying the scansion pattern, it is not used by default in the scan method; however, it is available as an optional keyword parameter, and users looking to further automate the generation of scansion candidates should consider using this as a fall back.

Parameters:

scansion (str) – scansion with broken dactyl chain; inverted amphibrachs not allowed

Return type:

str

Returns:

corrected line of scansion

>>> print(HexameterScanner().correct_dactyl_chain(
... "-   U U  -  - U U -  - - U U  - x"))
-   - -  -  - U U -  - - U U  - x
>>> print(HexameterScanner().correct_dactyl_chain(
... "-   U  U U  U -     -   -   -  -   U  U -   U")) 
-   -  - U  U -     -   -   -  -   U  U -   U
correct_inverted_amphibrachs(scansion)[source]

The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed: - U - -> - - -

Parameters:

scansion (str) – the scansion stress pattern

Return type:

str

Returns:

a string with the corrected scansion pattern

>>> print(HexameterScanner().correct_inverted_amphibrachs(
... " -   U -   - U  -  U U U U  - U  - x")) 
-   - -   - -  -  U U U U  - -  - x
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... " -   - -   U -  -  U U U U  U- - U  - x")) 
-   - -   - -  -  U U U U  U- - -  - x
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... "-  - -   -  -   U -   U U -  U  U - -")) 
-  - -   -  -   - -   U U -  U  U - -
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... "- UU-   U -   U -  -   U   U U   U-   U")) 
- UU-   - -   - -  -   U   U U   U-   U

8.1.13.1.1.5. cltk.prosody.lat.macronizer module

Delineate length of lat vowels.

The Macronizer class places a macron over naturally long Latin vowels. To discern whether a vowel is long, a word is first matched with its Morpheus entry by way of its POS tag. The Morpheus entry includes the macronized form of the matched word.

Since the accuracy of the macronizer largely derives from the accuracy of the POS tagger used to match words to their Morpheus entry, the Macronizer class allows for multiple POS to be used.

Todo

Determine how to disambiguate tags (see logger)

class cltk.prosody.lat.macronizer.Macronizer(tagger)[source]

Bases: object

Macronize Latin words.

Macronize text by using the POS tag to find the macronized form within the Morpheus database.

_retrieve_tag(text)[source]

Tag text with chosen tagger and clean tags.

Tag format: [('word', 'tag')]

Parameters:

text (str) – string

Return type:

list[tuple[str, str]]

Returns:

list of tuples, with each tuple containing the word and its pos tag

_retrieve_morpheus_entry(word)[source]

Return Morpheus entry for word

Entry format: [(head word, tag, macronized form)]

Parameters:

word (str) – unmacronized, lowercased word

Ptype word:

string

Return type:

tuple[str, str, str]

Returns:

Morpheus entry in tuples

_macronize_word(word)[source]

Return macronized word.

Parameters:

word (tuple[str, str]) – (word, tag)

Return type:

tuple[str, str, str]

Returns:

(word, tag, macronized_form)

macronize_tags(text)[source]

Return macronized form along with POS tags.

E.g. “Gallia est omnis divisa in partes tres,” -> [(‘gallia’, ‘n-s—fb-’, ‘galliā’), (‘est’, ‘v3spia—’, ‘est’), (‘omnis’, ‘a-s—mn-’, ‘omnis’), (‘divisa’, ‘t-prppnn-’, ‘dīvīsa’), (‘in’, ‘r——–’, ‘in’), (‘partes’, ‘n-p—fa-’, ‘partēs’), (‘tres’, ‘m——–’, ‘trēs’)]

Parameters:

text (str) – raw text

Return type:

list[tuple[str, str, str]]

Returns:

tuples of head word, tag, macronized form

macronize_text(text)[source]

Return macronized form of text.

E.g. “Gallia est omnis divisa in partes tres,” -> “galliā est omnis dīvīsa in partēs trēs ,”

Parameters:

text (str) – raw text

Return type:

str

Returns:

macronized text

8.1.13.1.1.6. cltk.prosody.lat.metrical_validator module

Utility class for validating scansion patterns: hexameter, hendecasyllables, pentameter. Allows users to configure the scansion symbols internally via a constructor argument; a suitable default is provided.

class cltk.prosody.lat.metrical_validator.MetricalValidator(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]

Bases: object

Currently supports validation for: hexameter, hendecasyllables, pentameter.

is_valid_hexameter(scanned_line)[source]

Determine if a scansion pattern is one of the valid hexameter metrical patterns :type scanned_line: str :param scanned_line: a line containing a sequence of stressed and unstressed syllables :return bool

>>> print(MetricalValidator().is_valid_hexameter("-UU---UU---UU-U"))
True
Return type:

bool

is_valid_hendecasyllables(scanned_line)[source]

Determine if a scansion pattern is one of the valid Hendecasyllables metrical patterns

Parameters:

scanned_line (str) – a line containing a sequence of stressed and unstressed syllables

>>> print(MetricalValidator().is_valid_hendecasyllables("-U-UU-U-U-U"))
True
Return type:

bool

is_valid_pentameter(scanned_line)[source]

Determine if a scansion pattern is one of the valid Pentameter metrical patterns

Parameters:

scanned_line (str) – a line containing a sequence of stressed and unstressed syllables

Return bool:

whether or not the scansion is a valid pentameter

>>> print(MetricalValidator().is_valid_pentameter('-UU-UU--UU-UUX'))
True
Return type:

bool

hexameter_feet(scansion)[source]

Produces a list of hexameter feet, stressed and unstressed syllables with spaces intact. If the scansion line is not entirely correct, it will attempt to corral one or more improper patterns into one or more feet.

Param:

scansion, the scanned line

:return list of strings, representing the feet of the hexameter, or if the scansion is wildly incorrect, the function will return an empty list.

>>> print("|".join(MetricalValidator().hexameter_feet(
... "- U U   -     -  - -   -  -     - U  U  -  U")).strip() )
- U U   |-     -  |- -   |-  -     |- U  U  |-  U
>>> print("|".join(MetricalValidator().hexameter_feet(
... "- U U   -     -  U -   -  -     - U  U  -  U")).strip())
- U U   |-     -  |U -   |-  -     |- U  U  |-  U
Return type:

list[str]

static hexameter_known_stresses()[source]

Provide a list of known stress positions for a hexameter.

Return type:

list[int]

Returns:

a zero based list enumerating which syllables are known to be stressed.

static hexameter_possible_unstresses()[source]

Provide a list of possible positions which may be unstressed syllables in a hexameter.

Return type:

list[int]

Returns:

a zero based list enumerating which syllables are known to be unstressed.

closest_hexameter_patterns(scansion)[source]

Find the closest group of matching valid hexameter patterns.

Return type:

list[str]

Returns:

list of the closest valid hexameter patterns; only candidates with a matching length/number of syllables are considered.

>>> print(MetricalValidator().closest_hexameter_patterns('-UUUUU-----UU--'))
['-UU-UU-----UU--']
static pentameter_possible_stresses()[source]

Provide a list of possible stress positions for a hexameter.

Return type:

list[int]

Returns:

a zero based list enumerating which syllables are known to be stressed.

closest_pentameter_patterns(scansion)[source]

Find the closest group of matching valid pentameter patterns.

Return type:

list[str]

Returns:

list of the closest valid pentameter patterns; only candidates with a matching length/number of syllables are considered.

>>> print(MetricalValidator().closest_pentameter_patterns('--UUU--UU-UUX'))
['---UU--UU-UUX']
closest_hendecasyllable_patterns(scansion)[source]

Find the closest group of matching valid hendecasyllable patterns.

Return type:

list[str]

Returns:

list of the closest valid hendecasyllable patterns; only candidates with a matching length/number of syllables are considered.

>>> print(MetricalValidator().closest_hendecasyllable_patterns('UU-UU-U-U-X'))
['-U-UU-U-U-X', 'U--UU-U-U-X']
_closest_patterns(patterns, scansion)[source]

Find the closest group of matching valid patterns.

Patterns:

a list of patterns

Scansion:

the scansion pattern thus far

Return type:

list[str]

Returns:

list of the closest valid patterns; only candidates with a matching length/number of syllables are considered.

_build_hexameter_template(stress_positions)[source]

Build a hexameter scansion template from string of 5 binary numbers; NOTE: Traditionally the fifth foot is dactyl and spondee substitution is rare, however since it is a possible combination, we include it here.

Parameters:

stress_positions (str) – 5 binary integers, indicating whether foot is dactyl or spondee

Return type:

str

Returns:

a valid hexameter scansion template, a string representing stressed and unstresssed syllables with the optional terminal ending.

>>> print(MetricalValidator()._build_hexameter_template("01010"))
-UU---UU---UU-X
_build_pentameter_templates()[source]

Create pentameter templates.

Return type:

list[str]

8.1.13.1.1.7. cltk.prosody.lat.pentameter_scanner module

Utility class for producing a scansion pattern for a Latin pentameter.

Given a line of pentameter, the scan method performs a series of transformation and checks are performed, and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

class cltk.prosody.lat.pentameter_scanner.PentameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]

Bases: VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False)[source]

Scan a line of Latin pentameter and produce a scansion pattern, and other data.

Parameters:
  • original_line (str) – the original line of Latin verse

  • optional_transform (bool) – whether or not to perform i to j transform for syllabification

Return type:

Verse

Returns:

a Verse object

>>> scanner = PentameterScanner()
>>> print(scanner.scan('ex hoc ingrato gaudia amore tibi.'))
Verse(original='ex hoc ingrato gaudia amore tibi.', scansion='-   -  -   - -   - U  U - U  U U ', meter='pentameter', valid=True, syllable_count=12, accented='ēx hōc īngrātō gaudia amōre tibi.', scansion_notes=['Spondaic pentameter'], syllables = ['ēx', 'hoc', 'īn', 'gra', 'to', 'gau', 'di', 'a', 'mo', 're', 'ti', 'bi'])
>>> print(scanner.scan(
... "in vento et rapida scribere oportet aqua.").scansion) 
-   -    -   U U -    - U   U -  U  U  U
make_spondaic(scansion)[source]

If a pentameter line has 12 syllables, then it must start with double spondees.

Parameters:

scansion (str) – a string of scansion patterns

Return type:

str

Returns:

a scansion pattern string starting with two spondees

>>> print(PentameterScanner().make_spondaic("U  U  U  U  U  U  U  U  U  U  U  U"))
-  -  -  -  -  -  U  U  -  U  U  U
make_dactyls(scansion)[source]

If a pentameter line has 14 syllables, it starts and ends with double dactyls.

Parameters:

scansion (str) – a string of scansion patterns

Return type:

str

Returns:

a scansion pattern string starting and ending with double dactyls

>>> print(PentameterScanner().make_dactyls("U  U  U  U  U  U  U  U  U  U  U  U  U  U"))
-  U  U  -  U  U  -  -  U  U  -  U  U  U
correct_penultimate_dactyl_chain(scansion)[source]

For pentameter the last two feet of the verse are predictable dactyls, and do not regularly allow substitutions.

Parameters:

scansion (str) – scansion line thus far

Return type:

str

Returns:

corrected line of scansion

>>> print(PentameterScanner().correct_penultimate_dactyl_chain(
... "U  U  U  U  U  U  U  U  U  U  U  U  U  U"))
U  U  U  U  U  U  U  -  U  U  -  U  U  U

8.1.13.1.1.8. cltk.prosody.lat.scanner module

Scansion module for scanning Latin prose rhythms.

class cltk.prosody.lat.scanner.Scansion(punctuation=None, clausula_length=13, elide=True)[source]

Bases: object

Prepossesses Latin text for prose rhythm analysis.

SHORT_VOWELS = ['a', 'e', 'i', 'o', 'u', 'y']
LONG_VOWELS = ['ā', 'ē', 'ī', 'ō', 'ū']
VOWELS = ['a', 'e', 'i', 'o', 'u', 'y', 'ā', 'ē', 'ī', 'ō', 'ū']
DIPHTHONGS = ['ae', 'au', 'ei', 'oe', 'ui']
SINGLE_CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j']
DOUBLE_CONSONANTS = ['x', 'z']
CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j', 'x', 'z']
DIGRAPHS = ['ch', 'ph', 'th', 'qu']
LIQUIDS = ['r', 'l']
MUTES = ['b', 'p', 'd', 't', 'c', 'g']
MUTE_LIQUID_EXCEPTIONS = ['gl', 'bl']
NASALS = ['m', 'n']
SESTS = ['sc', 'sm', 'sp', 'st', 'z']
_tokenize_syllables(word)[source]

Tokenize syllables for word. “mihi” -> [{“syllable”: “mi”, index: 0, … } … ] Syllable properties: syllable: string -> syllable index: int -> postion in word long_by_nature: bool -> is syllable long by nature accented: bool -> does receive accent long_by_position: bool -> is syllable long by position :type word: str :param word: string :rtype: list[Any] :return: list

>>> Scansion()._tokenize_syllables("mihi")
[{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'hi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("ivi")
[{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'vi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("audītū")
[{'syllable': 'au', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dī', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tū', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("ā")
[{'syllable': 'ā', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}]
>>> Scansion()._tokenize_syllables("conjiciō")
[{'syllable': 'con', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'ji', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ci', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'ō', 'index': 3, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("lingua")
[{'syllable': 'lin', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'gua', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("abrante")
[{'syllable': 'ab', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'ran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("redemptor")
[{'syllable': 'red', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'em', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'ptor', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("nagrante")
[{'syllable': 'na', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'gran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
_tokenize_words(sentence)[source]

Tokenize words for sentence. “Puella bona est” -> [{word: puella, index: 0, … }, … ] Word properties: word: string -> word index: int -> position in sentence syllables: list -> list of syllable objects syllables_count: int -> number of syllables in word :type sentence: str :param sentence: string :rtype: list[Any] :return: list

>>> Scansion()._tokenize_words('dedērunt te miror antōnī quorum.')
[{'word': 'dedērunt', 'index': 0, 'syllables': [{'syllable': 'de', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dē', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'runt', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 3}, {'word': 'te', 'index': 1, 'syllables': [{'syllable': 'te', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'miror', 'index': 2, 'syllables': [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ror', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'antōnī', 'index': 3, 'syllables': [{'syllable': 'an', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'tō', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'nī', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'quorum.', 'index': 4, 'syllables': [{'syllable': 'quo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'rum', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}]
>>> Scansion()._tokenize_words('a spes co i no xe cta.')
[{'word': 'a', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'sest'), 'accented': True}], 'syllables_count': 1}, {'word': 'spes', 'index': 1, 'syllables': [{'syllable': 'spes', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'co', 'index': 2, 'syllables': [{'syllable': 'co', 'index': 0, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'i', 'index': 3, 'syllables': [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'no', 'index': 4, 'syllables': [{'syllable': 'no', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'xe', 'index': 5, 'syllables': [{'syllable': 'xe', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'cta.', 'index': 6, 'syllables': [{'syllable': 'cta', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
>>> Scansion()._tokenize_words('x')
[]
>>> Scansion()._tokenize_words('atae amo.')
[{'word': 'atae', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tae', 'index': 1, 'elide': (True, 'strong'), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'amo.', 'index': 1, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'mo', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}]
>>> Scansion()._tokenize_words('bar rid.')
[{'word': 'bar', 'index': 0, 'syllables': [{'syllable': 'bar', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'rid.', 'index': 1, 'syllables': [{'syllable': 'rid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
>>> Scansion()._tokenize_words('ba brid.')
[{'word': 'ba', 'index': 0, 'syllables': [{'syllable': 'ba', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': True}], 'syllables_count': 1}, {'word': 'brid.', 'index': 1, 'syllables': [{'syllable': 'brid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
tokenize(text)[source]

Tokenize text on supplied characters. “Puella bona est. Puer malus est.” -> [ [{word: puella, syllables: […], index: 0}, … ], … ] :rtype: list[Any] :return:list

>>> Scansion().tokenize('puella bona est. puer malus est.')
[{'plain_text_sentence': 'puella bona est', 'structured_sentence': [{'word': 'puella', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'el', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'la', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'bona', 'index': 1, 'syllables': [{'syllable': 'bo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'na', 'index': 1, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': ' puer malus est', 'structured_sentence': [{'word': 'puer', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'er', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 2}, {'word': 'malus', 'index': 1, 'syllables': [{'syllable': 'ma', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'lus', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': '', 'structured_sentence': []}]
scan_text(text)[source]

Return a flat list of rhythms. Desired clausula length is passed as a parameter. Clausula shorter than the specified length can be exluded. :rtype: list[str] :return:

>>> Scansion().scan_text('dedērunt te miror antōnī quorum. sī quid est in mē ingenī jūdicēs quod sentiō.')
['u--uuu---ux', 'u---u--u---ux']

8.1.13.1.1.9. cltk.prosody.lat.scansion_constants module

Configuration class for specifying scansion constants.

class cltk.prosody.lat.scansion_constants.ScansionConstants(unstressed='U', stressed='-', optional_terminal_ending='X', separator='|')[source]

Bases: object

Constants containing strings have characters in upper and lower case since they will often be used in regular expressions, and used to preserve/a verse’s original case.

This class also allows users to customizing scansion constants and scanner behavior.

>>> constants = ScansionConstants(unstressed="U",stressed= "-", optional_terminal_ending="X")
>>> print(constants.DACTYL)
-UU
>>> smaller_constants = ScansionConstants(
... unstressed="˘",stressed= "¯", optional_terminal_ending="x")
>>> print(smaller_constants.DACTYL)
¯˘˘
HEXAMETER_ENDING

The following two constants are not offical scansion terms, but invalid in hexameters

DOUBLED_CONSONANTS

Prefix order not arbitrary; one will want to match on extra before ex

8.1.13.1.1.10. cltk.prosody.lat.scansion_formatter module

Utility class for formatting scansion patterns

class cltk.prosody.lat.scansion_formatter.ScansionFormatter(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]

Bases: object

Users can specify which scansion symbols to use in the formatting.

>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--"))
-UU|-UU|-UU|--|-UU|--
>>> constants = ScansionConstants(unstressed="˘",             stressed= "¯", optional_terminal_ending="x")
>>> formatter = ScansionFormatter(constants)
>>> print(formatter.hexameter( "¯˘˘¯˘˘¯˘˘¯¯¯˘˘¯¯"))
¯˘˘|¯˘˘|¯˘˘|¯¯|¯˘˘|¯¯
hexameter(line)[source]

Format a string of hexameter metrical stress patterns into foot divisions

Parameters:

line (str) – the scansion pattern

Return type:

str

Returns:

the scansion string formatted with foot breaks

>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--"))
-UU|-UU|-UU|--|-UU|--
merge_line_scansion(line, scansion)[source]

Merge a line of verse with its scansion string. Do not accent dipthongs.

Parameters:
  • line (str) – the original Latin verse line

  • scansion (str) – the scansion pattern

Return type:

str

Returns:

the original line with the scansion pattern applied via macrons

>>> print(ScansionFormatter().merge_line_scansion(
... "Arma virumque cano, Troiae qui prīmus ab ōrīs",
... "-  U  U -  U  U  -     UU-   -   - U  U  - -"))
Ārma virūmque canō, Troiae quī prīmus ab ōrīs
>>> print(ScansionFormatter().merge_line_scansion(
... "lītora, multum ille et terrīs iactātus et alto",
... " - U U   -     -    -   -  -   -  - U  U  -  U"))
lītora, mūltum īlle ēt tērrīs iāctātus et ālto
>>> print(ScansionFormatter().merge_line_scansion(
... 'aut facere, haec a te dictaque factaque sunt',
... ' -   U U      -  -  -  -  U  U  -  U  U  -  '))
aut facere, haec ā tē dīctaque fāctaque sūnt

8.1.13.1.1.11. cltk.prosody.lat.string_utils module

Utillity class for processing scansion and text.

cltk.prosody.lat.string_utils.remove_punctuation_dict()[source]

Provide a dictionary for removing punctuation, swallowing spaces.

:return dict with punctuation from the unicode table

>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate(
... remove_punctuation_dict()).lstrip())
Im ok Oh  Fine
Return type:

dict[int, None]

cltk.prosody.lat.string_utils.punctuation_for_spaces_dict()[source]

Provide a dictionary for removing punctuation, keeping spaces. Essential for scansion to keep stress patterns in alignment with original vowel positions in the verse.

:return dict with punctuation from the unicode table

>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate(
... punctuation_for_spaces_dict()).strip())
I m ok  Oh              Fine
Return type:

dict[int, str]

cltk.prosody.lat.string_utils.differences(scansion, candidate)[source]

Given two strings, return a list of index positions where the contents differ.

Parameters:
  • scansion (str) –

  • candidate (str) –

Return type:

list[int]

Returns:

>>> differences("abc", "abz")
[2]
cltk.prosody.lat.string_utils.mark_list(line)[source]

Given a string, return a list of index positions where a character/non blank space exists.

Parameters:

line (str) –

Return type:

list[int]

Returns:

>>> mark_list(" a b c")
[1, 3, 5]
cltk.prosody.lat.string_utils.space_list(line)[source]

Given a string, return a list of index positions where a blank space occurs.

Parameters:

line (str) –

Return type:

list[int]

Returns:

>>> space_list("    abc ")
[0, 1, 2, 3, 7]
cltk.prosody.lat.string_utils.flatten(list_of_lists)[source]

Given a list of lists, flatten all the items into one list.

Parameters:

list_of_lists

Returns:

>>> flatten([ [1, 2, 3], [4, 5, 6]])
[1, 2, 3, 4, 5, 6]
cltk.prosody.lat.string_utils.to_syllables_with_trailing_spaces(line, syllables)[source]

Given a line of syllables and spaces, and a list of syllables, produce a list of the syllables with trailing spaces attached as approriate.

Parameters:
  • line (str) –

  • syllables (list[str]) –

Return type:

list[str]

Returns:

>>> to_syllables_with_trailing_spaces(' arma virumque cano ',
... ['ar', 'ma', 'vi', 'rum', 'que', 'ca', 'no' ])
[' ar', 'ma ', 'vi', 'rum', 'que ', 'ca', 'no ']
cltk.prosody.lat.string_utils.join_syllables_spaces(syllables, spaces)[source]

Given a list of syllables, and a list of integers indicating the position of spaces, return a string that has a space inserted at the designated points.

Parameters:
  • syllables (list[str]) –

  • spaces (list[int]) –

Return type:

str

Returns:

>>> join_syllables_spaces(["won", "to", "tree", "dun"], [3, 6, 11])
'won to tree dun'
cltk.prosody.lat.string_utils.starts_with_qu(word)[source]

Determine whether or not a word start with the letters Q and U.

Parameters:

word

Return type:

bool

Returns:

>>> starts_with_qu("qui")
True
>>> starts_with_qu("Quirites")
True
cltk.prosody.lat.string_utils.stress_positions(stress, scansion)[source]

Given a stress value and a scansion line, return the index positions of the stresses.

Parameters:
  • stress (str) –

  • scansion (str) –

Return type:

list[int]

Returns:

>>> stress_positions("-", "    -  U   U - UU    - U U")
[0, 3, 6]
cltk.prosody.lat.string_utils.merge_elisions(elided)[source]

Given a list of strings with different space swapping elisions applied, merge the elisions, taking the most without compounding the omissions.

Parameters:

elided (list[str]) –

Return type:

str

Returns:

>>> merge_elisions([
... "ignavae agua multum hiatus", "ignav   agua multum hiatus" ,"ignavae agua mult   hiatus"])
'ignav   agua mult   hiatus'
cltk.prosody.lat.string_utils.move_consonant_right(letters, positions)[source]

Given a list of letters, and a list of consonant positions, move the consonant positions to the right, merging strings as necessary.

Parameters:
  • letters (list[str]) –

  • positions (list[int]) –

Return type:

list[str]

Returns:

>>> move_consonant_right(list("abbra"), [ 2, 3])
['a', 'b', '', '', 'bra']
cltk.prosody.lat.string_utils.move_consonant_left(letters, positions)[source]

Given a list of letters, and a list of consonant positions, move the consonant positions to the left, merging strings as necessary.

Parameters:
  • letters (list[str]) –

  • positions (list[int]) –

Return type:

list[str]

Returns:

>>> move_consonant_left(['a', 'b', '', '', 'bra'], [1])
['ab', '', '', '', 'bra']
cltk.prosody.lat.string_utils.merge_next(letters, positions)[source]

Given a list of letter positions, merge each letter with its next neighbor.

Parameters:
  • letters (list[str]) –

  • positions (list[int]) –

Return type:

list[str]

Returns:

>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2])
['ab', '', 'ov', '', 'o']
>>> # Note: because it operates on the original list passed in, the effect is not cummulative:
>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2, 3])
['ab', '', 'ov', 'o', '']
cltk.prosody.lat.string_utils.remove_blanks(letters)[source]

Given a list of letters, remove any empty strings.

Parameters:

letters (list[str]) –

Returns:

>>> remove_blanks(['a', '', 'b', '', 'c'])
['a', 'b', 'c']
cltk.prosody.lat.string_utils.split_on(word, section)[source]

Given a string, split on a section, and return the two sections as a tuple.

Parameters:
  • word (str) –

  • section (str) –

Return type:

tuple[str, str]

Returns:

>>> split_on('hamrye', 'ham')
('ham', 'rye')
cltk.prosody.lat.string_utils.remove_blank_spaces(syllables)[source]

Given a list of letters, remove any blank spaces or empty strings.

Parameters:

syllables (list[str]) –

Return type:

list[str]

Returns:

>>> remove_blank_spaces(['', 'a', ' ', 'b', ' ', 'c', ''])
['a', 'b', 'c']
cltk.prosody.lat.string_utils.overwrite(char_list, regexp, quality, offset=0)[source]

Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.

Parameters:
  • char_list (list[str]) –

  • regexp (str) –

  • quality (str) –

  • offset (int) –

Return type:

list[str]

Returns:

>>> overwrite(list("multe igne"), r"e\s[aeiou]", " ")
['m', 'u', 'l', 't', ' ', ' ', 'i', 'g', 'n', 'e']
cltk.prosody.lat.string_utils.overwrite_dipthong(char_list, regexp, quality)[source]

Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.

Parameters:
  • char_list (list[str]) – a list of characters

  • regexp (str) – a matching regular expression

  • quality (str) – a quality or character to replace

Return type:

list[str]

Returns:

a list of characters with the dipthong overwritten

>>> overwrite_dipthong(list("multae aguae"), r"ae\s[aeou]", " ")
['m', 'u', 'l', 't', ' ', ' ', ' ', 'a', 'g', 'u', 'a', 'e']
cltk.prosody.lat.string_utils.get_unstresses(stresses, count)[source]

Given a list of stressed positions, and count of possible positions, return a list of the unstressed positions.

Parameters:
  • stresses (list[int]) – a list of stressed positions

  • count (int) – the number of possible positions

Return type:

list[int]

Returns:

a list of unstressed positions

>>> get_unstresses([0, 3, 6, 9, 12, 15], 17)
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16]

8.1.13.1.1.12. cltk.prosody.lat.syllabifier module

Latin language syllabifier. Parses a lat word or a space separated list of words into a list of syllables. Consonantal I is transformed into a J at the start of a word as necessary. Tuned for poetry and verse, this class is tolerant of isolated single character consonants that may appear due to elision.

class cltk.prosody.lat.syllabifier.Syllabifier(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]

Bases: object

Scansion constants can be modified and passed into the constructor if desired.

syllabify(words)[source]

Parse a Latin word into a list of syllable strings.

Parameters:

words (str) – a string containing one lat word or many words separated by spaces.

Return type:

list[str]

Returns:

list of string, each representing a syllable.

>>> syllabifier = Syllabifier()
>>> print(syllabifier.syllabify("fuit"))
['fu', 'it']
>>> print(syllabifier.syllabify("libri"))
['li', 'bri']
>>> print(syllabifier.syllabify("contra"))
['con', 'tra']
>>> print(syllabifier.syllabify("iaculum"))
['ja', 'cu', 'lum']
>>> print(syllabifier.syllabify("amo"))
['a', 'mo']
>>> print(syllabifier.syllabify("bracchia"))
['brac', 'chi', 'a']
>>> print(syllabifier.syllabify("deinde"))
['dein', 'de']
>>> print(syllabifier.syllabify("certabant"))
['cer', 'ta', 'bant']
>>> print(syllabifier.syllabify("aere"))
['ae', 're']
>>> print(syllabifier.syllabify("adiungere"))
['ad', 'jun', 'ge', 're']
>>> print(syllabifier.syllabify("mōns"))
['mōns']
>>> print(syllabifier.syllabify("domus"))
['do', 'mus']
>>> print(syllabifier.syllabify("lixa"))
['li', 'xa']
>>> print(syllabifier.syllabify("asper"))
['as', 'per']
>>> #  handle doubles
>>> print(syllabifier.syllabify("siccus"))
['sic', 'cus']
>>> # handle liquid + liquid
>>> print(syllabifier.syllabify("almus"))
['al', 'mus']
>>> # handle liquid + mute
>>> print(syllabifier.syllabify("ambo"))
['am', 'bo']
>>> print(syllabifier.syllabify("anguis"))
['an', 'guis']
>>> print(syllabifier.syllabify("arbor"))
['ar', 'bor']
>>> print(syllabifier.syllabify("pulcher"))
['pul', 'cher']
>>> print(syllabifier.syllabify("ruptus"))
['ru', 'ptus']
>>> print(syllabifier.syllabify("Bīthÿnus"))
['Bī', 'thÿ', 'nus']
>>> print(syllabifier.syllabify("sanguen"))
['san', 'guen']
>>> print(syllabifier.syllabify("unguentum"))
['un', 'guen', 'tum']
>>> print(syllabifier.syllabify("lingua"))
['lin', 'gua']
>>> print(syllabifier.syllabify("linguā"))
['lin', 'guā']
>>> print(syllabifier.syllabify("languidus"))
['lan', 'gui', 'dus']
>>> print(syllabifier.syllabify("suis"))
['su', 'is']
>>> print(syllabifier.syllabify("habui"))
['ha', 'bu', 'i']
>>> print(syllabifier.syllabify("habuit"))
['ha', 'bu', 'it']
>>> print(syllabifier.syllabify("qui"))
['qui']
>>> print(syllabifier.syllabify("quibus"))
['qui', 'bus']
>>> print(syllabifier.syllabify("hui"))
['hui']
>>> print(syllabifier.syllabify("cui"))
['cui']
>>> print(syllabifier.syllabify("huic"))
['huic']
_setup(word)[source]

Prepares a word for syllable processing.

If the word starts with a prefix, process it separately. :param word: :rtype: list[str] :return:

convert_consonantal_i(word)[source]

Convert i to j when at the start of a word.

Return type:

str

_process(word)[source]

Process a word into a list of strings representing the syllables of the word. This method describes rules for consonant grouping behaviors and then iteratively applies those rules the list of letters that comprise the word, until all the letters are grouped into appropriate syllable groups.

Parameters:

word (str) –

Return type:

list[str]

Returns:

_contains_consonants(letter_group)[source]

Check if a string contains consonants.

Return type:

bool

_contains_vowels(letter_group)[source]

Check if a string contains vowels.

Return type:

bool

_ends_with_vowel(letter_group)[source]

Check if a string ends with a vowel.

Return type:

bool

_starts_with_vowel(letter_group)[source]

Check if a string starts with a vowel.

Return type:

bool

_starting_consonants_only(letters)[source]

Return a list of starting consonant positions.

Return type:

list

_ending_consonants_only(letters)[source]

Return a list of positions for ending consonants.

Return type:

list[int]

_find_solo_consonant(letters)[source]

Find the positions of any solo consonants that are not yet paired with a vowel.

Return type:

list[int]

_find_consonant_cluster(letters)[source]

Find clusters of consonants that do not contain a vowel. :type letters: list[str] :param letters: :rtype: list[int] :return:

_move_consonant(letters, positions)[source]

Given a list of consonant positions, move the consonants according to certain consonant syllable behavioral rules for gathering and grouping.

Parameters:
  • letters (list) –

  • positions (list[int]) –

Return type:

list[str]

Returns:

get_syllable_count(syllables)[source]

Counts the number of syllable groups that would occur after ellision.

Often we will want preserve the position and separation of syllables so that they can be used to reconstitute a line, and apply stresses to the original word positions. However, we also want to be able to count the number of syllables accurately.

Parameters:

syllables (list[str]) –

Return type:

int

Returns:

>>> syllabifier = Syllabifier()
>>> print(syllabifier.get_syllable_count([
... 'Jām', 'tūm', 'c', 'au', 'sus', 'es', 'u', 'nus', 'I', 'ta', 'lo', 'rum']))
11

8.1.13.1.1.13. cltk.prosody.lat.verse module

Data structure class for a line of metrical verse.

class cltk.prosody.lat.verse.Verse(original, scansion='', meter=None, valid=False, syllable_count=0, accented='', scansion_notes=None, syllables=None)[source]

Bases: object

Class representing a line of metrical verse.

This class is round-trippable; the __repr__ call can be used for construction.

>>> positional_hex = Verse(original='impulerit. Tantaene animis caelestibus irae?',
... scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter',
... valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?',
... scansion_notes=['Valid by positional stresses.'],
... syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> dupe = eval(positional_hex.__repr__())
>>> dupe
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> positional_hex
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
working_line

placeholder for data transformations

8.1.13.1.1.14. cltk.prosody.lat.verse_scanner module

Parent class and utility class for producing a scansion pattern for a line of Latin verse.

Some useful methods * Perform a conservative i to j transformation * Performs elisions * Accents vowels by position * Breaks the line into a list of syllables by calling a Syllabifier class which may be injected into this classes constructor.

class cltk.prosody.lat.verse_scanner.VerseScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, **kwargs)[source]

Bases: object

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

transform_i_to_j(line)[source]

Transform instances of consonantal i to j :type line: str :param line: :rtype: str :return:

>>> print(VerseScanner().transform_i_to_j("iactātus"))
jactātus
>>> print(VerseScanner().transform_i_to_j("bracchia"))
bracchia
transform_i_to_j_optional(line)[source]

Sometimes for the demands of meter a more permissive i to j transformation is warranted.

Parameters:

line (str) –

Return type:

str

Returns:

>>> print(VerseScanner().transform_i_to_j_optional("Italiam"))
Italjam
>>> print(VerseScanner().transform_i_to_j_optional("Lāvīniaque"))
Lāvīnjaque
>>> print(VerseScanner().transform_i_to_j_optional("omnium"))
omnjum
accent_by_position(verse_line)[source]

Accent vowels according to the rules of scansion.

Parameters:

verse_line (str) – a line of unaccented verse

Return type:

str

Returns:

the same line with vowels accented by position

>>> print(VerseScanner().accent_by_position(
... "Arma virumque cano, Troiae qui primus ab oris").lstrip())
Ārma virūmque canō  Trojae qui primus ab oris
elide_all(line)[source]

Given a string of space separated syllables, erase with spaces the syllable portions that would disappear according to the rules of elision.

Parameters:

line (str) –

Return type:

str

Returns:

calc_offset(syllables_spaces)[source]

Calculate a dictionary of accent positions from a list of syllables with spaces.

Parameters:

syllables_spaces (list[str]) –

Return type:

dict[int, int]

Returns:

produce_scansion(stresses, syllables_wspaces, offset_map)[source]

Create a scansion string that has stressed and unstressed syllable positions in locations that correspond with the original texts syllable vowels.

:param stresses list of syllable positions :param syllables_wspaces list of syllables with spaces escaped for punctuation or elision :param offset_map dictionary of syllable positions, and an offset amount which is the number of spaces to skip in the original line before inserting the accent.

Return type:

str

flag_dipthongs(syllables)[source]

Return a list of syllables that contain a dipthong

Parameters:

syllables (list[str]) –

Return type:

list[int]

Returns:

elide(line, regexp, quantity=1, offset=0)[source]

Erase a section of a line, matching on a regex, pushing in a quantity of blank spaces, and jumping forward with an offset if necessary. If the elided vowel was strong, the vowel merged with takes on the stress.

Parameters:
  • line (str) –

  • regexp (str) –

  • quantity (int) –

  • offset (int) –

Return type:

str

Returns:

>>> print(VerseScanner().elide("uvae avaritia", r"[e]\s*[a]"))
uv   āvaritia
>>> print(VerseScanner().elide("mare avaritia", r"[e]\s*[a]"))
mar  avaritia
correct_invalid_start(scansion)[source]

If a hexameter, hendecasyllables, or pentameter scansion starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | -

Parameters:

scansion (str) –

Return type:

str

Returns:

>>> print(VerseScanner().correct_invalid_start(
... " -   - U   U -  -  U U U U  U U  - -").strip())
-   - -   - -  -  U U U U  U U  - -
correct_first_two_dactyls(scansion)[source]

If a hexameter or pentameter starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | - And/or if the starting pattern is spondee + trochee + stressed, then the unstressed trochee can be corrected: - - | - u | - -> - - | - -| -

Parameters:

scansion (str) –

Return type:

str

Returns:

>>> print(VerseScanner().correct_first_two_dactyls(
... " -   - U   U -  -  U U U U  U U  - -")) 
 -   - -   - -  -  U U U U  U U  - -
assign_candidate(verse, candidate)[source]

Helper method; make sure that the verse object is properly packaged.

Parameters:
  • verse (Verse) –

  • candidate (str) –

Return type:

Verse

Returns: