cltk.phonology.enm package

Middle English phonology Submodules cltk.phonology.enm.phonology module

Middle English phonology tools

class cltk.phonology.enm.phonology.MiddleEnglishSyllabifier[source]

Bases: object

Middle English syllabifier

Return type

List[str] cltk.phonology.enm.stress module

Middle English stress module

class cltk.phonology.enm.stress.MiddleEnglishStresser(syllabifier=None)[source]

Bases: object

Middle English stresser

stress(word, stress_rule='FSR')[source]
  • word – word to stress

  • stress_rule

    Stress Rule, valid options:

    ’FSR’: French Stress Rule, stress falls on the ultima, unless it contains schwa (ends with e), in which case the penult is stressed.

    ’GSR’: Germanic Stress Rule, stress falls on the first syllable of the stemm. Note that the accuracy of the function directly depends on that of the stemmer.

    ’LSR’: Latin Stress Rule, stress falls on the penult if its heavy, else, if it has more than two syllables on the antepenult, else on the ultima.

Return type



A list containing the separate syllable, where the stressed syllable is prefixed by ‘ . Monosyllabic words are left unchanged, since stress indicates relative emphasis.


>>> from cltk.phonology.syllabify import Syllabifier
>>> from cltk.phonology.enm.syllabifier import DIPHTHONGS, TRIPHTHONGS, SHORT_VOWELS, LONG_VOWELS
>>> enm_syllabifier = Syllabifier()
>>> enm_syllabifier.set_short_vowels(SHORT_VOWELS)
>>> enm_syllabifier.set_vowels(SHORT_VOWELS+LONG_VOWELS)
>>> enm_syllabifier.set_diphthongs(DIPHTHONGS)
>>> enm_syllabifier.set_triphthongs(TRIPHTHONGS)
>>> stresser = MiddleEnglishStresser(enm_syllabifier)
>>> stresser.stress('beren', stress_rule="FSR")
['ber', "'en"]
>>> stresser.stress('prendre', stress_rule="FSR")
["'pren", 'dre']
>>> stresser.stress('yisterday', stress_rule="GSR")
['yi', 'ster', "'day"]
>>> stresser.stress('day', stress_rule="GSR")
>>> stresser.stress('mervelus', stress_rule="LSR")
["'mer", 'vel', 'us']
>>> stresser.stress('verbum', stress_rule="LSR")
['ver', "'bum"]
phonetic_indexing(word, p='SE')[source]
  • word – word

  • p – Specifies the phonetic indexing method SE: Soundex variant for MHG

Return type



Encoded string corresponding to the word’s phonetic representation


The Soundex phonetic indexing algorithm adapted to ME phonology.


Let w the original word and W the resulting one

  1. Capitalize the first letter of w and append it to W

  2. Apply the following replacement rules

    p, b, f, v, gh (non-nasal fricatives) -> 1

    t, d, s, sh, z, r, k, g, w (non-nasal alveolars and velars) -> 2

    l (alveolar lateral) -> 3

    m, n (nasals) -> 4

    r (alveolar approximant) -> 5

  3. Concetate multiple occurrences of numbers into one

  4. Remove non-numerical characters


/h/ was thought to be either a voiceless or velar fricative

when occurring in the coda with its most used grapheme being <gh>. Those phonemes either disappeared, resulting in the lengthening of preceding vowel clusters, or were developed into /f/ as evident by modern spelling (e.g. ‘enough’: /ɪˈnʌf/ and ‘though’: /ðəʊ/)


>>> MiddleEnglishStresser().phonetic_indexing("midel", "SE")
>>> MiddleEnglishStresser().phonetic_indexing("myddle", "SE")
>>> MiddleEnglishStresser().phonetic_indexing("might", "SE")
>>> MiddleEnglishStresser().phonetic_indexing("myghtely", "SE")
'M123' cltk.phonology.enm.syllabifier module

The hyphenation/syllabification algorithm is based on the typical syllable structure model of onset/nucleus/coda. An additional problem arises with the distinction between long and short vowels, since many use identical graphemes for both long and short vowels. The great vowel shift that dates back to the early stages of ME poses an additional problem.