The Weekly Newsletter of MIT Linguistics

Phonology Circle 12/5 - Adam Albright

Speaker: Adam Albright (MIT)
Title: Why do speakers try to predict the unpredictable?
Time: Monday, December 5th, 5:00–6:30pm
Location: 32-D831

(Joint work with Michelle Fullwood (MIT) and Jongho Jun (Seoul National University))

Generative phonology traditionally distinguishes two types of feature values: (1) unpredictable, or contrastive values, and (2) contextually predictable values. Unpredictable values are listed in the lexicon as arbitrary properties of morphemes, whereas predictable values are assigned or enforced by grammar. However, statistical studies of lexicons have revealed that contrastive feature values are often surprisingly predictable. For example, Ernestus and Baayen (2003) observed that although stem-final obstruent voicing is nominally contrastive in Dutch, it is actually fairly predictable based on the obstruent’s place and continuancy, and the preceding vowel’s quality. Furthermore, speakers are aware of this predictability, and can use it to judge likely voicing values for stem-final obstruents in nonce words. Similar results have been found for contrasts in numerous other languages, including Korean stem-final continuancy and laryngeal features Jun (2010), Spanish mid vowel vs. diphthong contrasts (Albright et al. 2001), and others. These results support a model in which phonological grammars attempt to predict at least some contrastive feature values.

In this study, we ask why there is this redundancy between the grammar and the lexicon. One possibility is data compression (Rasin and Katzir 2015, and others); if the grammar can exploit statistical asymmetries to predict certain feature values, they need not be listed in the lexicon. Maximal compression is achieved if the grammar supplies all predictable feature values. An alternative possibility is that values must be predicted when there is neutralization. In Dutch, stem-final obstruents undergo final devoicing, so speakers must sometimes guess the voicing of a stem-final obstruent, based on the neutralized singular form. Under this account, the grammar must supply only those feature values that are neutralized in the singular. We test the predictions of these accounts by comparing the predictability of feature values that are subject to neutralization in different languages. We compare place, continuancy, and laryngeal contrasts in Korean, Dutch, and English. In English, all three features contrast word-finally (with numerous specific restrictions), whereas in Dutch, voicing is neutralized, and in Korean, continuancy and laryngeal features are both neutralized in this position.

In order to test predictability, we extracted the most frequent items in each language (5018 Korean nouns; 5151 Dutch nouns; 5085 English words). When trained the Minimal Generalization Learner (Albright and Hayes 2002) to predict the values of various features based on remaining features of the segment in question, and the preceding context. We then wug-tested the resulting grammars, to determine whether feature values get more predictable at lower frequencies. The reasoning is that, as with morphological regularity, low frequency words should be less able to sustain exceptionality, and should therefore reflect grammatical preferences. The results show that although overall predictability does tend to be higher for neutralizing features, neutralizing and non-neutralizing features both get more predictable at lower frequencies, as predicted by the data compression model. Neutralization may increase the likelihood that a speaker will need to use their grammar to predict an `unpredictable’ feature, but it is not a prerequisite to learning and enforcing such generalizations.