Chapter 9 Mispronunciations and referent selection

In the earlier chapters, I studied word recognition by examining how young listeners recognized familiar words. But children do not know all the words they encounter, and another avenue for studying word recognition is to examine how listeners respond to unfamiliar or novel stimuli. This study looks at how children responded to mispronunciations and novel words in a two-image word-recognition experiment.

9.1 How phonetically detailed are children’s words?

There has been long, productive line of research examining how children respond to mispronunciations of familiar words. The motivation for this research was to determine how detailed children’s phonological representations are. One hypothesis held that infants and toddlers do not need to store words in much phonetic or phonological detail because they know so few words. In other words, their lexicon had underspecified or holistic representations.

A common argument for underspecified or holistic representations poses the building a lexicon as a design or engineering problem. It considers the question: What would be a smart way to build up a lexicon? One solution is that the word-learning system should be lazy, encoding words in just enough phonetic detail to differentiate among them and adding more details as the need arises. Early words should be underspecified and pick up details on demand. Why fully encode the phonetic form of “doggie” when there is no competition from similar words like “toggie” or “dokkie” or “tokkie”? An efficient solution would be to encode a minimal amount of phonetic detail. In fact, this strategy could be developmentally advantageous: “Perhaps children benefit from the sparseness of their lexicon by encoding only the detail necessary to distinguish words” (Swingley & Aslin, 2000, p. 148).

Charles-Luce and Luce (1990, 1995) are commonly cited touchstones for this argument. Charles-Luce and Luce compared the expressive (1990) and receptive (1995) lexicons of 5- and 7-year-olds against those of adults. They observed that adults had much denser phonological neighborhoods than children, and they suggested that children may only have holistic representation of words given these sparser neighborhoods. Dollaghan (1994) rebutted the 1990 study, observing that kids have sparser neighborhoods because they have sparse lexicons. Dollaghan (1994) also showed that young children do indeed have dense neighborhoods in their lexicons. Coady and Aslin (2003) elaborated this claim, observing that children’s lexicons would be comparatively denser early on in development if a child’s first words are made up of more common sounds and word shapes. That is, words are more likely to be neighbors if the early lexicon favors more frequent sounds and word shapes.

Structural studies of lexicons of this sort are rather limited. They describe the knowledge to be learned instead of the content of the representations throughout development. More direct evidence comes from studies where children have difficulty learning minimal pairs. For example, Barton (1976) found that 27–35-month-olds who, say, knew the words bear and pear could differentiate them successfully (at approximately 90% accuracy), but when the children had to learn one of the words, then they were less successful at differentiating them (50–60%). The discrepancy invites a conclusion that newly learned words are underspecified.

The Switch Task, studied extensively by Werker and colleagues, also yielded evidence where young children were unable to learn a minimal pair in the lab. In the classic switch paradigm, a child is habituated to the ostensive naming of two novel-object/novel-word associations. In other words, they see a novel object and hear a paired novel word (liff) and see different novel object with a different paired nonword (neem). Once the child is habituated, there is a critical switch trial where the liff-object is displayed but labeled with the other nonword neem. If the child looks longer on the switch, then we infer that they detected the change and pay more attention because their expectations were violated. This paradigm fits into the debate about early phonological representations because 14-month-olds could detect the liff-neem switch (Werker, Cohen, Lloyd, Casasola, & Stager, 1998) but not a minimal pair bih-dih switch (Stager & Werker, 1997).

One interpretation of these results is that children may have underspecified representations for recently learned words. That reading begs the question, however, of whether a child has actually “learned” the word or just has an inchoate, less-well learned representation from a few laboratory exposures to a nonword. In Fennell and Waxman (2010), 14-month-olds were able to detect a bin/din switch when the nonwords were treated as words. That is, during familiarization the words were embedded in sentence prompts like Look. It’s the din. or Do you see the din? By changing the task in a way that makes word-learning easier, the children encoded more phonetic detail about the words. This suggests that the challenge of learning minimal-pair words seems to have more to do with the difficulty of word learning rather than with how the known words are stored.

Against that backdrop of lexicon design strategies and minimal-pair training studies, mispronunciation studies provide a rather direct way to study the phonetic representations of the words that children know. Swingley and Aslin (2000) presented 18–23-month-olds two familiar images onscreen, like baby and dog, and the children heard a correct production (where’s the baby) or a mispronunciation (where’s the vaby). Children looked to the correct productions about 73% of the time and mispronunciations 61% of the time. They looked more than chance but less than the correct production, indicating they were sensitive to the mispronunciation. Children had encoded baby in enough phonetic detail that a small phonetic change made them less certain during word recognition. (Swingley and Aslin (2002) found the same pattern of results for 14–15-month-olds with 60% looks to correct productions and around 53.5% looks for mispronunciations.)

A similar study by Bailey and Plunkett (2002) tackled the representations of recently learned words. They created custom word-lists for children and included mispronunciations for words that the child purportedly learned long before testing and only recently before testing. They did not find a difference between the two types of words, suggesting that recently learned words were as well specified as earlier learned words.

One limitation with the Swingley and Aslin (2000) design is that the child has no way to reject vaby. It could be that children might treat vaby a completely novel word, but they have to choose either baby or dog so they look at the image that rhymes with the vaby. White and Morgan (2008) updated the paradigm to allow for these kind of rejections. They presented toddlers with images of a familiar object and a novel object, and children heard a correct production of the familiar object, mispronunciations of the familiar object of varying severity, or an unrelated nonword. Toddlers looked less to a familiar word when the first segment was mispronounced, so they did not treat the mispronunciations as nonwords. The children demonstrated graded sensitivity such that a 1-feature mispronunciation yielded more looks to a familiar image than a 2-feature mispronunciation, and a 2-feature mispronunciation yielded more looks than a 3-feature one. Finally, in the nonword condition, the children looked more to the novel object than the familiar one, demonstrating fast referent selection as they associated novel words to novel objects in the moment. In this case, mispronunciations can vary in severity and children’s responses to them will vary in turn.

Law and Edwards (2015) applied this approach to preschool-age children, observing a similar pattern of effects: Preschoolers mapped real words to familiar objects, mapped nonwords to novel objects, and equivocated about mispronunciations of familiar words. They also found that the child’s vocabulary size was related to these looking behaviors such children with larger vocabularies looked more to the target in the real word and nonword trials and looked less to the familiar object in the mispronunciation trials.

9.2 How to handle nonwords

Mispronunciations are not the only nonwords a young child might hear. In fact, if a child knows very few words, we expect them to be bombarded by new and novel words. There has been a great deal of research on how children handle nonwords, especially when paired with a novel object as its referent (mutual exclusivity principle, Markman & Wachtel, 1988; Novel Name–Nameless Category principle, Mervis & Bertrand, 1994). The nonword trials in studies like Law and Edwards (2015) and White and Morgan (2008) can shed light on other aspects of word recognition.

Children have a strong novelty bias when they hear nonwords. In Horst, Samuelson, Kucker, and McMurray (2011), two-year-olds were familiarized to novel objects. They were later tested with a prompt to select a novel object (Can you get the fode?) from three choices: two familiarized novel objects and one new unfamiliarized super-novel object. They demonstrated a clear preference for the super-novel object. Mather and Plunkett (2012) replicated the preference for a super-novel object during a word recognition eyetracking task. In this case, 22-month-old English-learning toddlers were pre-exposed to images of novel objects. Later, during a test trial, a familiar object, a familiarized novel object, and new unseen super-novel object appeared onscreen with a prompt to view a nonword (Look at the gub! Look! Gub!). Children looked to the novel object more than the other two objects. In a second experiment, they removed the familiar object leaving just the familiarized and super-novel objects. The advantage of the super-novel object was replicated but it only emerged after the third repetition of the trial.

A robust novelty bias raises the question of whether listeners’ comprehension of familiar words and interpretation of nonwords reflect different processes. McMurray et al. (2012) propose that the same basic process is at play in both recognition of familiar words and fast association of nonwords. After all, in the lab, the observed behaviors are the same: Children hear a word (be it a real word or nonword) and direct their attention to an appropriate referent. “To the extent that a word links sound and meaning, any time that link is used to guide behavior, a word is being used. Thus, word use also includes processes such as comprehending known words, and even determining referents for new words” (McMurray et al., 2012, p. 832).

Bion, Borovsky, and Fernald (2013) tested referent selection of nonwords and real words in 18-, 24-, and 30-month-olds. In the article’s second experiment, toddlers were trained on two novel words on disambiguation trials. They would see a familiar object and an unfamiliar object and heard a prompt with a nonword (Where’s the dofa?). During later retention trials, the two unfamiliar objects were presented and prompted (Where’s the dofa?). Mixed in with these trials were familiar-object trials in which a familiar object was labeled (Where’s the car?). In that experiment, children looked more to the target on the familiar word trials than on the nonword disambiguation trials (82% versus 68% looks to the target for the 30-month-olds).

For Bion et al. (2013), toddlers performed better on the familiar-word trials than the novel-word trials. But if we think of the nonwords as just much less familiar words, then this result is wholly consistent with the idea that the same process operates in both familiar word recognition and nonword referent selection. The authors, interestingly, make a point to note that fast referent selection is not necessary for word-learning: “Those 18-month-olds whose accuracy scores on Disambiguation trials were lower than [chance] were reported to produce as many as 389 words …. those 24-month-olds who failed to show a disambiguation bias produced as many as 417 words”. In other words, a toddler may purportedly know hundreds of words but still not reliably look to a novel object given a novel label. This finding raises the possibility that nonword referent selection is not a guaranteed behavior in young children.

9.3 The current study

As with lexical competition, it is unclear how children’s responses to mispronunciations and novel words change over time. For example, do children become more forgiving of mispronunciations as they mature and learn more words? Do familiar word recognition and nonword referent selection ever dissociate? Moreover, is one of these behaviors more related to future word learning?

In this study, I report the results of a longitudinal study of word recognition in preschoolers at age 3, age 4, and age 5. The particular experiment here was a mispronunciation study following the paradigms of White and Morgan (2008) and Law and Edwards (2015). Children saw a familiar object and unfamiliar object and heard either a real word (shoes), a one-feature mispronunciation of the word (suze), or a nonword (geeve). The study is described in detail in Chapter 10.

In Chapter 11, I examine children’s development of referent selection in unambiguous contexts by comparing their performance in the real word and nonword conditions. Of interest is whether real word and nonword processing follow similar developmental trajectories. I expect the two to be highly related, but if they ever dissociate, it should happen with younger children. At face value, one might expect a child’s ability to associate new words with unfamiliar objects to be a more direct measure of word-learning capacity than a child’s ability to process known words. Under this assumption, I predict that nonword referent selection will be a better measure of later vocabulary growth than familiar word recognition.

In Chapter 12, I study how children’s responses to mispronunciations changed with age. From the literature review above, I expect preschoolers to treat the mispronunciations as passable but still flawed productions of known words. As for development, I expect children to become more tolerant of mispronunciations, based on the assumption that they become more experienced at listening to noisy, degraded, or misspoken speech. I also report data from age 5 where we tested children’s retention of the novel images paired with the nonwords and mispronunciations.

Finally, in Chapter 13, I describe the both sets of analyses together, and Chapter 14 reviews the results of my pre-analysis research hypotheses. In Appendix E, I briefly present the results for specific mispronunciations, although item effects are not formally modeled.

References

Swingley, D., & Aslin, R. N. (2000). Spoken word recognition and lexical representation in very young children. Cognition, 76(2), 147–66. doi:10.1016/S0010-0277(00)00081-0

Charles-Luce, J., & Luce, P. A. (1990). Similarity neighbourhoods of words in young children’s lexicons. Journal of Child Language, 17(1), 205–215. doi:10.1017/S0305000900013180

Charles-Luce, J., & Luce, P. A. (1995). An examination of similarity neighbourhoods in young children’s receptive vocabularies. Journal of Child Language, 22(3), 727–735. doi:10.1017/S0305000900010023

Dollaghan, C. A. (1994). Children’s phonological neighbourhoods: Half empty or half full? Journal of Child Language, 21(2), 257–271. doi:10.1017/S0305000900009260

Coady, J. A., & Aslin, R. N. (2003). Phonological neighbourhoods in the developing lexicon. Journal of Child Language, 30(2), 441–469. doi:10.1017/S0305000903005579

Barton, D. (1976). Phonemic discrimination and the knowledge of words in children under three years. Papers and Reports on Child Language Development, 11, 61–68.

Werker, J. F., Cohen, L. B., Lloyd, V. L., Casasola, M., & Stager, C. L. (1998). Acquisition of word–object associations by 14-month-old infants. Developmental Psychology, 34(6), 1289. doi:10.1037/0012-1649.34.6.1289

Stager, C. L., & Werker, J. F. (1997). Infants listen for more phonetic detail in speech perception than in word-learning tasks. Nature, 388(6640), 381. doi:10.1038/41102

Fennell, C. T., & Waxman, S. R. (2010). What paradox? Referential cues allow for infant use of phonetic detail in word learning. Child Development, 81(5), 1376–83. doi:10.1111/j.1467-8624.2010.01479.x

Swingley, D., & Aslin, R. N. (2002). Lexical neighborhoods and the word-form representations of 14-month-olds. Psychological Science, 13(5), 480–484. doi:10.1111/1467-9280.00485

Bailey, T. M., & Plunkett, K. (2002). Phonological specificity in early words. Cognitive Development, 17(2), 1265–1282. doi:10.1016/S0885-2014(02)00116-8

White, K. S., & Morgan, J. L. (2008). Sub-segmental detail in early lexical representations. Journal of Memory and Language, 59(1), 114–132. doi:10.1016/j.jml.2008.03.001

Law, F., II, & Edwards, J. R. (2015). Effects of vocabulary size on online lexical processing by preschoolers. Language Learning and Development, 11(4), 331–355. doi:10.1080/15475441.2014.961066

Markman, E. M., & Wachtel, G. F. (1988). Children’s use of mutual exclusivity to constrain the meanings of words. Cognitive Psychology, 20(2), 121–157. doi:10.1016/0010-0285(88)90017-5

Mervis, C. B., & Bertrand, J. (1994). Acquisition of the novel name–nameless category (N3C) principle. Child Development, 65(6), 1646–1662. doi:10.1111/j.1467-8624.1994.tb00840.x

Horst, J. S., Samuelson, L. K., Kucker, S. C., & McMurray, B. (2011). What’s new? Children prefer novelty in referent selection. Cognition, 118(2), 234–244. doi:10.1016/j.cognition.2010.10.015

Mather, E., & Plunkett, K. (2012). The role of novelty in early word learning. Cognitive Science, 36(7), 1157–1177. doi:10.1111/j.1551-6709.2012.01239.x

McMurray, B., Horst, J. S., & Samuelson, L. K. (2012). Word learning emerges from the interaction of online referent selection and slow associative learning. Psychological Review, 119(4), 831–877. doi:10.1037/a0029872

Bion, R. A. H., Borovsky, A., & Fernald, A. (2013). Fast mapping, slow learning: Disambiguation of novel word–object mappings in relation to vocabulary learning at 18, 24, and 30 months. Cognition, 126(1), 39–53. doi:10.1016/j.cognition.2012.08.008