Prosody outweighs statistics in 6-month-old German-learning infants' speech segmentation

Key Concepts

  • Speech segmentation in infancy relies on multiple cues: prosody (e.g., lexical stress) and statistical information (transitional probabilities, TPs).

  • German-language properties: strong trochaic (initial-stress) bias; prosodic structure may shape early segmentation differently than in English.

  • TP cue: high TP within a word, low TP between words; prosody cue in this study was an iambic pattern (stress on the second syllable).

  • Method: Headturn Preference Procedure to measure listening times as index of familiarity/novelty.

  • Research aim: compare weighting of prosody vs TP cues in 6‑month‑old German‑learning infants; test if English-like TP dominance appears early or if prosody dominates in German.

Experimental Overview

  • Three experiments test cue weighting in 6–7 month old German-learning infants.

  • Test phase uses three word types: statistical words (high TP), prosodic words (prosody-based), and non-words (syllables never co-occurred).

  • Experiment 1: familiarization with a string containing both high TP and a consistent iambic stress; 24 infants.

  • Experiment 2: control without familiarization; 27 infants.

  • Experiment 3: familiarization with only statistical cues using synthesized speech (no prosody); 31 infants.

  • Across experiments, sample sizes and ages are similar (roughly 6–7 months).

Experiment 1: Prosody vs Statistics

  • Stimuli (familiarization): four disyllabic statistical words gobu, tade, bido, puda with internal TP = 1.01.0 and between-word TPs 0.2o0.40.2 o 0.4; second syllable consistently stressed (iambic prosody).

  • Test words (disyllabic):

    • Statistical words: puda, bido (TP = 1.01.0) – stress-final

    • Prosodic words: buta, dego (TP ≈ 0.51,0.490.51, 0.49) – stress-initial

    • Non-words: dabi, bide (TP = 0.00.0) – stress pattern noted as in string

  • Procedure: Headturn Preference; infants exposed to string then tested with 12 repetitions per trial across three conditions; measurements are looking times.

  • Analysis: non-parametric Wilcoxon Signed-Rank tests; effect size via Cliff’s delta \

Key Concepts

  • Speech segmentation in infancy, the process by which infants break down a continuous stream of speech into individual words, relies on a combination of different cues. These include prosody (the rhythm, stress, and intonation of speech, such as lexical stress) and statistical information (the probability of certain syllables occurring together, known as transitional probabilities, or TPs).

  • German-language properties introduce specific challenges and biases. German has a strong trochaic (initial-stress) bias, meaning that a majority of multisyllabic words naturally have stress on the first syllable. This predominant prosodic structure may shape early segmentation processes differently in German-learning infants compared to those learning languages like English, which has a more varied or often iambic (final-stress) pattern in certain contexts.

  • The TP cue used in such studies is based on the principle that the probability of one syllable following another is high within a word (e.g., in "guitar," 'gui' is almost always followed by 'tar'), but low between words (e.g., 'the' followed by 'cat'). The prosody cue specifically investigated in this study involved an iambic pattern (stress on the second syllable), which was systematically introduced during the familiarization phase.

  • The Method employed was the Headturn Preference Procedure (HPP). This behavioral technique measures infants' listening times to different stimuli as an index of their familiarity or novelty. Longer looking times often indicate greater interest or recognition, allowing researchers to infer what speech units infants have successfully segmented or learned.

  • The Research Aim was to directly compare the relative weighting of prosody versus transitional probability cues in 6-month-old German-learning infants. The study sought to determine if an English-like dominance of statistical cues appears early in German infants, or if the strong prosodic bias (trochaic) inherent in German leads to prosody being the more dominant segmentation cue for them.

Experimental Overview

  • Three distinct experiments were conducted to systematically test cue weighting in a total of 82 German-learning infants, all aged approximately 6 to 7 months (with a mean age of around 6 months and 25 days across experiments). These experiments were designed to isolate and combine different cues.

  • The Test Phase in each experiment used three primary types of disyllabic word-like stimuli to assess infant learning:

    • Statistical words: These items maintained the high transitional probabilities learned during familiarization.

    • Prosodic words: These items were defined by a specific prosodic structure (e.g., stress pattern) that might not align with the familiarized statistical cues, specifically testing the influence of prosody.

    • Non-words: These serve as a baseline, consisting of syllable sequences that had never co-occurred in the familiarization phase, thus having a transitional probability of 0.0.</p></li></ul></li><li><p><strong>Experiment1</strong>:Involved<strong>familiarization</strong>whereinfantswereexposedtoacontinuousspeechstreamcontainingboth<strong>hightransitionalprobabilities</strong>(markingwordboundaries)andaconsistent<strong>iambicstresspattern</strong>(stressonthesecondsyllable)withinthewords.Thisexperimentaimedtoseehowinfantscombinedorprioritizedtheseconflictingcues.Itincluded24infants.</p></li><li><p><strong>Experiment2</strong>:Servedasa<strong>controlcondition</strong>anddidnotinvolveanypriorfamiliarizationphase.Thisexperimentwith27infantsprovidedabaselinemeasureofinfantspreferencefortheteststimuliwithoutanylearningofspecificstatisticalorprosodicpatterns.</p></li><li><p><strong>Experiment3</strong>:Utilized<strong>synthesizedspeech</strong>forfamiliarization,specificallydesignedtopresent<strong>onlystatisticalcues</strong>(highTPs)withoutanyclearornaturalprosodicinformation.Thisallowedresearcherstotestthesoleeffectofstatisticallearning.Itinvolved31infants.</p></li><li><p><strong>Acrossallexperiments</strong>,thesamplesizesandtheagerangesoftheinfantswerekeptsimilar(roughly67months),ensuringconsistencyforcomparison.</p></li></ul><h4id="97ec9cff7161488f84298167d8e30bd3"datatocid="97ec9cff7161488f84298167d8e30bd3"collapsed="false"seolevelmigrated="true">Experiment1:ProsodyvsStatistics</h4><ul><li><p><strong>Stimuli(FamiliarizationPhase)</strong>:Infantswerefamiliarizedwithacontinuousstringcomposedoffourdisyllabicstatisticalwords:<code>gobu</code>,<code>tade</code>,<code>bido</code>,and<code>puda</code>.Thesewordswereconstructedsuchthattheirinternaltransitionalprobability(TP)was.</p></li></ul></li><li><p><strong>Experiment 1</strong>: Involved <strong>familiarization</strong> where infants were exposed to a continuous speech stream containing both <strong>high transitional probabilities</strong> (marking word boundaries) and a consistent <strong>iambic stress pattern</strong> (stress on the second syllable) within the 'words'. This experiment aimed to see how infants combined or prioritized these conflicting cues. It included 24 infants.</p></li><li><p><strong>Experiment 2</strong>: Served as a <strong>control condition</strong> and did not involve any prior familiarization phase. This experiment with 27 infants provided a baseline measure of infants' preference for the test stimuli without any learning of specific statistical or prosodic patterns.</p></li><li><p><strong>Experiment 3</strong>: Utilized <strong>synthesized speech</strong> for familiarization, specifically designed to present <strong>only statistical cues</strong> (high TPs) without any clear or natural prosodic information. This allowed researchers to test the sole effect of statistical learning. It involved 31 infants.</p></li><li><p><strong>Across all experiments</strong>, the sample sizes and the age ranges of the infants were kept similar (roughly 6–7 months), ensuring consistency for comparison.</p></li></ul><h4 id="97ec9cff-7161-488f-8429-8167d8e30bd3" data-toc-id="97ec9cff-7161-488f-8429-8167d8e30bd3" collapsed="false" seolevelmigrated="true">Experiment 1: Prosody vs Statistics</h4><ul><li><p><strong>Stimuli (Familiarization Phase)</strong>: Infants were familiarized with a continuous string composed of four disyllabic 'statistical words': <code>gobu</code>, <code>tade</code>, <code>bido</code>, and <code>puda</code>. These words were constructed such that their internal transitional probability (TP) was1.0(meaningthesecondsyllablealwaysfollowedthefirst),whilethetransitionalprobabilities<em>between</em>thesewordsweresignificantlylower(rangingfrom(meaning the second syllable always followed the first), while the transitional probabilities <em>between</em> these words were significantly lower (ranging from0.2toto0.4).Critically,allthesewords(<code>gobu</code>,<code>tade</code>,<code>bido</code>,<code>puda</code>)consistentlyexhibitedan<strong>iambicprosody</strong>,meaningthestressfellonthesecondsyllable(e.g.,go<strong>BU</strong>).Thissetuppresentedinfantswithbothstrongstatisticalandconsistentiambicprosodiccuesforsegmentation.</p></li><li><p><strong>TestWords(Disyllabic)</strong>:</p><ul><li><p><strong>Statisticalwords</strong>:Forthetestphase,twoofthefamiliarizedwords,<code>puda</code>and<code>bido</code>,werepresented.TheseretainedtheirinternalTPof). Critically, all these 'words' (<code>gobu</code>, <code>tade</code>, <code>bido</code>, <code>puda</code>) consistently exhibited an <strong>iambic prosody</strong>, meaning the stress fell on the second syllable (e.g., go-<strong>BU</strong>). This setup presented infants with both strong statistical and consistent iambic prosodic cues for segmentation.</p></li><li><p><strong>Test Words (Disyllabic)</strong>:</p><ul><li><p><strong>Statistical words</strong>: For the test phase, two of the familiarized words, <code>puda</code> and <code>bido</code>, were presented. These retained their internal TP of1.0andalsomatchedthe<strong>stressfinal(iambic)</strong>prosodyencounteredduringfamiliarization.Thesewordswereexpectedtobehighlyfamiliarbasedonbothstatisticalandprosodiccues.</p></li><li><p><strong>Prosodicwords</strong>:Noveldisyllabicsequences,<code>buta</code>and<code>dego</code>,werespecificallyconstructed.Theirinternaltransitionalprobabilitieswerenearchance(approximatelyand also matched the <strong>stress-final (iambic)</strong> prosody encountered during familiarization. These words were expected to be highly familiar based on both statistical and prosodic cues.</p></li><li><p><strong>Prosodic words</strong>: Novel disyllabic sequences, <code>buta</code> and <code>dego</code>, were specifically constructed. Their internal transitional probabilities were near chance (approximately0.51andand0.49),meaningtheylackedastrongstatisticalcoherence.However,theywerepresentedwitha<strong>stressinitial(trochaic)</strong>prosody(e.g.,<strong>BU</strong>ta).Thesewordsweredesignedtotestifinfantswouldsegmentbasedonthistrochaicprosodicpattern,whichisdominantinGerman,evenwhenstatisticalcueswereweakandtheprosodyconflictedwiththeiambicpatternofthefamiliarizedstatisticalwords.</p></li><li><p><strong>Nonwords</strong>:Exampleslike<code>dabi</code>and<code>bide</code>wereused.Theseweresyllablepairsthatnevercooccurredinthefamiliarizationstring,thushavingaTPof), meaning they lacked a strong statistical coherence. However, they were presented with a <strong>stress-initial (trochaic)</strong> prosody (e.g., <strong>BU</strong>-ta). These words were designed to test if infants would segment based on this trochaic prosodic pattern, which is dominant in German, even when statistical cues were weak and the prosody conflicted with the iambic pattern of the familiarized statistical words.</p></li><li><p><strong>Non-words</strong>: Examples like <code>dabi</code> and <code>bide</code> were used. These were syllable pairs that never co-occurred in the familiarization string, thus having a TP of0.0.Theirstresspatternwasalsocontrolledtobeconsistentwiththesurroundingstring,primarilyservingasabaselineofcompletelynovelsequences.</p></li></ul></li><li><p><strong>Procedure</strong>:The<strong>HeadturnPreferenceProcedure</strong>wasemployed.Afterthefamiliarizationtothecontinuousspeechstring,infantswerepresentedwiththetestwords.Eachtrialconsistedof12repetitionsofatestwordtype(statistical,prosodic,ornonword).Researchersmeasuredtheinfants<strong>lookingtimes</strong>towardsaspeakerwhenthestimuliwereplayed.Longerlookingtimesaretypicallyinterpretedasindicatingapreferenceorgreaterrecognition/processingofthestimulus.</p></li><li><p><strong>Analysis</strong>:Thedatafromlookingtimeswereanalyzedusingnonparametric<strong>WilcoxonSignedRanktests</strong>,whichareappropriateforcomparingtworelatedsampleswhenthedataisnotnormallydistributed.<strong>Effectsize</strong>wasreportedusing<strong>Cliffsdelta(. Their stress pattern was also controlled to be consistent with the surrounding string, primarily serving as a baseline of completely novel sequences.</p></li></ul></li><li><p><strong>Procedure</strong>: The <strong>Headturn Preference Procedure</strong> was employed. After the familiarization to the continuous speech string, infants were presented with the test words. Each trial consisted of 12 repetitions of a test word type (statistical, prosodic, or non-word). Researchers measured the infants' <strong>looking times</strong> towards a speaker when the stimuli were played. Longer looking times are typically interpreted as indicating a preference or greater recognition/processing of the stimulus.</p></li><li><p><strong>Analysis</strong>: The data from looking times were analyzed using non-parametric <strong>Wilcoxon Signed-Rank tests</strong>, which are appropriate for comparing two related samples when the data is not normally distributed. <strong>Effect size</strong> was reported using <strong>Cliff’s delta (\delta)</strong>,whichindicatesthemagnitudeanddirectionofthedifferencebetweentwogroups,providingarobustmeasureofthestrengthoftheobservedeffect.</p></li></ul><p></p><p>Thestudyinvestigatedspeechsegmentationin67montholdGermanlearninginfantsacrossthreeexperiments.</p><p>Experiment1aimedtoassesshowinfantsweighprosodic(iambicstress)andstatistical(hightransitionalprobabilities,TP)cuesforwordsegmentation.InfantswerefamiliarizedwithastringwheredisyllabicwordshadbothhighinternalTPs()</strong>, which indicates the magnitude and direction of the difference between two groups, providing a robust measure of the strength of the observed effect.</p></li></ul><p></p><p>The study investigated speech segmentation in 6-7 month old German-learning infants across three experiments. </p><p>Experiment 1 aimed to assess how infants weigh prosodic (iambic stress) and statistical (high transitional probabilities, TP) cues for word segmentation. Infants were familiarized with a string where disyllabic 'words' had both high internal TPs (1.0$$) and consistent iambic stress. In the test phase, the Headturn Preference Procedure measured looking times for statistical words (high TP, iambic stress), prosodic words (weak TP, trochaic stress), and non-words (zero TP).

      Experiment 2 served as a control, with no familiarization phase, to provide a baseline for infants' preferences.

      Experiment 3 focused solely on statistical cues by familiarizing infants with synthesized speech that contained only high TPs, without any clear prosodic information. The purpose was to observe infants' segmentation performance when only statistical cues were available. The provided text details the experimental setups and analysis methods, but it does not include the specific results or conclusions for any of the three experiments.