Method

Participants: 16 Australian English-speaking adults and 16 Chinese Mandarin-speaking adults.
- Age range: 18-35 years.
- None of the English speakers had knowledge of Mandarin or tone languages.
- Recruited online, with some participants receiving course credit.
- All participants reported normal hearing abilities.
- Study approved by the Western Sydney University Ethics Committee (H11383) and complied with ethical standards of the Helsinki Declaration.

Recording: Stimuli recorded by a native male Mandarin speaker.
Syllables: 12 CV, CVV, and CVVC syllables such as /tou/, /bou/, /ɕye/, /pye/, /pian/, /fian/, /jy/, /ty/, /bi/, /gi/, /gua/, /lua/.
Tones: Two Mandarin tones (T2, T3).
Lengths: Different stimulus lengths (500 ms, 1000 ms, 1500 ms, and 2000 ms).
Final Stimulus Set: 24 different stimuli formulated as:
- 2 tones (T2, T3)
- 3 numbers of syllables (1, 2, 4)
- 4 length sets (400 ms, 1000 ms, 1500 ms, 2000 ms)
Phonotactic Structures: All stimuli had legal phonotactic structures and were non-meaningful words.
Variability: At least five productions per stimulus to create acoustic variability, selected based on representativeness.
Acoustic Analysis: Conducted using PRAAT software to analyze pitch patterns and other acoustic parameters.
All stimuli normalized in intensity (70 dB) and duration (500 ms).

Both language groups performed an AXB discrimination task programmed in Labvanced.
Environment: Online, in a reported quiet place; participants wore headphones.
Familiarization Trials: Included two practice trials.
Response Method: Participants indicated similarities via keypress (key 1 for first, key 3 for third syllables), aiming for quick and accurate responses.
Test Phase: Followed practice, with A and B sounds differing in tone but similar in syllable count and duration. The X sound was also tone-differentiated but maintained same syllable and duration.
Order Counterbalancing: Four order types included: AAB, ABB, BAA, and BBA.
Timing:
- Interstimulus interval: 1000 ms
- Intertrial interval: 3000 ms
- Response time-out: 2500 ms (measured after third syllable).
Trials randomized, with pauses added at 25%, 50%, and 75% of trials.

Analysis utilized a one-predictor logistic model to examine participants’ perceptual acuity related to language background.
Data coding: 0 (incorrect) or 1 (correct) for trial outcomes; 0 (Chinese) or 1 (English) for language background.
Statistical Analysis Tools: SPSS (version 24).
Findings:
- Model: Predicted logit of accuracy = 0.379 + (−0.421) * language background.
- Notable relationship: the odds of correct tone perception are lower for Australian English speakers (p = .041).
- Australian English speakers had about two-thirds the odds of correctly perceiving tone contrasts compared to Chinese Mandarin speakers (eβ = 0.656).
- One-sample t-tests confirmed that:
- Australian group performance did not exceed chance (t = -0.288, p = .774).
- Chinese group performance significantly above chance (t = 2.638, p = .009).
Results Visualization:
- Figure 1 shows the proportion of correct responses between English and Mandarin listeners.
- Performance at 0.5 indicates chance level with error bars representing one standard error of performance observed between subjects.