[Auto-generated transcript. Edits may have been applied for clarity.]
Okay, everyone, let's get started. So first of all, well done for getting here.
My I on the Tuesday very rainy. So thank you for that. Uh, this week's about speech perception.
First lecture today is, uh, well, I hope a gentle, gentle introduction.
And then the next session will be moving on some more of the theory.
But for it for today's session, which is an introduction.
Like to begin by playing you. This fun video, I think will help us to get thinking about speech perception.
So have a watch of this, please.
Most of us applauded and sold voice recognition technology in US to help pass voice recognition technology analyst in Scotland.
Ever tried voice recognition technology? No, I don't the Scottish accents.
11. Could you please repeat that?
11. 11. 11.
11. Could you please repeat that?
11. Which idea is us?
We need to try an American accent. 11.
11. I am sure it is, then.
11. Wait a minute. I'm sorry.
Did you please repeat that English accent?
11. He's in the same part of England as Dick Van [INAUDIBLE].
Listen. Using swallows, please speak slowly and clearly, smart arse.
You laugh, so it goes on. They get more and more angry.
Um, so I'm playing this to you, uh, for a reason, and I'm sorry.
And let me just get off the slides. I can.
Good. Okay? Yeah.
Just to make the point that understanding speech is harder than it might seem, and our brains are doing it automatically and it's evolved.
Very smart mechanisms to process speech, but it's harder than it seems.
There's lots of complexities to it. So on this slide, I'm going to walk you through some of the challenges of speech perception.
So these are the challenges that your brain is somehow solving because you understand speech.
Most of us can understand speech. Um, but then, you know, you you start to realise that these are, these are real challenges.
These are complex things to do. Once you try and get a computer to do it, if you've ever used Siri or Alexa, they work pretty well, but all the time.
And it can be quite frustrating because it is a hard thing to understand speech.
So these are the reasons why it's it's hard to make sense of a speech signal.
So what I'm showing you here is a picture of a sentence.
So this is, um, a depiction showing you the the sound waveform.
What you're seeing here is changes in sound pressure over time.
So this is a visual depiction of a sound of a sentence.
And so specifically it's the sentence that's the answer to the question in the exam when I turn up a little bit.
That's the answer to the question in the exam. Yes. So that's the answer to the question in the exam to be chopped off the beginning.
But yes, the answer, the question in the exam.
So I'm just showing you where the words are aligned in time. So the corresponding bit in the acoustic signal.
So one of the things that makes speech hard to process to understand is that especially for a computer,
is that unlike written language, there's no clear gaps between words.
So, for example, if you look at this stretch of the audio, there's no gap where I'm pointing to.
And yet there are two distinct words in and the conversely, here,
if you look at this part of the audio waveform that looks like there's a bit of a gap, a bit of silence.
So maybe this is one word and this is in other words. But no, it's the same.
It's a single words. The word answer. Okay. So that's one of the things that's quite unique to to speech.
Another thing that's tricky is acoustic variability.
So if you look at the two instances of the I'm highlighting here, look at how it appears in the acoustic signal here.
It looks quite different to here. Okay.
So this is acoustic variability expressing itself even just within a single sentence from the same speaker.
And of course you'll have even more variability if you're considering different speakers different genders with different accents.
And even speaking rights as well. How quickly you speak so that will a acoustic variability.
And so just two points I just explain this word here articulation.
This is just referring to the fact that whatever you're producing in terms of speech, you're getting ready to say the next speech sounds.
And so there's an influence of what's your next going to say on what you're currently saying.
This is called co articulation that creates acoustic variability. This might be why the is different here.
So that this the because you've got different sounds that follow this the you've
got the the in question versus this one which has uh an answer following it.
Okay. So this is just if you see that in textbooks that's what co articulation means.
And you've had some. I think you've had lectures on vision already, and maybe you have seen that in visual experiments.
They like these static images that don't change at all.
They just present on the screen and they measure behavioural responses or maybe brain responses.
But there's no such thing as a static image for hearing and for speech perception, because sound is inherently something that changes over time.
So it's a dynamic signal. So you've got all these kind of, you know, issues to do with time.
First of all, speech is coming in really quickly.
So in conversational speech you might hear up to 200 words per minute.
So it's really fast. And also sound is fleeting. So it's not a static image.
Once you hear it it doesn't stick around. It quickly goes away.
So that means that you have to really process what you're hearing very quickly
before the next thing you hear overwrites in memory what you're currently hearing.
Well, what you've just heard. So this has been termed the now or never.
Bottleneck. Okay, so that is, um.
If introducing you to this, you know, speech perception, all the things that go with it.
I think all these things I've explained to you make it very scientifically interesting as a cognitive psychologist to study how this works.
So you've got this, you know, you have to, as a cognitive psychologist,
have to understand how we achieve perception of this, you know, rapidly changing sensory signal.
And then also you've got the high level cognition coming into it,
because you have to match that rapidly changing sensory signal on to higher level language representation.
So memories of words and things like that. Of course, it's not just scientifically interesting.
These are the reasons for studying speech perception. First of all, most obviously it's the primary means by which we communicate.
For most of us. There's a central part of the human experience.
If we want a full account of how the mind works and how the brain works,
we want to understand, you know, we want to explain how speech perception happens.
It also extends to something you might not immediately recognise as being linked to speech perception reading.
So in England, children start school at the age of four and they're already taught to learn to read in reception.
That's the first year of school. And the way the methods by which children learn to read is through something called phonics.
So phonics is about learning the relationship between letters, symbols, visual symbols, and speech sounds.
And indeed there's many theories of of reading and things like dyslexia.
So developmental conditions like dyslexia that explain dyslexia arising not as a visual difference,
but as actually something to do with speech representations in the brain.
Okay, so if you want understood understand development in children, you want understand speech perception.
Another reason for studying speech perception is if you think of someone who has some form of hearing loss.
So here is a picture of a child with a cochlear implants.
So cochlear implants are devices that restore hearing to profoundly deaf individuals.
Now, the the way they work is they restore hearing, but not fully.
So the the sound that a cochlear implant listener gets is quite degraded.
And so what the brain needs to do is to adapt to that,
to learn to use the implants and make sense of these quite distorted and normal sounding sounds.
And so if you want understand what's going on in cochlear implant listeners and understand why there's a lot of variability in outcomes.
So some listeners seem to be able to better learn with their implant and others.
Well, we want to understand how the brain understands speech, how the brain processes speech.
And maybe through that we'll be able to come up with better diagnostic tests and maybe interventions as well,
to help improve outcomes in cochlear implant listeners.
Another clinical connection is a developmental language disorder, which affects about two children in every classroom.
And so this is a brain difference that makes talking and listening difficult.
So again this is defined by how the brain's processing speech.
So if you want to understand this and help children with this condition we want to understand speech perception.
Okay, so that was all. An introduction to the introduction.
Um, so this is now giving you an outline and the learning outcomes for the rest of the lecture today.
So let's start off with this. So first of.
By telling you about how speech is produced and then work on them.
We'll move on to how it's perceived. We'll start talking and introduce you to this important theory called source filter theory.
Okay, so what are speech sounds in the first place? Let's just consider that.
Don't assume too much prior knowledge here. Um, so speech sounds, um, reflect changes in sound pressure resulting from vocal movements.
And I'll tell you a bit more about that on the later slides.
And again, I've showed you this already. This is the changes in sound pressure for a sentence.
You'll see often that speech sounds are often referred to as phonemes.
What phonemes? Well, phonemes are the smallest, smallest units of speech that convey meaning.
So if you change one phoneme in a word, you change the meaning.
So for example pen versus bin. Yeah.
The first phoneme is a is difference in these two words.
Phonemes, often denoted with slashes on either side.
So that's just to signify. We're not talking about letters here.
We're talking about phonemes. It's a very specific thing that we're referring to.
According to this definition. Just to elaborate that connection with letters.
You know, we're all familiar with letters, you know, when you know the alphabets. But phonemes are not the same thing as letters.
Okay. So letters refer to written symbols, whereas phonemes refer to speech sounds.
So for example, the first letter in cats and kites is different, but it actually represents you know,
the first sound of these two words is the same phoneme denoted by by this symbol here.
So you learn more about our phonemes in an algorithm language lectures coming up next week and the week after that.
Um, another things that we touch upon today as well. You'll see that it also comes up again in Northern Ireland's language lectures.
Allen called them for the move, you know, to language more broadly, not just consider speech, but look at reading and other aspects of language.
Okay, so how do we produce speech? Well, all sounds require a basic source of energy to be produced.
Okay, so in the case of speech, that source of energy is coming from the lungs.
So the lungs are pushing up air up the trachea or windpipe.
So that's this bit here. And then what happens then is the air reaches the larynx and specifically the vocal cords in the in the larynx.
The air stops the vocal cords to vibrates. Now this is where sauce filter theory comes in.
So in source filter theory, the vocal chords and the vibration of the vocal chords are termed the sauce.
And this is because this is really, you know, together with the air coming from the lungs.
That's the source of the energy behind speech. That's the initial sound.
Yeah. That's comprising speech that's generated.
And the vibration of the vocal chords is really important for pitch and intonation.
So to to explain this a little bit more, I'm going to play you a sound though that's been artificially generated.
So what's happened is the this computer algorithm is taken speech and then it's done something some fancy applied
some fancy algorithm to leave only the the vocal chords and taking everything else away that comprises the speech.
So it says if you're listening to speech without, um, a passage from someone without a head.
Okay. So it's imagine if you haven't got all these components here, you just haven't got the the these cavities,
the lips, the teeth, the tongue, which, as you'll see, are also important for shaping speech.
But you know, you're taking this away. And so you're just listening to the raw sounds from someone's voice box.
Yes. Someone's larynx. This is what it might sound like.
So it sounds really funny, but hopefully you can hear that there's, uh, you can hear the pitch intonation.
You could probably make out the, you know, if that person was saying asking a question versus making a statement,
um, or subtle change in their intonation.
And that's still that information is still there, even though you've removed the head and you're just listening to to the voice box.
Okay, so that's the source. I'm sorry just to say.
So this is, uh. Yeah, a bit big. Gross.
But this is just to give you an idea of what the vocal chords look like, because we talk about vocal chords, we don't know what they look like.
So this is as observed with a camera around someone's neck.
So kind of being very comfortable. But this is you can see these are the vocal cords vibrating okay.
So that's one part of speech. But then there's another really important part of speech.
And that's the filter. So what's the filter. Filter. So the filter is the the results of what the supra laryngeal vocal tract is doing.
So supra means above. And laryngeal here is referring to the larynx.
So the super laryngeal vocal tract is referring to all the structures above the larynx.
So that includes, you know, gaps in your, in the, you know that comprise the nasal cavity, the oral cavity, your lips, your teeth, your tongue.
When you're talking, all these things are moving.
And they that movement has the effect of shaping or filtering the sounds that have been generated by the vocal cords.
And this filter is really important for speech, and it's important for producing different speech sounds,
for conveying words and intelligible, intelligible speech.
So if we take that sound that I just played here, that funny sounds,
you could hear the pitch, but you can get the words if we now bring back the filter.
So we give the person the head back. And it sounds like this surprised his parents by his lack of concern.
So now the speech becomes intelligible because now you've got, uh, you know, you've got a tongue,
you've got lips and you've got teeth, and you can use that to shape the sounds of the vocal cords to produce different speech sounds.
Here's, um, and a video now so you can see under the hood what's going on with the vocal,
the super laryngeal vocal tract and seeing all the various structures like the lips, the teeth, etc. are moving.
And you can see how how rapidly they're moving. This is stuff you're doing all the time without thinking about it.
But there's really real sophistication here in what you're doing.
And so this is going to be a video of someone speaking in an MRI scanner.
And so the MRI scanner allows us to look, um, more clearly at all these different structures.
When it comes to singing, I love to sing French are songs.
It's probably my favourite type of song to sing.
Um, I'm a big fan of C, uh, I mean, I also love operas, like I love singing done in the 50s and Mozart and Strauss.
Um, but when I listen to music, I tend to listen to, uh, hard rock or classic rock musicians.
One of my favourite bands is AC, DC, um, and my favourites.
Okay, so that's the super laryngeal vocal tract three.
Now in this slide, I just want to focus now a bit more about the acoustic aspects.
So one of the useful tools for visualising speech and making sense of speech analysing speech is what's termed as a spectrogram.
Now this is a way to analyse the different frequencies of speech. I've showed you this already.
You know that sentence? Um, the one of the first slides and it was, you know, I termed it a sound waveform.
So a sound waveform shows changes in sound pressure over time.
But, you know, this doesn't really allow us very easily to look at the frequency contents of speech.
But a spectrogram allows us to do this more easily by splitting up this signal into different frequency bands or different frequency channels.
So now with the spectrum, when you get a spectrogram, what you see is this.
So you've still got time on the x axis, but now you've got frequency on the y axis.
And so if you're looking at here this is the low frequency part of speech.
Here is the middle part the middle frequencies. And here would be the high frequencies.
And then the colour indicates the amplitude. Now.
So if you've got a dark colour that means you know there's some allowance of acoustic energy in this point in time and in this point of this,
in this particular frequency. Anyone hear that?
So what I'm doing there is playing just this narrow bit of the spectrogram here.
So this low frequency bit. Here, I'm going to move up to higher frequencies.
So. Those are the individual frequencies.
And if you add them all up together, you know, you and you look at all the frequencies together,
then you get the back of that original speech as the answer to the question in the exam.
Okay, so a spectrogram shows how sound amplitude varies as a function of time on the x axis, frequency on the y axis.
And you can see how it allows you to see much more detail the structure of speech.
One of the things that allows you to more easily do is, um, to see what the super laryngeal vocal tract is doing to the speech.
There's the filtering of the vocal tract appears as bands of energy at certain frequencies called formants.
And so this is a formants here. So this is like you can see the the before the super laryngeal vocal tract has an effect.
You'll see. Just you would see just, you know, uniform black everywhere.
Just energy everywhere.
But then with the filtering, the shaping of the sounds by the super laryngeal vocal tract, you filter out some sounds, but you leave certain bands in.
And these are the formants. Um, these bands of energy and they're, they're really, as you'll see on the subsequent slides,
are really important for conveying different speech sounds.
It turns out that the lowest three formance most important for speech intelligibility.
So Formance one f1, f2 and f3. Okay, so let's apply this, uh, these concepts to vowels.
Okay. So what I'm showing you here are spectrograms for different vowels.
So in the words he heads, heads, hard, etc. and then also for for herds, horns, hoods and shoes.
Now I'm pointing to you here. This is Formance one.
This is formant two. This is form three. He said, if you can see.
But if you go from a so-called front foul to a back foul.
So comparing. He needs versus hads.
Can you see that the the tongue is more frontal here in heat versus in hard where
it's more at the back and you can see that that has a corresponding acoustic effect.
So specifically F2 the frequency of two goes down.
Okay. And then. F1 is also seems to be linked with the height of a vowel.
So if you go from a high to a low vowel comparing heat, which is high, the tongue's high versus hot, the tongue is quite low.
You can see that, uh, f one is higher here.
Um. Sorry. If she's confused myself.
You can see the F1 is increasing as you go from a high to low value.
Sorry, a height of how to a low value. Can you see here that this F1 here in the high value?
It's higher in them. It's lower in frequency than in here.
If one. So this is basically making the point that, you know, these are acoustic correlates that go with,
you know, these individual speech sounds that you're producing specifically vowels.
And so maybe what the brain is doing when it's hearing speech is matching up those formants to internal memories to know that, uh, okay.
So I've heard this acoustic pattern. So that means I'm hearing, you know, heat.
If instead the brain is presented with this pattern of acoustic pattern, then it's, um, it's going to maybe think what's being said is the, um, hot.
So that's vowels. Now, this is, uh, consonants there.
So here I'm showing you the consonants. Um, well, I'm showing you the full spectrogram for the words.
What's this nonsense word? PEM. This. We were ten and then this nonsense word came.
So the vowel is always the same in these words. But if you look at the start of the words you have, the consonant is different.
And then if you look at the beginning of the spectrogram, you can see that there's a the way that you know, the trajectory of the Formance.
F2 and F3 are different for these three consonants.
So you can see here with ten you've got this kind of curve here for F3.
And then the F2, um, you know it goes down.
Whereas you can see for puh you haven't got that curve.
It starts low then builds up in the F3. So these F2 and F3 are, you know, doing different things for these different consonants.
Okay, let's quiz you now a little bit.
Um. Okay.
If you go on your phones or your laptops to this link, you should be able to give a response for this question.
So the question is, when saying different words with the same intonation, which parts of your vocal tract changes the most?
Is it the larynx? Is it the vocal cords?
Is it the super laryngeal vocal tracts? Or is it the lungs?
I always wish these polls would tell me how many responses.
I'm not sure if it's just, you know, 2 or 3 of you, but, uh,
seems like the clear favourites is this one, and that's that is, um, that is the correct option.
Yeah. So that's what I was saying earlier.
So the super laryngeal vocal tract is really important for shaping or filtering the sounds of the vocal cords,
which, um, you know, which you need to convey different speech sounds.
If instead the question had been, you know, if you when saying words with the difference intonation,
but at the same words, which part of your vocal tract would be would change the most anyway, would be the larynx.
Okay, so the because the larynx is important for producing the speech, but also the rates of the vocal cords vibrations determines the pitch.
Okay. All right.
Should we do attendance very quickly? Of course it gets.
Okay. The pin is 0648. Okay.
And we're going to continue now. All right. So we've done the first, um, learning outcome.
It's the one that's taken. We'll take the longest. Let's move on to explaining categorical perception.
Okay. So the way you demonstrate categorical perception is by doing this fireman like this.
Okay. So you first need to generate a continuum of sounds.
So in this case you've got on one end of the continuum bar.
And this is the spectrogram showing the formants for for for football.
And then on the other end of the continuum you've got a sound.
Um in this case da. Now along this continuum in the along the sort of around the middle of the continuum.
The listener will hear a sound that's ambiguous or intermediate between bar and da.
Okay, so then they're they're hearing sounds from across this continuum.
In random order. And then on each trial in this experiment, the listener has to do is to decide whether what they've just heard is a bar or a door.
Uh baa baa baa baa baa baa.
So I think that's supposed to sound like bar in the beginning, morphing to Da in order,
but just important to note that it wouldn't be in order in the actual experiment, right?
You'd randomise the order of these sounds, but just show you what it might sound like.
And there's better, more naturalistic ways to generate these stimuli. These are quite old stimuli, but, you know, it gives you the idea.
And yeah so I mentioned this. So one of the tasks would be to identify what sound they're hearing.
So is it a bar or is it a door. And if you plot their responses on the y axis so how accurate they are.
Also not how accurate, but the percentage of times they're saying that it's a bar, for example.
And you do this for each points of the continuum, then you'll see that when they're hearing a very clear bar,
most of the time they respond with bar when they're hearing a very clear door instead.
Then they never say bar, which means that in fact, they're responding to.
Okay, hopefully that makes sense. Now, if you look at this graph that we've just generated this here, this points marks the phoneme boundary.
So the phoneme boundary is the points in the graph. Where listeners are equally equally likely to respond bar as Da.
So in this case it's around. This is where 50% is is reached by in this by this graph.
And so. Yeah. Sorry. Before I move on to that. Now, the thing with categorical perception, it's revealed here in this graph,
the points is that this sound here is changing from bar to door in a, in a, in equal steps in a graded and equal step fashion from bar to door.
But can you see that the change in perception isn't linear? It's not.
You haven't got a constant nice smooth change in behaviour.
Instead you have this where it's remains bar at this end, and then suddenly it switches around the phoneme boundary.
And so it seems like what's happening with perception is that even though the
acoustics is graded and it's morphing between bar and da in a continuous fashion,
the perception is categorical. It just switches between a bar or door.
You don't have, um, really much of a smooth, um, transition between the two percepts.
And so that's one of the, the key characteristics of categorical perception.
We seem to be perceiving things as discrete categories rather than as a continuous variable.
So there's another way you can demonstrate categorical perception. And this by asking participants to do a slightly different task.
So in this other task they have to do a discrimination task.
So what happens is that. They hear pairs of sounds.
So for example this sounds and this sounds. And then they have to respond and say whether the pair is different or not.
And so for this, on this side of the phoneme boundary, you know, you might get this, um,
this response here, which um, which is not very good because really chance level is 50%.
So really this means that they're not doing very well when, um, the hearing a pair like this, uh,
uh, but if you present a pair that straddles the phoneme boundary, then the performance shoots up.
So we can plot this. And and this is another feature characteristic of speech perception.
And when you're doing a discrimination experiment you see a peak around in the discrimination function around the phoneme boundary.
And so this is demonstrating that really, um, your perception of this, you know,
involving discrimination between speech sounds is really tied to different the categories of phoneme categories that you know.
And so you can hear a difference when that difference spans different categories.
And so that's another characteristic of categorical perception.
So in summary, categorical perception is the tendency to perceive gradual sensory changes in a discrete fashion.
The three characteristics or hallmarks of categorical perception.
You've got this abrupt change in identification of the phoneme boundary.
You've got that discrimination peak at the founding boundary. And, um, also, you know, you can predict discrimination from identification.
So a sound pair will only be labelled as difference.
If those sounds are pairs correspond to different phonemes.
Okay. So our perception seems to really revolve around segmenting into discrete categories.
And just apply this a little bit to the real world.
You may have, uh, remember this Yanny or Laurel, uh, phenomenon that went viral on social media.
And so this is actually a demonstration of categorical perception.
So this sound was, um, ambiguous between Yanny and Laurel.
But then if you if you played this to listeners, some people heard it as a clear laurel and they never heard it as Yanny,
but other people heard it as Yanny and never as Laurel.
And so they, you know, they never heard a blend between Yanny and Laurel.
It was either Yanny or either Laurel. So this is really categorical perception, right?
Even though the sensory signal might be ambiguous and changing continuously,
your brain can't help but categorise stuff and assign a category, in this case, Yanny or Laurel.
So let's, uh, let's let's listen to this. Have a listen. So I'm going to play this this clip.
Oral. One more time.
Oral. Hands up if you heard Laurel.
Without another eight hands. I'm hearing Laurel very clearly.
And so, rest of you, just to confirm, Yanni, is that right?
Hands up your move yet. Because. So. And none of you heard a blend between Yanni and Laura, right?
Yeah. So that's that's the point. Um, as for why some people are biased or most people are biased towards Yanny.
Um, well, I think, um, I think if you were to play this to older people who've lost a bit of their high frequency hearing, they're going to hear more.
Laurel. Now. Not to say that you have people who respond to Laurel have any sort of hearing loss.
I'm not saying that, but it seems like the formants that, um, favour, uh, Yanny are a little bit higher in frequency compared with Laurel.
So if you've got, you know, if you're if you're young and you've got, you know,
good hearing still, um, then you're going to be hearing, you know, more prone to hearing Yanny.
Um, but someone may be where the hearing, you know, starting to go or you might have a little bit of a,
you know, blockage in your ear or something like that. Um, then, uh, you know, then you hear Laurel, that's the explanation.
So it's again, it's to do with, with the formants that are conveying the the different speech sounds in these, uh, in these words.
Yeah. That was I've just started sort of doing it as a poll everywhere. I've already asked you the question whether you heard Yanny or Laurel.
Okay. So we've done the first two learning outcomes.
And then we're going to finish off by discussing how speech perception is influenced by context,
an issue that I'm, you know, quite close to my heart because this is the research that I do.
Okay, so this is another thing that went viral on social media.
So have a listen to this. So first audio clip is of the words green needle.
Next audio clip is of the word brainstorm.
So I'm hoping that the first sound I played, you heard Green Needle.
The second sound I played, you had brainstorm. But, um, in fact, I played the same sound throughout.
It was exactly the same sounds. I've just played it twice. But I gave you the context or expectation of hearing Green Needle the first time.
And hopefully you heard it as green needle. Uh, the the second time around, the exact same sound.
I gave you the expectation of hearing brainstorm. And hopefully you heard it as brainstorm.
So this is a demonstration of how speech perception is very influenced by prior knowledge and contexts and expectations.
So that's, you know, people saw this in social media, but then, um, yeah,
this is this is where it came from, apparently from a YouTube recording of, um, a review of this, uh, toy.
I think if you press the button that says brainstorm green needle, um.
But, you know, it's also been demonstrated more formally in the scientific literature.
So a very well-known effect of context influencing speech perception is the so-called McGurk effect.
So the McGurk effect is, uh, the result of, um, visual context.
So lip movements, when you're seeing someone speak the lip movements, you see them a little bit before you actually hear the speech.
So there's a very short lag.
And so the the lip movements can act as a sort of give you the sort of, um, you know, predictive cue or context, um, for what you're about to hear.
And so in this video clip, it explains more specifically what the McGurk effect is.
So have a have a watch. Buh buh buh.
Buh buh buh buh. Buh buh buh.
Baa baa baa. It's very cheesy, I know, but I think it explains it nicely.
So what's going on here is that the brain is receiving a visual signal indicating that they're about to hear God.
Now, God's produced at the back of the mouse. You know, the tongue sitting at the moving high of the back of the mouth.
But then that hearing baa now baa is something where it's articulated at the front with your lips, and you hear a door you perceived on the ends.
Because Da is, in articulation terms, somewhat intermediate between baa and GA.
Um. Ta da! Yeah, somewhere in the middle. Um, in terms of articulation.
And so faced with this conflicting signals, the brain's doing its best guess and trying to come up with its best guess.
So, you know, kind of coming up with something, an interpretation that's in between the mismatching visual and auditory signals.
That's one well-known example of how context, in this case visual context, influences speech perception.
Another well-known example is the journal effects. So this is an example of a lexical context effect.
So you demonstrate the effects again with this sort of categorical perception type paradigm where you hear a continuum of sounds,
in this case between good and cooked. And so then we can ask listeners, you know, uh, for each trial, what are you hearing?
Is it a good or is it a cook? We can place that sounds between good and cooked at the start of ES.
In one condition. And then.
So we might get something like this.
And as before, you know, one end of the continuum, you get the responses that are very, um, you know, that will favour the clear.
Auditory signals if you're hearing, are clear. Good. You respond most of the time.
Good. And then, conversely, if you hear very clear, you won't be responding.
You'll be responding. In the middle. You've got these this ambiguous points.
Now that's what you might get for this condition.
But then if you place the same continuum of sounds, this ambiguous sound between good and good, but now you place it in front of IFT instead.
You might get this instead. You can see there's a clear difference when the sound is ambiguous.
In the case where you're hearing the sound in front of is your responses are biased towards.
And that's because cases are real words and guess isn't.
So your responses are very much biased by your prior knowledge of English and the words that you know in English.
Conversely, in this condition, when the sound is placed in front of IFT, your bias,
your responses are biased very much towards good because gift is a real word and gift isn't.
So I think I'm about to play the two two examples there.
Just bear in mind that in these two conditions, the start of the sound is exactly the same.
It's ambiguous and it's only the sounds that follow.
Our difference. So is versus IFT. It's not playing.
It's a crappy sound card in their sound system. So it's dropping off the the beginning of the sound.
So it's still going to work, but it doesn't matter if you get the idea. Right.
So that is, um. That is it actually.
So to wrap up the. The lecture.
These are the key points, okay, that I want you to take away.
The very important I told you about source filter theory.
So source filter theory describes speech production as two separate components.
You've got the source and the vocal cords vibrating. That's important for generating the speech and for conveying pitch and intonation.
But then you've got the filter, the super laryngeal vocal tract which is shaping the sounds from the from the vocal chords.
Um, and that's important for conveying different speech sounds.
Told you about how speech sounds are perceived in a categorical fashion, even if they are acoustically graded or ambiguous.
And so the Yanny Laurel was an example of that.
They've also told you about how perception of speech sounds is really very strongly influenced by by context.
Okay. So we've got actually time just.
Yeah five minutes. More questions.
And of course, I'm really happy to take questions over email.
Is the discussion on canvas? Um, I've got drop in hours as well.
But you know, I'm happy now. So to get questions if you have any.
So. And I mean, if you want to come up to me now, I'll stick around if you have questions and you ask me directly.
So thanks again. So I'll see you later in the week for the second lecture of the speech perception.
Contents. Thank you. Well.
Very vigorous clapping. Thank you.