Computational Linguistics and Psycholinguistics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/58

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

59 Terms

1
New cards

Understanding a language

grounded in experience and perception, uses meaning, context, and emotion. Learn from the world

2
New cards

Stimulating a language

models patterns it sees, learns from text/user input. Patterns of word cocurrence, grounded in data and probability.

3
New cards

computational linguistics

using computational methods to model how human language works

4
New cards

Natural language processing

builds applications that let computers use human languages

5
New cards

distributional hypothesis

words that appear in similar context tend to have similar meanings. cat and dog appear near words like pet, fur, feed

6
New cards

vector representations

computers that appear in similar context build vector representations. capturing meaning mathematically

7
New cards

vector semantics

words become vectors where dimensions represent co-occurrences with other words. Calculate similarity

8
New cards

embeddings

takes words as input and produce vectors as outputs

9
New cards

static embeddings

learns what vector dimensions are useful. system then predicts the next word based on a current word.

10
New cards

Advantages of static embeddings

vectors have fewer dimensions (~300 vs ~50k). Dense representations, fewer 0's

11
New cards

Disadvantages of static embeddings

Give 1 fixed vector for a word. Bank gets the same vector regardless of whether it's in the context of a river or a financial institution

12
New cards

computing analogies as static embeddings

"Man is to king as woman is to ---"

1. get relationship between "man and king"

2. apply relationship to "woman"

3. find nearest word

13
New cards

contextual embeddings

same words get different vectors depending on context. models look at surrounding words when creating representations. (ELMo, BERT)

14
New cards

Attention mechanisms

use attention to weigh which context words matter. Ie. "the river bank was flooded." Pays attention to words like river and flooded. contextual clues shape representation

15
New cards

N-grams

sequence of n words that approximate which word should go next in the sequence

16
New cards

n-grams example

"The water of Walden Pond is so beautifully..."

2 grams: "water of," "the water"

3 grams: "the water of," "pond is so"

4 grams: "walden pond is so"

infinity - gram blue

2 - gram blue

4 - gram blue

17
New cards

n-grams disadvantage

they require a lot of data to identify word relationships, diverse datasets to generate novel text, specific datasets relevant to use case of the model.

18
New cards

transformers

a neural network architecture proposed by google in 2017. LLMs like chatgpt are possible because of transformers. they have an attention mechanism that makes them good at generating human language. encode context sensitivity, keep track of word meanings in the context of the sequence

19
New cards

homonym

same word different meanings

20
New cards

Leka & shah, 2025

improve how we guide poeple to brainstorm and innovate solutions to challenges in industry

21
New cards

ignoring linguistic diversity

Most research focuses on english. billions of speakers excluded from ai benefits. digital colonialism - language hierarchies reinforced digitally. accelerate loss of endangered languages. technical and economic barriers. leads to linguistic injnustice, cultural loss, reinforces language power dynamics

22
New cards

reproduce implicit bias

encode patterns in language data. black names have greater cosine similarity to unpleasant words than white names. analogies encode stereotypes, father is to doctor as mother is to....

23
New cards

benefits of LLMs

healthcare, accessibility, education, documentation of endangered languages

24
New cards

psycholinguistics

study of how we understand, produce and learn language

25
New cards

speech perception

how we decode acoustic signals into meaningful sounds. speaker normalization, McGurk effect

26
New cards

lexical access

how we retrieve word meanings from mental storage

27
New cards

sentence processing

how we parse through grammatical structure and build meaning

28
New cards

language production

how we plan to articulate our thoughts into speech

29
New cards

speaker normalization

part of speech perception, modify our expectations about linguistic input to account for what we know about the speaker. gender, physical size

30
New cards

mcgurk effect

an error in perception that occurs when we misperceive sounds because the audio and visual parts of the speech are mismatched. we also rely on visual information to percieve sounds

31
New cards

Warren and Warren (1970)

Found that participants reported hearing sentence relevant phoneme restoration. (eel of a shoe, eel of an orange)

32
New cards

temporary ambiguity

Present during the processing of a sentence, resolved by the end of a sentence. ("the rock band (banned?) played all night")

33
New cards

garden path effect

Phenomena in which people are fooled into thinking a sentence has a different structure because of a temporary ambiguity

34
New cards

global ambiguity

not resolved by the end of a sentence, require context to determine intended structure and meaning. ("the cop saw the man with the binoculars") Prosody

35
New cards

prosody

intonation and pausing to help solve ambiguity

36
New cards

speech production

conceptualization, formulation, articulation

37
New cards

speech errors

anticipations, preservations, metathesis, spoonerisms, shifts, blends, substitutions

38
New cards

anticipations

a later unit is substituted/added for an earlier unit

39
New cards

Splicing from one tape/ splacing from one tape

anticipation

40
New cards

preservations

earlier unit substituted for later unit

41
New cards

splicing from one tape/ splicing from one type

preservation

42
New cards

metathesis

switching up 2 units taking the place of the other

43
New cards

fill the pool/ fool the pill

metathesis

44
New cards

spoonerism

metathesis that involves the first sounds of 2 separate words

45
New cards

dear old queen/ queer old dean

spoonerism

46
New cards

shift

unit is moved from one location to another

47
New cards

she decides to hit it/ she decide to hits it

shift

48
New cards

blends

two words fuse

49
New cards

grizzly and ghastly/ grastly

blends

50
New cards

substitutions

1 unit is replaced with another

51
New cards

its hot in here/ its cold in here

substitutions

52
New cards

what speech errors reveal

speech is planned in advance. there are distinct levels of planning (meaning, words, sounds). Slips of the hand occur also with sign language

53
New cards

non literal language

metaphor, idioms, irony/sarcasm

54
New cards

metaphor

understanding one thing in terms of another. "time is money"

55
New cards

idioms

fixed expressions with non compositional meaning "she spilled the beans". familiar idioms are processed as singular lexical units, unfamiliar idioms require more time

56
New cards

irony/sarcasm

saying the opposite of what you mean. "nice parking job". requires recognizing the mismatch between the statement and the context

57
New cards

sequential brain processing

1. brain computes literal meaning

2. detects literal meaning doesn't fit context

3. searches for alternative meaning

(Figurative language takes longer to process)

58
New cards

direct access theory

brain uses context from the start, accesses figurative meaning directly. Familiar metaphors/idioms are processed just as fast as literal language.

59
New cards

lack of invariance in speech perception

the fundamental problem that the same sound (phoneme) is represented by different, inconsistent acoustic signals due to speaker differences, speaking rate, and context (coarticulation), yet listeners consistently perceive the same sound