Lecture 14: the role of negative information in distributional semantic learning

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/35

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:49 PM on 3/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

36 Terms

1
New cards

why do we try to integrate other types of information in distributional models?

  • it’s related to how humans use language and how it’s applied in the perpetual world

  • it’s how we use language but also how it’s applied to the perceptual world

  • language isn’t only cognitive, we use to to navigate the social environment

2
New cards

define the “user” extra-linguistic information

who produced the reddit comment

3
New cards

define “discourse” extra-linguistic information

in what subreddit the comment was produced in

4
New cards

what are the extra-linguistic information that can be extracted from reddit? (2)

  • user: who produced the comment

  • discourse: in what subreddit it was produced in

5
New cards

define “usage based theory”

  • kids don’t have sophisticated language processing abilities from abstract rules

  • they are sensitive to social information and pay attention to people around them

  • kids will mimic how others use language

6
New cards

why is reddit a popular choice when trying to get textual information from someone?

it provides all the comments made by a user, but also the social context it was made in (which subreddit) → helps us understand how communication plays a role in the large social scale

7
New cards

true or false: all models use sentences as a linguistic context

true: usually due to the model’s requirements

8
New cards

what is the problem with using sentences as a linguistic context when training a model?

it works well for models, but that’s not how human condition words: we don’t store information as sentences

9
New cards

define “ecological validity”

extent to which the findings of a research can be generalized to real-life naturalistic settings

10
New cards

define “word frequency”

the amount of time each word happens in a corpus

11
New cards

how is our language system organized?

  • as we process things more and more, there is some processing in our brain that makes something more represented easier to access

  • meaning, the cognitive system is adapted to the environment based on the frequency of word usage

  • we organize language usage based on how often we need to communicate with certain people

12
New cards

define “user contextual diversity” (UCD)

number of people who used a word across reddit

13
New cards

define “discourse contextual diversity”

number of subreddits a word was used in

14
New cards

what’s the difference between user contextual diversity and discourse contextual diversity?

  • user: number of people who used a word

  • discourse: number of subreddits a word was used in

*based on the idea that communication and language organization are social

15
New cards

in user contextual diversity, what does a word used commonly used across multiple users signify?

when you start interacting with someone new, you will likely need to use or process that word

*user contextual diversity: number of subreddits a word was used in

16
New cards

according to UCD, how do we organize linguistic information?

not with language itself, but according to the environment it was used in

17
New cards

what are the different forms of model? (3)

  • WW: word x word model

  • UD: user x discourse model

    • column = user

    • word count for a word = number of subreddits a user produces the word in

  • DU: discourse - user model

    • column = discourse

    • one element = number of users who produced a word in that subreddit

18
New cards

define “UD model”

word x word model

19
New cards

define “UD model”

  • UD: user x discourse model

  • column = user

  • word count for a word = number of subreddits a user produces the word in

20
New cards

define the “DU model”

  • DU: discourse - user model

  • column = discourse

  • one element = number of users who produced a word in that subreddit

21
New cards

true or false: the WW, UD and DU models were all trained on different corpora

false: it was the same corpora, but organized differently

  • WW: trained on sentences

  • UD: corpus organized by users

  • DU: corpus organized by discourses

22
New cards

[UD/DU] worked better than [UD/DU]

UD better than DU

23
New cards

true or false: when we build models, social information is negligible

false: it’s important

24
New cards

why do we need optimization?

when we have to do parameter settings, we need to have a model that will best fit with the data → optimization

25
New cards

explain how the signal detection theory works

  • there are two distributions:

  • green: there is a stimulus

    • if the sensation/memory is above the criterion, the person felt/remember the stimulus

    • hits: there was something and the person detected it

    • misses: there was something and the person did not detect it

  • blue: there is no stimulus

    • if the sensation/memory is below the criterion, the person will say they did not feel/remember the stimulus

    • false alarms: there was no stimulus and the person “detected” it

    • correct rejection: there was no stimulus and the person did not detect it

  • criterion = beta

<ul><li><p>there are two distributions:</p></li><li><p>green: there is a stimulus</p><ul><li><p>if the sensation/memory is above the criterion, the person felt/remember the stimulus</p></li><li><p>hits: there was something and the person detected it</p></li><li><p>misses: there was something and the person did not detect it</p></li></ul></li><li><p>blue: there is no stimulus</p><ul><li><p>if the sensation/memory is below the criterion, the person will say they did not feel/remember the stimulus</p></li><li><p>false alarms: there was no stimulus and the person “detected” it</p></li><li><p>correct rejection: there was no stimulus and the person did not detect it</p></li></ul></li><li><p>criterion = beta</p></li></ul><p></p>
26
New cards

explain the grid search optimization algorithm

  • test every possible combination of the parameters your have

  • measures the fit of model: red area is a better fit than blue

  • best way to optimization because you find the best settings

<ul><li><p>test every possible combination of the parameters your have</p></li><li><p>measures the fit of model: red area is a better fit than blue</p></li><li><p>best way to optimization because you find the best settings</p></li></ul><p></p>
27
New cards

when can you not do a grid search optimization?

it only works with a small amount of parameters: with too many parameters, it can become uncomputable

28
New cards

explain how drift diffusion models work

  • you have a starting parameter and you will drift towards a yes or a no response

  • non-decision time: cognitive processing before making the decision

  • drift rate/evidence accumulation: you have some process by which you are floating towards a yes or no decision based on environmental information

  • there is a threshold/boundary: where you need to drift to make a decision

<ul><li><p>you have a starting parameter and you will drift towards a yes or a no response</p></li><li><p>non-decision time: cognitive processing before making the decision</p></li><li><p>drift rate/evidence accumulation: you have some process by which you are floating towards a yes or no decision based on environmental information</p></li><li><p>there is a threshold/boundary: where you need to drift to make a decision</p></li></ul><p></p>
29
New cards

what do drift diffusion models measure?

reaction time when making a decision

*pretty powerful, can explain a lot of data

30
New cards

what’s the problem with drift diffusion models? (2)

  • it requires a lot of computations (parameters)

  • bias factors: in some tasks, you are more likely to say yes or no

31
New cards

how does the simplex algorithm work?

  • it’s a sample of the parameter space

  • at first, it will test random parameters on the sample and find the area that seem to work best

  • it will then sample around that position

  • it will then go down to smaller spaces until it finds the best fitting setting

<ul><li><p>it’s a sample of the parameter space</p></li><li><p>at first, it will test random parameters on the sample and find the area that seem to work best</p></li><li><p>it will then sample around that position</p></li><li><p>it will then go down to smaller spaces until it finds the best fitting setting</p></li></ul><p></p>
32
New cards

what’s a problem encountered with the simplex algorithm?

when there are local maxima: it will sample at the local maxima but forget about the global

33
New cards

true or false: the simplex optimizing function can be applied to all problems

false: some of them don’t have a deterministic space

*ex: drift diffusion models are not deterministic because they have no predictable path

34
New cards

what’s the efficient form of a grid search? explain it

  • random search optimization/genetic algorithms

  • you start with different models with different settings

  • you select the most fir and they will breed with each other

  • the selected or produced models are the next generation

  • you evaluate their fit and select the best performing

  • (there can be some slight randomness)

  • you will find a solution without needing all the parameters combination

35
New cards

true or false: backpropagation is an optimization algorithm

true: this means that LLM (large language models) and word2vec are optimized (to some extent)

36
New cards

what is needed in the parameter space of a BEAGLE? (3)

  • vector dimensionality

  • n-gram size window

  • vocabulary size

Explore top notes

Explore top flashcards

flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)
flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)