1/53
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what does it mean for a rule to be productive (2 things)
it can be applied to new instances
makes language infinite
what does it mean for a rule to be unproductive (2 things)
only applys to a fixed list of words
doesn’t generalize
from both the Brown and spanish CHILDES corpus, what is the most common type of error
over-regularization errors
briefly describe the 3 steps of U-shaped learning in regardes to the english past-tense
1 - children memorize past-tense forms
2 - learn the +ed rule and overapply it
3 - learn restrictions on the rule
what is the past tense debate really asking?
how is inflectional morphology represented in the mind?
what are 4 main points about the past tense debate? (both general info and field-related stuff)
a lot of the early debate was on English past-tense inflection
are regulars and irregulars represented/processed differently?
what does this tell us about the language faculty in a narrow sense? (how defined does the cognitive process have to be)
can we do all of this without symbolic representations? (huge implications)
what are the 2 camps in the past tense debate and what kind of model are they
regulars and irregulars are represented/processed differently (dual route model)
regulars and irregulars are represented/processed the same way (single route model)
describe 3 points of the dual route model
regulars are productive rules
exceptions are minor rules or memorized
generally associated with people with people who argue for a large faculty of language
describe 3 points of the single route model
no fundemental difference
it’s just a matter of frequency
generally associated with people who argue for small/general faculty of language
what is pinker’s symbolic approach?
minds work like computer programs, math, and logic with symbols and rules
name 2 things about connectionism
a reoccuring trend in computational cognitive science
based of artificial neural networks (ANNs)
(in regards to connectionsim) what if we could model behaviors by modeling neurons directly?
we could get away with very little or no mental abstraction
what 2 backround assumptions in connectism and 1(+implecations) points about them
it is informative to study the mind this way: is trying to model a brain directly more scientifically sound than reasoning about the mind in terms of processes and representations?
ANNs are reasonably accurate models of the brain: nessessarily simpler leading to the implications: 1) if an ANN can learn something so can a more complex brain 2) if ANN can’t learn something we can’t draw cognitive conclusions
what are classic computational theories of mind?
CCTM conceive of discrete symbolic representations and abstract processes that act on these representations
most things you’ve scene in linguistics classes (phonological rules, syntax trees)
do connectionists reject the CCTM?
yes
what do connectionists argue against the CCTM?
distributed representations. pieces of the representation that are spread across neurons and can be represented with a vector of numbers (not human interpretable)
briefly describe how CCTM (symbolic) and Connectionist (distibuted) store information
CCTM/symbolic: each piece of representation is a discrete thing, and it’s clear how it contributes to the overall picture.
Connectionist/Distibuted: the strength of each neuron doesn’t represent anything in particular, but together they make up the picture
what 4 questions come up in regards to connectionism posing a challenge to classic models?
what if general ANNs learn to form the past tence?
what if don’t need explicit rules like +ed?
what if regs and irregs are represented the same way?
what if don’t need any FLN to do this?
what did most people assume about a connectionist past-tence learn
it would be impossible
What is Rumelheart & McClelland do in 1986?
make a conenctionist past tense learning system
name 3 basic features of the past-tense learner
a bunch of features are fed in
some features pop out the other end
no rules. all about token frequency
what 3 steps would be taken to prove the model learned the past tense (3 things plus a name for the process)
computational wug test!
train it on some (present, past) pairs
give it new present forms and see what it comes up with
if it’s correct, it learned the past-tense
did RumelHart&McClelland report U-shaped learning? what was interesting about it?
yes
they achievied the over-regularization by feeding a bunch of irregs and then flooding it w regs
what was Pinker&Princes' responce to R&Ms model (title+3 things)
too much over-irregularization
the model overproduced many irregularizations (and related issues)
failure to produce strong asymetry depite favorable training
outputted few base forms (which would be a more plausable failure)
give an example of an R&M…
over-irregularization
gibberish output
doubled output
past-tense of smilge
shape-shipt
mail - membled
type - typeded
smilge - leafloag
what is the english past-tense “easy”
+ed if the default and most frequent pattern
what is the problem with using connectionists models with the english past-tense
connectionists models are sensitive to frequency, we can’t tell if a learner learned +ed because it’s the most frequent or because it’s the most meaningful
what is a better type of rule to test connectionists models with (give exmaple)
patterns where the default isn’t the most frequent, such as german noun plurals
what do single-route models struggle to achieve
asymmetry between over-reg and over-irreg
what is deep learning and why does it not prove connectionism
based on artificial neural networks, it’s not meant to be like a realistic cognitive network
what beats out deep learning?
well planned algorithmic models
what are 2 things about where deep learning currently stands
deep learning is more accurate than old connectionism
it still has all the classic problems (too sensitive to frequency, too much over-regularization)
what are 3 types of flawed outputs deep learners make
unnatural metathesis - own —> won
over-irregularization - snow —> snew
doubled outputs - bleed —> blededed
what is token-frequency
how many items in running text
of pattterns: how many times it shows up
what is type-frequency
how many items in the dictionary
of patterns: how many unique itmes it shows up with
what kind of pattern is sing~sang
high token frequency might have low frequency type and keeps showing up with just a few unique items
between token and type, which frequency works better and why does the other one work worse
type frequency works better, because we’re asking how often we should extend to a new type
toekn-frequency is misleading because of high-frequency items
exceptions are…
ubiquitous
define N, e, and θ
N = number of types that should obey the generalization
e = number of types that do not obey the generalization
θ = max number of exceptions that can be tolerated
what equation should you apply rules to
N-e
no productive rule:
look through N exceptions (ie, no rule)
which is a faster to process with big e vs small e
big e: listing all N as exceptions is faster
small e: rule+exception is faster
what is the tolerance principle (3 things)
a concrete model for the acquision of the linguistic generalization
an evaluation metric over linguistic hypotheses
developed in the context of the past-tense debate
define the concept of the tolerance principle
given a hypothesized generalization operating over some class, quantitatively define the number of exceptions below which the generalization is tenable
exceptions are tolerable if
e < θ
θ =
N / ln N
N and e vary over individual development (4 points)
N & e are properties of each individual
N is the # of class members a child has learned so far
N & e grow as the learner’s vocab grows
can learn generalizations over small N, not possible over large N
if e is below θ
acquire pattern as rule
if e is above θ
do not form rule
describe how N and θ grow
N grows over an individual’s development
θ grows more slowly
if θ grows faster than e
a pattern may fall into productivity
if e grows faster than θ
a pattern may fall out of productivity
θ grows more slowly than…
N
θ is larger relative to N when… (+2 info)
N is small
1) proportionately more exceptions are tolerable with small N
2) it’s easier to learn rules when N is small