Syntax: Formalism, Theory, and Phrase Structure Rules
Introduction to Linguistic Formalism: Syntax Trees
Purpose of Syntax Trees:
To explain issues previously raised with syntax.
To clarify the facts of constituency: how languages group words into phrases, and then phrases into sentences.
To account for the infinite use of finite means: the unbounded capacity inherent in human language knowledge.
Understanding through Mechanical Application:
Learning syntax rules is analogous to executing algorithms in mathematics (e.g., long division, multiplication).
Accurate execution of these algorithms facilitates a deeper understanding of what the rules accomplish and why specific linguistic structures arise.
Mastering basic concepts (like previous work on IPA symbols) is a crucial prerequisite for understanding higher-level material.
Constituency and Hierarchical Structure
Constituency Tests: Linguists have developed various constituency tests to discover which words group together into constituents. These tests are vital for comprehension and exam preparation.
Hierarchical Structure:
Languages do not merely operate as a linear sequence of words. Instead, they possess a hierarchical structure, where words are grouped into "chunks" (constituents).
Syntax trees provide a formal method for representing these hierarchical chunks.
Trees as Formal Hypotheses:
Constituents are considered "brute facts" about language, inherent in how languages work.
Our task is to explain why certain constituents arise and, importantly, why others do not.
Syntax trees serve as a theory about constituents, designed to generate constituents in the same manner that the human mind represents them. The ultimate goal is to create a theory that accurately mirrors the structure known in the mind of a competent speaker.
Rules of Syntax and Linguistic Diversity
Functions of Syntactic Rules:
Organize words into phrases and phrases into sentences.
Determine the correct word order for a given language (e.g., English as Subject-Verb-Object (SVO) vs. Japanese as Subject-Object-Verb (SOV)).
Determine the grammatical relations within a sentence.
Tension: Universal Mind vs. Language Diversity:
There is a fundamental tension between the idea of a dedicated language faculty in the human mind and the vast diversity observed across the world's languages (all 6 possible word orders are attested, with varying frequencies).
A rule system must be flexible enough to accommodate the diversity of natural languages while remaining inherently constrained.
The strategy for this unit is to initially focus on the English language; however, the overall structural format of the rule system is intended to be generalizable to all languages, requiring specific modifications as needed.
Importance of Grammatical Relations (Subject, Object):
Concepts like subject, direct object, and indirect object are crucial. For example, the notion of a subject is essential for forming yes/no questions in English through operations like subject-auxiliary inversion.
Semantic Roles and Configurational Languages:
The structural configuration of a sentence (as represented by the syntactic tree) has significant consequences for interpreting the semantic roles of elements (i.e., who is doing what to whom).
Example: "The student watched the teacher" vs. "The teacher watched the student" convey completely different meanings, even with the same words, because their underlying syntactic configuration differs.
English is a configurational language, meaning the tree structure heavily influences the assignment of semantic roles.
The consistency people exhibit in interpreting these roles (e.g., in active vs. passive constructions like "The teacher was watched by the student") underscores the deeply rooted nature of these structural insights.
Beyond Surface-Level Analysis:
Sentences possess an internal organization that extends far beyond a simple linear sequence of parts of speech.
While syntactic categories (parts of speech) are important, they are insufficient on their own to explain grammatical judgments.
Example: Comparing
1a."Jack and Jill ran up the hill" (good) and2a."Jack and Jill ran up the bill" (good) to1b."Jack and Jill ran the bill up" (good) vs.2c."Jack and Jill ran up the hill" (bad if 'hill' is a direct object and not a prepositional phrase). The differing grammaticality of sequences that appear similar on the surface requires an explanation rooted in underlying constituent structure, not just word order or part-of-speech sequence.
Syntactic Categories and Phrase Structure
Lexical Categories (Word Levels):
Noun (N):
puppy,happinessVerb (V):
to find,to run,to throwAdjective (A):
red,big,fairAdverb (Adv):
carefully,never(note: not all adverbs end in-ly)Preposition (P):
up,across
Phrasal Categories (Constituents): These are the larger "chunks" of language that combine to form sentences.
Noun Phrase (NP)
Verb Phrase (VP)
Adjective Phrase (AP)
Adverb Phrase (AdvP)
Prepositional Phrase (PP)
Complementizer Phrase (CP)
Sentence (S)
Head of the Phrase:
The most semantically and syntactically important word within a phrase.
The lexical category of the head determines the label of the phrasal constituent (e.g.,
puppyis the head of the NP "a puppy";happyis the head of the AP "very happy").A single word can constitute an entire phrase (e.g.,
Puppiesis an NP;Happyis an AP;Brightlyis an AdvP).
Function Words:
Determiners (Det): Combine with nouns (e.g., articles like
the,a; possessive pronouns likemy; quantifiers likeevery,some; number terms liketwo; demonstratives likethis,that).Auxiliaries (Aux): "Helping verbs" (e.g.,
have,be,do). A reliable test for auxiliaries is their ability to move to the front of a sentence to form a yes/no question.Modals: A sub-category of auxiliaries (e.g.,
can,may,might,should,will,shall).Complementizers (C):
Function: To introduce an embedded clause (a clause – or sentence – that is contained within another clause).
This function is central to explaining the unbounded capacity of language; complementizers signal that a following clause should be interpreted as nested within another (e.g.,
thatin "I know that you are here").
Tree Terminology: Nodes and Relationships
Node: Any point in a syntax tree that has a label (e.g., S, VP, V, Det).
Root Node: The topmost node from which the entire tree structure emanates (typically labeled S for Sentence).
Leaves: The bottommost nodes in the tree, typically representing individual words or lexical categories.
Familial Relationships (for navigating trees):
Mother Node: A node directly above and connected to another node.
Daughter Node: A node directly below and connected to its mother node.
Sister Nodes: Two or more daughter nodes that share the same mother node.
These local relationships (mother, daughter, sister) provide an intuitive framework for understanding and reading tree structures in syntactic exercises.
Phrase Structure Rules (PS-Rules)
Definition: Formal rules that explicitly specify the internal constituency of a given syntactic category (e.g., what an NP or a VP consists of).
Purpose:
To systematically generate syntax trees, rather than merely drawing them from intuition or memory.
A finite, restricted set of PS-rules can account for and generate an infinite number of grammatical sentences in a language.
Rule Format and Interpretation:
A general rule format is
XP ightarrow YP (Z) Wp^{{+}}. (Note: in LaTeX^+means one or more, but in the transcript it's used more loosely to mean 'any number of').Left Hand Side (LHS):
XP– The name of the constituent being built (e.g.,S,NP).Arrow (
ightarrow): Pronounced "consists of".Right Hand Side (RHS): Specifies the elements that form the constituent, read from left to right, indicating their order.
X, Y, Z, Ware variables representing any syntactic category (e.g., N, V, A, P).Parentheses
(): Indicate an optional constituent. If an element is enclosed in parentheses (e.g.,(Det)), it means that constituent may or may not be present.No Parentheses: Indicate an obligatory (or compulsory) constituent. It must be present in the phrase.
Plus Sign (
^{{+}}): Indicates "any number of" or "one or more" of that constituent (e.g.,AP^{{+}}means one or many adjective phrases).
Examples of PS-Rule Application for Noun Phrases (NP):
Basic NP:
NP ightarrow N(GeneratesJohn,Mary).NP with Optional Determiner:
NP ightarrow (Det) N(Generatesslippersorthe slippers).Comprehensive NP Rule:
NP ightarrow (Det) (AP^{{+}}) N (PP^{{+}})This rule can generate a vast range of noun phrases, including:
slippersthe slipperspink slippersthe pink slippersthe big pink comfortable slippers(due toAP^{{+}})the book of poems(due toPP)the book of poems with a red cover(due toPP^{{+}})the book of poems with a red cover from New York
Rule-Tree Correspondence:
A drawn syntax tree must directly correspond to one or more phrase structure rules. If a tree cannot be generated by the provided rules, it is considered incorrect.
Rules are conceptually "inverted" by 90 degrees when drawing a tree, with the LHS node branching downwards to its RHS constituents.
Unbounded Capacity Explained by Rules: The inclusion of optional elements (
()) and iterative elements (^{{+}}) in phrase structure rules allows a finite set of rules to generate an infinite number of grammatical phrases and sentences. This property is essential for a theory that aims to explain the unbounded nature of human language knowledge and its capacity.Assessment Note: Students are not required to memorize the specific set of rules provided but must be able to understand and apply them correctly.