Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Explore Top Notes

Chapter 10: Childhood Disorders

Studied by 9 people

44. Nhập vào một chuỗi, hãy in từ đầu tiên trong chuỗi

Studied by 1 person

Studied by 15 people

HAP 355 Midterm

Studied by 19 people

Hurricane Hazards In Depth

Studied by 14 people

Chapter 17: Contemporary Property

Studied by 15 people

Statistiek 3 voor Pedagogen – Week 1 College

Definition Statistics

"Statistics is the science of collecting, organizing and interpreting numerical facts, which we call data."
- Source: Statistiek in de Praktijk, David S. Moore / George P. McCabe, 1994
Important matters for the application of statistics (“Applied Statistics”):
- Selecting a sample from a population
- Deciding whether a sample is representative
- Descriptive or inferential statistics
- Measurement levels (NOIR) and types of variables (categorical/quantitative)
- Selecting the correct statistical analysis
- Experimental versus non-experimental research design

Methods (Design) & Statistics (Toolkit)

Important for the application of statistics ("Applied Statistics"):
- Selecting the correct statistical analysis

Programma Statistiek 3

6 hoorcolleges
- maandag: theorie
6 interactieve colleges
- woensdag: voorbereiding tentamen
5 werkgroepen (verplicht)
- maandag of woensdag: werken aan opdrachten
Week 7: Q&A sessie (op woensdag)
Literatuur:
- Warner (2020) - Applied Statistics II – International Student Edition 3rd Edition (tentamen)
- Warner (2013) - Applied Statistics – From bivariate through multivariate techniques: - (tentamen) en
- Agresti & Finlay (2012/2018) – Statistical Methods for the Social Sciences (tentamen)

Hoorcolleges

Maandag (theorie)
- Op de campus
- Introduceren nieuw(e) hoofdstuk(ken)
- Insluiten nieuwe theorie in de praktijk en in het weekschema
Woensdag (voorbereiding tentamen)
- Op de campus
- Interactief en met korte quizjes
- Verheldering, voorbeelden en herhaling

Werkgroepen

Week 2 t/m 6
- Op de campus
- Aanwezigheid verplicht
- Oefenen met SPSS, vragen stellen
- Inhoudelijke vragen à tutor à discussieforum à Q&A

Toetsing

Tentamen: 30 meerkeuzevragen:
- Woensdag 28 Mei 2025
- 10 vragen over Statistiek 1 & 2 (A&F 2009/2012/2018)
- 20 vragen over de statistische analyses uit Statistiek 3 (Warner, 2013/2020)
Eindcijfer = Tentamencijfer

Materiaal en leerdoelen (1)

Hoorcolleges:
- Theorie, samenhang, herhaling en samenhang: week 1 t/m 6
- Q&A in week 7
Werkgroepen:
- Maandag en woensdag: oefenen en mogelijkheid vragen te stellen aan tutor
- Aanwezigheid verplicht, maximaal één afwezigheid toegestaan
Boek :
- Theorie: Agresti (2018) Ch. 9 + 12, Warner (2020-II): Ch. 5, 7, 8, 9, 11, 14 of Warner (2013): Ch. 6, 9, 12, 13, 15, 16, 17, 19, 22
- Practice: comprehension questions at the end of every chapter
Herhaling Statistiek 1 and 2:
- Hoorcollege week 1.
- StatTalk: “Knowledge-clips” (4-5 min) divided per topic.
- Grasple
Canvastoetsen/quizzes:
- Wekelijkse formatieve toets; uit iedere wekelijkse toets wordt één (bewerkte) vraag in het tentamen gebruikt.
- Oefententamens Statistiek 1 & 2

Materiaal en leerdoelen (2)

Doelstellingen Statistiek 3:
- Herhaling statistiek 1+2 (met name de methoden en assumpties), plus nieuwe toevoegingen en het toepassen van deze methoden in de praktijk (SPSS).
- Ontdek de samenhang tussen de verschillende methoden in het raamwerk van het Generalized Linear Model (GLM), en daarmee…
- vormt Statistiek 3 een goede basis voor de B-these.

Statistiek in de praktijk: Pedagogische Wetenschappen

Belang van goed empirisch onderzoek (en daarvoor is statistiek noodzakelijk):
- “Regression to the mean. It is a statistical fact of life that extreme scores tend to become less extreme upon re-testing, a phenomenon known as regression toward the mean (Kruger, Savitsky, & Gilovich, 1999). Regression to the mean can fool therapists and patients alike into believing that a useless treatment is effective (Gilovich, 1991)."

Overview Lectures: by week

Recap(itulation) lecture
ANOVA / Regression
Factorial ANOVA
ANCOVA
Mediation / Moderation
MANOVA / Repeated Measures

Overview Lectures: by topic

Recap lecture
ANOVA
Fact. ANOVA
ANCOVA
MANOVA
Rep. Measures
Q&A lecture
F-test
Moderation
TSS
MSE
Tukey Contrasts
Bonferroni
Sphericity
DF
Mediation
Type II error
Type I error
Dummy

Lecture Week 1: Recapitulation

Overview of het most important concepts in statistics:
- Descriptive vs inferential statistics
- Data, population and sample
- Reliability and validity
- Variables, measurement levels and range
- Central tendency-, dispersion-, and position measures
- Population distribution, sample distribution and sampling distribution
- Central Limit Theorem and hypothesis testing
Focus on empirical analyses:
- Comparison of 2 groups on one quantitative outcome variable (t-test)
- Comparison of 2 or more groups on one quantitative outcome variable (ANOVA)
- Determine relation between 2 quantitative variables (regression analysis)

Definition Statistics

"Statistics is the science of collecting, organizing and interpreting numerical facts, which we call data."
- Source: Statistiek in de Praktijk, David S. Moore / George P. McCabe, 1994
A&F: Statistics consists of a body of methods for obtaining and analyzing data, to:
- Design [research studies that]
- Describe [the data to]
- Make inferences based on these data.
- Descriptive Statistics:
  - Descriptive statistics summarize sample or population data with numbers, tables, and graphs
- Inferential Statistics:
  - Inferential statistics make predictions about population parameters, based on a (random) sample of data.

Data, population, sample, reliability, validity

Doing research by means of data: observation of characteristics
- Population: the total set of participants, relevant for the research question
  - E.g. Population parameter: average hour of self study per week of all students.
- Sample: a subset of the population about who the data is collected
  - E.g. Sample statistic: average hour of self study per week of a randomly selected sample of 800 students
Good data is necessary to answer the research question:
- Reliability (Precision)
- Validity (Bias)

Variables, measurement levels and range

Variable: measures characteristics that can differ between subjects
- Types: behavior-, stimulus-, subject-, physiological variables
- Measurement scales (NOIR):
  - Categorical/qualitative
    - Nominal unordered categories (eye color, biological sex)
    - Ordinal ordered categories (disagree/neutral/agree)
  - Quantitative/numerical
    - Interval: equal distance between consecutive values (°C)
    - Ratio: equal distance and true zero point (K)
- Range:
  - Discrete: measurement unit that is indivisible (# brothers/sisters)
  - Continuous: infinitely divisible measurement unit (body height)

Summarized Scale

Characteristic	Ordered	Interpretable differences	Absolute zero point
Nominal
Ordinal	✓
Interval	✓	✓
Ratio	✓	✓	✓

Absolute zero point means that the theoretically lowest possible value indicates an absence (value 0)

Descriptive statistics

In descriptive statistics, 3 dimensions are of importance:
- Central tendency - “typical observation”
  - Central tendency measures: mean, mode, median …
- Dispersion - “variability in observations”
  - Dispersion measures: standard deviation, variance, interquartile range
- Position - “relative position of the observation(s)”
  - Gives information about relative positions of observations: percentile, quartile, …

Sample problems with inferential statistics

Goal: reliable and valid statements about the population based on a sample:
- Sample statistic should not differ from population parameter
Problems:
- Sampling error - “natural (random) sampling variation”
- Sampling bias - “selective sampling”
- Response bias - “incorrect answer”
- Non-Response bias - “selective participation”
Important difference between problems concerning reliability (error) and validity (bias).
Solution:
- “A random (or other probability) sampling approach of sufficient size that generates data for everyone approached, with correct responses on all items for all subjects.”

Dimensions of distributions

Population distribution
- Proportion of students indicating the need for extra support in mathematics.
Sample data distribution
- Proportion of students in the sample (here n = 1000) indicating the need for extra support in mathematics.
Sampling distribution
- The probability distribution for the sample statistic (proportion/mean/regression coefficient). To interpret as the result of repetitive taking of a sample of size n (here n=1000).
- Standard deviation of: = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.38 (1-0.38)}{1000}} = 0.015
- Standard error (σM) estimated by SEM

Central Limit Theorem for sampling distribution

Empirical rule for normal distribution
- 68% within ± 1 of the mean
- 95% within ± 2 of the mean
- almost 100% within ± 3 of the mean
Jaccard and Becker (2002):
- Given a population [of individual X scores] with a mean of μ and a standard deviation of σ, the sampling distribution of the mean [M] has a mean of μ and a standard deviation [generally called the “[population] standard error,” σM] of \frac{σ}{\sqrt{N}} and approaches a normal distribution as the sample size on which it is based, N, approaches infinity. (p. 189)

Types of probability distributions - I

(Standard) normal distribution à z-statistic
- Sampling distribution for proportion(s) when H0 holds.
- (Sampling distribution for mean when H0 holds and when the population standard deviation is known)
Student’s T distribution(s) à t-statistic
- Sampling distribution for mean when H0 holds and when the population standard deviation is unknown.
- Sampling distribution for regression coefficient(s) when H0 holds.

Types of probability distributions - II

Chi square distribution(s) à χ2-statistic
- Sampling distribution for squared deviations (in frequencies) of categorical variables when H0 holds.
Fisher’s distribution(s) à F-statistic
- Sampling distribution for ANOVA omnibus test of means when H0 holds.

Sampling distribution and hypothesis testing

Significance-test or hypothesis-test:
- Method through which you determine, based on the sample, how strong the evidence against a certain hypothesis is and subsequently decide to (not) reject this hypothesis.
5 steps of a hypothesis test:
- Defining assumptions
- Set up hypothesis
- Calculate test-statistic (e.g. t-value)
- Determine p-value
- Draw conclusion

Type 1 & Type II error

Probability of a Type I-error (false positive) is determined by:
- The chosen significance level (α).
Probability of a Type 2-error (false negative) is determined by:
- Effect size
- Sample size
- Variance (dispersion) in the sample
The smaller the chosen Type I-error, the larger the acquired Type 2-error, given a certain sample..

Hypothesis-Testing Examples

Comparison of 2 groups with one quantitative outcome variable (t-test)
Comparison of 2 or more groups with one quantitative outcome variable (ANOVA)
Determine the relation between 2 quantitative variables (regression analysis)

Comparison of 2 groups: t-test

Comparisons between 2 samples:
- Dependent samples
  - Husbands and wives (e.g. time spent on household activities)
  - Repeated measures: the same person on two different points in time (e.g. extent of depression symptoms before and after therapy)
- Independent samples:
  - Men and women in randomly selected samples
  - Democrats and Republicans
Null hypothesis : H0: m1 = m2
Assumptions of an independent samples t-test:
- Dependent variable is quantitative and normally distributed (interval/ratio-level)
- Equal variances for both groups: s21 = s22
- Independent observations (within and between groups)

Comparison of 2 or more groups: ANOVA

ANOVA: ANalysis Of VAriance
- One-way between subjects ANOVA
  - Each participant falls into only one group (e.g. 4 types of stress situations)
  - For each participant there is one observation (e.g. self-reported anxiety)
- Groups are determined by the levels (categories) of the factor:
  - In this case the number of different stress situations
Null hypothesis : H0: m1 = m2 = … = mk
Assumptions for an ANOVA omnibus test:
- Dependent variable is quantitative and normally distributed (interval/ratio level)
- Equal variances for all K groups: s12 = s22= … = sk 2
- ‘Independent observations’ (within and between groups)

ANOVA test-statistic: F-ratio

ANOVA:
- F = MSbg/MSwg MS= mean square, bg = between groups, wg = within groups
- Numerator (MSbg) information about variance in means between groups (M1, M2, … Mk)
- Denominator (MSwg) information about variance in means within groups
The F-test is an omnibus test (‘global test’): is there a difference between one or more of the means?
- An F-test does not show which groups differ!
F-test signficant? Two ways to test for differences between specific groups:
- Post hoc (after the fact, after data collection, explorative) à Tukey’s test
- A priori (planned beforehand, confirmative) à contrasts, regression analysis

Variance analysis: ANOVA Sums of Squares

Group-indicator = i (i = 1, …, k)
Participant-indicator = j (j = 1, …, l )
First partition each deviation (Y{ij} – MY) = total variance
(Y{ij} – Mi) Unexplained variance within group and (Mi – MY) Explained variance between group components:
- (Y{ij} – MY) = (Y{ij} – Mi) + (Mi – MY)
Square each component: (Y{ij} – MY)^2 = (Y{ij} – Mi)^2 + (Mi – MY)^2
Then sum the squared components across all scores in entire dataset \sum(Y{ij} – MY)^2 = \sum(Y{ij} – Mi)^2 + \sum(Mi – MY)^2
SS{total} = SS{wg} + SS_{bg}

One-way ANOVA Table

k = Number of groups
N = Total number of observations
df = Degrees of Freedom

Relation between variables: to bivariate statistics

The univariate (“one variable”) statistics:
- Measures of central tendency
- Measures of dispersion
- Confidence interval mean/proportion
- Significance test mean/proportion
- Significance test difference between groups
Bivariate (“two variables”) statistics is about investigating a possible association between two different variables:
- Predictor variable or independent variable
- Outcome variable or dependent variable

OLS- Bivariate Regression

Other methods used in Statistics 3 ([M]AN[C]OVA) can be related to OLS- regression (together GLM)
Association between 2 variables
- E.g.: exam grade (Y) and hours of self study (X)
Null hypothesis : H0: ρ= 0, H0: b = 0; H0: R = 0
Assumptions bivariate regression (simple linear regression)
- Dependent variable (Y) is quantitative and independent variable (X) is quantitative or dichotomous.
- There is a linear relationship between Y and X.
- Independent observations.
- Equal variance of errors.
- Errors are normally distributed with a mean of 0 for all values of X.

Regression analysis: components

Assumed functional form for the population:
- Yi = β0 + βXi + εi
Regression function: Yi' = b0 + bX_i
- Yi' predicted value for Yi
- X_i observed X for person i
- b_0 intercept
- b slope
- Yi– MY total deviation from the mean
- Yi'– MY predicted part by X_i
- Yi – Yi' error/residual of the prediction
SS{total} = SS{residual} + SS_{regression}
R^2 = \frac{SS{reg}}{SS{total}}
SE{est} = \sqrt{\frac{\sum (Y - Y')^2}{N-2}} = \sqrt{\frac{SS{residual}}{N-2}} = \sqrt{(1-R^2) * \frac{SS_{total}}{N-2}}

1 example tested in 3 different ways

Difference between 2 groups: Independent samples t-test
Difference between 2 groups: OLS bivariate regression
Difference between 2 groups: one-factor ANOVA (between-subjects)

Verder in week 1:

Zelfstudie:
- Agresti (2018) Ch.9 of Warner (2013) Ch.9
- Canvas quiz week 1

Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Explore Top Notes

Chapter 10: Childhood Disorders

Studied by 9 people

44. Nhập vào một chuỗi, hãy in từ đầu tiên trong chuỗi

Studied by 1 person

Studied by 15 people

HAP 355 Midterm

Studied by 19 people

Hurricane Hazards In Depth

Studied by 14 people

Chapter 17: Contemporary Property

Studied by 15 people