Statistics

0.0(0)
studied byStudied by 3 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/136

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

137 Terms

1
New cards

Alternative hypothesis H1

Assumes there is an effect or relationship and that any observed effect is not due to chance. Therefore the data is statistically significant.

2
New cards

Randomisation

Assigning particapants into different experimental groups by chance to prevent bias

3
New cards

True value

Actual correct value of the quantity being measured

4
New cards

Measured value

Results you get from taking a measurement with an instrument

5
New cards

Generalisation

Refers to the extent that the sample data can be accurately applied to the entire population.

6
New cards

Study design

Is a plan that outlines the methodology for conducting a research, which details how the data is collected measured and analysed

7
New cards

Probability distribution [new]

A statistical function [rule] that describes all the possible outcomes of a random variable and measures the likelihood of observing each outcome.

8
New cards

Central Limit theorem 

states that for large samples (n > 30), the sampling distribution of the mean will be approximately normally distributed, and the mean of this distribution equals the population mean.

9
New cards

Box Plot

visually displays the distribution of a dataset by showing its five-number summary:

the minimum, lower quartile (Q1), median(Q2), upper quartile (Q3), and maximum.

It uses a box to represent the middle 50% of the data (the interquartile range) and "whiskers" to show the rest of the data's range

10
New cards

Z scores/ Standard score

A statistical measure that indicates how many standard deviations a data point is from the mean of a dataset

11
New cards

Probability Distribution

A probability distribution is a smooth curve that represents the likelihood of different outcomes. It is an idealised version of real-world data, and the area under the curve shows the probability of values occurring.

if histogram follows the probability distribution, it means the pattern of your sample data is similar to the expected population pattern.

It Serve as an idealised model of the population and Lets you see how well your sample matches the expected pattern (probability distribution curve)

12
New cards

Measure of dispersion/spread

Describes how spread out a set of data is from its central value, indicating the data's variability. This includes the range, interquartile range and standard deviation

13
New cards

Interquartile Range (IQR)

A measure of statistical dispersion, showing the spread of the middle 50% of your data/scores

14
New cards

Bootstrap

A statistical method that involves resampling a dataset with replacement to create many simulated samples. A method to transform data.

15
New cards

Transforming data

Is the process performed to make the data more suitable for statistical analysis by helping it meet the assumptions of tests

16
New cards

Skewness 

Refers to the asymmetry of a data distribution, with main types:

positive skew (right-skewed) - tail on the right

negative skew (left-skewed) - tail on the left 

17
New cards

Data Trimming

The process of removing a specific percentage of the most extreme values from the top and bottom of a dataset to reduce the influence of outliers.

However outliers can be meaningful and could force our data into a false distribution.

18
New cards

Frequency Distribution

A table or graph that shows how often values or events occur within a dataset. it is often visualised using a histogram. 

19
New cards

Kurtosis

How much a distribution deviates from normal by looking at the spread.

refers to how tall or short the curve is on the graph

20
New cards

Bimodal 

A distribution that has two distinct peaks or modes, indicating two high-frequency values or clusters within the data

21
New cards

Median 

The middle value of a data set and mostly used with ordinal data, skewed data or non parametric data.

22
New cards

Measure of Central Tendency 

Numerical values that describe the centre of a dataset, with the three most common being the mean, median, and mode. They are single (value), statistical models of the data 

23
New cards

Reliability

Refers to the consistency and repeatability of a measurement or assessment under the same conditions

24
New cards

Test- retest reliability

The ability to measure and produce consistent results when the same entities are tested at two different points in time

25
New cards

Content Validity 

Is a measure of how well a test or instrument covers all relevant aspects of the concept it is designed to measure.

The instrument/methodology actually measures the effect the researcher is interested in.

example: does simon says really test short term memory well.

26
New cards

Validity

The degree to which a measurement/tool or test accurately measures what it is intended to measure.

Accuracy of a method.

27
New cards

Measurement Error 

The difference between the true value and the measured value.

This can occur due to human error or limitation in the measurement tool

28
New cards

Correlation Analysis 

A statistical method used to evaluate the relationship between two variables, determining the strength and direction of their linear association.

29
New cards

Coefficient Corelation 

A statistical measure that quantifies the strength and direction of a relationship between two variables, ranging from -1 to +1

30
New cards

Predicted Value

An estimate of the dependent variable based on the values of independent variables in a regression model

31
New cards

Expected value 

Represents the average value you would expect if an experiment were repeated many times and is a weighted average of all possible values. 

32
New cards

Research Question

a clear, focused, and concise question that serves as the foundation for a research project, guiding the study's direction and defining its scope

33
New cards

Level of data

Refers to the four scales of measurement—nominal, ordinal, interval, and ratio—that determine how precisely data is recorded

34
New cards

Independent Variable

What the researcher manipulates and controls to see if it has an effect on the dependent variable (DV)

35
New cards

Within Subject cons (repeated measure)

Longer experiments

36
New cards

Pro Within Subjects (Repeated measures)

  • Participants characteristics are not a problem (they take part in all conditions)

  • Requires fewer participants 

37
New cards

Pros of Between subjects (Independent Sample)

  • performance not influenced by boredom, fatigue, practice effect 

  • Shorter experiments

38
New cards

Cons of Between subjects (Independent sample)

  • Hard to match participants

  • More participants required

39
New cards

Qualitative

Non-numerical data, focusing on meaning, experience, and in-depth understanding rather than measurement

40
New cards

Post hoc test 

a statistical analysis performed after a primary test, such as an ANOVA, to determine which specific group means are statistically different from one another.

Telling us where the difference lies.

41
New cards

Participant variables

Differences in participants' backgrounds that could affect the outcome, such as age, intelligence, or prior experience.


42
New cards

Demand characteristics

Cues from the environment or experiment that might lead participants to act in a way they believe is expected. 

43
New cards

Experimenter effects

Unintentional actions or behaviors by the researcher that could influence participant responses.

44
New cards

Left Skew

Negative skew, describes a data distribution with a long tail extending to the left,

Negative scores are skewing the data. (Below the mean)

indicating most values are clustered at the right side.

45
New cards

Pearson correlation

measures the strength and direction of a linear relationship between two continuous variables, resulting in a coefficient that ranges from -1 to +1. A value

It is a parametric statistical test

46
New cards

Right Skew 

Describes a distribution where most data points are clustered at the low end, with a long tail extending to the right due to a few high-value outliers.

Positive scores (greater than the mean) which are skewing the data)

  • since scores are higher at the start, its higher then the mean and then drops which makes the asymmetrical, skewed data.

47
New cards

Bias

a systematic error that causes a result to be inaccurate or skewed, meaning it does not accurately represent the population it's meant to study

48
New cards

Situational variables

Environmental factors that are not part of the experiment design, like the temperature, time of day, or noise levels.

49
New cards

Order effects

An order effect is a change in the results of a psychology experiment that occurs because of the specific sequence in which participants are exposed to different conditions or treatments

50
New cards

Representive sample 

a smaller group of a larger population that accurately reflects the key characteristics of that group, such as demographics and behaviors

51
New cards

representative sample

A smaller group selected from a larger population that accurately reflects the characteristics of the whole population

52
New cards

Practice Effect

An improvement in performance on a task that results from repeated exposure or practice, rather than from a specific intervention

53
New cards

Extraneous Variable 

Is anything other than the independent variable that could potentially affect the results of an experiment. If not controlled, these variables can lead to inaccurate conclusions about the relationship between the independent and dependent variables.

54
New cards

Qualitative

Non numerical information that describes the qualities characteristics or attributes.

55
New cards

Confounding variables

An extraneous (external) factor that influences both the independent and dependent variables in a study, leading to a misleading association between them.

We want to control this third variable.

56
New cards

Null hypothesis significant testing

Null hypothesis significance testing (NHST) is a statistical method used to determine if a relationship or effect observed in a sample is likely to be real in the population or simply due to chance

57
New cards

Research study

Systematic and detailed investigation into a specific problem using scientific methods

58
New cards

Casual statement

A statement that describes the cause and effect relationship between variables or events

59
New cards

Inferential statistics

Focuses on using a representative sample data to make generalisations about the population and to test the hypothesis, relationship and effect

60
New cards

Descriptive statistics

Used to summarise and describe the characteristics of a data set. This includes measure of central tendency and measure of dispersion. 

Only tells us about the sample not the population.

61
New cards

Ecological Validity

The extent to which the findings of a research study can be generalized to real-life settings.

Evidence that ensures the results of the study/experiments can be applied and allow inferences to real world conditions.

62
New cards

Sphericity

It is the assumption that the variances of the differences between all possible pairs of within-subject conditions are equal

63
New cards

Predictor variables

As another name for the Independent Variable (IV). It is what the experimenter manipulates or controls.

64
New cards

Goodness of fit

A statistical assessment that determines how well a sample of data fits a specific distribution or model.

It asses the inconsistency between the observed and expected value from the model, helping to assess how well the model represents the data. 

65
New cards

Sample

Is a subset of a larger population that is studied to make inferences about the entire group.

66
New cards

Standard Deviation (SD)

Measures how spread out the data points are from the average (mean) of the data set. It indicates how much individual data points typically deviate from the mean value.

  • Low Standard Deviation: Data points are clustered tightly around the mean, showing consistency.

  • High Standard Deviation: Data points are more spread out, showing greater variability.

67
New cards

The proportion of variance in the dependent variable that is explained by the independent variable(s) in the model. Used for linear regression (one IV and one DV).

68
New cards

𝜔2

is an effect size measure used primarily in (ANOVA) that quantifies the magnitude of the relationship between an independent variable and a dependent variable.

69
New cards

Adjusted R-squared (Adj. R²)

The proportion of variance in the dependent variable that is explained by the independent variable(s) in the model, taking into consideration of the sample size and its predictors. Used for multiple regression (multiple IVs).

70
New cards

Effect size

A statistical measure that quantifies the strength and the direction between two variables. The magnitude of difference between groups.

71
New cards

Magnitude

Refers to the strength and direction of the relationship or effect between variables. General descriptive term. Linked to standardized beta (\beta) because \beta expresses the strength of the effect in standard deviation units.

  • Small: r or \beta \approx 0.1

  • Medium: r or \beta \approx 0.3

  • Large: r or \beta \approx 0.5 or higher.

72
New cards

Degrees of freedom (df)

The number of independent values in a calculation that are free to vary after constraints are applied.

73
New cards

Hypothesis

A testable statement predicting whether there is a relationship or effect between variables and whether it is likely due to chance.

74
New cards

Test statistic

A numerical value calculated from data which is used to determine whether a result is statistically significant and to help decide whether to reject a null hypothesis.

75
New cards

P-value

The probability of obtaining the observed result, or a more extreme result, assuming the null hypothesis is true. If P > 0.05, we typically fail to reject the null hypothesis.

76
New cards

Bar charts

A graph that uses rectangular bars to show the frequency, counts or proportion of different categorical data.

Type of data - categorical [nominal/ordinal].

The bars are not touching each other, and each bar represents its own category. It is a descriptive statistic to visually compare categories.

77
New cards

One-tailed hypothesis

Predicts a specific direction of the effect of the IV on the DV.

78
New cards

Two-tailed hypothesis

Predicts an effect exists but does not specify the direction.

79
New cards

Type I error

Occurs when you incorrectly reject the null hypothesis, concluding there is an effect when in reality there is none.

Also known as a false positive (seeing an effect that isn't there).

80
New cards

Type II error

Occurs when you fail to reject the null hypothesis when it is actually false, concluding there is no effect when there is one.

Also known as a false negative (not seeing an effect when there is one).

81
New cards

Regression coefficient

The slope in a regression model that tells the expected change in the dependent variable for a one-unit increase in the independent variable.

82
New cards

Unstandardized B

Represents the expected change in the DV for a one-unit increase in the IV using the original units of the DV. Used to interpret actual impact.

83
New cards

Standardized beta (\beta)

The expected change in the DV for a one standard deviation increase in the IV, which can be used to signify the strength and direction between the variables.

Allows comparison of the direction and strength of effects across different variables.

84
New cards

Statistics

Numerical summary measured from a sample data (e.g., sample mean or sample mode). It represents a property of the sample.

85
New cards

Between-Subjects Design

Different participants are assigned to different conditions, allowing comparison of the effects of the IV across groups.

This is also known as an independent groups design.

86
New cards

Bias

Occurs when systematic errors or researcher influence affect how data are collected, analysed, or interpreted, making the results less representative of the true population.

This can come from the researcher, participants or the sampling process.

87
New cards

Central Measures of Tendency

Descriptive statistics that depict the overall 'central' trend of a set of data. There are three key measures: mean, median and mode.

88
New cards

CI (Confidence interval)

A range from a set of values. If we were to repeat the sampling method multiple times, we would be confident that about 95% of the confidence intervals generated would contain the true population parameter

89
New cards

Cohen's d

A measure of effect size that assesses the strength of the difference between two means in terms of standard deviation.

  • Small effect: d \approx 0.2

  • Medium effect: d \approx 0.5

  • Large effect: d \approx 0.8

90
New cards

Spearman rho

A Spearman's correlation was conducted to evaluate the relationship between students' French written and oral exam grades. There was a significant positive relationship,

rs(38)=.65,p<.001r sub s open paren 38 close paren equals .65 comma p is less than .001

𝑟𝑠(38)=.65,𝑝<.001

.

Non parametric

91
New cards

Pearson r correlation

"A Pearson correlation was calculated to assess the relationship between self-efficacy and breastfeeding exclusive." There was a significant positive correlation between self-efficacy and breastfeeding exclusive, \(r(df)=.45\), \(p<.001\).

92
New cards

Control group

The group in an experiment that is not manipulated and is seen as a baseline to compare the effects of the IV on the dependent group.

93
New cards

Correlation coefficient

Measures the strength and direction of a linear relationship between two variables; does not imply causation.

  • Guideline:
    • 0 - no relationship
    • 0.3 - low relationship
    • 0.8 - high relationship
    • 1 - perfect relationship
94
New cards

Descriptive Statistics

Statistical methods used to summarise, organise, and describe the main features of a dataset or sample. Includes measures like mean, median, mode, standard deviation, and visual tools like graphs and tables to help us understand the data.

95
New cards

DV

The variable that is measured or observed to see if an effect has occurred due to the IV. Also known as the outcome variable.

96
New cards

Parameter

A numerical value that describes a characteristic of the whole population (e.g., population mean).

97
New cards

Standard error

Measures how much a sample mean is likely to vary from the population mean.

It acts as an indicator of the sample mean's accuracy as an estimate of the population mean.

98
New cards

Nominal data

Categorical data that categorizes variables into distinct groups or labels without any natural order or quantitative value.

ex, gender, age, hair colour (brown, white), nationality (English, Chinese)

99
New cards

Histogram

A graphical representation of the frequency distribution of continuous data, showing how often values occur within certain ranges (bins). Bars touch each other, indicating continuous/scale data [interval/ratio].

100
New cards

Ordinal data

Categorical data that has a natural, ordered ranking but the intervals between ranks are unknown.

Ex, Likert charts, educational levels, 1-5 scales