mathematical content, reliability and validity

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/41

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

42 Terms

1
New cards

sign test definition

  • involves looking up the number of pluses/minuses (S) (whichever is smaller) against the total number of pluses and minuses (N) in a table

  • the table tells you whether the results you have obtained are statistically significant at the 5% level

  • the sign test is used when you have pairs of scores of related samples

    • used in matched pairs or repeated measures designs

    • unknown.png

2
New cards

sign test procedure

  1. give each pair of scores a plus if the score in the left column is bigger than the score in the right column

  2. give each pair of scores a minus if the score in the left column is smaller than the score in the right column

  3. give each pair of scores a zero if there is no difference between the left and right columns

  4. make a note of the number of times the less frequent sign (S) occurs and the total number of pluses and minuses (N) (don’t include any zeroes in N)

  5. look at the given statistical table in the highest critical value of S which is significant at the 5% level of N

3
New cards

final steps of sign test

  • if the value for 5 you have found is equal to or lower than the value in the statistical table, the IV has had an effect on the DV

    • thus the results are significant at the 5% level

  • if the value for 5 you have found is more than the table value, then it is concluded the IV has no effect on the DV

    • thus the results are not significant at the 5% level

4
New cards

example of model answer for sign test

  • the calculated value of S (7) is greater than the critical value (5) for N = 20 at 5% level (p = 0.05)

  • for a one-tailed test such as this so the results are not significant

5
New cards

purpose of statistical testing

  • statistical tests determine if a difference/correlation is statistically significant

6
New cards

factors that determine the choice of statistical tests

  • has the researcher conducted a test of difference or correlation?

  • if a test of difference is conducted, which experimental design was used?

    • unrelated - independent groups design

    • related - repeated measures and matched pairs designs

  • has the researcher collected nominal, ordinal or interval data?

7
New cards

when to use which statistical test

  • chi-squared is a test of both difference and association

  • spearman's rho and pearonss r are the only tests of correlation

8
New cards

parametric tests definition

  • parametric tests assume:

    • a normal distribution

    • use of interval data (as it’s the most sensitive and precise)

    • homogeneity of variance

9
New cards

homogeneity of variance definition

  • if the set of scores per condition are similar in terms of dispersion, then this means they have homogeneity of variance

  • if both conditions have a similar standard deviation, then this indicates that there was not a large amount of variability in each condition

10
New cards

strengths of parametric tests

  • more powerful and precise than non-parametric tests assume they have more statistical power and more likely to lead to the detection of a significant difference or correlation

11
New cards

non-parametric tests definition

  • do not follow the same criteria as parametric tests

    • there is no assumption of normal distribution as what is being measured may not fall within defined parameters

    • they use nominal or ordinal data

    • they do not depend on homogeneity of variance

12
New cards

reliability definition

  • reliability is consistency (if results are reliable, they will be consistent every time the experiment is repeated)

13
New cards

internal reliability definition

  • internal reliability means the test is consistent within itself

    • e.g. 2 parts of the same test need to measure the same thing in the same way

14
New cards

external reliability definition

  • external reliability means the test is consistent over a period of time

    • e.g. an IQ test should produce the same results for the same person the next year as it did last year

  • external reliability can be determined using the test-retest method

15
New cards

how to test consistency

  • test-retest reliability

    • if a test is repeated, a reliable test would yield similar results

  • split half reliability

    • if 2 halves of the same test yield the same results

16
New cards

threats to reliability

  • task interest

    • if a participant finds a task interesting, they are more likely to do well in it

    • if a participant is bored by a task, there may be a decline in performance

    • if a task interesting issues is suspected, it is necessary to make the control task equally interesting

17
New cards

how to combat task interest

  • counterbalancing

    • e.g half of participants would do task A then task B

    • the other half would do task B then task A

18
New cards

inter-rater reliability definition

  • how similar different raters/judges score the same event

  • it is unreliable to have the results of a test depend upon who is doing the observation

  • it is necessary that all people making judgements need to be making the same judgements using the same criteria

19
New cards

MODEL ANS: how to check inter-rater reliability

  • inter-rater reliability is when you have 2 or more people making the same judgements using

  • the numerical scores of the observers are compared using a scattergram

  • a positive correlation of 0.8 or more demonstrates good inter-rater reliability

  • the correlation can be worked out by calculating a correlational coefficient

  • correlational coefficient can be calculated using Spearman’s Rho and Pearson’s R

20
New cards

MODEL ANS: how to train inter-rater reliability

  • clear and mutually exclusive categories are made for observation

  • joint observations are completed (preferable using a video) whilst recording categories to standardise observations and judgements

  • the reliability of observational scores is compared using a scattergram

  • this is done by calculating the correlational coefficient of observational scores

  • correlational coefficient can be calculated using Spearman’s Rho and Pearson’s R

  • a correlational coefficient of 0.8 or more demonstrates good inter-rater reliability

21
New cards

validity definition

a test or measure is valid if it measures what its supposed to measure

22
New cards

face validity definition

  • weakest form of testing validity

  • process of looking at a measure and making a quick judgement as to whether it does measure what it’s supposed to

23
New cards

population validity definition

  • concerns the population chosen for a sample

  • questions which are asked:

    • is the sample size large enough?

    • is the sample size too narrow culturally?

    • can the findings be generalised to a wider population?

24
New cards

external validity definition

  • another way of questioning whether the results can be extrapolated or generalised across a wider population

  • ideas to look at and whether these make the data generalisable or not:

    • location

    • time of day

    • era within history

25
New cards

internal validity definition

  • examines whether the IV caused the DV

  • the results of a study are internally valid if they are the results of the manipulation of the IV on the DV and if they have not been affected by extraneous variables

26
New cards

ecological validity definition

  • how true to life the experimental situation is and if the results would be replicated in a real-life scenario

27
New cards

content validity definition

  • does the content of the experiment/test measure what its supposed is supposed to

28
New cards

criterion validity definition

  • examines if the criteria being used is measuring what they’re mean to

29
New cards

concurrent validity definition

  • examines if other test done at the same time support the results found

30
New cards

predictive validity definition

  • examines whether the experiment or theory predict what will happen to other people or how how other people will behave, based on what happened to the participants or how they behaved in the study

31
New cards

experimental validity definition

  • examines whether the conclusions drawn from a piece of research are true

  • examines whether the experiment worked

  • examines whether there was a genuine effect of the IV on the DV

    • type I and type II errors can be referred to here

  • experimental validity requires internal validity

32
New cards

construct validity definition

  • examines whether the correct assumptions about psychological constructs have been made

    • e.g. is measuring sadness through crying correct? are there any other occasions when people cry? are there other ways of expressing sadness?

33
New cards

temporal validity definition

  • measures the extent to which research findings are still relevant in the current age

34
New cards

threats to validity

  • investigator effects

  • demand characteristics

  • Hawthorne Effect

  • observational unreliability

  • unreliable self-report

35
New cards

improving the validity of lab experiments

  • using controlled conditions and standardised procedure to be able to establish causality

  • using single-blind or double-blind procedures to ensure no bias from researchers

36
New cards

avoiding investigator effects

  • use double-blind procedure to reduce impact of the investigator on participant performance

37
New cards

avoiding demand characteristics

  • disguise the aim of the research as much as ethical conduct allows

  • using single-blind procedures

38
New cards

improving observational reliability

  • using covert methods in naturalistic observation to reduce participant behaviour being contrived

  • ensuring behavioural categories are unambiguous and mutually exclusive

39
New cards

improving reliability of self-report

  • using a lie scale to show inconsistencies in responses

  • using reverse scoring to ensure the participants answer all questions with the same direction of response, rather than the same option (e.g, 10/10 for all answers)

40
New cards

type I error definition

  • occurs when the null hypothesis is rejected when it should have been accepted (researcher claims the results are significant when they are not)

  • more likely to occur when the researcher uses a probability value that is too high (e.g 0.1 rather than 0.05)

41
New cards

type II error definition

  • occurs when the null hypothesis is accepted when it should have been rejected (researcher claims the results are not significant when they are)

  • more likely to happen when the researcher uses a probability value that is too low (e.g. 0.01 instead of 0.05)

42
New cards

how to combat type I and II errors

using a significance level of 0.05