mathematical content, reliability and validity

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/49

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No study sessions yet.

50 Terms

New cards

sign test definition

involves looking up the number of pluses/minuses (S) (whichever is smaller) against the total number of pluses and minuses (N) in a table
the table tells you whether the results you have obtained are statistically significant at the 5% level
the sign test is used when you have pairs of scores of related samples
- used in matched pairs or repeated measures designs

New cards

sign test procedure

give each pair of scores a plus if the score in the left column is bigger than the score in the right column
give each pair of scores a minus if the score in the left column is smaller than the score in the right column
give each pair of scores a zero if there is no difference between the left and right columns
make a note of the number of times the less frequent sign (S) occurs and the total number of pluses and minuses (N) (don’t include any zeroes in N)
look at the given statistical table in the highest critical value of S which is significant at the 5% level of N

New cards

final steps of sign test

if the value for 5 you have found is equal to or lower than the value in the statistical table, the IV has had an effect on the DV
- thus the results are significant at the 5% level
if the value for 5 you have found is more than the table value, then it is concluded the IV has no effect on the DV
- thus the results are not significant at the 5% level

New cards

example of model answer for sign test

the calculated value of S (7) is greater than the critical value (5) for N = 20 at 5% level (p = 0.05)
for a one-tailed test such as this so the results are not significant

New cards

purpose of statistical testing

statistical tests determine if a difference/correlation is statistically significant

New cards

factors that determine the choice of statistical tests

has the researcher conducted a test of difference or correlation?
if a test of difference is conducted, which experimental design was used?
- unrelated - independent groups design
- related - repeated measures and matched pairs designs
has the researcher collected nominal, ordinal or interval data?

New cards

when to use which statistical test

chi-squared is a test of both difference and association
spearman's rho and pearons’s r are the only tests of correlation

New cards

parametric tests definition

parametric tests assume:
- a normal distribution
- use of interval data (as it’s the most sensitive and precise)
- homogeneity of variance

New cards

homogeneity of variance definition

if the set of scores per condition are similar in terms of dispersion, then this means they have homogeneity of variance
if both conditions have a similar standard deviation, then this indicates that there was not a large amount of variability in each condition

New cards

types of parametric test

related t-test
unrelated t-test
Pearson’s R

New cards

strengths of parametric tests

more powerful and precise than non-parametric tests assume they have more statistical power and more likely to lead to the detection of a significant difference or correlation

New cards

weaknesses of parametric tests

requires a large sample size, which is time consuming, hard to achieve and expensive

New cards

non-parametric tests definition

do not follow the same criteria as parametric tests
- there is no assumption of normal distribution as what is being measured may not fall within defined parameters
- they use nominal or ordinal data
- they do not depend on homogeneity of variance

New cards

types of non-parametric test

Mann-Whitney U
Spearman’s Rho
Wilcoxon
Chi-squared
Sign test

New cards

reliability definition

reliability is consistency (if results are reliable, they will be consistent every time the experiment is repeated)

New cards

internal reliability definition

internal reliability means the test is consistent within itself
- e.g. 2 parts of the same test need to measure the same thing in the same way

New cards

external reliability definition

external reliability means the test is consistent over a period of time
- e.g. an IQ test should produce the same results for the same person the next year as it did last year
external reliability can be determined using the test-retest method

New cards

how to test consistency

test-retest reliability
- if a test is repeated, a reliable test would yield similar results
split half reliability
- if 2 halves of the same test yield the same results

New cards

threats to reliability

task interest
- if a participant finds a task interesting, they are more likely to do well in it
- if a participant is bored by a task, there may be a decline in performance
- if a task interesting issues is suspected, it is necessary to make the control task equally interesting

New cards

how to combat task interest

counterbalancing
- e.g half of participants would do task A then task B
- the other half would do task B then task A

New cards

inter-rater reliability definition

how similar different raters/judges score the same event
it is unreliable to have the results of a test depend upon who is doing the observation
it is necessary that all people making judgements need to be making the same judgements using the same criteria

New cards

MODEL ANS: how to check inter-rater reliability

inter-rater reliability is when you have 2 or more people making the same judgements using
the numerical scores of the observers are compared using a scattergram
a positive correlation of 0.8 or more demonstrates good inter-rater reliability
the correlation can be worked out by calculating a correlational coefficient
correlational coefficient can be calculated using Spearman’s Rho and Pearson’s R

New cards

MODEL ANS: how to train inter-rater reliability

clear and mutually exclusive categories are made for observation
joint observations are completed (preferable using a video) whilst recording categories to standardise observations and judgements
the reliability of observational scores is compared using a scattergram
this is done by calculating the correlational coefficient of observational scores
correlational coefficient can be calculated using Spearman’s Rho and Pearson’s R
a correlational coefficient of 0.8 or more demonstrates good inter-rater reliability

New cards

validity definition

a test or measure is valid if it measures what its supposed to measure

New cards

face validity definition

weakest form of testing validity
process of looking at a measure and making a quick judgement as to whether it does measure what it’s supposed to

New cards

population validity definition

concerns the population chosen for a sample
questions which are asked:
- is the sample size large enough?
- is the sample size too narrow culturally?
- can the findings be generalised to a wider population?

New cards

external validity definition

another way of questioning whether the results can be extrapolated or generalised across a wider population
ideas to look at and whether these make the data generalisable or not:
- location
- time of day
- era within history

New cards

internal validity definition

examines whether the IV caused the DV
the results of a study are internally valid if they are the results of the manipulation of the IV on the DV and if they have not been affected by extraneous variables

New cards

ecological validity definition

how true to life the experimental situation is and if the results would be replicated in a real-life scenario

New cards

content validity definition

does the content of the experiment/test measure what its supposed is supposed to

New cards

criterion validity definition

examines if the criteria being used is measuring what they’re mean to

New cards

concurrent validity definition

examines if other test done at the same time support the results found

New cards

predictive validity definition

examines whether the experiment or theory predict what will happen to other people or how how other people will behave, based on what happened to the participants or how they behaved in the study

New cards

experimental validity definition

examines whether the conclusions drawn from a piece of research are true
examines whether the experiment worked
examines whether there was a genuine effect of the IV on the DV
- type I and type II errors can be referred to here
experimental validity requires internal validity

New cards

construct validity definition

examines whether the correct assumptions about psychological constructs have been made
- e.g. is measuring sadness through crying correct? are there any other occasions when people cry? are there other ways of expressing sadness?

New cards

temporal validity definition

measures the extent to which research findings are still relevant in the current age

New cards

threats to validity

investigator effects
demand characteristics
Hawthorne Effect
observational unreliability
unreliable self-report
social desirability
history effects
biased sampling
mortality
extraneous variables

New cards

improving the validity of lab experiments

using controlled conditions and standardised procedure to be able to establish causality
using single-blind or double-blind procedures to ensure no bias from researchers

New cards

avoiding investigator effects

use double-blind procedure to reduce impact of the investigator on participant performance

New cards

avoiding demand characteristics

disguise the aim of the research as much as ethical conduct allows
using single-blind procedures

New cards

improving observational reliability

using covert methods in naturalistic observation to reduce participant behaviour being contrived
ensuring behavioural categories are unambiguous and mutually exclusive

New cards

improving reliability of self-report

using a lie scale to show inconsistencies in responses
using reverse scoring to ensure the participants answer all questions with the same direction of response, rather than the same option (e.g, 10/10 for all answers)

New cards

type I error definition

occurs when the null hypothesis is rejected when it should have been accepted (researcher claims the results are significant when they are not)
more likely to occur when the researcher uses a probability value that is too high (e.g 0.1 rather than 0.05)

New cards

type II error definition

occurs when the null hypothesis is accepted when it should have been rejected (researcher claims the results are not significant when they are)
more likely to happen when the researcher uses a probability value that is too low (e.g. 0.01 instead of 0.05)

New cards

how to combat type I and II errors

using a significance level of 0.05

New cards

participant expectancy definition

participants feel there should be some change or difference between conditions that they participate in

New cards

how to control history effects

use a control group that also experienced the same historical event

New cards

how to control extraneous variables

use counter-balancing