research methods 2

psychology as a science

Science
- Kuhn
  - Scientific methods — Skinner’s rats
  - Scientific terminology — ‘synaptic transmission’, ‘variables’
- Popper
  - Falsifiable — Skinner’s rats, brain scans
  - Replicable — strange situation
  - Objective — Skinner’s rats, brain scans
- Not science
  - Kuhn
    - No paradigms/paradigm shifts — bio vs cog., not agreed assumptions
    - Doesn’t have scientific methods — Genie and other case studies
    - Doesn’t have scientific terminology — schema (unscientific)
  - Popper
    - Not falsifiable — Freud’s unconscious
    - Not reliable — Genie and other case studies
    - Not objective — case studies, schema

kuhn

Criteria of a science:
- ✓
  - Scientific methods, e.g lab experiments
  - Scientific terminology, e.g independent/dependent variable or hypothesis, synaptic transmission
- X
  - Assumption about behaviour (shared/agreed with), e.g different approaches

Paradigm shifts
- A paradigm is a shared belief on how something works, a paradigm shift is the change in paradigm when an older theory is disproven and a newer theory is introduced
- Psychology is different because there are different approaches to test ideas, over one unifying theory to test ideas, which means paradigm shifts are less likely to happen because psychology is a broad subject with too many researchers researching on different things that the chance of finding a theory wrong and creating a new theory is reduced, meaning it is harder to get to the truth

karl popper

Falsifiability
- Trying to prove theory is biased and easy, trying to disprove it is more scientific (verification)
- Scientific topic — empirically testable, so you have the ability to prove it false. Impossible to ‘prove’ a scientific theory is 100% true because no amount of evidence assures no contradicting evidence can ever be found
- Science only accomplished by proving theories wrong, which enables evolution/development of the subject because 1 single observation that fit the theory means the theory is debunked and a new theory will replace it
- Freud’ psychodynamic which is unfalsifiable, as ‘unconscious’ cannot be empirically tested to see if it exists or not
Replicable — ‘reliability’, researcher replicates the study to see if its the same results. Standardised procedures = easier to replicate = easy to test reliability (external)
- E.g reliable: strange situation, Harlow’s monkeys. Not reliable: Genie
Objective — ‘unbiased’, not affected by researcher’s own opinions and feelings, must be objective to be truly scientific
- E.g objective: experiments like Skinner, brain scans. Subjective: case studies or ppt observations, schema

analysing qualitative data

content analysis

Qualifying qualitative data through indirectly observing the presence of certain words, images or concepts (coding units) within the media (advertisements, books, films etc.) and counting the number of times they occur. INferences about the messages within the data are then made. It is usually carried out on secondary data, e.g data already published
Waynforth and Dunbar carried out a content analysis of lonely hearts columns, discovered distinct gender differences in the way men and women advertised themselves (men advertised resources like job, house etc. and sought attractive, youthful partners, while women advertised attractiveness and sought resources)

Evaluations:

Advantage — relatively reliable, method is easy to replicate through others using the same materials and examining the same source. Because categories to look out for are already created before the investigation begins and therefore are objective. Stronger chance of being reliable
Disadvantage — only collects numerical data. Does not reveal underlying reasons for behaviour, only that they exist. For example, recording times men look for beauty but not what type of beauty. May make it difficult to apply these findings to society as the data has limited uses

thematic analysis

Method for identifying and reporting themes within data, using coding. Organises, describes and intercepts data, the identified themes become categories for analysis and the process of coding involves 6 stages:
- Familiarisation with the data — read the content
- Coding — decide what codes you will use to record (i.e numbers, letters, columns)
- Searching for themes — read back through after identifying themes you want to find and use your code to annotate themes
- Reviewing themes — ensure themes are relevant/not too exclusive or inclusive, after searching
- Defining and naming themes — identifying clear and concise names for themes so they make sense at the write-up
- Writing up — write up the report of what this is stating or inferring
Preferred over content analysis as involves comparison of themes, identification of co-occurrences and graphs to display differences in themes

Evaluations:

Advantage — looks at meanings and preserves qualitative data, due to finding qualitative themes in data, it can provide rich insight into why a behaviour occurs. More sensitive
Disadvantage — inter-observer reliability often low, qual. data relies on subjective interpretation — researchers may disagree which theme the info fits into. Method suggests reasonably low reliability, the outcomes of the method are not consistent

reliability

Consistency of a test or procedure, one aspect is testing consistency between different observers (inter-rater reliability) and another is achieving consistent measuring instruments when developing tests:
- Internal reliability — the extent to which something is consistent within itself, e.g scales should measure the same weight between 50g or 100g etc, ruler is another consistent measurement
- External reliability — this concerns the extent to which the test measures consistency over time (gets consistent results)
Ways of assessing and improving reliability
- Split half method — measures the internal reliability by splitting a test into 2 halves and having the same ppt do both halves. If the 2 halves of the test provide similar results, this indicates the test has internal reliability
- Test retest method — measures the external reliability by giving the same test to the ppt on 2 occasions. If the result is obtained then the test has external reliability
- Inter-observer reliability — this measures whether different observers are viewing or rating a behaviour in the same way. If it low, could lead to biased observation
  - Can be improved by:
    - Develop clearly defined and separate categories for observational criteria
    - Increase the number of observers as data will be less subject to bias
    - Record the test/experiment so it can be reviewed at any time

validity

Concerns accuracy, to which something measures what it claims to measure and the extent to which findings can be generalised beyond research setting:
- Internal validity — this concerns whether results are to do with the manipulation of the Iv instead of confounding variables
- External validity — this concerns the extent to which can experiment’s results can be generalised to other settings (ecological validity), other people (population validity) and over time (temporal validity)
  - Improved by a more natural or realistic setting
Ways of assessing validity
- Face validity — simple way of assessing validity. Extent to which results look like what a test claims to measure, not scientific and usually only used for pilot study
- Concurrent validity — correlating scores on a test with another test that is already known to be valid, i.e a new IQ test would check the score with another IQ test that is already known to be valid to see if they’re the same
- Predictive validity — involves testing a group of subjects and then comparing scores to the results obtained at some point in the future to see if they match, i.e a school entrance test predicting later exam results
- Temporal validity — use of the test rest method in another tie context and see if the same results were found, e.g conduct an investigation on how oppressed females feel 1901 and 2020 — different results = low temporal validity

choosing an appropriate statistical test

Statistical tests reveal whether results are significant and experimental hypothesis can be accepted or rejected
- Difference (1 dependent variable) or correlation (2 dependent variables)
- Level of data/measurement
- Experimental design

Nature of hypothesis Level of measurement Ind. Rep.

Difference Nominal Chi-squared Sign test

Ordinal Mann-Whitney U Wilcoxon

Interval Ind. t-test related t-test

Correlation Ordinal Spearman’s rho

Interval Pearson product moment

analysing quantitative data

Interval data — difference between measurements but no true zero. Distance between units is internally reliable, usually standardised unit of measurement, e.g kg, mm
Ordinal data — ordered categories (rankings, order). Data which can be ranked, ordered or scored but what is being measured is not internally reliable/standardised
Nominal data — categories (no ordering or direction). Data which can be ranked, ordered or scored but what is being measured is not internally reliable/standardised, e.g self report techniques, scores. No individual score, number represents frequency of a behaviour/category

calculating significance

To calculate significance of results, observed/calculated value is compared with critical value which is provided in a critical value table.
- Observed value from statistical test
- Directional or non-directional hypothesis
- Number of participants
- Level of significant (5% or 1%)
Mann whitney, wilcoxon and sign tests require

Chi-squared — uses ppt/d.f
Mann Whitney U test — number of ppts in n1 and n2
Spearman’s rho — number of ppts
Sign test — not always given observed value, get rid of ‘same’/’no difference’ and add up each category, least occurring is observed value. One tailed and two tailed have different significance levels

probability

Probability 0 = WON’T happen, 1 = WILL happen
Significant can accept experimental hypothesis, while not significant can accept the null
P<0.05 means there is a 5% risk that results gained are due to chance and therefore we have accepted and rejected the wrong hypothesises

After conducting the [] test, the results of this test are significant/not significant at the significance level of p<0.05, with [] ppts using a one/two tailed hypothesis. This is because the observed value is MORE/LESS THAN the critical value. In this case, the experimental/null hypothesis can be rejected and the null/experimental hypothesis can be accepted

type 1 and 2 errors

Type 1 error (possible if obtained significant result) —detecting a significant difference between your IV conditions that is not actually present. This is more likely when your significance level is lenient, dangerous
Type 2 error (possible if obtained not significant results) — failing to detect a significant difference that should be present. This is more likely when using a probability level too strict, less likely to get product approved/publish results
Anything more lenient than 0.07 is too lenient, anything less than 0.03 is strict, 0.05 is used as it cancels/averages out chances of type 1 and 2 errors.