Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Explore Top Notes

elements of fiction: plot

Studied by 5 people

1.0: unit one major concepts + review

Studied by 9 people

Studied by 14 people

AP Chem Unit 2: Compound Structure and Properties🧪

Studied by 91 people

Chapter One: Past and Present

Studied by 12 people

Science chapter studyguide: The Laws of Motion

Studied by 6 people

Lecture 3 Flashcards (1)

Data Transformation and Cleaning in SPSS

Introduction

This lecture focuses on transforming raw data into usable knowledge for analysis in SPSS.
The goal is to clean and manipulate data obtained from sources like Qualtrics to ensure accurate and reliable results.

Data Cleaning: Garbage In, Garbage Out

Principle: Clean data is crucial for generating meaningful insights; otherwise, the analysis will produce unreliable results.
Raw Data: Keep the original, untouched data set separate from the working data.
- One set of raw data that you don't touch
- Another set of not so clean that you can manipulate and clean up.

Step 1: Removing Unnecessary Data

Excluding Participants Who Declined Participation

Remove responses from individuals who indicated they did not want to participate.
Process:
- Go to Data -> Select Cases.
- Use an IF condition to filter out participants who answered "no" to the participation question.
- Syntax example: IF participant = 2 (where 2 represents "no").
- Execute the syntax to remove these cases.

Excluding Invariant Responses (Straight-liners)

Identify and remove responses where participants provided the same answer for all questions, indicating a lack of attention.
Respondent two and respondent four are invariant because there is no variation.

Excluding Rushers

Identify and remove participants who completed the survey too quickly, suggesting they did not engage thoughtfully with the questions.
- Straight liners / invariants
- Rushers.
Process:
- Create a new variable called "duration" to record the time taken to complete the survey.
- Calculate duration in minutes from seconds by dividing the duration in seconds by 60: Duration(minutes) = \frac{Duration(seconds)}{60}.
  - Time in minutes is the durations in seconds divided by sixty, (\frac{seconds}{60} = minutes)
- Establish a reasonable minimum time based on the survey's intended length (e.g., at least 5 minutes for a 15-20 minute survey).
- Remove responses with durations below this threshold.

Step 2: Handling Missing Data

Types of Missing Data

Skipped questions: Participants saw the question but chose not to answer (indicated by a specific code, e.g., -99).
Unseen questions: Participants did not reach the question due to early dropout or branching (indicated by a blank cell).
- Two types of missing.
  - Saw question, but skipped.
  - Didn't see question.

Coding Missing Values

Define missing value codes in SPSS to ensure they are not included in calculations.
Process:
- Go to Variable View.
- For each variable with missing data, specify the missing value code (e.g., -99) under the "Missing" column.

Step 3: Creating a Respondent ID Variable

Generate a unique identification number for each respondent.
Process:
- Compute a new variable named "respondent ID" using the $CASENUM system variable, which represents the case number.
- Syntax: COMPUTE respondent_ID = $CASENUM.
- Place the new variable at the beginning of the data set by inserting a new column and then running the compute command.
- Go to variable view, insert the variable where you want it, then run the computation.

Step 4: Reverse Coding

Reverse code items where the scale is inverted to maintain consistency and ensure participants are paying attention.
Reverse Coded to make sure they are paying attention.
Process:
- Use the Recode into Different Variables function.
- For each reverse-coded item, assign new values such that:
  - 1 becomes 7
  - 2 becomes 6
  - 3 becomes 5
  - 4 stays 4
  - 5 becomes 3
  - 6 becomes 2
  - 7 becomes 1
- Syntax example: RECODE Q9_1 (1=7) (2=6) (3=5) (4=4) (5=3) (6=2) (7=1) INTO Q9_1_reversed.

Step 5: Standardizing Variables / Mean Centering

Mean Centering

Adjust the scale by subtracting the midpoint value (e.g., 4 for a 7-point scale) from each response.
Helps balance responses around a neutral point.
Process:
- Compute a new variable by subtracting the mean value from the original variable.
- Syntax example: COMPUTE Q9_2_centered = Q9_2 - 4 (for a scale of 1-7).
- NewValue = OldValue - Mean

Standardizing (Z-Scores)

Transform variables to a standard scale with a mean of 0 and a standard deviation of 1 using the formula: z = \frac{x - \mu}{\sigma}, where x is the observed value, \mu is the mean, and \sigma is the standard deviation.
Enables direct comparison between variables with different scales (e.g., age and income).
If you have scales of one to five and one to seven you need to normalize the data.
Also, say you wanna compare age to income, but income is a large number with a wide range, and age is a smaller number with a tighter range - you can use standardizing to compare them.
The person that's 65 is one standard deviation above the mean, and a person who's income is 120,000 is also one standard deviation above the mean, you can now directly compare.
Process:
- Use descriptive statistics to get the mean and the standard deviation.

Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Explore Top Notes

elements of fiction: plot

Studied by 5 people

1.0: unit one major concepts + review

Studied by 9 people

Studied by 14 people

AP Chem Unit 2: Compound Structure and Properties🧪

Studied by 91 people

Chapter One: Past and Present

Studied by 12 people

Science chapter studyguide: The Laws of Motion

Studied by 6 people