Last saved 49 days ago
EG

Stats Notes 1_3.pdf

robot
knowt logo

Stats Notes 1_3.pdf

Three Things to Understand Data

  • Variables have distinct types.

  • One variable is often used to explain another.

  • Data helps answer specific questions.

Cases and Variables

Definition of Cases

  • Cases are individual subjects or entities in a dataset (e.g., participants in a marathon).

Definition of Variables

  • Variables are characteristics measured for each case, usually represented as columns in a data table.

Types of Variables

Explanatory Variables

  • Variables used to explain variations in another variable (response variable).

Response Variables

  • Variables that are explained or predicted by explanatory variables.

Collecting Data Example

Involvement in Truckee Marathon

  • Information to be collected:

    • Finish Time

    • Event (Marathon/Half-Marathon)

    • Hometown

    • Age

    • Gender

    • Participant's Name

Organizing Data

Data Organization Techniques

  • Data can be organized on index cards or structured in a spreadsheet.

CSV File Example

  • Analyzing TruckeeMarathons2017.csv:

    • Rows represent cases (participants).

    • Columns represent variables.

    • Explore potential questions from the dataset.

Types of Variables in Detail

Categorical Variables

  • Divides cases into distinct groups; each case belongs to one category (M/F).

Quantitative Variables

  • Measures a numerical quantity; operations like adding/averaging are relevant.

Ordinal Variables

  • A categorical data type with natural ordered categories, but inconsistent distances (e.g., star ratings).

Practical Considerations with Variables

Examples of Variables

  • Categorical but not ordinal: Gender

  • Ordinal: Rating scales (1-5 stars)

Problems to Consider

  • Averaging ordinal data, like Amazon ratings, may not be appropriate due to lack of meaningful distance between categories.

Case Study Example: Doctor's Office

Variables Collected

  • Insurance Company: Categorical

  • Weight: Quantitative

  • Height: Quantitative

  • Temperature: Quantitative

  • Pain Level: Ordinal

  • Age: Quantitative

Examining Correlations

Sleep and Grades Example

  • Observation: More sleep often correlates with better grades.

Definition of Explanatory and Response Variables

  • Explanatory Variable: Sleep (independent variable)

  • Response Variable: Grades (dependent variable)

Building Towards Analysis

Identifying Variables in Questions

  1. Predict party affiliation from religious affiliation.

  2. Compare depreciation rates between domestic and foreign cars.

  3. Assess the effect of Tylenol on fever.

  4. How heart rate affects systolic blood pressure.

Introduction to Sampling

Importance of Samples

  • Samples provide insights into populations.

Bias in Sampling

  • Bias can occur during sample collection, distorting results.

Random Sampling

  • Random sampling eliminates bias and ensures every unit has an equal chance of selection.

Population Definitions

Population: A Complete Set

  • A population includes all individuals or objects relevant to a study.

Sampling: Choosing a Subset

  • A sample is a subset selected to represent the entire population.

Statistical Inference Understanding

Definition of Statistical Inference

  • It involves using sample data to draw conclusions about a population.

Assessing Pros and Cons of Sampling Strategies

Type

Pros

Cons

Census

Complete Information

Difficulties in collection

Sample

Easier to collect

Validity of population inference may vary

Identifying Bias

Definition of Sampling Bias

  • Occurs when selection methods distort population representation.

Selection Bias

  • Introduced when randomization fails, leading to non-representative samples.

Preservation Bias Example

Understanding Survival Bias

  • Focus on visible subjects overlooks those that did not survive a selection process.

Examples

  1. Dinosaur size estimates.

  2. Cavemen lifestyle.

  3. Armor placement on planes.

Participation and Non-Response Biases

  • Bias occurs if study participants disproportionately possess specific traits affecting outcomes.

  • Examples: Test preparation effects, mail surveys.

Designing Samples and Surveys Effectively

Biased Survey Questions

  • Questions framed to skew responses.

Example Questions

  1. Views on school board funding.

  2. Pet vaccination stance.

Random Sampling Techniques

Definition of Simple Random Sample

  • Every group of size n has the same chance of being chosen.

Random Selection Problem

  • Strategies for random selection (e.g., drawing slips from a hat).

Understanding Experimental Designs

Experimental vs. Observational Studies

  • Controlled Experiments: Researcher controls variables.

  • Observational Studies: Researcher observes without manipulation.

Association vs. Causation

Definitions

  • Association: Relationship between two variable values.

  • Causation: Actively changing one variable affects another.

Investigating Confounding Variables

Definition

  • Third variable influencing both the explanatory and response variables.

Designing Experiments Responsibly

Example Questions

  1. Designing sleep impact experiments while considering ethics.

Placebos and Blinding in Experiments

Definitions

  • Placebo Effect: Perceived benefit from believing in treatment efficacy.

  • Blinding: Keeping participant or researcher unaware of treatment allocation.

Types of Randomized Experiments

Randomized Comparative Experiment

  • Random assignment to treatment groups for comparison.

Matched Pairs Experiment

  • Same case receives both treatments, analyzed for differences.

Example Problem

  • Design experiments for comparing poison ivy lotions.