PUB561 module 1 part 2

Basic Terminology

  • Variable: Characteristic of interest that varies across individuals (e.g., gender, disease status, BMI, age).

  • Data: Values measured for a variable. Singular value for one participant is called a datum.

Types of Variables

  • Independent Variable:

    • Also known as explanatory variable, exposure, or predictor.

    • Potentially influences the dependent variable (e.g., age may influence BMI).

  • Dependent Variable:

    • Also called outcome, endpoint, or response variable.

    • Potentially influenced by independent variables (e.g., BMI is dependent on factors such as age).

Relationships

  • A relationship signifies an association between two or more variables (e.g., evidence of relationship between age and BMI).

    • Older individuals tend to have higher BMI.

Forms of Data

  • Continuous Data:

    • Represent numerical amounts measured with high precision.

    • Examples: Height (cm), Weight (kg), Time (seconds), Blood Pressure.

  • Categorical Data:

    • Organized into categories, non-numeric.

    • Limited, finite number of categories (e.g., gender).

Scales of Measurement in Categorical Data

  • Dichotomous/Binary Data:

    • Only two possible values (e.g., male/female, alive/dead).

  • Nominal Data:

    • More than two groups, with no inherent order (e.g., eye color, blood type).

  • Ordinal Data:

    • More than two groups with a natural ordering (e.g., satisfaction scale: strongly agree, agree).

  • Interval Data:

    • Groups with equal and meaningful intervals but rarely used in practice.

    • Example: Zero to nine, where each group represents a number.

Data Reduction

  • Collect data in its most precise form before categorizing for analysis.

  • Example: Collect age in years, then group into categories (20s, 30s, etc.).

    • Easier to collapse than to expand data later in analysis.

Key Takeaways

  • Variables can be independent (predictors) or dependent (responses).

  • Data can be continuous or categorical.

  • Categorical data can be further divided into binary, nominal, ordinal, and interval scales.

  • Aim to collect data at the highest level of precision before categorization.

Practice Exercise Examples

  • Example 1:

    • Statement: "The time from diagnosis to death was 24.6 months."

    • Variable: Time from diagnosis to death.

    • Value: 24.6 months.

    • Type: Continuous variable (due to decimal).

  • Example 2:

    • Statement: "The survival of the patient was noted 24 months after diagnosis."

    • Variable: Patient survival (alive/dead).

    • Type: Binary data.

robot