Variables and Data Types Notes

Variables and Data Types

Definitions:

  • Variable: A characteristic or attribute that can take on different values.
    • Examples: Age, weight, height, blood pressure, number of patients.
  • Random Variables: Variables whose values are determined by chance.
  • Data: The values that a variable can assume; a collection of data values is called a data set.

Types of Variables:

  1. Categorical/Qualitative Variables: Classify observations into distinct groups; these are unmeasurable.

    • Binary/Dichotomous Variables: Two categories (e.g., gender, pregnancy).
    • Nominal Variables: More than two categories without natural order (e.g., blood type, eye color).
    • Ordinal Variables: Categories with a natural order (e.g., cancer stages, socio-economic status).
  2. Quantitative/Numerical Variables: Measurable or countable.

    • Discrete Variables: Whole number values (e.g., number of visits).
    • Continuous Variables: Can take any value in a range (e.g., height, temperature).

Relationships Between Variables:

  • Response Variable (Dependent/Outcome Variable): Measured and affected by other variables.
  • Explanatory Variable (Independent/Predictor Variable): Used to predict changes in the response variable.

Scales of Measurement:

  • Nominal Scale: Classifies data into non-overlapping categories; no order (e.g., gender, blood type).
  • Ordinal Scale: Categories can be ranked; precise differences do not exist (e.g., pain level).
  • Interval Scale: Ranked with precise unit differences; no true zero (e.g., temperature in Celsius).
  • Ratio Scale: Equal ratios and intervals exist; a true zero point (e.g., weight).

Examples of Measurement Scales:

  • Nominal: Zip code, gender, race, political affiliation.
  • Ordinal: Grade, judging ranks, socioeconomic status.
  • Interval: Temperature, time, IQ scores.
  • Ratio: Weight, height, salary, age.

Types of Data:

  • Categorical Data: Not measurable numerically.
    • Nominal Data: Blood type, gender.
    • Ordinal Data: Cancer stages, grading.
  • Quantitative Data: Measurable numerically.
    • Discrete Data: Whole numbers like the number of patients.
    • Continuous Data: Any measurable quantities like height or weight.

Example - Classification:

  • For a data set: Patient ID, Age, Blood Pressure, BMI, Diabetes Status,
    • Categorical Variables: Diabetes Status (Yes/No).
    • Quantitative Variables: Age (Continuous), Blood Pressure (Continuous), BMI (Continuous).