Variables and Data Types Notes
Variables and Data Types
Definitions:
- Variable: A characteristic or attribute that can take on different values.
- Examples: Age, weight, height, blood pressure, number of patients.
- Random Variables: Variables whose values are determined by chance.
- Data: The values that a variable can assume; a collection of data values is called a data set.
Types of Variables:
Categorical/Qualitative Variables: Classify observations into distinct groups; these are unmeasurable.
- Binary/Dichotomous Variables: Two categories (e.g., gender, pregnancy).
- Nominal Variables: More than two categories without natural order (e.g., blood type, eye color).
- Ordinal Variables: Categories with a natural order (e.g., cancer stages, socio-economic status).
Quantitative/Numerical Variables: Measurable or countable.
- Discrete Variables: Whole number values (e.g., number of visits).
- Continuous Variables: Can take any value in a range (e.g., height, temperature).
Relationships Between Variables:
- Response Variable (Dependent/Outcome Variable): Measured and affected by other variables.
- Explanatory Variable (Independent/Predictor Variable): Used to predict changes in the response variable.
Scales of Measurement:
- Nominal Scale: Classifies data into non-overlapping categories; no order (e.g., gender, blood type).
- Ordinal Scale: Categories can be ranked; precise differences do not exist (e.g., pain level).
- Interval Scale: Ranked with precise unit differences; no true zero (e.g., temperature in Celsius).
- Ratio Scale: Equal ratios and intervals exist; a true zero point (e.g., weight).
Examples of Measurement Scales:
- Nominal: Zip code, gender, race, political affiliation.
- Ordinal: Grade, judging ranks, socioeconomic status.
- Interval: Temperature, time, IQ scores.
- Ratio: Weight, height, salary, age.
Types of Data:
- Categorical Data: Not measurable numerically.
- Nominal Data: Blood type, gender.
- Ordinal Data: Cancer stages, grading.
- Quantitative Data: Measurable numerically.
- Discrete Data: Whole numbers like the number of patients.
- Continuous Data: Any measurable quantities like height or weight.
Example - Classification:
- For a data set: Patient ID, Age, Blood Pressure, BMI, Diabetes Status,
- Categorical Variables: Diabetes Status (Yes/No).
- Quantitative Variables: Age (Continuous), Blood Pressure (Continuous), BMI (Continuous).