distinguishing between variables and data
Overview
This material distinguishes between individuals, variables, and data using a Seattle on-street parking meters dataset.
Data come from the City of Seattle data portal.
The table shows 11 randomly selected cars (the individuals) and several attributes (the variables) collected for each car.
Individuals, Variables, and Data
Individuals: the 11 different cars highlighted in blue in the table; each row corresponds to one individual car.
Variables: the column headers in the table represent the variables:
Payment method
Amount paid
Duration in minutes
Side of street
Parking space number
Data: the actual values recorded under the column headings for each variable, i.e., the values in each row for those five variables.
Example row (first individual, as described)
Payment method: credit card
Amount paid: 3.75
Duration in minutes: 30
Side of street: west
Parking Space Number: 458
The first individual is one of the 11 cars highlighted in blue; all other rows follow similarly for the other cars.
Variables: what they are and how they’re categorized
The five variables are the column headers listed above.
These variables are the measurable aspects recorded for each individual car.
Qualitative vs Quantitative Variables
Qualitative (categorical) variables:
Payment method
Side of street
Parking space number
Characteristics:
They are categories or labels (not numeric measurements).
Side of street consists of categories like north, south, east, west, or combinations of those.
Parking space number is a category representing location, not a numeric quantity used in arithmetic.
Quantitative variables:
Amount paid
Duration in minutes
Characteristics:
They are numeric measurements.
Units: Amount paid is in dollars and cents; Duration is in minutes.
Quantitative variables: continuous vs discrete
Duration in minutes (continuous):
Time is treated as a continuous variable, even though the data are reported to the nearest minute.
Conceptually, time can take on any value in an interval, not just integer minutes.
Amount paid (discrete in theory, continuous in practice):
Amount paid is recorded in cents, so strictly it is a discrete variable.
There are gaps between potential cents values (e.g., not every possible cent value may occur).
In practice, dollar amounts rounded to the nearest penny are often treated as continuous for analysis.
Data values and their organization
Data are the values recorded under the variable headings for each row (each individual car).
The variables that have data values are: payment method, side of street, and parking space number (qualitative data).
Notes on units and interpretation
Amount paid: units are dollars (and cents). Example value: 3.75 dollars.
Duration: units are minutes. Example value: 30 minutes.
Spatial/categorical identifiers:
Side of street: qualitative category (north, south, east, west, or combinations).
Parking space number: qualitative identifier indicating location.
Relationships and potential analyses (conceptual, based on the data type)
Since duration is continuous and amount paid is treated as continuous in practice, analyses like averages, ranges, and distributions can be computed for these two variables.
For qualitative variables, analyses focus on frequencies, proportions, and cross-tabulations (e.g., distribution of payment methods by side of street).
When comparing groups (e.g., average duration by side of street), ensure the variable type is used correctly in statistical tests (continuous vs categorical).
Source and context
Data originate from the City of Seattle data portal and pertain to on-street parking meters.
The example emphasizes understanding the basic data-science distinction between individuals (rows), variables (columns), and data (cell values).
Key takeaways
Individuals correspond to rows (each car in the sample).
Variables correspond to columns (the attributes measured).
Data are the actual measurements/values in the table for each variable and each individual.
Qualitative variables are categories; quantitative variables are numerical measurements.
Among quantitative variables, duration is continuous; amount paid is discrete in theory but often treated as continuous in practice when using monetary values with pennies.