Section 1.2 - Lecture Notes (Intro to Data, Variables, Graphs, and Descriptive Statistics)
Welcome and class routine: start each class with an open Q&A; a time to raise homework questions for the whole group; share campus news if relevant.
Walker Hall logistics: location of math department, grad student offices (LC on 2nd floor, Eddie on 3rd floor); office hours start Monday; you can seek help from them there.
Syllabus announcements:
- Reading the syllabus was assigned; key points summarized here.
- Daily Wiley assignments: small in size, due next Monday (this week’s deadlines). Each problem can be attempted four times.
- Checkpoints: larger, on paper; due on Wednesdays.
- Tests: three tests (in person) + a finance project + final exam (in person).
- Final exam date: Monday, December 8, starts at 8:00 AM and ends by 10:30 AM (for all sections 8–10). Not a trick exam; focus on understanding the material.
- Final project: finance-related (salaries, cars, houses, loans) in lieu of a finance test.
Support and wellbeing:
- Everyone should seek support when needed (math, health, mental health, housing, food, etc.).
- GTA office hours will be solidified next week; instructor hours are after class Mon–Thu in Ann Belk; Tuesday/Thursday afternoons have a 30-minute window in the College of Education.
- General math tutoring lab funded by the college; free tutoring available.
- AppState support network: AppFairs (mental/physical health, wellbeing, housing/food), Disability Services (ODR), academic supports, tutoring, life coaching, and the Office of Student Success.
- Policies mentioned in syllabus: integrity, attendance/engagement, mandatory reporting, emergency preparedness, and other college policies.
Important policies:
- Integrity: work should be your own; adhere to academic honesty.
- Attendance and engagement: you are paying for the class; participate actively.
- Mandatory reporting: instructors/GTA staff are mandatory reporters; disclosures about self-harm, harm to others, violence, or abuse must be reported.
- Emergency preparedness: stay informed via university alerts; be aware of weather-related cancellations and procedures.
Course focus and pacing:
- Plan to move at a steady pace; foundational topics first to build confidence.
- Tools: calculators and Desmos (free calculator) will be used for computations and multiple representations.
Core concepts introduced today:
- Data and context: data are numbers or labels with context; context is crucial to interpretation.
- Raw data vs summarized data: raw data show individual observations; summarized data present a distribution or summary (e.g., tables, graphs).
- Variables can be categorical or quantitative; examples:
- Categorical: class year (Freshman, Sophomore, Junior, Senior) — a category.
- Quantitative: GPA (grade point average) — a measurement.
- Pulse rate — numerical measurement (quantitative).
- Area code — numeric-looking, but categorical (location-based, not a measurement).
- Reported homework times in ranges — treated as categorical (ordinal) data when summarized as ranges.
- Ordinal variables: categories with a meaningful order (e.g., Likert scale: strongly agree, agree, neutral, disagree, strongly disagree).
- Context matters: scope (e.g., Appalachian State students vs. broader populations) affects interpretation.
Graphs and data visualization:
- Good graphs have: a clear title, labeled axes with units, and a data source.
- Data bias and data sources: know where data come from and who collected them; potential biases may influence conclusions.
- Graph types:
- Bar charts: for categorical data; bars do not touch (categories are discrete).
- Histograms: for quantitative data; bars touch (continuous data; data binned into numeric ranges).
- Side-by-side bar charts can compare groups (e.g., late excuses by gender); careful labeling and understanding of denominators (total N) are needed.
- Interpretation practice: identify what the graph represents, critique improvements, and determine what information is missing (e.g., year, total sample, data source).
- Special note on histograms vs. bar charts: decision on bin edges (e.g., left-inclusive vs. right-inclusive) is a modeling choice; requires context.
Descriptive statistics vocabulary:
- Frequency (f): the count of occurrences in a category.
- Relative frequency (rf): the proportion of occurrences, rf = f / N, where N is the total number of observations.
- Percentage (p): rf × 100 = (f / N) × 100.
- Fractions, decimals, and percentages are interchangeable representations of the same quantity; conversion is key.
Practical calculation overview (with a fake data example):
- Fake data: 24 people; preference for local coffee shops (e.g., Blue Deer on King Street).
- Given: one-sixth of the people prefer Blue Deer.
- Relative frequency: rf = rac{1}{6} = 0.1667 ext{ (approx. 4 decimals)}
- Percentage: p = rf imes 100 = 16.67 ext{%}
- Count: f = rac{1}{6} imes 24 = 4
- Desmos calculator can be used to compute these values and to reduce fractions when needed.
- If given a percentage or a relative frequency, you can back-calculate the count by multiplying by the total N: e.g., if rf = 0.375 (3/8) and N = 24, then f = rf × N =
- For a given frequency, convert to a relative frequency: http example: 9 people out of 24 is f = 9, rf = f / N = 9/24 = 0.375, p = 37.5%.
Summary of how to handle a data table in class practice:
- You will practice moving between f, rf, and p, using Desmos (or a calculator).
- You will sometimes be given f or rf or p and asked to fill in the rest, using the total N = 24 (in the example).
- Emphasis on exact fractions vs. decimals and rounding conventions (e.g., four decimal places for rf).
Introduction to means and medians (foundational measures of center):
- Median: the exact middle value in an ordered (ascending) list. If there are n data values, the median is the central value; half of the data lie below and half above.
- Mean (arithmetic average): the balancing point of the data; computed as ar{x} = rac{1}{n}
x_i
where the sum runs over all observations.- Practical method: to find the mean, sum all values and divide by the number of values.
- Rationale for two measures: mean reflects the total value distribution; median is robust to outliers and reflects the middle position.
- Mode: the value that occurs most frequently.
- Quick reminder: calculations can be done with a calculator; but the instructor also emphasizes writing out the process (sum and count for the mean; ordering for the median).
Simple worked example (mean and median with a small dataset):
- Given data (example): 25, 28, 30, 35, 35, 45, 47, and two more values to make nine data points (the middle value is the 5th when ordered).
- To compute the mean: add all nine values, then divide by 9: ar{x} = rac{ ext{sum of nine values}}{9}
edge
- For the median with nine values, the 5th value in the ordered list is the median.
- Concrete arithmetic example from the lecture shows how to identify the middle term when n is odd; the instructor notes how even-n cases will be handled later.
Preview of next topics:
- Tomorrow will cover how to handle even numbers of data values for the median and how to solve problems where a value is missing given a mean.