1/29
Comprehensive practice flashcards covering tidy data, bias, visualization, linear regression, probability distributions, hypothesis testing, and common R language statistical functions.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is Tidy Data? 🤔
A neat structure where each column is a variable (like price or weight) and each row is one observation (like one listing). 🗂️
How do we define Qualitative (Categorical) Variables? 🧐
These are variables without meaningful numbers (e.g., room type), and we summarize them using counts or proportions. They’re often shown using cute bar charts! 📊
What are Quantitative (Numerical) Variables? 📈
These are meaningful numbers (like price and age), shown using mean, SD, or median, and visualized with histograms or boxplots. Think of it as solid data you can count on! 😄
What does Bias (Systematic Error) mean? 🕵️♂️
An error consistently leaning in one direction, like a pirate always steering left! It includes sampling bias, response bias, and non-response bias. 🏴☠️
What is Sampling Bias? 🎯
It happens when your sample doesn’t reflect the whole population, like surveying cats about their favorite dogs! 🐱➡️🐶
What is Response Bias? 🗳️
Here, a poorly-worded question leads people to say something they don’t mean, like asking if they love broccoli when they really don’t! 🥦❌
What is Non-response Bias? ❓
Occurs when certain types of people don’t respond, like missing out on grandmas in online surveys! 🧓💻
How can we write the Measurement Error Formula? 🔍
It’s Individual measurement=exact value+chance error+bias. Think of it as detective work on your measurements! 🕵️♀️
What is Chance Error? 🤷♂️
These are random fluctuations you can figure out by repeating measurements and calculating the Standard Deviation (SD). Consider it the oops factor! 🤪
What do we know about Standard Deviation (SD)? 🎢
It shows distance and is always ≥0; like a hike up a mountain, it can’t go down! ⛰️
What is the Mean Transformation formula? 📊
It goes like this: New Mean=a+b×(old mean). Imagine adjusting your favorite recipe! 🍰
How does SD Transformation work? 🎉
It’s New SD=∣b∣×(old SD); adding a just shifts it without changing the spread. Like changing the pizza topping without altering the slice size! 🍕
What does the Correlation Coefficient (r) tell us? 📏
It’s a value from −1 to +1 showing the strength/direction of a linear association. Think of it like the friendship meter! 👯
What is the Regression Line equation? 📉
It’s yˆ=a+bx; this line tries to minimize the discrepancy! Think of it as finding the best fit for your outfit! 👗👖
What is the Slope (b) of the Regression Line? 🤔
It shows how much y changes for every 1-unit increase in x; b=r×SDxSDy. Think of it as how you’ll grow taller each year! 📏
What does R2 (Coefficient of Determination) express? 📊
It tells us the percentage of variation in y explained by x. Think of it like your study time explaining your grade outcome! 🎓
What is a Residual? 📉
It’s Residual=Actual−Predicted; fun fact: the mean of residuals is always zero! Like when surprises don’t balance out! 🎲
What is RMS Error? 📏
It’s the Standard Deviation (SD) of the residuals – basically giving you a fun summary of how well your predictions are doing! 🙌
What does the Normal Distribution N(μ, σ²) Empirical Rule say? 📐
68\text{%} of data is within ±1SD, 95\text{%} within ±2SD, and 99.7\text{%} within ±3SD. It’s like the data party where most show up early! 🎉
How do we use R function: pnorm(x, mean, sd)? 📊
This function finds P(X≤x) for a normal distribution; remember, the 3rd argument is SD, not variance. Think of it as your calculator buddy! 🖥️
What does the Binomial Formula P(X=k) provide us? 📈
It’s (kn)×pk×(1−p)n−k. Picture it as breaking down choices like your snack options! 🍿
What is the Central Limit Theorem (CLT)? 🎇
It says the sample sum or mean becomes Normal as the sample size grows, like gathering friends for a party! 🎈
What are the Sample Sum Expected Value (EV) and Standard Error (SE) formulas? 🧮
EV=n×μ and SE=∑n×σ. Think of them as your homework values and confidence levels! 📚
What are the Sample Mean Expected Value (EV) and Standard Error (SE) formulas? 📏
EV=μ and SE=∑nσ. These are like your average scores to keep you on track! 🎯
What is Prosecutor's Fallacy? ⚖️
The error of confusing P(evidence∣innocent) with P(innocent∣evidence). It’s like mixing up your friends' stories! 🗣️
What does a P-value represent? 🧙♂️
It’s the probability of seeing data as extreme as observed if the null hypothesis (H0) is true. Think of it as your luck factor in magic tricks! 🎩
What is a Chi-squared Test of Independence? ❓
A test where H0 states that two categorical variables are independent. It’s like checking if two games can be played without affecting each other! 🎮🎲
What is the Confidence Interval (CI) Hypothesis Testing rule? 📏
If H0 value is inside the CI, keep it; if not, reject it. It’s like checking if your favorite spot is still open! 🏞️
What is Homoscedasticity? 🎢
A condition in residual plots where they scatter randomly and have consistent spread around zero; think of it as a fun fair ride that stays steady! 🎠
What is Extrapolation? ⚠️
The mistake of predicting values outside the original data range fitted by a regression model. It’s like guessing how popular a new snack might be without trying it! 🍿📉