1/23
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
[T1] Why must a scientific hypothesis be falsifiable?
Because science requires hypotheses that could be refuted by evidence; otherwise they can’t be tested meaningfully.
[T1] What’s the difference between descriptive and inferential statistics?
Descriptive summarizes the observed data; inferential makes conclusions beyond the sample (population/uncertainty).
[T1] In plain language, what does a p-value tell you?
How surprising your data (or more extreme) would be if the null hypothesis were true.
[T1] What role does α = 0.05 play in hypothesis testing?
It’s the cutoff used to decide if a result is “statistically significant” under the chosen standard.
[T2] If a network has N nodes and is fully connected, how many links does it have—and why?
N(N−1)/2 because every pair of distinct nodes forms one undirected link.
[T2] Degree distribution vs frequency distribution: what’s the difference?
Degree distribution is a probability (chance a node has degree k); frequency distribution is counts (# nodes with degree k).
[T2] What does a high clustering coefficient imply about a node’s neighborhood?
The node’s neighbors tend to also be connected to each other (more “clique-like”).
[T2] How do you generate a random network in the slides’ model?
Use N nodes and connect each pair with probability p.
[T2] What is homophily and why does it matter in social networks?
Similar people connect more often; it shapes communities and patterns like shared behaviors/opinions.
[T2] What is the main idea behind “six degrees of separation”?
Social networks can have short path lengths, so most people are connected by only a few steps.
[T3] What is culturomics trying to measure, and what dataset enables it?
Cultural change/usage patterns at scale using digitized text; Google Books corpus is central.
[T3] Why does the dataset impose a “min 40 occurrences” rule for n-grams?
To filter out extremely rare sequences and focus on more stable/meaningful patterns.
[T3] If the dataset supports up to 5-grams, what’s an example of a 5-gram?
Any 5-word sequence (e.g., “to be or not to”). (Concept: sequence of 5 tokens.)
[T4] Why is testing a model on its training data misleading?
It can overfit—performing well by memorizing training examples rather than generalizing.
[T4] In holdout evaluation, why separate training/validation/test?
Train builds the model; validation tunes settings; test estimates real-world performance.
[T4] How does k-fold CV reduce the risk of a “lucky/unlucky” split?
Every data point gets to be in a test fold; performance is averaged across k runs.
[T4] Give an example of when accuracy is a bad metric and explain why.
Imbalanced classes: predicting the majority class can yield high accuracy but fail to detect the rare (important) class.
[T4] How do precision and recall differ in what they “care about”?
Precision: how reliable positive predictions are; Recall: how many actual positives are caught.
[T4] When would you prioritize recall over precision?
When missing positives is costly (e.g., disease detection), so you want to catch as many positives as possible.
[T4] Why use F1 instead of accuracy?
F1 balances precision and recall, which is useful especially with class imbalance.
[T4] Walk through how k-NN classifies a new point.
Choose k, find k nearest labeled points, take majority vote, assign that class.
[T4] What’s a major weakness of k-NN mentioned in the slides?
It’s memory-heavy, prediction can be slower, and it’s sensitive to irrelevant features.
[T4] How does a decision tree decide splits, at a high level?
It iteratively chooses splits (often after discretizing continuous values) to improve accuracy until reaching leaves.
[T4] Why can decision trees overfit?
Too many branches (especially with outliers) can fit noise rather than general patterns.