Exam1MathML

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
GameKnowt Play
New
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/86

flashcard set

Earn XP

Description and Tags

exam1

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

87 Terms

1
New cards

Population

The entire set of all individuals, items, or events of interest

2
New cards

Parameter

A numerical characteristic of the entire population

3
New cards

Sample

A small subset of observational units chosen from the population

4
New cards

Statistic

Numerical characteristics computed directly from the sample data

5
New cards

Random Sampling

Units are selected entirely at random from the population. Each subset is equally likely

6
New cards

Stratified Sampling

The population is divided into meaningful groups based on an important attribute,. Random representatives are then selected from each stratum, often in proportion to the total strata size

7
New cards

Cluster Sampling

The population is divided into groups based on a non-important attribute. Then, the entire content of some randomly selected clusters is sampled

8
New cards

Systematic Sampling

Selecting every k-th observational unit. The value k is calculated as population size / sample size

9
New cards

Convenience Sampling

Selecting units that are the easiest to access

10
New cards

Selection Bias

Bias caused by a bad selection methodology

11
New cards

Leading Question (Response) Bias

Questions are structured to elicit a particular response

12
New cards

Social Desirability Bias

Respondents answer in a way that is favorable or acceptable to others

13
New cards

Self-selection (Volunteer) Bias

The sample consists of a self-selected group of respondents who chose to participate

14
New cards

Nonresponse Bias

Occurs when certain groups prefer not to share their opinions, leading to a skewed sample

15
New cards

Observational Study

Data is collected by recording responses and measuring features as they naturally occur, without the researcher exerting any direct influence on the observed data

16
New cards

Statistical Experiment

Treatments are first assigned to the observational units, and then the responses are recorded

17
New cards

What is Data?

A collection of things known or assumed as facts. It is empirical, referring to something that is observed. When visualizing data, think of tables where each column is a different attribute and each row represents the measurements for a specific entity

18
New cards

What is an Observational Unit?

An individual, item, or event of the population for which a single data record is created

19
New cards

What is a Census?

A complete enumeration of every member of a population. It collects data from the entire population to get an accurate and complete picture

20
New cards

Define Population.

The entire set of all individuals, items, or events of interest. Examples include "all five-card poker hands" or "all car models made by Chevrolet

21
New cards

Define Sample.

A small subset of observational units chosen from the population. It is the subset for which the data of interest is actually collected

22
New cards

Define Parameter.

A numerical characteristic of the entire population. Examples include the population mean (μ). The goal of statistics is to learn these values. True population parameters typically cannot be measured directly

23
New cards

Define Statistic.

Numerical characteristics computed directly from the sample data. The mean of each sample () is a statistic. Statistics are used as tools for making inferences about the population parameters

24
New cards

What is the goal of a good Sampling Method?

To produce a sample that is a good representative of the population, ensuring that the sample statistics align with the population parameters

25
New cards

What is Random Sampling?

Units are selected at random from the entire population, where each subset is equally likely

26
New cards

What is Stratified Sampling?

The population is first divided into meaningful groups (strata) based on an important attribute, and then random representatives are selected from each stratum (often proportional to the total size)

27
New cards

What is Cluster Sampling?

The population is divided into groups based on a non-important attribute, and then the entire content of some randomly selected clusters is sampled

28
New cards

What is Systematic Sampling?

Selecting every k-th observational unit, where k is calculated as population size divided by sample size

29
New cards

What is Convenience Sampling?

Selecting observational units that are the easiest to access. This method is quick but results in a sample that is often not representative of the full population

30
New cards

Define Sampling Bias.

A difference between the parameter inferred from a sample and the true value of the parameter in the population. The way data is collected impacts the reliability of results

31
New cards

What is Selection Bias?

Bias caused by a bad selection methodology. Example: surveying economic policies only in an affluent neighborhood

32
New cards

What is Self-selection (Volunteer) Bias?

The sample consists of a self-selected group of respondents who chose to participate. Example: conducting a poll on X (Twitter)

33
New cards

What is Social Desirability Bias?

Respondents answer in a way that is favorable or designed to please others. Example: inflating how much one donates to charity, or a survey about the frequency of alcohol consumption

34
New cards

What is Leading Question (Response) Bias?

Occurs when questions are structured to elicit a particular response. Example: asking "How enjoyable was your recent shopping experience?" which assumes enjoyment

35
New cards

What is Nonresponse Bias?

Occurs when certain groups prefer not to share their opinions, leading to a skewed sample

36
New cards

Define Observational Study.

Data is collected by recording responses and measuring features as they naturally occur, without the researcher exerting any direct influence on the observed data. Example: Recording data about birds in a backyard

37
New cards

Define Statistical Experiment.

Treatments are first assigned to the observational units, and then the responses are recorded. Example: An A/B test where participants are randomly assigned one of two web page layouts

38
New cards

What is the key benefit of a Statistical Experiment?

With random assignment of treatments, researchers can investigate whether the treatment is the cause of the observed response

39
New cards

What is the purpose of Measures of Centrality?

They provide a single, representative value of the entire dataset. Knowing the central tendency helps understand what is typical and facilitates comparisons across different datasets

40
New cards

Define Mean and provide its notation for a sample and a population.

The mean is the average value of the data. Sample Mean: =∑xi/n​​. Population Mean: μ=∑xi/n​​. The larger the sample, the closer the sample mean gets to the population mean.

41
New cards

Define Median.

The median is the middle value in a sorted list of data. If the number of data points (n) is odd, the median is the value in the middle position. If n is even, the median is the average of the two central points

42
New cards

Define Mode

The mode is the value in a dataset that occurs most frequently. Alternatively, the mode is represented by a peak (or bump) in a frequency distribution (histogram)

43
New cards

How do the mean, median, and mode relate in different types of data distribution (Skewness)?

Symmetrical distribution: The mean, median, and mode are equal. 2. Right-skewed (positively skewed): The mean is greater than the median, which is greater than the mode. 3. Left-skewed (negatively skewed): The mode is greater than the median, which is greater than the mean

44
New cards

What is the purpose of Measures of Dispersion (Variance/SD)?

They quantify the variability or spread of data around a central point. They reveal how consistent or diverse a dataset is, and help in identifying outliers

45
New cards

Define Variance.

Variance is an average of a squared deviation from the mean. It is computed differently for populations and samples

46
New cards

Contrast the formulas for Population Variance (\sigma^2) vs. Sample Variance (s^2).

Population Variance (\sigma^2): Divides the sum of squared deviations by the total population size (N). Sample Variance (s^2): Divides the sum of squared deviations by (n−1). Dividing by (n−1) gives a less biased estimator of the population variance, preventing the sample variance from underestimating the actual variance

47
New cards

Define Standard Deviation (SD) and its purpose.

is the square root of the variance. By taking the square root, the data returns to the original units in which the measurements were made. It is often used to summarize a sample as ±s

48
New cards

What is the Empirical Rule?

The Empirical Rule applies to bell-shaped (normal) distributions (where mean=median=mode). It states that: 68% of all data falls within 1 standard deviation (σ) of the mean (μ); 95% falls within 2 σ; and 99.7% falls within 3 σ. Data points falling more than 3 σ from the mean are considered outliers

49
New cards

What is Chebyshev's Theorem and what kind of distributions does it apply to?

Chebyshev's Theorem applies to any dataset and any distribution. It states that the percentage of data within k standard deviations from the mean is at least 1−1/k²​, for k>1

50
New cards

What is the purpose of Measures of Position (e.g., Z-score, Quartiles)?

They indicate where a given value is located with respect to other data points. They divide the data into segments, revealing how the data is spread out, and help identify outliers

51
New cards

Define Percentile.

A percentile is a data value for which a specified proportion (n) of the distribution falls at or below that value. The median, for example, is the 50th percentile.

52
New cards

Define the three Quartiles (Q_1, Q_2, Q_3).

Q_1 (First Quartile): The 25th percentile; one-quarter of the data falls at or below Q1​. Q_2 (Second Quartile): The median of the dataset. Q_3 (Third Quartile): The 75th percentile; three-quarters of the data fall at or below Q3​.

53
New cards

What are the Range and the Inter-Quartile Range (IQR)?

Range: The difference between the maximum (max) and minimum (min) values (Max−Min). Inter-Quartile Range (IQR): The difference between the third quartile and the first quartile (IQR=Q3​−Q1​)

54
New cards

List the components of the Five-Number Summary.

The five values used to summarize data are: Minimum (Min), First Quartile (Q1​), Median (Q2​), Third Quartile (Q3​), and Maximum (Max). Quartiles divide the dataset into four parts, each covering 25% of the data

55
New cards

How are Quartiles used to detect outliers (Outlier Thresholds)?

Outliers are values that fall outside the calculated minimum and maximum thresholds defined using the IQR: Min Threshold: Q1​−1.5×IQR. Max Threshold: Q3​+1.5×IQR

56
New cards

What is a Box Plot (Box-and-Whisker Plot)?

A box plot is a visual representation created by plotting the five-number summary along a numeric axis. It provides insights into central tendency, spread (IQR), and outliers

57
New cards

What is a Z-score?

A standardized score that describes how many standard deviations from the mean a given value lies, and in which direction. It is computed as z=(val−μ​)/σ

58
New cards

Define a Random Process (or Random Experiment).

An action or process where the outcome is determined by chance. One spin of a prize wheel is an example of a random process

59
New cards

Define Sample Space.

The set of all possible outcomes of a random process.

60
New cards

Define Outcome.

One possible result of a random process.

61
New cards

Define a (Discrete) Event.

A subset of a sample space; it is a single outcome or a collection of outcomes. Events are typically denoted by A, B, or C.

62
New cards

What is the Complement of Event A (\neg A)?

The event consisting of all outcomes in the sample space that are not in event A.

63
New cards

What is the Union of Events A and B (A \cup B)?

The event consisting of all outcomes in A or B, including outcomes that are in both A and B.

64
New cards

What is the Intersection of Events A and B (A \cap B)?

The event consisting of only the outcomes in both A and B.

65
New cards

What defines Disjoint (Mutually Exclusive) Events?

knowt flashcard image
66
New cards

What are the three Axioms of Probability?

knowt flashcard image
67
New cards

Define a Discrete Random Variable (X).

A variable whose possible values are numerical outcomes of a random experiment, taking on only a countable number of distinct values (typically integers). Discrete random variables are usually counts

68
New cards

What is the general method for computing the probability P(A) of a discrete event A?

knowt flashcard image
69
New cards

How is P(A) visualized in relation to the sample space?

P(A) is visualized as the proportion of the total sample space where A is true. The total area of the sample space equals 1.0

70
New cards

How does the Complement Rule relate -A to A

knowt flashcard image
71
New cards

How is the probability of the Union of two disjoint events (A \cup B) calculated?

knowt flashcard image
72
New cards

How is the probability of the Intersection of two independent events (A \cap B) calculated?

knowt flashcard image
73
New cards

What is the purpose of Combinatorics in probability?

To count the total number of possible outcomes (N) and the number of successful outcomes (T) without having to enumerate them

74
New cards

State the Rule of Sum.

If you need to count the total number of elements in two sets (P and R), you sum up their set cardinalities ($

75
New cards

State the Rule of Product.

To count all possible pairs of elements from two sets (F and D), multiply the cardinality of the sets ($

76
New cards

Formula for Selection Type 1: Order Matters, With Repetitions?

The total number of groups of size k built from n items, allowing repetitions, is n^k

77
New cards

Formula for Selection Type 2: Order Matters, Without Repetitions (k-Permutations)?

knowt flashcard image
78
New cards

Formula for Selection Type 3: Order Doesn't Matter, Without Repetitions (Combinations)?

knowt flashcard image
79
New cards

Formula for Selection Type 4: Order Doesn't Matter, With Repetitions (Stars and Bars)?

knowt flashcard image
80
New cards
81
New cards
82
New cards
83
New cards
84
New cards
85
New cards
86
New cards
87
New cards