Prob & Stats Unit 1 Review Flashcards

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/69

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

70 Terms

New cards

Statistics

the science of collecting, organizing, summarizing,
and analyzing information to draw conclusions or answer questions

New cards

Population

The entire group we wish to study or make conclusions about

New cards

Parameter

A numerical summary calculated based on data from all individuals in the group of interest; it describes a characteristic of the population.

New cards

Sample

A subgroup of the entire group of interest.

New cards

Statistic

A numerical summary calculated based on data taken from a sample of the population; it describes a characteristic of the sample.

New cards

Qualitative Data

Non-numerical information that describes characteristics or qualities of a population or sample.

New cards

Quantitative Data

Numerical information that represents measurements or counts, allowing for mathematical calculations and statistical analysis.

New cards

Discrete Data

Quantitative data that can take on a countable number of values; finite; Ex. the number of students in a class or the number of cars in a parking lot.

New cards

Continuous Data

Quantitative data that can take on any value within a given range; infinite possibilities; Ex. height, weight, or temperature.

New cards

Nominal

a type of categorical data that represents distinct categories without any intrinsic ordering; examples include gender, race, or hair color.

New cards

Ordinal

a type of categorical data that represents distinct categories with a meaningful order or ranking, but without consistent intervals between the categories; examples include education level, customer satisfaction ratings, or ranks in a competition.

New cards

Ratio

a type of quantitative data that possesses a true zero point, allowing for meaningful comparisons and calculations; examples include weight, height, and duration.

New cards

Interval

a type of quantitative data that has consistent intervals between values but no true zero point; examples include temperature in Celsius or Fahrenheit and dates.

New cards

Observational Study

A research method in which the researcher observes and records behavior without manipulating any variables. It aims to gather data in a natural setting to understand relationships between variables.

New cards

Experimental Study

A research method where the researcher manipulates one or more independent variables to observe their effect on a dependent variable, often using control and experimental groups to establish cause-and-effect relationships.

New cards

Cross-Sectional Study

A research method that examines a population at a single point in time, providing a snapshot of data for analysis of relationships and characteristics.

New cards

Case-Control Study

A research method comparing subjects with a particular condition to those without, looking back in time to identify potential risk factors.

New cards

Cohort Study

A research method that follows a group of individuals over time to assess the development of outcomes, comparing those exposed to a certain factor with those not exposed.

New cards

Confounding Variable

A variable that influences both the dependent and independent variable, potentially leading to a false association between the two; considered, but unable to
distinguish from another variable

New cards

Lurking Variable

A variable that is not measured or considered in a study but can affect the outcome, potentially leading to misleading conclusions.

New cards

Placebo

A substance with no therapeutic effect used as a control in experiments to test the effectiveness of another substance.

New cards

Blinding

The practice of keeping study participants unaware of whether they are receiving the treatment or placebo to reduce bias in research results.

New cards

Frame

a list of all members of the population; it serves as the basis for selecting samples.

New cards

Convenience Sampling

A non-probability sampling technique where samples are selected based on their easy availability and proximity to the researcher, potentially leading to biased results.

New cards

Simple Random Sampling

A sampling method where each individual has an equal chance of being chosen from a population, usually achieved through random selection.

New cards

Stratified Random Sampling

A sampling method where the population is divided into distinct subgroups, and samples are randomly selected from EACH stratum to ensure representation.

New cards

Systematic Random Sampling

A sampling method where individuals are selected from a larger population at regular intervals, often using a RANDOM starting point.

New cards

Cluster Random Sampling

A sampling method where the population is divided into groups, usually geographically, and entire groups are randomly selected to be included in the sample; an entire group may be left out of this sampling method.

New cards

Multistage Sampling

A sampling method that combines multiple sampling techniques, such as cluster sampling and stratified sampling, to select samples from different stages or levels of the population. This method allows for flexibility and efficiency in obtaining a representative sample from large populations; teacher says most common is cluster → SRS

New cards

Bias

Occurs when results are not representative of reality, potentially due to poor sampling procedures, low response rates, problematic survey questions, faulty analysis, or bad luck that skews the findings and leads to incorrect conclusions.

New cards

Non-Response Bias

A type of bias that occurs when individuals selected for a survey or study do not respond, leading to a potential distortion in the overall results if the non-respondents differ significantly from respondents; mitigate through incentives

New cards

Response Bias

answers provided on a survey do not reflect the reality of the
respondent’s feelings or beliefs, often due to leading questions, social desirability, or misunderstanding.

New cards

Frequency Distribution

A summary of how often each value occurs in a data set, typically displayed in a table or graph to show the distribution of values. It helps to visualize the data's shape, central tendency, and variability.

New cards

Relative Frequency

indicates the proportion (or percentage) of observations within that
category; calculated by dividing the frequency of a category by the total number of observations.

New cards

Bar Graph

A graphical representation of data using rectangular bars, where the length of each bar is proportional to the value it represents; are commonly used to compare different categories; bars don’t touch

New cards

Pareto Chart

A type of bar graph where the values are represented in descending order, highlighting the most significant factors in a dataset; often used for quality control and decision making.

New cards

Pie Charts

A circular graph divided into slices, each representing a proportion of the whole; commonly used to show percentage or proportional data. Compares relative frequencies for qualitative data

New cards

Histogram

similar to a bar graph, only it is for quantitative data; bars should touch; It visually displays the distribution of numerical data by grouping values into intervals or bins, allowing for identification of patterns such as skewness or modality.

New cards

Classes

Intervals that group quantitative data in a histogram, allowing for analysis of the frequency of data points within those ranges.

New cards

Lower Class Limit

smallest possible value within a class

New cards

Upper Class Limit

largest possible value within a class

New cards

Class Width

the difference between consecutive classes’ lower limits; NOT upper limit - lower limit

New cards

Stem-and-Leaf Plot

a method of displaying quantitative data that separates the digits of each value into a "stem" and a "leaf."

New cards

Dot-Plot

a statistical chart that uses dots to represent the frequency of data points along a number line.

New cards

Skew

A type of distribution in which data points are not symmetrically distributed around the mean, resulting in a longer tail on one side.

New cards

Skewed Left

A distribution where most values are concentrated on the right side, causing the tail to extend to the left. It indicates that a majority of data points are above the mean.

New cards

Skewed Right

A distribution where most values are concentrated on the left side, causing the tail to extend to the right. It indicates that a majority of data points are below the mean.

New cards

Time Series Graph

A graphical representation that displays data points over time, showing trends and patterns in the data across chronological intervals.

New cards

Central Tendency

A statistical measure that identifies a single score as representative of an entire dataset, commonly using mean, median, or mode to summarize the data; shape, center, and spread

New cards

Mean

the average value of a dataset, calculated by dividing the sum of all values by the number of values. It is a measure of central tendency that provides a central point around which the data is distributed; true ‘average,’ non-resistant statistic

New cards

Population Mean

the average of a population data set, calculated similarly to the sample mean but includes all members of the population.

New cards

Sample Mean

the average calculated from a sample of a population, used to estimate the population mean; obtained by dividing the sum of the sample data by the number of observations in the sample.

New cards

Median

the middle value of a dataset when arranged in ascending or descending order; if there is an even number of observations, the median is the average of the two middle numbers. It is another measure of central tendency; resistant statistic

New cards

Resistant Statistic

A statistic that is not affected by extreme values or outliers, making it a robust measure of central tendency, such as the median.

New cards

Outlier

A data point that significantly differs from other observations in a dataset, which can skew results and affect statistical calculations.

New cards

Mode

The value that appears most frequently in a dataset, which can be used as a measure of central tendency.

New cards

Dispersion

The extent to which values in a dataset differ from each other and the average; common measures include range, variance, and standard deviation.

New cards

Range

The difference between the largest and smallest values in a dataset, providing a measure of how spread out the values are; non-resistant

New cards

Standard-Deviation

A measure of the amount of variation or dispersion in a set of values; it quantifies how much the individual data points differ from the mean; non-resistant

New cards

Population Standard Deviation

The standard deviation calculated using the entire population data rather than a sample. It provides a precise measure of variability for a whole population. (unadj. std. var.)

New cards

Sample Standard Deviation

The standard deviation calculated from a sample of data, used to estimate the variability of the larger population. It adjusts for sample size by applying Bessel's correction. (std. var.)

New cards

Variance

the square of the standard deviation; a measure of the dispersion of a set of values, indicating how far each number in the set is from the mean.

New cards

Empirical Rule (68-95-99.7 Rule)

A statistical rule that states for a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations; non-resistant; ABOUT

New cards

Chebyshev’s inequality

The minimum percentage of observations within k standard
deviations of the mean is; A statistical theorem that states for any data distribution, at least 1 - (1/k²) of the data lies within k standard deviations from the mean, where k > 1; resistant; AT LEAST

New cards

Z-Score

A statistical measurement that describes a value's relationship to the mean of a group of values, expressed in terms of standard deviations. It indicates how many standard deviations a data point is from the mean; mean of 0 and a standard deviation of 1

New cards

Percentiles

measures that indicate the value below which a given percentage of observations fall in a dataset; two parts: above and below the value

New cards

Quartiles

values that divide a dataset into four equal parts, with each representing 25% of the data; (Q1, Median, Q2)

New cards

Interquartile Range (IQR)

A measure of statistical dispersion that represents the range between the first quartile (Q1) and the third quartile (Q3) in a dataset. It effectively shows the middle 50% of data points, helping to identify outliers; resistant

New cards

5-Number Summary

A summary that provides five key values of a dataset: the minimum, first quartile (Q1), median, third quartile (Q3), and maximum. It offers a quick overview of the dataset's distribution.

New cards

Box-Plots

A graphical representation of the 5-number summary that displays the distribution of a dataset; show the median, quartiles, and any potential outliers.