Unit 4: Exploring Data

0.0(0)

Studied by 18 people

0%Unit 1 Mastery

0%Exam Mastery

View linked note

Build your Mastery score

AP Practice

Supplemental Materials

Call Kai

Card Sorting

1/100

Earn XP

Description and Tags

Statistics

Collecting Data

AP Statistics

Unit 1: Exploring One-Variable Data

Exploring Data

Statistics

Descriptive Methods

Types of Variables

Graphical Methods

Qualitative Data

Quantitative Data

Summarizing Distribution

Measures of Central Tendency, Variation, and Position

Univariate Data

Bivariate Data

Least Squares Regression Line

Outliers and Influential Points

11th

Last updated 1:59 AM on 3/31/23

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

101 Terms

New cards

Mosaic Plots

________: Stacked bar chart that shows percentages of data in groups.

New cards

Box plots

________: a graph that gives a quick picture of the middle 50 % of the data.

New cards

Outliers

________: An observation that is surprisingly different from the rest of the data.

New cards

Bivariate data

________: Taking two measurements on each object (Ex.

New cards

Dotplot

________: Best for small data sets, similar to histograms and bar plots.

New cards

Numerical

________ or Qualitative: Outcomes can be measured arithmetically.

New cards

Sample

________: The part of the population that is actually studied.

New cards

Quartiles

________: Divide a set of values into four equal parts by using the 25th, 50th, and 75th.

New cards

________: 25 % of values are below and 75 % of values are above.

New cards

Correlation Coefficient

________: Numerical measures used to judge the relation between two variables.

New cards

standard deviation

Can be qualified through the range, ________, or variance of a distribution.

New cards

________: 50 % of the values are below and 50 % of the values are above.

New cards

Spread

________: Describes how far the data points are from the center.

New cards

Univariate data

________: Taking only one measurement on each object (Ex.

New cards

Histogram

________: a graphical representation in the x- y form of the distribution of data in a data set; x represents the data and y represents the frequency or relative frequency.

New cards

Shape

________: Distribution can tell us where most of the data is.

New cards

Categorical

________ or Qualitative: Places the individual being studied into one of several groups.

New cards

Error

________ or residual= e= y- ŷ= observed values of Y for a given value of X- predicted value of Y for a given value of X.

New cards

Population

________: The entire group of individuals or things that we are interested in.

New cards

Range

________: The difference between the largest and the smallest measurement in a data set.

New cards

graph

The ________ consists of contiguous rectangles.

New cards

Scatterplot

________: Graphical summary measure.

New cards

Linear regression mode

________: Is an equation that gives a straight- line relationship between two variables.

New cards

Direction

________: The scatterplot will show whether the y- value increases or decreases as the x increases, or that it changes ________.

New cards

Positive z score

________: Indicates that the measurement is larger than the mean.

New cards

Linear Regression

________: If two different qualitative variables have a linear relation, then we can measure the strength of that relationship using this.

New cards

Statistics

________: The science of data.

New cards

Stem

________- and- leaf graph or stemplot: easy to compute the median and other quantiles.

New cards

Positive relation

________: Increasing or upward trend between two variables.

New cards

Tabular Methods

________: Frequency distribution table (it facilitates the analysis of patterns of variation among observed data)

New cards

regression line

Predicted value: computed using the estimated ________ and is also known as "y hat.

New cards

Coefficient of determination

________: measures the percent of the variation in Y- values explained by the linear relation between X- and Y- values.

New cards

Descriptive methods

________: The different methods used collect data.

New cards

Population mean

________: Adding up all the values in the entire population and dividing by the number of values.

New cards

Frequency

________** (f): Number of times that observation has occurred.

New cards

Bar Charts

________: The length of the bar for each category is proportional to the number or percent of individuals in each category.

New cards

Cumulative Frequency Charts

________: Frequency for that group plus the frequencies of all groups of small observations.

New cards

Statistics

The science of data

New cards

Descriptive methods

The different methods used collect data

New cards

Categorical or Qualitative

Places the individual being studied into one of several groups

New cards

Numerical or Qualitative

Outcomes can be measured arithmetically

New cards

Univariate data

Taking only one measurement on each object (Ex

New cards

Bivariate data

Taking two measurements on each object (Ex

New cards

Tabular Methods

Frequency distribution table (it facilitates the analysis of patterns of variation among observed data)

New cards

Denotes the number of observations

New cards

**Frequency (**f)

Number of times that observation has occurred

New cards

Relative frequency

Ratio of the frequency to the total number of observations

New cards

Cumulative frequency

Gives the number of observations less than or equal to a specific value

New cards

Frequency distribution table

A table giving all possible values of a variable and their frequencies

New cards

Bar Charts

The length of the bar for each category is proportional to the number or percent of individuals in each category

New cards

Pie Chart

Categories of data are represented by wedges in a circle and are proportional in size to the percentage of individuals in each category

New cards

Segmented Bar Chart

Takes the distribution from each group and arranges them along either the horizontal or vertical axis and shows the relative frequency of each group represented in one bar for each group

New cards

Mosaic Plots

Stacked bar chart that shows percentages of data in groups

New cards

Center

Describes the "typical" or central data points

New cards

Spread

Describes how far the data points are from the center

New cards

Shape

Distribution can tell us where most of the data is

New cards

Symmetrical Distribution

The data is spread out in the same way on both sides and there is the same amount of data on each side of the center

New cards

Skewed Distribution

If there is an extreme value in only one direction that causes one side to have a longer tail

New cards

Cluster sample

A sample in which the researcher first divides the population into sections (or clusters), and then randomly selects all members from some of those clusters

New cards

Outliers

An observation that is surprisingly different from the rest of the data

New cards

Stem-and-leaf graph or stemplot

easy to compute the median and other quantiles

New cards

Dotplot

Best for small data sets, similar to histograms and bar plots

New cards

Histogram

a graphical representation in the x-y form of the distribution of data in a data set; x represents the data and y represents the frequency or relative frequency

New cards

Cumulative Frequency Charts

Frequency for that group plus the frequencies of all groups of small observations

New cards

Population

The entire group of individuals or things that we are interested in

New cards

Sample

The part of the population that is actually studied

New cards

Mean

The arithmetic means AKA average

New cards

Population mean

Adding up all the values in the entire population and dividing by the number of values

New cards

Median

Point that divides the measurements in half

New cards

Range

The difference between the largest and the smallest measurement in a data set

New cards

Interquartile range

The range of the middle 50% of the data, the difference between the third quartile and the first quartile

New cards

Standard deviation

A number that is equal to the square root of the variance and measures how far data values are from their mean

New cards

Variance

Average of the squares of the deviation

New cards

Percentiles

Percentiles divide a set of values into 100 equal parts

New cards

Quartiles

Divide a set of values into four equal parts by using the 25th, 50th, and 75th

New cards

25% of values are below and 75% of values are above

New cards

50% of the values are below and 50% of the values are above

New cards

75% of values are below and 25% of values are above

New cards

Standardized scores or z-scores

Gives the distance between the measurements and the mean in terms of the number of standard deviations

New cards

Negative z-score

Indicated that the measurements are smaller than the mean

New cards

Positive z-score

Indicates that the measurement is larger than the mean

New cards

Box plots

a graph that gives a quick picture of the middle 50% of the data

New cards

Bivariate data

Data on two different variables collected from each item in a study

New cards

Linear Regression

If two different qualitative variables have a linear relation, then we can measure the strength of that relationship using this

New cards

Scatterplot

Graphical summary measure

New cards

Shape

A scatter plot tells us whether the nature of the relation between the two variables in linear or nonlinear

New cards

Direction

The scatterplot will show whether the y-value increases or decreases as the x increases, or that it changes direction

New cards

Positive relation

Increasing or upward trend between two variables

New cards

Negative relation

Decreasing or downward trend between the two variables

New cards

Strength of relationship

If the trend of the data can be described with a line of the curve then the spread of the data values around the line or curve describes the degree of the relation between the two

New cards

Correlation Coefficient

Numerical measures used to judge the relation between two variables

New cards

Linear regression mode

Is an equation that gives a straight-line relationship between two variables

New cards

Independent variable

New cards

Dependent variable

New cards

Slope

New cards

y-intercept

New cards

Predicted value

computed using the estimated regression line and is also known as "y hat"

New cards

Least square regression line

line that minimizes the sum of the squares of the residuals

New cards

Outliers

are observed data points that are far from the least squares line

100

New cards

Influential points

observed data points that are far from the other observed data points in the horizontal direction