Unit 4: Exploring Data

studied byStudied by 18 people
0.0(0)
Get a hint
Hint

Mosaic Plots

1 / 100

101 Terms

1

Mosaic Plots

________: Stacked bar chart that shows percentages of data in groups.

New cards
2

Box plots

________: a graph that gives a quick picture of the middle 50 % of the data.

New cards
3

Outliers

________: An observation that is surprisingly different from the rest of the data.

New cards
4

Bivariate data

________: Taking two measurements on each object (Ex.

New cards
5

Dotplot

________: Best for small data sets, similar to histograms and bar plots.

New cards
6

Numerical

________ or Qualitative: Outcomes can be measured arithmetically.

New cards
7

Sample

________: The part of the population that is actually studied.

New cards
8

Quartiles

________: Divide a set of values into four equal parts by using the 25th, 50th, and 75th.

New cards
9

Q1

________: 25 % of values are below and 75 % of values are above.

New cards
10

Correlation Coefficient

________: Numerical measures used to judge the relation between two variables.

New cards
11

standard deviation

Can be qualified through the range, ________, or variance of a distribution.

New cards
12

Q2

________: 50 % of the values are below and 50 % of the values are above.

New cards
13

Spread

________: Describes how far the data points are from the center.

New cards
14

Univariate data

________: Taking only one measurement on each object (Ex.

New cards
15

Histogram

________: a graphical representation in the x- y form of the distribution of data in a data set; x represents the data and y represents the frequency or relative frequency.

New cards
16

Shape

________: Distribution can tell us where most of the data is.

New cards
17

Categorical

________ or Qualitative: Places the individual being studied into one of several groups.

New cards
18

Error

________ or residual= e= y- ŷ= observed values of Y for a given value of X- predicted value of Y for a given value of X.

New cards
19

Population

________: The entire group of individuals or things that we are interested in.

New cards
20

Range

________: The difference between the largest and the smallest measurement in a data set.

New cards
21

graph

The ________ consists of contiguous rectangles.

New cards
22

Scatterplot

________: Graphical summary measure.

New cards
23

Linear regression mode

________: Is an equation that gives a straight- line relationship between two variables.

New cards
24

Direction

________: The scatterplot will show whether the y- value increases or decreases as the x increases, or that it changes ________.

New cards
25

Positive z score

________: Indicates that the measurement is larger than the mean.

New cards
26

Linear Regression

________: If two different qualitative variables have a linear relation, then we can measure the strength of that relationship using this.

New cards
27

Statistics

________: The science of data.

New cards
28

Stem

________- and- leaf graph or stemplot: easy to compute the median and other quantiles.

New cards
29

Positive relation

________: Increasing or upward trend between two variables.

New cards
30

Tabular Methods

________: Frequency distribution table (it facilitates the analysis of patterns of variation among observed data)

New cards
31

regression line

Predicted value: computed using the estimated ________ and is also known as "y hat.

New cards
32

Coefficient of determination

________: measures the percent of the variation in Y- values explained by the linear relation between X- and Y- values.

New cards
33

Descriptive methods

________: The different methods used collect data.

New cards
34

Population mean

________: Adding up all the values in the entire population and dividing by the number of values.

New cards
35

Frequency

________** (f): Number of times that observation has occurred.

New cards
36

Bar Charts

________: The length of the bar for each category is proportional to the number or percent of individuals in each category.

New cards
37

Cumulative Frequency Charts

________: Frequency for that group plus the frequencies of all groups of small observations.

New cards
38

Statistics

The science of data

New cards
39

Descriptive methods

The different methods used collect data

New cards
40

Categorical or Qualitative

Places the individual being studied into one of several groups

New cards
41

Numerical or Qualitative

Outcomes can be measured arithmetically

New cards
42

Univariate data

Taking only one measurement on each object (Ex

New cards
43

Bivariate data

Taking two measurements on each object (Ex

New cards
44

Tabular Methods

Frequency distribution table (it facilitates the analysis of patterns of variation among observed data)

New cards
45

n

Denotes the number of observations

New cards
46

**Frequency (**f)

Number of times that observation has occurred

New cards
47

Relative frequency

Ratio of the frequency to the total number of observations

New cards
48

Cumulative frequency

Gives the number of observations less than or equal to a specific value

New cards
49

Frequency distribution table

A table giving all possible values of a variable and their frequencies

New cards
50

Bar Charts

The length of the bar for each category is proportional to the number or percent of individuals in each category

New cards
51

Pie Chart

Categories of data are represented by wedges in a circle and are proportional in size to the percentage of individuals in each category

New cards
52

Segmented Bar Chart

Takes the distribution from each group and arranges them along either the horizontal or vertical axis and shows the relative frequency of each group represented in one bar for each group

New cards
53

Mosaic Plots

Stacked bar chart that shows percentages of data in groups

New cards
54

Center

Describes the "typical" or central data points

New cards
55

Spread

Describes how far the data points are from the center

New cards
56

Shape

Distribution can tell us where most of the data is

New cards
57

Symmetrical Distribution

The data is spread out in the same way on both sides and there is the same amount of data on each side of the center

New cards
58

Skewed Distribution

If there is an extreme value in only one direction that causes one side to have a longer tail

New cards
59

Cluster sample

A sample in which the researcher first divides the population into sections (or clusters), and then randomly selects all members from some of those clusters

New cards
60

Outliers

An observation that is surprisingly different from the rest of the data

New cards
61

Stem-and-leaf graph or stemplot

easy to compute the median and other quantiles

New cards
62

Dotplot

Best for small data sets, similar to histograms and bar plots

New cards
63

Histogram

a graphical representation in the x-y form of the distribution of data in a data set; x represents the data and y represents the frequency or relative frequency

New cards
64

Cumulative Frequency Charts

Frequency for that group plus the frequencies of all groups of small observations

New cards
65

Population

The entire group of individuals or things that we are interested in

New cards
66

Sample

The part of the population that is actually studied

New cards
67

Mean

The arithmetic means AKA average

New cards
68

Population mean

Adding up all the values in the entire population and dividing by the number of values

New cards
69

Median

Point that divides the measurements in half

New cards
70

Range

The difference between the largest and the smallest measurement in a data set

New cards
71

Interquartile range

The range of the middle 50% of the data, the difference between the third quartile and the first quartile

New cards
72

Standard deviation

A number that is equal to the square root of the variance and measures how far data values are from their mean

New cards
73

Variance

Average of the squares of the deviation

New cards
74

Percentiles

Percentiles divide a set of values into 100 equal parts

New cards
75

Quartiles

Divide a set of values into four equal parts by using the 25th, 50th, and 75th

New cards
76

Q1

25% of values are below and 75% of values are above

New cards
77

Q2

50% of the values are below and 50% of the values are above

New cards
78

Q3

75% of values are below and 25% of values are above

New cards
79

Standardized scores or z-scores

Gives the distance between the measurements and the mean in terms of the number of standard deviations

New cards
80

Negative z-score

Indicated that the measurements are smaller than the mean

New cards
81

Positive z-score

Indicates that the measurement is larger than the mean

New cards
82

Box plots

a graph that gives a quick picture of the middle 50% of the data

New cards
83

Bivariate data

Data on two different variables collected from each item in a study

New cards
84

Linear Regression

If two different qualitative variables have a linear relation, then we can measure the strength of that relationship using this

New cards
85

Scatterplot

Graphical summary measure

New cards
86

Shape

A scatter plot tells us whether the nature of the relation between the two variables in linear or nonlinear

New cards
87

Direction

The scatterplot will show whether the y-value increases or decreases as the x increases, or that it changes direction

New cards
88

Positive relation

Increasing or upward trend between two variables

New cards
89

Negative relation

Decreasing or downward trend between the two variables

New cards
90

Strength of relationship

If the trend of the data can be described with a line of the curve then the spread of the data values around the line or curve describes the degree of the relation between the two

New cards
91

Correlation Coefficient

Numerical measures used to judge the relation between two variables

New cards
92

Linear regression mode

Is an equation that gives a straight-line relationship between two variables

New cards
93

Independent variable

x

New cards
94

Dependent variable

y

New cards
95

Slope

b

New cards
96

y-intercept

a

New cards
97

Predicted value

computed using the estimated regression line and is also known as "y hat"

New cards
98

Least square regression line

line that minimizes the sum of the squares of the residuals

New cards
99

Outliers

are observed data points that are far from the least squares line

New cards
100

Influential points

observed data points that are far from the other observed data points in the horizontal direction

New cards

Explore top notes

note Note
studied byStudied by 132 people
... ago
5.0(1)
note Note
studied byStudied by 55 people
... ago
4.5(2)
note Note
studied byStudied by 7 people
... ago
5.0(1)
note Note
studied byStudied by 30 people
... ago
5.0(1)
note Note
studied byStudied by 37 people
... ago
5.0(1)
note Note
studied byStudied by 6 people
... ago
5.0(1)
note Note
studied byStudied by 16 people
... ago
5.0(1)
note Note
studied byStudied by 23129 people
... ago
4.8(187)

Explore top flashcards

flashcards Flashcard (21)
studied byStudied by 4 people
... ago
5.0(1)
flashcards Flashcard (93)
studied byStudied by 13 people
... ago
5.0(2)
flashcards Flashcard (27)
studied byStudied by 5 people
... ago
5.0(1)
flashcards Flashcard (58)
studied byStudied by 4 people
... ago
5.0(1)
flashcards Flashcard (83)
studied byStudied by 8 people
... ago
5.0(1)
flashcards Flashcard (30)
studied byStudied by 1 person
... ago
5.0(1)
flashcards Flashcard (22)
studied byStudied by 2 people
... ago
5.0(1)
flashcards Flashcard (68)
studied byStudied by 29 people
... ago
5.0(2)
robot