Stats: Unit 1

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/63

flashcard set

Earn XP

Description and Tags

Definitions

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

64 Terms

1
New cards

Data

information about a specified collection

2
New cards

Statistics

the science of planning and studies and experiments; obtaining data; organizing and summarizing those data; and then drawing conclusions.

3
New cards

Population

the collection of ALL potential persons/objects/etc under study.

4
New cards

Sample

the best portion of persons/objects/etc under study for which data has been gathered.

5
New cards

For best results, a sample should have…

the same characteristics as the population it is representing.

6
New cards

Variable

a specific type of measurment

7
New cards

Parameter

is a numerical summary for a variable of a population

8
New cards

Statistic

is a numerical summary for a variable of a sample

9
New cards

Simple Random

each individual in the population has an equal chance of being chosen. Hard to obtain in real life; even a process like picking a ball from a bin blindfolded may deviate if not done carefully. This type of sample is most likely to imitate the characteristics of its host pop.

10
New cards

Systematic

taking items from a sorted list at regular intervals. Example, checking every 100th item made for quality control along a production line.

11
New cards

Stratified

First, divide a whole population into categories, then sample proportionally from each category. Note that there can be no one left out nor any overlap from the categories.

12
New cards

Cluster

First, identify clusters within a population, these may overlap and/or leave some individuals out. Then, choose a sample of clusters and gather samples from within each chosen cluster. Fro example, choosing 4 coffee shops around town and then asking 10 people from each survey a question.

13
New cards

Convenience

A sample which is taken quickly without any particular coordination. For example, a marketer may ask people leaving a nearby store about their options on a new product. The goal is usually not high-quality info.

14
New cards

Categorical

data consists of names or labels. These are called qualitative.

15
New cards

Numerical

data are numbers which actually represent the amount of measurement of something. These are called quantitive.

16
New cards

Discrete

data have gaps between possible values. The # of possible values within a window is finite.

17
New cards

Continuous

data have no gaps between possible values. The # of possible values within a window is infinite.

18
New cards

nominal

a variable with values that are only names, no structure.

19
New cards

ordinal

if its values are labels with a natural order

20
New cards

A variable is called interval if its values…

differences have consistent meaning.

21
New cards

A variable is called ratio If its values…

quotients (division) have consistent meaning.

22
New cards

Ratio values have a…

natural zero which indicates a lack of measurement.

23
New cards

A measure of center…

is a value (or values) toward the middle of a data set, which tends to be close to the other data values.

24
New cards

Mean

sum of all the data values divided by the amount of values

25
New cards

Median

a data set is a value where the count of smaller data values and the count of larger data values is equal

26
New cards

Mode

a data set is/are the value(s) which are repeated most often.

27
New cards

A summary is consistent…

if its values varies little between samples from the same population.

28
New cards

A summary is accurate…

if its value tends to be close to the associated population parameter.

29
New cards

A summary is robust…

if its values remains nearby with the addition or removal of extreme (large or small) values from a data set.

30
New cards

Measure of Spread

is a value which indicates how far data points tend to lay either from each other, or from a measure of center

31
New cards

Range

difference between max/min values

32
New cards

Inter-quartile

difference between the third and first quartiles. Q3-Q1+ IQR

33
New cards

Variance

of a data set is the average square deviation from the mean

34
New cards

Skew

of a data set measures how far data values tend to lean in either direction, relative to the center.

35
New cards

Skewed down/left…

if data values smaller than center are further from center than data values larger than center(-)

36
New cards

Skewed up/right

if data values smaller than center are closer to center than data values larger than center(+)

37
New cards

z-score

(or standard score or standardized value) is the number standard deviations that a given value x is above or below the mean.

38
New cards

If a z-score is les than two

the data is considered extremely small

39
New cards

if the z-score is more than two

the data is considered very large

40
New cards

A frequency table

shows how data are portioned among the different catergories called classes

41
New cards

Frequency

of a class is the count of data values that lie in the class

42
New cards

Bar graph

has gaps

43
New cards

histogram

has NO gaps

44
New cards

Statistical Study Approach

  1. ask questions

  2. determine how to gather the sample and which variables to measure

  3. analyze the data and create appropriate summarizes

  4. Create a picture of the results, drawing conclusions where appropriate

45
New cards

Sample error

arises directly from the sampling methodology and varies between samples of the same population

46
New cards

Sample bias

occurs when some individuals from the population are more/less likely to be chosen than others.

47
New cards

measurment

error arises from the data collection process as a result of how information is measured from an individual. Two types; random and systematic.

48
New cards

random

errors occur fresh from one observation to the next

49
New cards

systematic

errors occur in a related fashion between observations

50
New cards

Outlier

an observation lies very far from the other values in a sample, definition depends on context.

51
New cards
52
New cards
53
New cards
54
New cards
55
New cards
56
New cards
57
New cards
58
New cards
59
New cards
60
New cards
61
New cards
62
New cards
63
New cards
64
New cards