[MMW] M4 - Data Management

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/36

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

37 Terms

1
New cards
statistics
deals with numerical and categorical data
2
New cards
numerical

categorical
statistics deals with __ __and__ ____ data
3
New cards
numerical data
data representing counts or measurements
4
New cards
categorical data
data that can be classified
5
New cards
observation
any recording of information
6
New cards
collection

presentation

analysis

interpretation of data
4 statistical methods
7
New cards
descriptive statistics
using numerical and graphical tools to get a better understanding of the information contained in a set of data.
8
New cards
statistical inference
* Comprises those methods concerned with the analysis of a subset of data leading to predictions or inferences about the entire set of data
* *Infer the expected amount of rain for July next year based on the average precipitation data for July in the past 30 years*
* *Estimate the average depths of a lake from measures taken from a  set of random areas*
9
New cards
population
N

\
* Consists of the totality of the observations with which we are concerned 
* May be finite or infinite
10
New cards
sample
n

\
subset of population

for the inference from the sample to be valid, the sample must be representative of the population
11
New cards
simple random sample
* n observations is a sample that is chosen in such a way that every subset of n observations of the population has the sample probability of being selected
12
New cards
tables
* Summarizes raw data into an organized list or distribution of counts and may include sums and proportions
13
New cards
bar graphs
often used to compare quantities in different categories
14
New cards
pie graph
used to show the distribution of proportions of parts to a whole
15
New cards
line graph
show information that is connected in some way like changes through time
16
New cards
frequency distribution table
* Organization of raw data in table form, using **classes** and **frequencies**

\
* Each class is defined by its **class limits,** which are the smallest and highest data value that can be included in the class
* **Class boundaries**  are numbers used to separate the classes so that there are no gaps i the frequency distribution
* *There must be 5 to 20 classes that are* ***mutually exclusive and exhaustive***
* ***They must be continuous and of equal widths***
17
New cards
class limits
* Each class is defined by its **_____,** which are the smallest and highest data value that can be included in the class

\
18
New cards
class boundaries
**______** are numbers used to separate the classes so that there are no gaps i the frequency distribution

\
must be **mutually exclusive and exhaustive**

continuous and of equal widths
19
New cards
histogram
graph of **frequencies** against **class boundaries**
20
New cards
frequency polygon
line graph of **frequencies** against **class marks**

\
must be closed at the lowest and highest boundaries
21
New cards
ogive
line graph of **cumulative frequency** with the **upper boundaries**
22
New cards
**MEASURE OF CENTRAL TENDENCY**
* Gives a single value that acts as a ***representative*** or ***average*** of all the values the outcomes of your data set
23
New cards
central tendency
goal is to identify a *single value* that is the best representative of the entire set of data
24
New cards
mean
average
25
New cards
median
midpoint of the data set when arrangegd from smallest to largest
26
New cards
mode
most frequently occuring category or score
27
New cards
measures of dispersion
* Measures of the **average distance** of each observation from the center of the distribution
* Summarize and describe the extent to which scores in a distribution ***differ from each other***

Tells us ***how spread out*** the scores are
28
New cards
range
highest - lowest value

\
simplest but most unreliable measure of dispersion since it only uses 2 values
29
New cards
variance
measure of how much the data points in a dataset vary or deviate from the average or mean value.

\
average of squared differences between data point and mean

\
squared of std dev
30
New cards
standard deviation
sqrt of variance

\
measure of how spread out the data is, but is easier to interpret because it is expressed in the same units as the original data.
31
New cards
percentile
* Measure indicating the value below which is a given percentage of observations in a group of observations fall
32
New cards
**PROBABILITY EXPERIMENT**
* Chance process that leads to well-defined results called outcomes
33
New cards
**OUTCOME**
* Result of a single trial of a probability experiment
34
New cards
**SAMPLE SPACE**
* Set of all possible outcomes of a probability experiment
35
New cards
event
* Consists of a set of outcomes of a probability experiment
36
New cards
**EQUALLY LIKELY EVENTS**
* Events that have the same probability of occurring
37
New cards
**CLASSICAL PROBABILITY**
* Assumes that all outcomes in the sample space are equally likely to occur