1/65
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is a categorical variable?
Consists of observations that represent labels or names.
What are the types of categorical variables?
nominal and ordinal
What are ways to visualize and summarize categorical data?
Frequency tables, bar charts, and pie charts.
How do you make a frequency table?
Group the data into categories and record the number of observations in each, find the relative frequency for each category, and multiply the proportions by 100 to get percentages.
How do you find relative frequency?
Divide the frequency by the sample size.
In a pie chart the values must equal ______.
100
How do you chose how to represent your categorical data?
The simplest graph should be used.
What is statistics?
The science that deals with the collection, preparation, analysis, interpretation, and presentation of data.
What are the two branches of statistics?
descriptive and inferential
What are descriptive stats?
Refers to the summary of important aspects of a data set. Includes collecting, organizing, and presenting the data in the form of charts and tables. Often calculates numerical measures.
What are inferential stats?
Refers to drawing conclusions about a larger set of data based on a smaller set of data.
What is a population?
Consists of all items/members of interest.
What is a sample?
A subset of the population.
What is a statistic?
A number that represents a property of the sample.
What is a parameter?
A numerical characteristic of the whole population that can be estimated by a statistic.
All students who graduated from the college last year. This is an example of?
a population
A group of students who graduated from the college last year, randomly selected. This is an example of?
a sample
The average cumulative GPA of students who graduated from the college last year. This is an example of?
a parameter
The average cumulative GPA of students in the study who graduated from the college last year. This is an example of?
a statistic
What are the types of data?
Cross-sectional, time series, structured, unstructured, and big.
What is cross-sectional data?
Refers to data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time.
What is structured data?
Reside in a pre-defined, row-column format.
What is unstructured data?
Does not conform to a pre-defined, row-column format or database structure.
What is big data?
A massive amount of both structured and unstructured data.
What are the five Vs?
Volume, velocity, variety, veracity, and value.
What is a variable?
A general characteristic being observed on a set of poeple, objects, or events, where each observation varies in kind or degree.
What are the two types of variables?
categorical and numeric
What is categorical data?
Also called qualitative, represents categories. Things with labels, names, and distinguishing characteristics.
How can variables be identified?
By their scales of measure.
What are the four major scales of measure?
Nominal, ordinal, interval, and ratio.
What scales of measure do categorical variables use?
nominal and ordinal
What is nominal data?
no numbers or hierarchy
What is ordinal data?
Non-numeric, established ranking order, not evenly distributed, and cannot be manipulated using mathematical operators.
Numeric data is also called ________.
quantitative
What is a discrete variable?
Assumes a countable number of values.
what is a continuous variable?
Assumes an uncountable number of values within an interval.
What are numerical variables scales of measure?
interval and ratio scales
What are intervals?
Categorized and ranked, zero value is arbitrary, ratios are not meaningful. ex. temperature
What are ratios?
Strongest level of measurement, has all characteristics of an interval-scaled variable with a true zero point, ratios are meaningful. ex. profits, salary.
A kindergarten teacher marks whether each student is a boy or girl. What type of measurement is this?
nominal
A ski resort records the daily temperature during the month of January. What type of measurement is this?
interval
A restaurant surveys its customers about the quality of its waiting staff on a scale of 1 to 4, where 1 is poor and 4 is excellent. What type of measurement is this?
ordinal
________ variable is a variable that uses labels or names to identify the distinguishing characteristics of observations.
categorical
What is frequency distribution used for?
grouping
What are bar charts used for?
Visualization of the frequency or relative frequency.
What are pie charts used for?
Visualization of the relative frequency.
What are ways to summarize/visualize two categorical variables?
Two-way contingency table, stacked column chart, and clustered bar chart.
What does a contingency table show?
The frequencies for two categorical variables.
What are ways to visualize two numerical variables?
scatterplot and line chart
What does a scatterplot do?
It can determine if there is a relationship between the two variables.
What is the purpose of a line chart?
To analyze trends over time.
What is the null hypothesis?
Status quo, specified with =, ≥, or ≤.
What is the alternative hypothesis?
Contests status quo, specified with, <, >, or ≠.
How do you formulate a hypothesis test?
1. Identify the relevant population parameter of interest.
2. Determine whether it is a one- or two-tailed test.
3. Include some form of the equality sign in the null hypothesis and use the alternative hypothesis to establish a claim.
________ is when we are looking to disprove the null and feel the alternative hypothesis is greater than or less than the null.
one tail test
What are the two types of erros?
Type I and Type II
What is a type I error?
Rejecting the null hypothesis when its true.
What is a type II error?
Do not reject the null hypothesis when it is false.
How do you find df?
subtract 1 from the sample size
What is df?
The number of independent pieces of information used to calculate a statistic.
The test stat should always be ________.
positive
_______________ are samples that are completely unrelated to one another.
independent random samples
σ_1^2 and σ_2^2 are _________, use z for statistical inference
known
σ_1^2 and σ_2^2 are __________ but assumed _________ σ_1^2=σ_2^2, use t_df
unknown; equal
σ_1^2 and σ_2^2 are _________ but assumed ___________ σ_1^2≠σ_2^2, use t_df
unknown; unequal