Looks like no one added any tags here yet for you.
Census
Collection of data from every element in the population of interest.
Sampled population
The population from which the sample is drawn.
Frame
A listing of the elements from which the sample will be selected.
Parameter
A measurable factor that defines a characteristic of a population, process, or system, such as a population mean μ, a population standard deviation σ, or a population proportion p.
Simple random sample
A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.
Random sample
A __ sample from an infinite population is a sample selected such that the following conditions are satisfied: (1) Each element selected comes from the same population and (2) each element is selected independently.
Sample statistic
A characteristic of sample data, such as a sample mean x¯, a sample standard deviation s, or a sample proportion p¯. The value of the ___ statistic is used to estimate the value of the corresponding population parameter.
Point estimator
The sample statistic, such as x¯, s, or p¯, that provides the point estimate of the population parameter. (chapter 6)
A single value used as an estimate of the corresponding population parameter. The mean value of y for a given value of x (7.2)
Point estimate
The value of a point estimator used in a particular instance as an estimate of a population parameter. chapter 6
Target population
The population for which statistical inferences such as point estimates are made. It is important for the target population to correspond as closely as possible to the sampled population.
Random variable
A quantity whose values are not known with certainty.
Sampling distribution
A probability distribution consisting of all possible values of a sample statistic.
Unbiased
A property of a point estimator that is present when the expected value of the point estimator is equal to the population parameter it estimates.
Standard error
The standard deviation of a point estimator.
Finite population correction factor
The term (N-n)/(N-1) that is used in the formulas for computing the (estimated) standard error for the sample mean and sample proportion whenever a finite population, rather than an infinite population, is being sampled. The generally accepted rule of thumb is to ignore the finite population correction factor whenever n/N≤0.05.
Sampling error
The difference between the value of a sample statistic (such as the sample mean, sample standard deviation, or sample proportion) and the value of the corresponding population parameter (population mean, population standard deviation, or population proportion) that occurs because a random sample is used to estimate the population parameter.
Interval estimation
The process of using sample data to calculate a range of values that is believed to include the unknown value of a population parameter. chap 6
Interval estimate
An estimate of a population parameter that provides an interval believed to contain the value of the parameter. For the __ estimates in this chapter, it has the form: point estimate ± margin of error.
Margin of error
The ± value added to and subtracted from a point estimate in order to develop an interval estimate of a population parameter.
T distribution
A family of probability distributions that can be used to develop an interval estimate of a population mean whenever the population standard deviation s is unknown and is estimated by the sample standard deviation s.
Degrees of freedom
A parameter of the t distribution. When the t distribution is used in the computation of an interval estimate of a population mean, the appropriate t distribution has n − 1 degrees of freedom, where n is the size of the sample.
Confidence level
The confidence associated with an interval estimate. For example, if an interval estimation procedure provides intervals such that 95% of the intervals formed using the procedure will include the population parameter, the interval estimate is said to be constructed at the 95% confidence level.
Confidence coefficient
The confidence level expressed as a decimal value. For example, 0.95 is the _____ for a 95% confidence level.
Confidence interval
Another name for an interval estimate.
Level of significance
The probability that the interval estimation procedure will generate an interval that does not contain the value of parameter being estimated; also the probability of making a Type I error when the null hypothesis is true as an equality.
Null hypothesis
The hypothesis tentatively assumed to be true in the hypothesis testing procedure.
Alternative hypothesis
The hypothesis concluded to be true if the null hypothesis is rejected.
Type I error
The error of rejecting Ho when it is true.
Test statistic
A statistic whose value helps determine whether a null hypothesis should be rejected.
P value
The probability, assuming that H0 is true, of obtaining a random sample of size n that results in a test statistic at least as extreme as the one observed in the current sample. For a lower-tail test, the __ value is the probability of obtaining a value for the test statistic as small as or smaller than that provided by the sample. For an upper-tail test, the _ value is the probability of obtaining a value for the test statistic as large as or larger than that provided by the sample. For a two-tailed test, the _ value is the probability of obtaining a value for the test statistic at least as unlikely as or more unlikely than that provided by the sample.
Two-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in either tail of its sampling distribution.
Nonsampling error
Any difference between the value of a sample statistic (such as the sample mean, sample standard deviation, or sample proportion) and the value of the corresponding population parameter (population mean, population standard deviation, or population proportion) that are not the result of sampling error. These include but are not limited to coverage error, nonresponse error, measurement error, interviewer error, and processing error.
Coverage error
Nonsampling error that results when the research objective and the population from which the sample is to be drawn are not aligned.
Nonresponse error
Nonsampling error that results when some segments of the population are more likely or less likely to respond to the survey mechanism.
Big data
Any set of data that is too large or too complex to be handled by standard data processing techniques and typical desktop software.
Volume
The amount of data generated.
Variety
The diversity in types and structures of data generated.
Veracity
The reliability of the data generated.
Velocity
The speed at which the data are generated.
Tall data
A data set that has so many observations that traditional statistical inference has little meaning.
Wide data
A data set that has so many variables that simultaneous consideration of all variables is infeasible.
Practical significance
The real-world impact the result of statistical inference will have on business decisions.
Central limit theorem
A theorem stating that when enough independent random variables are added, the resulting sum is a normally distributed random variable. This result allows one to use the normal probability distribution to approximate the sampling distributions of the sample mean and sample proportion for sufficiently large sample sizes.
Hypothesis testing
The process of making a conjecture about the value of a population parameter, collecting sample data that can be used to assess this conjecture, measuring the strength of the evidence against the conjecture that is provided by the sample, and using these results to draw a conclusion about the conjecture.
One-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in one tail of its sampling distribution.
regression analysis
A statistical procedure used to develop an equation showing how the variables are related.
dependent variable
The variable that is being predicted or explained. It is denoted by y and is often referred to as the response.
independent variables
The variable(s) used for predicting or explaining values of the dependent variable. It is denoted by x and is often referred to as the predictor variable.
multiple linear regression
Regression analysis involving one dependent variable and more than one independent variable.
estimated regression
The estimate of the regression equation developed from sample data by using the least squares method. The estimated multiple linear regression equation is y^=b0+b1x1+b2x2+⋯+bqxq. (chap 7)
least squares method
A procedure for using sample data to find the estimated regression equation. (chapt 7)
residual
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation; for the ith observation, the ith residual is yi-y^i.
experimental region
The range of values for the independent variables x1, x2,..., xq for the data that are used to estimate the regression model.
extrapolation
Prediction of the mean value of the dependent variable y for values of the independent variables x1, x2,…, xq that are outside the experimental range.
Statistical inference
The process of making estimates and drawing conclusions about one or more characteristics of a population (the value of one or more parameters) through the analysis of sample data drawn from the population.
Standard normal distribution
A normal distribution with a mean of zero and standard deviation of one.
Type II error
The error of accepting Ho when it is false.