Comprehensive Notes on Statistics for Business and Economics (Bullet-Point Edition)
Statistics
Outline of topics covered in the material: Statistics, Applications in Business and Economics, Data, Data Sources, Descriptive Statistics, and Statistical Inference.
Applications in Business and Economics
Accounting
Public accounting firms use statistical sampling procedures when conducting audits for their clients.
Economics
Economists use statistical information to make forecasts about the future of the economy or aspects of it.
Finance
Financial advisors use price-earnings ratios and dividend yields to guide investment advice.
Marketing
Electronic point‑of‑sale scanners at retail counters collect data for various marketing research applications.
Production
Statistical quality control charts monitor the output of a production process.
Information Systems
Statistical information helps administrators assess the performance of computer networks.
Data and Data Sets
Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation.
All data collected in a particular study are referred to as the data set for the study.
Data Set example (illustrative dataset):
Companies and measurements:
Dataram: Annual Sales $73.1 M; Earn/Share $0.86
Energy South: Annual Sales $74.0 M; Earn/Share $1.67
Keystone: Annual Sales $365.7 M; Earn/Share $0.86
Land Care: Annual Sales $111.4 M; Earn/Share $0.33
Psychemedics: Annual Sales $17.6 M; Earn/Share $0.13
Elements, Variables, and Observations
Elements: entities on which data are collected.
Variable: a characteristic of interest for the elements.
Observation: the set of measurements obtained for a particular element.
Relationship: Elements → Variables → Observations.
Scales of Measurement
Scales: Nominal, Interval, Ordinal, Ratio.
Purpose: Assign a value to a variable for each element; determine information content and appropriate analyses.
Nominal scale
Data are labels or names identifying an attribute of the element; can be nonnumeric labels or numeric codes.
Example: Students classified by school using labels (Business, Humanities, Education) or codes (e.g., 1–3).
Ordinal scale
Data have nominal properties plus meaningful order/rank.
Example: Class standing (Freshman, Sophomore, Junior, Senior) or coded similarly (1 = Freshman, 2 = Sophomore, etc.).
Interval scale (I.E. Temperature, SAT scores, etc.)
Data have ordinal properties and the interval between observations is fixed; data are numeric.
Example: SAT scores (e.g., Melissa 1985, Kevin 1880) with a fixed unit difference: Melissa − Kevin = 105.
Ratio scale
Properties of interval data with a meaningful zero; the ratio of two values is meaningful.
Variables with ratio scale: distance, height, weight, time; zero indicates absence.
Example: Credit hours earned: Melissa 36, Kevin 72; Kevin has twice as many hours as Melissa.
Quick reference (types of data on scales):
Nominal, Ordinal, Interval, Ratio
Categorical vs Quantitative data categories map onto these scales.
Categorical and Quantitative Data
Data can be classified as categorical (qualitative) or quantitative.
Categorical data: grouped by categories or labels; use Nominal or Ordinal scales.
Quantitative data: numeric values indicating amount or count; use Interval or Ratio scales.
Appropriate statistical analyses depend on whether the data are categorical or quantitative.
Note: More statistical analysis is typically applied to quantitative data.
Categorical vs Quantitative (summary):
Categorical: Nominal, Ordinal
Quantitative: Interval, Ratio
Cross-Sectional Data
Cross-sectional data are collected at the same or approximately the same point in time.
Example: Number of building permits issued in June 2017 in each county of Georgia.
Time Series Data
Time series data are collected over several time periods.
Example: Number of building permits issued in Cobb County, Georgia in each of the last 36 months.
Graphs of time series help analysts understand:
what happened in the past,
identify trends over time,
project future levels for the time series.
Time series example: Average Basic Cable Rate (illustrative values from 1995–2005):
1995: $23.07
1996: $24.41
1997: $26.48
1998: $27.81
1999: $28.92
2000: $30.37
2001: $32.87
2002: $34.71
2003: $36.59
2004: $38.14
2005: $39.63
Source: Kagan Research, LLC; Broadband Cable Financial Databook 2004/2005 and related publications.
Data Sources
Existing sources (secondary data): data already gathered by public or private sources.
Internet, Library, US Government, Data collection agencies.
Experimental and observational studies (primary data): data collected for a specific purpose.
Response variable: the variable of interest.
Factors: other variables related to the response variable.
Data Sources – Experimental Data
Experimental data: identify the variable of interest first; then control one or more other variables to study their influence on the variable of interest.
Example: The 1954 Public Health Service polio vaccine experiment—a large study with nearly two million U.S. children (grades 1–3).
Data Sources – Observational Data
Observational data: no attempt to control or influence the variable of interest.
Example: Surveys; studies of smokers vs nonsmokers where researchers do not assign smoking status.
Descriptive Statistics
Most statistical information in publications consists of data summaries that are easy to understand.
Descriptive statistics are summaries of data, which may be tabular, graphical, or numerical.
Example: Hudson Auto cost analysis (invoices from 50 tune-ups) to understand parts costs.
Descriptive statistics – Tabular summaries
Frequency and percent frequency (distribution of parts costs):
Example table (Parts Cost ($), Frequency, Percent Frequency (%)):
50-59: 2 (4%)
60-69: 13 (26%)
70-79: 16 (32%)
80-89: 7 (14%)
90-99: 7 (14%)
100-109: 5 (10%)
Total: 50 (100%)
Descriptive statistics – Graphical summaries
Histogram example shows distribution of Parts Cost ($) across the same classes.
Descriptive statistics – Numerical summaries
The most common numerical descriptive statistic is the average (mean).
Example: Average cost of parts (based on 50 tune-ups) = 79. (Note: This is the sample mean, not the population mean.)
Common notation: the mean of a sample x1, x2, …, xn is \bar{x} = \frac{1}{n} \sum{i=1}^{n} x_i.
Statistical Inference
Population: the set of all elements of interest in a study.
Finite population: population with limited size.
Infinite population: population with unlimited size.
Sample: a subset of the population.
Statistical inference: using data from a sample to make estimates and test hypotheses about population characteristics.
Census: collecting data for the entire population.
Sample survey: collecting data for a sample.
Any characteristic of a population unit is a variable.
Process of Statistical Inference
1) Population consists of all tune-ups. The average cost of parts is unknown.
2) A sample of 50 engine tune-ups is examined.
3) The sample data provide a sample average parts cost of 79 per tune-up.
4) The sample average is used to estimate the population average.