M

Comprehensive Notes on Statistics for Business and Economics (Bullet-Point Edition)

Statistics

  • Outline of topics covered in the material: Statistics, Applications in Business and Economics, Data, Data Sources, Descriptive Statistics, and Statistical Inference.

Applications in Business and Economics

  • Accounting

    • Public accounting firms use statistical sampling procedures when conducting audits for their clients.

  • Economics

    • Economists use statistical information to make forecasts about the future of the economy or aspects of it.

  • Finance

    • Financial advisors use price-earnings ratios and dividend yields to guide investment advice.

  • Marketing

    • Electronic point‑of‑sale scanners at retail counters collect data for various marketing research applications.

  • Production

    • Statistical quality control charts monitor the output of a production process.

  • Information Systems

    • Statistical information helps administrators assess the performance of computer networks.

Data and Data Sets

  • Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation.

  • All data collected in a particular study are referred to as the data set for the study.

  • Data Set example (illustrative dataset):

    • Companies and measurements:

    • Dataram: Annual Sales $73.1 M; Earn/Share $0.86

    • Energy South: Annual Sales $74.0 M; Earn/Share $1.67

    • Keystone: Annual Sales $365.7 M; Earn/Share $0.86

    • Land Care: Annual Sales $111.4 M; Earn/Share $0.33

    • Psychemedics: Annual Sales $17.6 M; Earn/Share $0.13

  • Elements, Variables, and Observations

    • Elements: entities on which data are collected.

    • Variable: a characteristic of interest for the elements.

    • Observation: the set of measurements obtained for a particular element.

    • Relationship: Elements → Variables → Observations.

Scales of Measurement

  • Scales: Nominal, Interval, Ordinal, Ratio.

  • Purpose: Assign a value to a variable for each element; determine information content and appropriate analyses.

  • Nominal scale

    • Data are labels or names identifying an attribute of the element; can be nonnumeric labels or numeric codes.

    • Example: Students classified by school using labels (Business, Humanities, Education) or codes (e.g., 1–3).

  • Ordinal scale

    • Data have nominal properties plus meaningful order/rank.

    • Example: Class standing (Freshman, Sophomore, Junior, Senior) or coded similarly (1 = Freshman, 2 = Sophomore, etc.).

  • Interval scale (I.E. Temperature, SAT scores, etc.)

    • Data have ordinal properties and the interval between observations is fixed; data are numeric.

    • Example: SAT scores (e.g., Melissa 1985, Kevin 1880) with a fixed unit difference: Melissa − Kevin = 105.

  • Ratio scale

    • Properties of interval data with a meaningful zero; the ratio of two values is meaningful.

    • Variables with ratio scale: distance, height, weight, time; zero indicates absence.

    • Example: Credit hours earned: Melissa 36, Kevin 72; Kevin has twice as many hours as Melissa.

  • Quick reference (types of data on scales):

    • Nominal, Ordinal, Interval, Ratio

    • Categorical vs Quantitative data categories map onto these scales.

Categorical and Quantitative Data

  • Data can be classified as categorical (qualitative) or quantitative.

    • Categorical data: grouped by categories or labels; use Nominal or Ordinal scales.

    • Quantitative data: numeric values indicating amount or count; use Interval or Ratio scales.

  • Appropriate statistical analyses depend on whether the data are categorical or quantitative.

  • Note: More statistical analysis is typically applied to quantitative data.

  • Categorical vs Quantitative (summary):

    • Categorical: Nominal, Ordinal

    • Quantitative: Interval, Ratio

Cross-Sectional Data

  • Cross-sectional data are collected at the same or approximately the same point in time.

  • Example: Number of building permits issued in June 2017 in each county of Georgia.

Time Series Data

  • Time series data are collected over several time periods.

  • Example: Number of building permits issued in Cobb County, Georgia in each of the last 36 months.

  • Graphs of time series help analysts understand:

    • what happened in the past,

    • identify trends over time,

    • project future levels for the time series.

  • Time series example: Average Basic Cable Rate (illustrative values from 1995–2005):

    • 1995: $23.07

    • 1996: $24.41

    • 1997: $26.48

    • 1998: $27.81

    • 1999: $28.92

    • 2000: $30.37

    • 2001: $32.87

    • 2002: $34.71

    • 2003: $36.59

    • 2004: $38.14

    • 2005: $39.63

    • Source: Kagan Research, LLC; Broadband Cable Financial Databook 2004/2005 and related publications.

Data Sources

  • Existing sources (secondary data): data already gathered by public or private sources.

    • Internet, Library, US Government, Data collection agencies.

  • Experimental and observational studies (primary data): data collected for a specific purpose.

    • Response variable: the variable of interest.

    • Factors: other variables related to the response variable.

Data Sources – Experimental Data

  • Experimental data: identify the variable of interest first; then control one or more other variables to study their influence on the variable of interest.

  • Example: The 1954 Public Health Service polio vaccine experiment—a large study with nearly two million U.S. children (grades 1–3).

Data Sources – Observational Data

  • Observational data: no attempt to control or influence the variable of interest.

  • Example: Surveys; studies of smokers vs nonsmokers where researchers do not assign smoking status.

Descriptive Statistics

  • Most statistical information in publications consists of data summaries that are easy to understand.

  • Descriptive statistics are summaries of data, which may be tabular, graphical, or numerical.

  • Example: Hudson Auto cost analysis (invoices from 50 tune-ups) to understand parts costs.

  • Descriptive statistics – Tabular summaries

    • Frequency and percent frequency (distribution of parts costs):

    • Example table (Parts Cost ($), Frequency, Percent Frequency (%)):

    • 50-59: 2 (4%)

    • 60-69: 13 (26%)

    • 70-79: 16 (32%)

    • 80-89: 7 (14%)

    • 90-99: 7 (14%)

    • 100-109: 5 (10%)

    • Total: 50 (100%)

  • Descriptive statistics – Graphical summaries

    • Histogram example shows distribution of Parts Cost ($) across the same classes.

  • Descriptive statistics – Numerical summaries

    • The most common numerical descriptive statistic is the average (mean).

    • Example: Average cost of parts (based on 50 tune-ups) = 79. (Note: This is the sample mean, not the population mean.)

    • Common notation: the mean of a sample x1, x2, …, xn is \bar{x} = \frac{1}{n} \sum{i=1}^{n} x_i.

Statistical Inference

  • Population: the set of all elements of interest in a study.

    • Finite population: population with limited size.

    • Infinite population: population with unlimited size.

  • Sample: a subset of the population.

  • Statistical inference: using data from a sample to make estimates and test hypotheses about population characteristics.

  • Census: collecting data for the entire population.

  • Sample survey: collecting data for a sample.

  • Any characteristic of a population unit is a variable.

Process of Statistical Inference

1) Population consists of all tune-ups. The average cost of parts is unknown.
2) A sample of 50 engine tune-ups is examined.
3) The sample data provide a sample average parts cost of 79 per tune-up.
4) The sample average is used to estimate the population average.