BUS105: Statistics - Unit 1 Comprehensive Study Notes
Singapore University of Social Sciences - BUS105: Statistics
Unit 1 Overview
Topics Covered:
1.1 Describing Data: Graphic Presentation
1.2 Describing Data: Numerical Measures
2.1 Basic Probability Concepts
2.2 Discrete Probability Distributions
2.3 Continuous Probability Distributions
3.1 Sampling Methods
3.2 Central Limit Theorem
Statistics: A Comprehensive Introduction
Definition of Statistics:
Statistics is the discipline concerned with the collection, organization, presentation, analysis, and interpretation of data. The main goal of statistics is to make informed and effective decisions.
Key Processes in Statistics
Collection: Gathering data relevant to the study.
Organization: Structuring data in a meaningful way.
Presentation: Visually displaying data through graphs or tables.
Analysis: Interpreting data to discover patterns or draw conclusions.
Interpretation: Making sense of analyzed data to inform decision-making.
Population and Sample
Population:
Refers to the entire collection of individuals, items, or measurements that are the focus of the statistical study.
Sample:
A subset or portion of the population selected for analysis.
Types of Statistics
Descriptive Statistics:
Involves organizing, summarizing, and presenting data in an informative way.
Inferential Statistics:
Involves drawing conclusions about a population based on sample data.
Variables
Qualitative Variable:
Non-numeric characteristics or categories.
Quantitative Variable:
Measured on a numerical scale and represents a quantity.
Further classified into:
Discrete Variable: Can only assume specific, distinct values, with identifiable gaps between these values.
Continuous Variable: Can assume any value within a given range.
Graphic Presentation of Data
For Qualitative Variables:
Tables: Frequency table, relative frequency table.
Graphs: Bar chart, pie chart.
For Quantitative Variables:
Tables: Frequency distribution, relative frequency distribution.
Graphs: Histogram, frequency polygon, box plot.
Numerical Measures of Data
Measures of Location
Mean ($ar{X}$):
Arithmetic average calculated as the sum of all observations divided by the total number of observations.
Population Mean ($oldsymbol{oldsymbol{ extmu}}$):
extmu = rac{oldsymbol{ extsum{i=1}^{N} xi}}{N}Sample Mean ($ar{X}$):
ar{X} = rac{oldsymbol{ extsum{i=1}^{n} xi}}{n}
Median: Middle value when data is arranged in order.
Mode: Most frequently occurring value in a dataset.
Measures of Variation
Range: Difference between the largest and smallest values.
Variance ($oldsymbol{oldsymbol{ extsigma^2}}$): Arithmetic mean of squared deviations from the mean.
Population Variance:
oldsymbol{ extsigma^2} = rac{oldsymbol{ extsum{i=1}^{N} (xi - extmu)^2}}{N}Sample Variance:
s^2 = rac{oldsymbol{ extsum{i=1}^{n} (xi - ar{X})^2}}{n-1}
Standard Deviation ($oldsymbol{ extsigma}$): Positive square root of the variance.
Population Standard Deviation:
oldsymbol{ extsigma} = rac{oldsymbol{ extsum{i=1}^{N} (xi - extmu)^2}}{N}Sample Standard Deviation:
s = rac{oldsymbol{ extsum{i=1}^{n} (xi - ar{X})^2}}{n-1}
Skewness
Symmetrical Distribution:
Zero skewness, where Mean = Median.
Positively Skewed Distribution:
Positive skewness, where Mean > Median.
Negatively Skewed Distribution:
Negative skewness, where Mean < Median.
Excel Analysis ToolPak
Steps to Enable Analysis ToolPak:
Go to File > Options.
Click Add-Ins, then select Excel Add-ins from the Manage box.
Click Go.
Select the Analysis ToolPak checkbox and click OK.
Once loaded, the Data Analysis command is available in the Analysis group on the Data tab.
Self-Practice Activities in Excel
Histogram Construction
Activity Example: Gomminger Realty Company studied 30 selected homes selling prices. Construct a histogram to display the following data:
$125, 167, 179, 207, 229, 270, 135, 169, 180, 211, 240, 273, 140, 170, 182, 213, 242, 282, 151, 172, 190, 215, 252, 295, 163, 175, 193, 226, 257, 315.$
Numerical Measures Calculation
Activity Example: Calculations based on amounts spent on heating gases.
Data: $241, 262, 226, 179, 156, 142, 158, 158, 153, 151, 225, 244.$
Compute the arithmetic mean, median, mode.
Compute variance, standard deviation, and range.
Determine the quartiles.
Activity Reporting
Group Work Example: Analyze the prices of 80 vehicles sold and present a statistical report using Excel.
Statistical Outputs:
Mean: $23,218.16
Median: $22,831
Standard Deviation: $4,354.44
Count: 80
Probability Fundamentals
Definitions
Probability: A measure between 0 and 1 that indicates the likelihood of an event occurring.
Random Variable: A quantity resulting from a random experiment that can assume different values based on chance.
Outcome: A possible result of a probability experiment.
Types of Events
Event: A collection of one or more outcomes from an experiment.
Joint Event: When two or more events occur at the same time.
Complement Event ($ ilde{A}$): When event A does not occur.
Union of Events: At least one of multiple events occurs.
Relationships Between Events
Mutually Exclusive Events: Two events cannot occur simultaneously.
Independent Events: The occurrence of one does not influence the other.
Joint Probability
Joint Probability ($P(A ext{ and } B)$): Likelihood of two or more events occurring together.
Addition Rules of Probability
If events are mutually exclusive:
If events are not mutually exclusive:
Complement Rule:
Probability Distribution
Definitions
Probability Distribution: A function that describes the likelihood of all possible outcomes in a random experiment.
Types of Random Variables
Discrete Random Variable: Takes distinct values.
Example: Number of online orders by a bakery.
Continuous Random Variable: Can take any value in a given range.
Example: Time taken for customer checkout online.
Discrete Probability Distribution Characteristics
Outcomes: Mutually exclusive and exhaustive.
Example table structure for a probability distribution:
Number of Spots, $X$ | Probability, $P(X)$
1, $ rac{1}{6}$
2, $ rac{1}{6}$
3, $ rac{1}{6}$
4, $ rac{1}{6}$
5, $ rac{1}{6}$
6, $ rac{1}{6}$
Mean and Variance of Discrete Distribution
Mean ($oldsymbol{ extmu}$):
extmu = oldsymbol{ extsum_{x} x P(x)}Variance ($oldsymbol{ extsigma^2}$):
oldsymbol{ extsigma^2} = oldsymbol{ extsum_{x} (x - extmu)^2 P(x)}
Example Calculation - Activity 1.4
Distribution of Dishes Ordered:
X: 2, 3, 4
P(X): 0.2, 0.35, 0.45
Calculate mean, variance, and standard deviation of the number of dishes ordered.
Binomial Distribution
Characteristics of Binomial Distribution
Binomial Experiment: Each trial results in “success” or “failure”.
Fixed Trials: Number of trials is set in advance.
Probability of Success ($p$): Remains constant across trials.
Independence: Each trial is independent of others.
Excel Function for Binomial Distribution
Function:
Parameters:
number_s: Number of successes.
trials: Total number of trials.
probability_s: Probability of success.
cumulative: If TRUE, cumulative probability; FALSE for exact probability.
Example of Binomial Distribution
Case Study: Tossing a fair coin three times:
Outcomes as probabilities of heads:
P(X=0)= P(X=3)= 0.125
P(X=1)= P(X=2)= 0.375
Self-Practice - Binomial Distribution
Activity:
Probability of making a sale to exactly 3 out of 4 clients given $p=0.20$.
Calculate probabilities for selling more than 1 policy.
Calculate probabilities for selling less than 4 policies.
Conclusion
Emphasizes importance of statistics in decision making and effectiveness in data analysis.