Average: A central value of a statistical series that describes characteristics of a distribution.
Measures of Central Tendency: Main measures include:
Arithmetic Mean
Geometric Mean
Harmonic Mean
Median
Mode
Definition: The sum of values divided by the number of items.
Calculation Methods:
Simple Arithmetic Mean for Ungrouped Data:
Direct Method: ( \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} )
Grouped Data:
Direct Method: ( \bar{x} = \frac{\sum{f_ix_i}}{\sum f_i} )
Formula: ( G.M. = (x_1 \times x_2 \times ... \times x_n)^{1/n} )
Frequency Distribution: For frequency weighted distribution ( G.M. = (x_1^{f_1} \times x_2^{f_2} \times ... \times x_n^{f_n})^{1/N} ) where ( N = \sum f )
Definition: ( H.M. = \frac{n}{\sum \frac{1}{x_i}} )
For Frequency Distribution: ( H.M. = \frac{N}{\sum \frac{f_i}{x_i}} )
Definition: The middle value that separates the higher half from the lower half of the data.
Calculation:
Individual Series:
If ( n ) is odd: ( Median = value_{(n+1)/2} )
If ( n ) is even: ( Median = \frac{value_{n/2} + value_{(n/2)+1}}{2} )
Continuous Series: ( Median = L + \frac{(N/2 - C)}{f} \cdot i )
Definition: The value that appears most frequently in a data set.
Formula for Continuous Series: ( Mode = L + \frac{(f_1 - f_0)}{(2f_1 - f_0 - f_2)} \cdot i )
Description: A circular graph divided into slices, representing data proportions.
Central Angle Calculation: For a slice representing component ( n ), the angle = ( \frac{n/N \times 360}{1} )
Definition: Indicates how spread out the values are around the mean.
Measures Include:
Range: Difference between highest and lowest values, ( Range = X_{max} - X_{min} )
Variance: Average of squared deviations from the mean.
Standard Deviation (S.D.): Square root of variance.
Definition: Measure of the dispersion of a set of values.
Formula:
For grouped data: ( \text{Variance} = \frac{\sum{f(x_i - \bar{x})^2}}{N} )
Definition: Measure of the asymmetry of the distribution around its mean.
Types:
Negative Skew: Mean < Median < Mode
Positive Skew: Mean > Median > Mode
Formula: ( Skewness = \frac{E(X - \mu)^3}{\sigma^3} )
Definition: Correlation measures the degree to which two variables are related.
Types of Distributions:
Univariate: One variable.
Bivariate: Two related variables.
Definition: Measure of how two random variables change together.
Formula: [ Cov(X,Y) = \frac{\sum{(X_i - \bar{X})(Y_i - \bar{Y})}}{n} ]
Types:
Perfect, Positive, Negative.
Karl Pearson’s Correlation Coefficient: [ r = \frac{Cov(X,Y)}{\sigma_X \sigma_Y} ]
Definition: Measures the correlation between the rankings of two variables.
Spearman's Rank Correlation Formula: [ r = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)} ]
Definition: Predictive modeling technique to model the relationship between two variables.
Regression Equations:
For ( y ) on ( x ): [ y - \bar{y} = b_xy(x - \bar{x}) ]
For ( x ) on ( y ): [ x - \bar{x} = b_{yx}(y - \bar{y}) ]
Regression coefficients are used to describe the strength of relationship between variables.
The product of gradients of the regression lines equals the square of the correlation coefficient.
Standard Error of Prediction: The deviation of predicted values from the observed values.
Relation to Correlation Coefficient:
[ S.E. = \frac{r \sigma}{\sqrt{n}} ]
Probable Error: Indicates the reliability of the correlation coefficient.