1/15
Part 2
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Descriptive statistics
as the name suggests, assist in describing and comprehending datasets by providing a short summary pertaining to the dataset provided. The most common types of ——- include the measure of central tendencies, the measure of deviation, and others.
Statistics
is a branch of mathematics that deals with collecting, organizing, and interpreting data. Hence, by using statistical concepts, we can understand the nature of the data, a summary of the dataset, and the type of distribution that the data has.
1. Measures of central tendency
2. Measures of variability (spread)
There are two types of descriptive statistics
measure of central tendency
tends to describe the average or mean value of datasets that is supposed to provide an optimal summarization of the entire set of measurements.
mean, or average
is a number around which the observed continuous variables are distributed. This number estimates the value of the entire dataset. Mathematically, it is the result of the division of the sum of numbers by the number of integers in the dataset.
Median
Given a dataset that is sorted either in ascending or descending order, the ———— divides the data into two parts.
Mode
The ——- is the integer that appears the maximum number of times in the dataset. It happens to be the value with the highest frequency in the dataset. In the x dataset in the median example, the ——— is 2 because it occurs twice in the set.
Measures of dispersion
also known as a measure of variability. It is used to describe the variability in a dataset, which can be a sample or population. It is usually used in conjunction with a measure of central tendency, to provide an overall description of a set of data. A ——————- gives us an idea of how well the central tendency represents the data.
Standard deviation
In simple language, the —————— is the average/mean of the difference between
each value in the dataset with its average/mean; that is, how data is spread out from the
mean.
Variance
——— is the square of the average/mean of the difference between each value in the
dataset with its average/mean; that is, it is the square of standard deviation.
Skewness
In probability theory and statistics, ————- is a measure of the asymmetry of the variable in the dataset about its mean. The ————- value can be positive or negative, or undefined.
The —————- value tells us whether the data is skewed or symmetric.
Kurtosis
Basically, ——— is a statistical measure that illustrates how heavily the tails of distribution differ from those of a normal distribution. This technique can identify whether a given distribution contains extreme values.
Mesokurtic, Leptokurtic, Platykurtic
Types of Kurtosis
Mesokurtic
If any dataset follows a normal distribution, it follows a ———— distribution. It has kurtosis around 0.
Leptokurtic
In this case, the distribution has kurtosis greater than 3 and the fat tails indicate that the distribution produces more outliers.
Platykurtic
In this case, the distribution has negative kurtosis and the tails are very thin compared to the normal distribution.