raw scores
data that have not yet been transformed or analyzed
frequency distribution
a summary of data that shows the number of times each value or range of values occurs in a dataset. It provides a visual representation of the distribution of data and helps identify patterns or trends.
frequency table
a statistical table that displays the frequency of various categories or values in a dataset. It shows how often each category or value occurs in the data. The table typically consists of two columns: one for the categories or values and another for their corresponding frequencies. The frequencies can be represented as counts or percentages. this is commonly used in data analysis to summarize and organize categorical or discrete data. They provide a clear overview of the distribution and patterns within the dataset.
outlier
a data point that significantly deviates from the other data points in a dataset. It is an observation that lies an abnormal distance away from other values. this can occur due to various reasons such as measurement errors, experimental errors, or genuine anomalies in the data. Identifying and analyzing this can be important in statistical analysis as they can impact the overall interpretation and conclusions drawn from the data.
histogram
a graphical representation of the distribution of a dataset. It consists of a series of bars, where the height of each bar represents the frequency or count of data falling within a specific interval or bin. this is commonly used to visualize the shape, center, and spread of continuous or discrete data. They are particularly useful for displaying large datasets and identifying patterns or trends in the data distribution.
grouped frequency table
a way to organize and summarize data by grouping it into intervals or classes. It shows the frequency or count of data values that fall within each interval. This table is useful when dealing with large data sets or continuous data, as it provides a clearer overview of the distribution of values.
normal distributon
also known as a Gaussian distribution, is a probability distribution that is symmetric and bell-shaped. It is characterized by its mean (average) and standard deviation. In this, the majority of the data falls near the mean, with fewer data points further away from the mean. The shape of the distribution is determined by the mean and standard deviation. It is widely used in statistics and probability theory to model various phenomena in the natural and social sciences.
skewed distributions
the probability distributions that are asymmetrical, meaning they are not symmetrical around their mean. There are two types of this positively skewed (right-skewed) and negatively skewed (left-skewed). In a positively skewed distribution, the tail of the distribution extends towards the right, while in a negatively skewed distribution, the tail extends towards the left. Skewness is a measure of the extent and direction of skewness in a distribution. It can be calculated using statistical formulas. Skewed distributions can have implications for data analysis and interpretation.
positively skewed
is a type of distribution where the tail of the distribution extends towards the right side. In this type of distribution, the majority of the data is concentrated on the left side, while the right side has a few extreme values that pull the mean towards the right. The median is typically smaller than the mean in a positively skewed distribution.
floor effect
a situation where a variable or measure is unable to accurately capture lower values due to a lower limit or floor. This can occur when a measurement instrument or assessment is not sensitive enough to detect or differentiate scores below a certain threshold. It can lead to a clustering of scores at the lower end of the scale and a lack of variability in the data. usually leads to positively skewed distribution.
negatively skewed
s a type of distribution where the tail of the data is skewed to the left. This means that the majority of the data is concentrated towards the right side of the distribution, with a few extreme values on the left side. It is also known as left-skewed or left-tailed distribution.
ceiling effect
a phenomenon in statistical analysis where a variable reaches its maximum possible value, resulting in a clustering of scores at the upper limit of the measurement scale. This can occur when a measurement instrument or test is not sensitive enough to accurately differentiate between high-performing individuals. this can limit the ability to detect true differences among participants and may lead to an underestimation of their abilities or characteristics. typically leads to a negativley skewed distribution
dot plot
a simple data visualization tool that uses circles to represent data points on a number line. It is used to display the distribution and frequency of a dataset. Each circle represents a single data point, and circles are stacked vertically to show the frequency or count of each data value. this is useful for quickly identifying patterns, outliers, and the overall shape of a dataset. They are especially effective for small to moderate-sized datasets.
kurtosis
a statistical measure that describes the shape of a probability distribution. It quantifies the heaviness of the tails and the peakedness of the distribution compared to a normal distribution. Positive this indicates heavier tails and a sharper peak, while negative this indicates lighter tails and a flatter peak.
platykurtic
a statistical distribution that has a flatter peak and lighter tails compared to a normal distribution. It indicates a lower degree of concentration of data around the mean and a higher degree of dispersion.
mesokurtic
a statistical distribution with a kurtosis value of zero. In such a distribution, the peak of the data is similar to that of a normal distribution. It means that the tails of the distribution have a moderate level of outliers compared to a normal distribution.
leptokurtic
a statistical distribution that has a higher peak and heavier tails compared to a normal distribution. It indicates that the data has more extreme values and is more concentrated around the mean.