1/35
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Descriptive Analytics
is a statistical technique used to search and summarize historical data to identify patterns or meanings.
Descriptive Analytics
It is an initial stage in data processing that creates a summary of historical data to obtain useful information and possibly prepare the data for further analysis.
Descriptive Analytics
Examination of data or content, usually manually performed, to answer the question "What happened?" (or What is happening?)
Descriptive Analytics
characterized by traditional business intelligence (BI) and visualizations such as pie charts, bar charts, line graphs, tables, or generated narratives
Discrete Attribute
Types of Data
has a finite or countably infinite set of values, which may or may not be represented as integers.
Continuous Attribute
Types of Data
Are typically represented as floating-point variables. The terms numeric attribute and continuous attribute are often used interchangeably in the literature.
Discrete Attribute
Types of Data
Basically countable whole numbers or zero. Mainly used for counting.
Continuous Attribute
Types of Data
Basically measurable numbers (Any number in A to B) that are in decimal form. Mainly used for measuring.
Measures of Central Tendency
BASIC STATISTICAL DESCRIPTION OF DATA
can be defined as a descriptive statistical method which describes or shows the center value in a dataset.
Measures of Central Tendency
BASIC STATISTICAL DESCRIPTION OF DATA
It can be referred to as the measure of central location where most values in a distribution fall.
Mean (Average)
BASIC STATISTICAL DESCRIPTION OF DATA
Representation of the sum of all values in a dataset divided by the total number of the values.
Median
BASIC STATISTICAL DESCRIPTION OF DATA
simply the middle value in a dataset. In the case where the dataset has even number of values, the ___________ of that dataset is the average or mean of the two middle values.
Mode
BASIC STATISTICAL DESCRIPTION OF DATA
as the most recurrently occurring value in a dataset. Some dataset may contain multiple ________ while in some may not have any _______ at all. It is a measure of central tendency with largest frequency* in a table.
Long Left Tails
Rules for Skewness
In a negative skew, what do you call its tail?
Long Right Tail
Rules for Skewness
In a positive skew, what do you call its tail?
Normal Distribution
It is a distribution that contains the measures of central tendency in the middle, with symmetrical sides, and an asymptotic tail.
Left-Skewed (Negative Skewness)
In a skewed data set, what type of skewness contains the following order:
Mean(lowest), Median, Mode(Highest)
Right-Skewed (Positive Skewness)
In a skewed data set, what type of skewness contains the following order:
Mode, Median, Mean where:
Mean > Median > Mode
Mode
What is the best measure of central tendency of a Nominal Variable?
Median
What is the best measure of central tendency of an Ordinal Variable?
Mean
What is the best measure of central tendency of an Interval/Ratio variable that is not skewed?
Median
What is the best measure of central tendency of an Interval/Ratio variable that is skewed?
Measures of Dispersion or Variability
Refers to how scattered a group of data is. Shows how much the data differs or vary from the average distribution.
Variability and dispersion
Are some of the terms that describes how spread out a certain distribution is.
Absolute measure of dispersion
Categories of Measures of Variability or Dispersion
A measure which expresses the scattering of observation in terms of distances
Relative measure of dispersion
Categories of Measures of Variability or Dispersion
is used for comparing distributions of two or more dataset and for unit free comparison.
Range
The simplest variability measure of and dispersion to calculate is the _________. It is easy to calculate and easy to understand. It just simply the difference between the highest and the lowest score in a dataset.
Interquartile Range
Measure of dispersion or variation based on distributing a data set into quartiles.
Interquartile Range (Quartiles)
_________ means dividing the dataset into four equal parts. These values will be separated in parts called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively
Variance
An average of the squared differences of the scores from the computed mean.
Variance
is a measurement of how far each number in the dataset is from the mean and from every other number in the dataset.
Standard Deviation
is a measure of dispersion or variation that measures the difference between each data point and the mean.
Smaller
Standard Deviation
When the values in a dataset are closely distributed, the standard deviation is _________.
Larger
Standard Deviation
But when the values in a data set are scattered, the standard deviation is __________ for the reason that the distance is greater.
Percentiles
is a number where a certain percentage of scores fall below that number.
For example, a 90th ___________ marks the spot where 90% of values fall below that cut-off point.
Quartiles
divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively.