1/40
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data
the facts & figures collected, analyzed, and summarized for presentation and interpretation.
Dataset
all the data collected for a particular analysis.
Element
the entity on which data is collected.
Variable
a characteristic of interest of an element.
Observation
the variables associated with an individual element.
Categorical
use numeric or ordinal values of measurement of categories.
Quantitative
use numeric (quantitative) measures.
Cross-sectional
data collected at a similar point in time.
Time Series
data collected over several time periods.
Panel
combination of cross-sectional and time series data.
Descriptive Statistics
describe data or variables.
Population
is the set of all data/variables of a statistical analysis.
Sample
is a subset of the population.
Statistical Inference
uses data from a sample to make estimates and test hypothesis about the characteristics of a population.
Analytics
is the scientific process of transforming data for decision making.
Descriptive Analytics
which describe what has happened in the past.
Predictive Analytics
uses statistical models from past data to predict the future [forecasting] or assess the impact of one variable on another [inference].
Prescriptive Analytics
uses models seeking to find a best (optimal) solution. Often these are some type of optimization model.
Volume
the number of observations
Velocity
the speed at which data is collected
Variety
the forms of data are of different types.
Veracity
the reliability of the data generated
Data Mining
focuses on extracting predictive information from big data.
Frequency Distribution
a tabular summary of data showing the number (i.e. frequency) of observations in each of several non overlapping categories.
Relative Frequency
frequency of a class / n of a class.
Percent Frequency
relative frequency * 100.
Bar Chart
a visual display of frequency; relative frequency & percent frequency distributions.
Pie Chart
a visual display of frequency; relative frequency & percent frequency distributions.
Number of Classes
Typically, between 5 and 20. Small datasets have less; larger datasets have more.
Width of the Class
Generally, it should be the same for each class. Approximate class width = (largest data value - smallest data value)/number of classes.
Class Limits
each data observation must only belong to one class.
Relative Frequency Distributions
frequency of the class/n.
Histogram
A visual display of a frequency, relative frequency or percent frequency distribution, where the variable of interest is on the horizontal axis and the frequency, relative frequency or percent frequency is on the vertical axis.
Cumulative Distributions
Presents the number of data items with values less than or equal to the upper class limit for each class.
Cumulative relative frequency distribution
the proportion of data items with values less than or equal to the upper limit of each class.
Cumulative percent frequency distribution
the percentage of data items with values less than or equal to the upper limit of each class.
Crosstabulation
a tabular summary of data for two variables (either categorical or quantitative)
Scatterdiagram
a graphical display of the relationship between two quantitative variables
Trendline
provides an approximation (i.e. an estimate) of the relationship; which can be positive, negative or none
Side-by-Side Bar Chart
depicts multiple bar charts on the same display
Stacked Bar Chart
has one bar broken into segments of a different color showing the relative frequency of each class