1/25
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Business analytics
combines quantitative reasoning with quantitative tools to identify key business problems and translate data analysis into action that improve business performance.
descriptive analytics
“what has happened?”
predictive analytics
what could happen in the future?
prescriptive analytics
“what should we do?”
Data
Compilations of facts, figures, or other contents, both numerical and nonnumerical.
Population
Data in a statistical problem that consists of all items of interest.
Sample
a subset of the population that is used for the analysis. We rely on this because
(1) it is impossible to examine every member of the population and
(2) obtaining information on the entire population is expensive.
Statistic
a number that describes a characteristic of a sample from that population.
Parameter
a number describing a characteristic of a whole population
Cross-sectional data
Data is collected by recording a characteristic of many subjects at the same point in time.
example: Go to www.zillow.com and find the sale price of 20 single-family homes sold in Las Vegas in the last 30 days. Structure the data including the sale price, the number of bedrooms, the square footage, and the age of the house.
Time series data
Data is collected over several time periods, focusing on certain groups of people, events or objects.
ex: Find Under Armour’s annual revenue from the past 10 years.
Structured data
data with pre-defined, row-column format. uses spreadsheet or database applications.
Unstructured data
data that does not conform to a pre-defined, row-column format, textual, multimedia content, and does not conform to database structures—human or machine-generated data like email (human) or camera images (machine).
Big data
Extremely large datasets that can be analyzed for insights
volume: immense amount of data
velocity: generated at rapid speed
variety: all types of data, including structured and unstructured
Variable
a characteristic of interest that differs in kind or degree among various observations (records).
Categorical
qualitative; uses labels or names to identify characteristics.
example: marital status.
Numerical
quantitative; represent meaningful numbers, arithmetic operations are meaningful. example: number of children in a family
Discrete
numerical variable; assumes a countable number of values.
for example, we may observe values such as 3 children in a family but we will not observe fractions such as 1.31 children.
Continuous
numerical variable; characterized by uncountable values within an interval.
for example, an unlimited number of values occur between the weights of 100 and 101 pounds (like 100.03, 100.4, 100.05, etc).
more examples: Weight, height, time, and investment return
Nominal
categorical measurement; the observations differ merely by name or label.
example: marital status, gender
ordianal
categorical measurement; but able to both categorize and rank them with respect to some characteristic or trait.
example: ratings on a scale (excellent to poor, ranking 1 to 5)
Interval
able to categorize, rank, and find meaningful differences between them. ex: temperature
zero value does not reflect the absence of characteristic
ratios are not meaningful
for example: temperature where 0 degrees does not mean no temperature or time on a clock where 0 seconds does not mean no time
Ratio
the strongest level of measurement. has the characteristics of the interval scale but with a true zero point. ex: profits
a true zero point that reflects the absence of characteristic
ratios are meaningful
for example: income where $0 means no income or age where 0 years old means no age
Fixed width format
each column starts and ends at the same place in every row.Every observation or record has the exact same column widths
Delimited format
each piece of data is separated by a comma. the comma is called a delimiter. a comma-separated value file does not limit text to only eight characters.
eXtensible markup language (XML)
HyperText Markup Language (HTML)
JavaScript Object Notation (JSON)
Three widely used markup languages to provide structure to data.