Business Analytics - Descriptive Statistics

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/75

flashcard set

Earn XP

Description and Tags

pg - 24 - 68

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

76 Terms

1
New cards

Data

The facts and figures collected, analyzed, and summarized for representation and interpretation.

2
New cards

Variable

A characteristic or a quantity of interest that can take on different values

3
New cards

Observation

Set of values corresponding to a set of variables

4
New cards

Variation

Dfference in a variable measured over observations

5
New cards

Descriptive analytics

Collect and analyze data to gain a better understanding of variation and its impact on the business setting

6
New cards

Decision variables

The values of some variables are under direct control of the decision maker

7
New cards

Random/Uncertain Variable

Quantity whose values are not known with certainty is called

8
New cards

Population

The set of all elements of interest in a particular study

9
New cards

Sample

Subset of the population

10
New cards

Random sampling

Collecting a sample that ensures that (1) each element selected comes from the same population and (2) each element is selected independently.

11
New cards

Quantitative Data

Data is numeric and arithmetic operations, such as addition, substraction and division can be performed on them.

12
New cards

Categorical Data

If it cannot Arthimetic cannot be applied to them.

13
New cards

Categorical Data

This data is treated through counting the number of observations or computing the proportions of observations in each category

14
New cards

Cross-sectional data

Type of data that is collected from several entities at the same, or approximately the same, point in time.

15
New cards

Time series data

Collected over several periods of time.

16
New cards

Experimental study

Identify the Variable of interest → Using 2 or more Variables to impact the Variable of interest

17
New cards

Non-Experimental/Observational study

Make no attempt to control the variables of interest

18
New cards

Survey

Most common type of observational study.

19
New cards

Distributions

Help summarize many characteristics of a data set by describing how often certain values for a variable appear in that data set.

20
New cards

Distributions

created for both categorical and quantitative data, and they assist the analyst in gauging variation.

21
New cards

Classes

Bins for categorical data

22
New cards

Frequency distribution

Summary of the number that data shows up in observations.

23
New cards

Bin

The nonoverlapping groupings of data

24
New cards

Relative Frequency Distribution

Tabular summary of data showing the relative frequency for each bin.

25
New cards

Percent frequency distribution

Tabular summary of data showing the percent frequency for each bin.

26
New cards

Percent frequency distribution

can be used to provide estimates of the relative likelihoods of different values for a random variable. So

27
New cards

Bin Width

Largest Data Value - Smallest Data Value / No. Of bins

28
New cards

Histogram

Graphical presentation of a frequency distribution, relative frequency distribution, or percent frequency distribution of quantitative data.

29
New cards

Frequency polygon

A chart used to display a distribution by using lines to connect the frequency values of each bin.

30
New cards

Frequency polygon

Comparing distributions, particularly for quantitative variables.

31
New cards

Histogram with Clustered Columns

knowt flashcard image
32
New cards

Frequency Polygon

knowt flashcard image
33
New cards

Histogram

knowt flashcard image
34
New cards

Cumulative frequency distribution

Shows the number of data items with values less than or equal to the upper limit of each bin

35
New cards

Arithmetic Mean

The most commonly used measure of location

36
New cards

measure of central location for the data.

Mean
Median

37
New cards

Median

Value in the middle when the data are arranged in ascending order

38
New cards

Mode

Value that occurs most frequently in a data set

39
New cards

Multimodal

At least two modes

40
New cards

Geometric mean

Measure of location that is calculated by finding the nth root of the product of n values.

41
New cards

Geometric mean

knowt flashcard image
42
New cards

Range

Simplest measure of veriability

43
New cards

Range

Can be found by subtracting the smallest value from the largest value in a data set.

44
New cards

Variance

Measure of variability that utilizes all the data

45
New cards

Variance

Deviation about the mean squared

46
New cards

Deviation

Observation about the mean is written Xi -

47
New cards

Population Variance

can be computed directly rather than using sample variance For a population of N observations

48
New cards

Population Variance

knowt flashcard image
49
New cards

Population Variance

knowt flashcard image
50
New cards

Sample Variance

knowt flashcard image
51
New cards

𝜇

denoting the population mean

52
New cards

Standard Deviation

positive square root of the variance

53
New cards

s

Standard Deviation

54
New cards

𝜎

to denote the population standard deviation.

55
New cards

Coefficient of Variation

Standard deviation/mean * 100%

56
New cards

Percentiles

Value of a variable at which a specified (approximate) percentage of observations are below that value.

57
New cards

Percentile

knowt flashcard image
58
New cards
  1. Percentile

  2. Find that spot in the set

  3. Compute

Steps of Percentile

59
New cards

Q1

First quartile, 25th

60
New cards

Q2

Second quartile, 50th

61
New cards

Q2

Third quartile, 75th

62
New cards

Quartile

It is often desirable to divide data into four parts, with each part containing approximately one-fourth, or 25 percent, of the observations.

63
New cards

Z-score

allows us to measure the relative location of a value in the data set.

64
New cards

Z-score

How far a particular value is from the mean relative to the data set’s standard deviation.

65
New cards

Z-score

standardized value.

66
New cards

Empirical rule

determine the percentage of data values that are within a specified number of standard deviations of the mean.

67
New cards

Approximately 68% of the data

values will be within 1 standard deviation of the mean.

68
New cards

Approximately 95% of the data

values will be within 2 standard deviations of the mean.

69
New cards

Almost all

data values will be within 3 standard deviations of the mean.

70
New cards

Outliers

An unusually large or unusually small data value. - extreme values

71
New cards

Outliers

Above or Below 3

72
New cards

Boxplots

Box-and-whisker plots.a graphical summary of the distribution of data

73
New cards

Outliers

in a boxplot these are extreme values that should be investigated to ensure data accuracy.

74
New cards

Interquartile range

Q3-Q1

75
New cards

Box plot

Upper limit Q3+1.5(IQR)

76
New cards

Box plot

Lower limit Q1-1.5(IQR)