Data Analytics and Excel Lecture Notes

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/34

flashcard set

Earn XP

Description and Tags

Flashcards from Lecture Notes

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

35 Terms

1
New cards

Data Literacy

The ability to analyze, interpret, and question data, which is an increasingly valuable skill for evidence-based decision-making.

2
New cards

Data

Raw facts, like a single supermarket transaction.

3
New cards

Information

The result of processing data to create meaning and enable decision-making.

4
New cards

Target Population

All subjects of interest in a study.

5
New cards

Sample

A manageable subset of the target population used to make studies feasible.

6
New cards

Observational Study

A study where the researcher collects data without intervention.

7
New cards

Experimental Study

A study where the researcher intervenes to influence outcomes.

8
New cards

Bias

Systematic error causing incorrect parameter estimation or association.

9
New cards

Mode

The most frequently occurring category in a data set.

10
New cards

Mean

The arithmetic average of a data set.

11
New cards

Median

The middle value of an ordered data set.

12
New cards

Variance

The average of squared deviations from the mean.

13
New cards

Standard Deviation

The square root of the variance, indicating how spread out the data is.

14
New cards

Inter-Quartile Range (IQR)

The difference between the upper and lower quartiles.

15
New cards

Range

The difference between the maximum and minimum values.

16
New cards

Outliers

Values that fall outside the calculated fences (LQ - 1.5 x IQR or UQ + 1.5 x IQR).

17
New cards

Histogram

A visual representation of the distribution of a single numerical variable.

18
New cards

Contingency Table

Used to determine the relationship between two categorical variables.

19
New cards

Scatterplot

Used to determine the relationship between two numerical variables.

20
New cards

Normal Distribution

A symmetrical distribution where most observations cluster around the central peak.

21
New cards

Uniform Distribution

A distribution where all outcomes are equally likely.

22
New cards

Skewed Distribution

A distribution that is not symmetrical; can be skewed left (long tail on the left) or skewed right (long tail on the right).

23
New cards

One Sample t-Test

Used when comparing a sample mean to a known value.

24
New cards

Two Sample t-Test

Compares means from two independent samples.

25
New cards

Paired Sample t-Test

Compares means from related groups; involves calculating the differences.

26
New cards

Chi-Square Goodness of Fit Test

Used for a categorical variable to compare observed frequencies with expected proportions.

27
New cards

Central Limit Theorem (CLT)

States that the mean of a random sample has a sampling distribution whose shape can be approximated by a Normal distribution; the larger the sample, the better the approximation.

28
New cards

Confidence Interval (CI)

An interval that is expected to contain the population parameter being estimated with a certain level of confidence.

29
New cards

Correlation

Measures the association between two numeric variables; it is a number between -1 and 1.

30
New cards

Regression Line

Helps predict a value of y given a value of x.

31
New cards

Residuals

Vertical distances between data points and the regression line.

32
New cards

Linearity

Assumption that the relationship between variables is linear.

33
New cards

Independence

Assumption that residuals are independent.

34
New cards

Normality

Assumption that residuals are normally distributed.

35
New cards

Equality of Variance (Homoscedasticity)

Assumption that residuals have equal variance.