Statistics ch1

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/23

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

24 Terms

1
New cards

Data

Compilations of facts, figures, or other contents, both
numerical and non-numerical.

2
New cards

Statistics

the science that deals with the collection, preparation, analysis, interpretation, and presentation of data

3
New cards

Descriptive statistics

refers to the summary of important aspects
of a data set.
• Includes collecting, organizing, and presenting the data in the
form of charts and tables.

4
New cards

Inferential statistics

refers to drawing conclusions about a larger
set of data (population) based on a smaller set of data (sample)

5
New cards

Cross-sectional data

refers to data collected by recording a
characteristic of many subjects at the same point in time, or without regard to differences in time.

6
New cards

Time-series data

refers to data collected over several time periods focusing on certain groups of people, specific events, or objects.

7
New cards

Structured data

Reside in a pre-defined, row-column format.

8
New cards

Unstructured data

Do not conform to a pre-defined, row-column format.

9
New cards

Big data

  • A massive volume of structured and unstructured data.

  • Extremely difficult to manage, process, and analyze using traditional data
    processing tools

10
New cards

Volume

immense amount of data complied for a single or multiple sources

11
New cards

Velocity

data is generated at a rapid speed

12
New cards

Variety

data come in all types, forms, granularity, structured and unstructured.

13
New cards

Veracity

credibility and quality of the data

14
New cards

Value

useful insights or measurable improvements due to the use of data

15
New cards

Variable

a characteristic of interest that differs in kind or degree among various observations (records)

16
New cards

Categorical data

Also called qualitative.
• Represent categories.
• We use labels or names to identify distinguishing
characteristic of each observation.
• Can be defined by two or more categories.
• Coded into numbers for data processing.
• Example: marital status, grade in a course

17
New cards

Numerical data


Also called quantitative.

• Represent meaningful numbers.
• We use numbers to identify the distinguishing characteristic of each
observation.
• Either discrete or continuous.

18
New cards

Nominal scale

Least sophisticated.
• can be only categorized or grouped.
• Observations differ by label or name.
• Example: marital status

19
New cards

Ordinal scale

Stronger level of measurement.
• We can categorize and rank data with respect to some characteristic.
• Differences between the ranked observations cannot be interpreted, numbers
are arbitrary

20
New cards

Interval data

Can be categorized and ranked.
• Differences between the observations are meaningful.
• Zero value is arbitrary and does not reflect absence of characteristic.
• Ratios are not meaningful for interval data.
• Example: Fahrenheit scale for temperatures.

21
New cards

Ratio data


Strongest level of measurement.

• Has all the characteristics of the interval scale as well as a true zero point.
• Zero reflects absence of characteristic.
• Ratios are meaningful.
• Example: profits.

22
New cards

The omission strategy

observations with missing values are
excluded from subsequent analysis

23
New cards

The imputation strategy

missing values are replaced with
some reasonable imputed values

24
New cards

Subsetting

is the process of extracting a portion of a data
set that is relevant for subsequent statistical analysis or when the objective of the analysis is to compare two subsets of the data