Statistics ch1

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/23

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

24 Terms

New cards

Data

Compilations of facts, figures, or other contents, both
numerical and non-numerical.

New cards

Statistics

the science that deals with the collection, preparation, analysis, interpretation, and presentation of data

New cards

Descriptive statistics

refers to the summary of important aspects
of a data set.
• Includes collecting, organizing, and presenting the data in the
form of charts and tables.

New cards

Inferential statistics

refers to drawing conclusions about a larger
set of data (population) based on a smaller set of data (sample)

New cards

Cross-sectional data

refers to data collected by recording a
characteristic of many subjects at the same point in time, or without regard to differences in time.

New cards

Time-series data

refers to data collected over several time periods focusing on certain groups of people, specific events, or objects.

New cards

Structured data

Reside in a pre-defined, row-column format.

New cards

Unstructured data

Do not conform to a pre-defined, row-column format.

New cards

Big data

A massive volume of structured and unstructured data.
Extremely difficult to manage, process, and analyze using traditional data
processing tools

New cards

Volume

immense amount of data complied for a single or multiple sources

New cards

Velocity

data is generated at a rapid speed

New cards

Variety

data come in all types, forms, granularity, structured and unstructured.

New cards

Veracity

credibility and quality of the data

New cards

Value

useful insights or measurable improvements due to the use of data

New cards

Variable

a characteristic of interest that differs in kind or degree among various observations (records)

New cards

Categorical data

Also called qualitative.
• Represent categories.
• We use labels or names to identify distinguishing
characteristic of each observation.
• Can be defined by two or more categories.
• Coded into numbers for data processing.
• Example: marital status, grade in a course

New cards

Numerical data

Also called quantitative.
• Represent meaningful numbers.
• We use numbers to identify the distinguishing characteristic of each
observation.
• Either discrete or continuous.

New cards

Nominal scale

Least sophisticated.
• can be only categorized or grouped.
• Observations differ by label or name.
• Example: marital status

New cards

Ordinal scale

Stronger level of measurement.
• We can categorize and rank data with respect to some characteristic.
• Differences between the ranked observations cannot be interpreted, numbers
are arbitrary

New cards

Interval data

Can be categorized and ranked.
• Differences between the observations are meaningful.
• Zero value is arbitrary and does not reflect absence of characteristic.
• Ratios are not meaningful for interval data.
• Example: Fahrenheit scale for temperatures.

New cards

Ratio data

Strongest level of measurement.
• Has all the characteristics of the interval scale as well as a true zero point.
• Zero reflects absence of characteristic.
• Ratios are meaningful.
• Example: profits.

New cards

The omission strategy

observations with missing values are
excluded from subsequent analysis

New cards

The imputation strategy

missing values are replaced with
some reasonable imputed values

New cards

Subsetting

is the process of extracting a portion of a data
set that is relevant for subsequent statistical analysis or when the objective of the analysis is to compare two subsets of the data