Data Analysis

Chapter 1: What is Statistics

/

Why Study Statistics

  • Data is collected everywhere and requires statistical knowledge to make this information useful.

  • Statistics is used to make valid comparisons and predict outcomes.

Definition of Statistics

  • Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions.

Theories of Statistics

  • There are two main branches of statistics:

    • Descriptive Statistics: Summarizes and describes the features of a data set.

    • Inferential Statistics: Used to estimate properties of a population and draw inferences about it.

Key Concepts

  • Population: The entire set of individuals or objects of interest, or the measurement obtained from all individuals or objects of interest.

  • Sample: A portion or part of the population of interest.

  • Variable: A characteristic or attribute that can take different values across different observations in a data set.

  • Observation: A single data point or record in a data set that represents all the variables for a particular instance.

  • Time Series Data: A data set that tracks the same variables over a period of time at regular intervals.

Types of Variables

  • Qualitative Variables: Non-numeric characteristics or attributes recorded through observation.

    • Nominal Variables: Categories with no inherent order (can only be classified or counted).

    • Ordinal Variables: Categories with a meaningful order; the difference between values is not consistent (e.g., classification: freshman, sophomore).

  • Quantitative Variables: Numeric characteristics that can be measured.

    • Discrete Variables: Result from counting; values have gaps between them (e.g., number of students).

    • Continuous Variables: Usually result from measuring something; can assume any value within a specific range (e.g., height or weight).

Types of Variables

Discrete Variables

  • Discrete (Interval): Numerical, can take specific values typically resulting from counting.

Continuous Variables

  • Continuous (Ratio): Can take any value within a given range, typically resulting from measurement.

Categorical Variables

  • Nominal: Unordered categories that are mutually exclusive (e.g., colors).

  • Ordinal: Ordered categories that are mutually exclusive.

Measures of location is a value used to describe be central tendency of a set of data

Common measures of location

mean

median

mode

robot