Introduction to Descriptive and Inferential Statistics

Core Definitions of Populations and Samples

  • Population: In statistical studies, a population consists of the entire group to be studied. It is often either impossible or impractical to gain access to an entire population for data collection.
  • Sample: A sample is defined as a subset of the population that is being studied. It is the group from which data is actually collected to draw conclusions.
  • Individual: An individual refers to a person or an object that is a member of the population being studied.

Descriptive Statistics

  • Definition: Descriptive statistics consist of the processes of organizing and summarizing data. The primary purpose is to describe the results of a specific group or sample without extending those results to a larger group or making general conclusions about the population.
  • Methods of Description: Descriptive statistics describe data through three primary mediums:   - Numerical summaries.   - Tables.   - Graphs.
  • Statistic: A statistic is defined as a numerical summary of a sample.
  • Example Case Study of Descriptive Statistics:   - Context: In a survey of 4040 students, 3232 stated they would return found money to the owner.   - Calculation: The result is presented as the percent of students in the survey who would return the money, calculated as 80%80\%.   - Designation: The value 80%80\% is a statistic because it is a numerical summary derived specifically from a sample result. It describes the sample but does not make a claim about all students beyond that sample.
  • Utility: Descriptive statistics make it significantly easier for researchers to gain a clear overview of what the collected data are communicating.

Inferential Statistics and Reliability

  • Definition: Inferential statistics involves methods that take a result from a sample, extend it to the population, and measure the reliability of that result.
  • The Nature of Uncertainty: Any generalization from a sample to a population inherently contains uncertainty. This is because a sample, being only a subset, cannot provide complete information about the entire population.
  • Level of Confidence: Because of the uncertainty involved, inferential statistics must always include a level of confidence in the results, which serves as a measure of reliability.
  • Example of Inferential Level of Confidence and Range:   - Rather than stating a point estimate (e.g., 80%80\% of all students), an inferential statement would be: "We are 95%95\% confident that between 7684%76-84\% of all students would return the money."   - Components of the Statement:     - Confidence Level: The 95%95\% confidence is the measure of reliability.     - Range of Values: The interval of 7684%76-84\% accounts for the variability in the results.

Parameters vs. Statistics

  • Parameter: A parameter is defined as a numerical summary of a population.
  • Statistic: A statistic is defined as a numerical summary of a sample.
  • Comparative Examples of Parameters and Statistics:   - Scenario A (Parameter): Suppose the percentage of all students on a specific campus who own a car is determined to be 48.2%48.2\%. This value is a parameter because it summarizes the entire population (all students on campus).   - Scenario B (Statistic): Suppose a sample of 100100 students is obtained from that campus, and it is found that 46%46\% of them own a car. This value is a statistic because it represents a numerical summary of that specific sample rather than the whole population.