Introduction to Descriptive and Inferential Statistics
Core Definitions of Populations and Samples
- Population: In statistical studies, a population consists of the entire group to be studied. It is often either impossible or impractical to gain access to an entire population for data collection.
- Sample: A sample is defined as a subset of the population that is being studied. It is the group from which data is actually collected to draw conclusions.
- Individual: An individual refers to a person or an object that is a member of the population being studied.
Descriptive Statistics
- Definition: Descriptive statistics consist of the processes of organizing and summarizing data. The primary purpose is to describe the results of a specific group or sample without extending those results to a larger group or making general conclusions about the population.
- Methods of Description: Descriptive statistics describe data through three primary mediums:
- Numerical summaries.
- Tables.
- Graphs.
- Statistic: A statistic is defined as a numerical summary of a sample.
- Example Case Study of Descriptive Statistics:
- Context: In a survey of 40 students, 32 stated they would return found money to the owner.
- Calculation: The result is presented as the percent of students in the survey who would return the money, calculated as 80%.
- Designation: The value 80% is a statistic because it is a numerical summary derived specifically from a sample result. It describes the sample but does not make a claim about all students beyond that sample.
- Utility: Descriptive statistics make it significantly easier for researchers to gain a clear overview of what the collected data are communicating.
Inferential Statistics and Reliability
- Definition: Inferential statistics involves methods that take a result from a sample, extend it to the population, and measure the reliability of that result.
- The Nature of Uncertainty: Any generalization from a sample to a population inherently contains uncertainty. This is because a sample, being only a subset, cannot provide complete information about the entire population.
- Level of Confidence: Because of the uncertainty involved, inferential statistics must always include a level of confidence in the results, which serves as a measure of reliability.
- Example of Inferential Level of Confidence and Range:
- Rather than stating a point estimate (e.g., 80% of all students), an inferential statement would be: "We are 95% confident that between 76−84% of all students would return the money."
- Components of the Statement:
- Confidence Level: The 95% confidence is the measure of reliability.
- Range of Values: The interval of 76−84% accounts for the variability in the results.
Parameters vs. Statistics
- Parameter: A parameter is defined as a numerical summary of a population.
- Statistic: A statistic is defined as a numerical summary of a sample.
- Comparative Examples of Parameters and Statistics:
- Scenario A (Parameter): Suppose the percentage of all students on a specific campus who own a car is determined to be 48.2%. This value is a parameter because it summarizes the entire population (all students on campus).
- Scenario B (Statistic): Suppose a sample of 100 students is obtained from that campus, and it is found that 46% of them own a car. This value is a statistic because it represents a numerical summary of that specific sample rather than the whole population.