Course Modules in ISMAT3000: Quantitative Methods Lecture

MODULE 2: Topic: Describing Data

Course Modules Overview

Course Modules: 16 modules organized under ISMAT3000 focusing on Quantitative Methods.

Course Information

Week Number: 3 to 4
Chapter/Unit Number: 2
Topic Title: Describing Data

Course Outcomes

CO1: Describe, summarize, and organize datasets using appropriate descriptive statistical measures and graphical representations to support information systems analysis.

Intended Learning Outcomes

LO1: Describe datasets by organizing data into frequency distributions, tables, and graphical forms.
LO2: Compute and describe data using measures of central tendency (mean, median, mode).
LO3: Describe data variability using measures of dispersion (range, variance, standard deviation).
LO4: Describe patterns, trends, and outliers in datasets using graphs (histograms, bar charts, line graphs, box plots).
LO5: Interpret descriptive statistical measures and visual summaries to explain data behavior and support basic information systems decisions.

Sustainable Development Goals (SDGs) and Responsibility

REGAL100 Integration:
- SDG 3: Good Health
- SDG 4: Quality Education
- SDG 9: Industry, Innovation, and Infrastructure
- SDG 11: Sustainable Cities
- SDG 16: Peace, Justice, and Strong Institutions

Discussion: Organizing and Displaying Data

Importance of gathering data relevant to the variable under study in a statistical analysis.
Importance of organizing data meaningfully for drawing conclusions or making inferences.
Illustration through the example of studying snake bites through data collection from medical centers.
Emphasis on using visual aids to effectively communicate findings: building graphs for presentation.
Illustrated with an example of studying commuting distances of employees for a retail store.

Raw Data

Definition: Data collected in its original form,
Example of Raw Data: 1, 2, 6, 7, 12, 13, 2, 6, 9, 5, and so forth.

Rules of Data Analysis

Make a picture: Visual displays reveal hidden patterns.
Make a picture: Well-designed displays show important features and unexpected patterns.
Make a picture: A good picture communicates findings effectively.

Frequency Distribution: Organization of raw data in table form with classes and frequencies.
- Categorical Frequency Distribution: For nominal/ordinal data (e.g., blood types).

Bar and Pie Charts

Bar Chart: Displays counts of each category for easy comparison.
Pie Chart: Represents whole groups as a circle, divided into slices that are proportional to their fraction of the whole.

Frequency Distribution Rules

Between 5 and 20 classes.
Classes must be mutually exclusive, continuous, and exhaustive; they must also have equal width.

Example for Grouped Frequency Distribution

Record High Temperatures:
- Data: 112, 100, 127, 120, etc.
- Step 1: Determine classes (highest = 134, lowest = 100).
- Range Calculated: R = 134 - 100 = 34
- Number of Classes = 7, Class Width = (\frac{R}{\text{Number of Classes}} = \frac{34}{7} \approx 5

Constructing Classes

Choose a starting point (e.g., 100), establish the limits, and calculate boundaries.

Tallying and Numerical Frequencies

Example of Data Analysis: Miles per gallon data for SUVs (values provided).
- Construct frequency distributions and analyze.

Stem and Leaf Plots

Definition: Organizes data with part of the data as the stem and the remainder as leaves.
Usage: Retains actual data while providing a graphical representation.

Back-to-Back Stem and Leaf Plot

Usage: For comparing two datasets side-by-side.

Measures of Central Tendency

Mean:
- Formula: \bar{X} = \frac{\sum X}{n} where (\bar{X}) is sample mean, (n) is number of values.
- Example for calculation given days off.
Median:
- The midpoint of a data set when ordered.
- Method of determining median when data set has odd/even amounts.
Mode:
- The most frequently occurring value.
- Example with signing bonuses of NFL players.

Measures of Variation

Population Variance: Formula: \sigma^2 = \frac{\sum (X - \mu)^2}{N}
Sample Variance: Formula: s^2 = \frac{\sum(X - \bar{X})^2}{n - 1}$$
Standard Deviation: The square root of variance.

Exploratory Data Analysis (EDA)

Overview of tools used for analyzing data:
- Standard scores (z-scores), percentiles, deciles, quartiles.

Percentiles

Definition: Divide data into 100 groups to indicate score positions relative to others.
Example calculation provided for finding percentile for test scores.

Quartiles and Deciles

Quartiles: Divide data into four groups (Q1, Q2, Q3; where Q2 is the median).
Deciles: Divide data into ten groups.

Outliers Identification

Definition: Extreme data values that significantly differ from others.
Procedure for identifying outliers documented with examples.

Exploratory Data Analysis Techniques

Use of stem and leaf, box plots, and five-number summary for data representation.
Boxplots: Graphical representation involving minimum, Q1, median, Q3, maximum.

Outcomes-Based Assessment

Requirements include spotting misleading graphs, computations of various measures, explanations of outliers, and comparison of distributions.

References and Resources

Suggested readings and valuable papers for further knowledge in Quantitative Methods.
List of books and papers pertaining to qualitative and quantitative research methodologies.