Course Modules in ISMAT3000: Quantitative Methods Lecture

Course Modules in ISMAT3000: Quantitative Methods Lecture

MODULE 2: Topic: Describing Data

Course Modules Overview

  • Course Modules: 16 modules organized under ISMAT3000 focusing on Quantitative Methods.

Course Information

  • Week Number: 3 to 4

  • Chapter/Unit Number: 2

  • Topic Title: Describing Data

Course Outcomes

  • CO1: Describe, summarize, and organize datasets using appropriate descriptive statistical measures and graphical representations to support information systems analysis.

Intended Learning Outcomes

  • LO1: Describe datasets by organizing data into frequency distributions, tables, and graphical forms.

  • LO2: Compute and describe data using measures of central tendency (mean, median, mode).

  • LO3: Describe data variability using measures of dispersion (range, variance, standard deviation).

  • LO4: Describe patterns, trends, and outliers in datasets using graphs (histograms, bar charts, line graphs, box plots).

  • LO5: Interpret descriptive statistical measures and visual summaries to explain data behavior and support basic information systems decisions.

Sustainable Development Goals (SDGs) and Responsibility

  • REGAL100 Integration:

    • SDG 3: Good Health

    • SDG 4: Quality Education

    • SDG 9: Industry, Innovation, and Infrastructure

    • SDG 11: Sustainable Cities

    • SDG 16: Peace, Justice, and Strong Institutions

Discussion: Organizing and Displaying Data

  • Importance of gathering data relevant to the variable under study in a statistical analysis.

  • Importance of organizing data meaningfully for drawing conclusions or making inferences.

  • Illustration through the example of studying snake bites through data collection from medical centers.

  • Emphasis on using visual aids to effectively communicate findings: building graphs for presentation.

  • Illustrated with an example of studying commuting distances of employees for a retail store.

Raw Data

  • Definition: Data collected in its original form,

  • Example of Raw Data: 1, 2, 6, 7, 12, 13, 2, 6, 9, 5, and so forth.

Rules of Data Analysis

  1. Make a picture: Visual displays reveal hidden patterns.

  2. Make a picture: Well-designed displays show important features and unexpected patterns.

  3. Make a picture: A good picture communicates findings effectively.

  • Frequency Distribution: Organization of raw data in table form with classes and frequencies.

    • Categorical Frequency Distribution: For nominal/ordinal data (e.g., blood types).

Bar and Pie Charts

  • Bar Chart: Displays counts of each category for easy comparison.

  • Pie Chart: Represents whole groups as a circle, divided into slices that are proportional to their fraction of the whole.

Frequency Distribution Rules

  • Between 5 and 20 classes.

  • Classes must be mutually exclusive, continuous, and exhaustive; they must also have equal width.

Example for Grouped Frequency Distribution

  1. Record High Temperatures:

    • Data: 112, 100, 127, 120, etc.

    • Step 1: Determine classes (highest = 134, lowest = 100).

    • Range Calculated: R = 134 - 100 = 34

    • Number of Classes = 7, Class Width = (\frac{R}{\text{Number of Classes}} = \frac{34}{7} \approx 5

Constructing Classes

  • Choose a starting point (e.g., 100), establish the limits, and calculate boundaries.

Tallying and Numerical Frequencies

  • Example of Data Analysis: Miles per gallon data for SUVs (values provided).

    • Construct frequency distributions and analyze.

Stem and Leaf Plots

  • Definition: Organizes data with part of the data as the stem and the remainder as leaves.

  • Usage: Retains actual data while providing a graphical representation.

Back-to-Back Stem and Leaf Plot

  • Usage: For comparing two datasets side-by-side.

Measures of Central Tendency

  1. Mean:

    • Formula: \bar{X} = \frac{\sum X}{n} where (\bar{X}) is sample mean, (n) is number of values.

    • Example for calculation given days off.

  2. Median:

    • The midpoint of a data set when ordered.

    • Method of determining median when data set has odd/even amounts.

  3. Mode:

    • The most frequently occurring value.

    • Example with signing bonuses of NFL players.

Measures of Variation

  • Population Variance: Formula: \sigma^2 = \frac{\sum (X - \mu)^2}{N}

  • Sample Variance: Formula: s^2 = \frac{\sum(X - \bar{X})^2}{n - 1}$$

  • Standard Deviation: The square root of variance.

Exploratory Data Analysis (EDA)

  • Overview of tools used for analyzing data:

    • Standard scores (z-scores), percentiles, deciles, quartiles.

Percentiles

  • Definition: Divide data into 100 groups to indicate score positions relative to others.

  • Example calculation provided for finding percentile for test scores.

Quartiles and Deciles

  • Quartiles: Divide data into four groups (Q1, Q2, Q3; where Q2 is the median).

  • Deciles: Divide data into ten groups.

Outliers Identification

  • Definition: Extreme data values that significantly differ from others.

  • Procedure for identifying outliers documented with examples.

Exploratory Data Analysis Techniques

  • Use of stem and leaf, box plots, and five-number summary for data representation.

  • Boxplots: Graphical representation involving minimum, Q1, median, Q3, maximum.

Outcomes-Based Assessment

  • Requirements include spotting misleading graphs, computations of various measures, explanations of outliers, and comparison of distributions.

References and Resources

  • Suggested readings and valuable papers for further knowledge in Quantitative Methods.

  • List of books and papers pertaining to qualitative and quantitative research methodologies.