Notes for Test 1: StatCrunch Workflow, Summary Stats, and Visualization

Overview of the exam workflow

This session emphasizes preparing for a statistics test using StatCrunch. You’ll be given a dataset (about 20 data items) and will enter it into StatCrunch, then generate summary statistics, a frequency table, a histogram with six bins, a five-number summary, and a box-and-whisker plot. There is an option to do some problems by hand for the first test, but the instructor’s intention is for you to use StatCrunch. A dataset will be posted on Brightspace as notes for the day. Typing the data correctly is crucial: if you enter the wrong number, all answers can be off. The instructor will compare your summary statistics to expected values to detect typos. Minor deviations in a value could indicate a typing error (five-point penalty for typing errors overall; each value that is off without being a typo costs one point). Extremely off results (e.g., a mean of 4 when you expected 75) will be treated as not understanding the material.

Data entry and initial checks

Enter the data into StatCrunch. Expect about 20 data items. When you type, be mindful of typos because they impact the whole test. After entering the data, verify the following summary statistics:

  • n: the number of data values (a count that confirms you typed the correct number of observations)

  • x̄: the mean

  • s: the standard deviation

  • the median (Q2)

  • Q1 and Q3 (the first and third quartiles)

  • min and max

  • the interquartile range (IQR)

  • the lower fence and upper fence (these are not provided directly by all tools and must be calculated by hand)

Note that StatCrunch provides the mean and standard deviation, but not just the quartiles or fences by default, so you’ll need to recall or compute Q1, Q2, Q3, and the fences manually when needed.

Key statistics to report and their significance

You will report the following from the dataset:

  • n: sample size

  • x̄: mean

  • s: standard deviation

  • Median (Q2): the middle value in the ordered data

  • Q1 and Q3: the 25th and 75th percentiles

  • Min and Max: the smallest and largest data values

  • IQR: Q3 − Q1, a measure of spread resistant to outliers

  • Lower Fence: Q1 − 1.5 × IQR

  • Upper Fence: Q3 + 1.5 × IQR

Formulas you should remember:
\bar{x} = \frac{1}{n} \sum{i=1}^{n} xi
s = \sqrt{ \frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2 }
\text{IQR} = Q3 - Q1
\text{Lower Fence} = Q1 - 1.5 \times \text{IQR} \text{Upper Fence} = Q3 + 1.5 \times \text{IQR}

If the summary statistics look exactly as expected, there were no typos. If each value is off by a small amount, it suggests a small typing error. If the values are wildly off, that signals a major misunderstanding or a data entry mistake.

Outliers and fences

The lower and upper fences are used to determine potential outliers. The fence values serve as guidelines: values outside these fences are flagged as potential outliers. In a box plot, whiskers extend to min and max values if there are no outliers; if there are outliers, points outside the fences appear as individual markers beyond the whiskers.

  • The five-number summary (min, Q1, Q2, Q3, max) helps locate the position of data within the range and is used to draw the box in the box-and-whisker plot.

  • The box plot visually represents the five-number summary and any outliers beyond the fences.

Class width and histogram setup (six classes)

For the histogram, you’ll use six classes (bins). The first step is to calculate the class width. The class width w is determined by:
w = \left\lceil \frac{\max - \min}{k} \right\rceil
where k = 6 is the number of classes. Always round up, even if the fractional part is less than 0.5. The starting value for the first bin is min.

Once you have w, use StatCrunch to create a frequency table and a histogram with six classes, using the minimum data value as the starting point and the computed class width as the bin width. Then copy the resulting frequency table and histogram onto the test.

Five-number summary and the box plot

The five-number summary consists of:

  • Min

  • Q1

  • Q2 (the median)

  • Q3

  • Max
    These values define the interquartile range and the central box in the box-and-whisker plot. Do not forget to include the box plot on the test as it is a required element.

Percentiles and interpretation

You will be asked to calculate a percentile, such as the 90th percentile (P_{90}). In summary stats, you’ll input the requested percentile, and StatCrunch (or the test setup) will provide or confirm the value.

Step-by-step workflow to practice

  • Find a small dataset in the textbook (chapter 3 is suggested) with fewer than 30 values.

  • Enter the dataset into StatCrunch and verify you can obtain n, x̄, s, median, Q1, Q3, min, max, and IQR.

  • Compute the lower and upper fences by hand using the formulas above.

  • Create a six-bin histogram using the class width and min as your starting point.

  • Generate the frequency table for those six bins.

  • Record the five-number summary and construct the box plot.

  • Calculate the requested percentile (e.g., the 90th percentile) and interpret what it means in context.

Practice setup and study strategy

The instructor emphasizes using StatCrunch as the primary tool and keeping a notebook or notes for reference. There is a practice test and a dedicated test video that walks through StatCrunch material. The dataset for the actual test will be provided later, but practicing with a smaller, textbook dataset (<30 values) helps solidify the workflow.

Resources, timelines, and tips for success

  • A video labeled Test 1 (and a Test 2 video) is available; use the Test 1 video to walk through StatCrunch material before attempting the test.

  • If you need definitions or concepts, the textbook is a good reference.

  • The instructor will be available by email or text after class, even when not on campus.

  • You may use notes during the test, and you can use StatCrunch on laptop, tablet, or phone. Please limit to one seat per student to avoid confusion and ensure attendance tracking.

  • The instructor emphasizes not writing in red or purple during grading to avoid color mix-ups.

  • The grading scheme highlights that a data-typing error costs 5 points total; other incorrect values cost 1 point each if not due to a typo.

Quick reminders and common questions

  • What is a fence? The fences are markers used to determine outliers. If data lie outside the fences, they are considered outliers; StatCrunch may display these as separate points beyond the whiskers on the box plot.

  • Always round up when computing class width, even if the fractional part is less than 0.5.

  • You can answer questions in a non-linear order on the test (e.g., you may compute fences after constructing the box plot), as long as you provide all required outputs.

  • If you miss a step on the test (e.g., forget the box plot), it’s a common mistake; ensure to include both the five-number summary and the box plot.

Final note on expectations

The overall goal is for you to be comfortable with loading a dataset into StatCrunch, extracting key statistics, identifying potential outliers, constructing a histogram with six bins, and interpreting percentiles. Use the video resources, practice datasets, and your notes to reinforce the workflow, and don’t hesitate to reach out to the instructor for help during the available office hours or via email.