Final new material (lecture 15/16)

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/27

There's no tags or description

Looks like no tags are added yet.

Last updated 10:05 PM on 6/6/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

28 Terms

New cards

variability (what is it?)

Quantitative measure of the degree to which scores in a distribution are spread out or clustered together

New cards

range (what is it? Xmax? Xmin?)

range = the difference between the largest and smallest score

Xmax: largest score

Xmin: smallest score

New cards

exclusive range (what is it? how is it measured? most common of range variation? does "exclusive" need to be specified when discussing "range" Be able to calculate it for a given distribution)

Most Common, the assumed version when referring to range

(Xmax - Xmin)

computes difference between highest and lowest values in distribution

New cards

inclusive range (what is it? how is it measured? when is it useful? does "inclusive" need to be specified when discussing "range"? Be able to calculate it for a given distribution)

Computes number of values included between highest and lowest values

Xmax - Xmin + 1

Useful for discrete values

If data suggests we should emphasize number of values, we often use inclusive range, must specify inclusive range is needed

New cards

advantages / limitations of the range?

Advantage: simple to calculate and easy to understand

Limitations: determined by 2 extreme values and ignores every other score in distribution --> often fails to give accurate measure of variability

range generally considered crude and unreliable measure of variability.

New cards

quartiles (what are they? what measures of variability are they used to compute? how are they computed?)

Divides distributions into 4 equal parts

Used to compute interquartile range and semi-interquartile range

How to compute: First find median (Q2), first quartile (Q1) is median of lower half of distribution, and third quartile (Q3) is median of upper half

Lower half = everything at/below median (Q2)

Upper half = everything at/above median (Q2)

New cards

quartiles (what percentage of a distribution is below Q1? Below Q2? Below Q3? Q2 is the same as what measure of central tendency?)

Q1 seperates first 25% of distribution

Q2 has 50% of distribution (same as median)

Q3 has 75% of distribution below it

New cards

Computing quartiles ex 1 (find Q1, Q2(median) and Q3):

11, 3, 4, 7, 5, 9, 13, 10

First order them: 3, 4, 5, 7, 9, 10, 11, 13

Q2 (median): 8

(bc even number of scores and 8 falls in b/w 7 and 9!)

lower half: 3, 4, 5, 7 Upper half: 9, 10, 11, 13

Q1: 4.5

(median of lower half)

Q3: 10.5

(median of upper half)

New cards

calculating quartiles ex 2 (find Q1, Q2(median) and Q3):

15, 4, 7, 8, 21, 1, 5, 18, 11, 13, 10

First order them: 1, 4, 5, 7, 8, 10, 11, 13, 15, 18, 21

Q2: 10 (median)

Lower half: 1, 4, 5, 7, 8, 10 Upper half: 10, 11, 13, 15, 18, 21

Q1: 6 (median of lower half)

Q3: 14 (median of upper half)

New cards

interquartile range (what is it? how is it calculated? Be able to calculate it from a given distribution)

Distance between first and third quartile

Q3 - Q1

New cards

semi-interquartile range (what is it? how is it calculated? what does it provide a measure of? Be able to calculate it from a given distribution)

One half of the interquartile range

(Q3 - Q1) / 2

Provides descriptive measure of "typical" distance of scores from median (Q2)

New cards

semi-interquartile range (advantages/disadvantages?)

Advantages: focuses on the middle 50% distribution, so it's less likely to be influenced by extreme scores

- makes it better/stable measure of variability than range

Disadvantages: doesn't take into account actual distances b/w individual scores so it doesn't give complete pic of how scattered/clustered scores are

New cards

box plots (also known as? what statistical elements are represented on the plot? what are hinges? H-spread?)

also known as box-and-whisker plot

shows median, quartiles and range on plot

Hinges: the "box" of the plot determined by Q1 and Q3 (the left and right ends of the box!)

H-spread: interquartile range, distance between two "hinges" of the box plot

New cards

box plots (inner fence? adjacent values? nonadjacent values? Be able to calculate all of these values; Be able to identify all of these on a box plot)

inner fence: point that falls 1.5 times the H-spread (interquartile range) above or below the appropriate hinge

adjacent values: values in data that are no farther from median than the inner fences (inside inner fences)

anything outside inner fences is a nonadjacent value

New cards

inner fence ex: H-spread of 2, with hinges at 4 and 6

1.5 H-spread --> (1.5 2 = 3)

lower hinge is at 4, so lower fence will be: 4 - 3 = 1

upper hinger at 6, so upper fence will be: 6 + 3 = 9

New cards

box plot lines ("whiskers") (what values are they drawn through?)

Lines (whiskers) are drawn from hinges out through adjacent values

New cards

box plot outliers (what are they? non-adjacent values? how are they plotted? Be able to identify them in a distribution or on a box plot)

any value more extreme than the end of the whiskers (more extreme than adjacent values)

non-adjacent values are outliers

plotted as just dots on the outside of inner fences

New cards

two methods of calculating the range in R?

Method 1: range() function

Method 2: min() and max() functions

New cards

range() function (what does it do? What does it return?)

returns a vector with both the lowest and highest value in your data

(to get actual range, we subtract first element from the second element): range(VecRaw)[2] - range(VecRaw)[1]

New cards

max() and min() functions (how can they be used to compute the range?)

min() calls low value and max() calls high value, then subtract low value from high value:

max(VecRaw) - min(VecRaw)

New cards

fivenum() function (what does it do? what values does it return and in what order? how can we access individual values from returned values?)

Computes the quartiles, returns 5 values:

(1) the minimum

(2) first quartile (lower hinge)

(3) the median

(4) third quartile (upper hinge)

(5) the maximum

we can access by using brackets, ex, Q3 - Q1:

fivenum(VecRaw)[4] - fivenum(VecRaw)[2]

New cards

quantile() function (what does it do? what does it do that the fivenum() function does not? what does the "type" parameter change? Know how to use the "probs" parameter to calculate a series of percentiles)

allows you to select any percentiles you want, rather than just returning quartiles

can specify exactly which method you want to use by using "type" (type 1-9)

use probs parameter to compute specific percentiles, ex 30th and 70th percentile:

quantile(VecRaw, probs = c(.3, .7))

New cards

boxplot() function (default orientation? "horizontal" parameter? "outline" parameter?)

Default orientation of a boxplot is vertical

horizontal = TRUE orients boxplot horizontally

outline = FALSE suppresses display of outliers

New cards

boxplot() (What elements of plot can be adjusted and with what parameters? "medlwd" parameter? "medcol" parameter? "border" parameter? "col" parameter "lwd" parameter?)

medlwd: set width of the median

medcol: set color of the median

border: change color of the box border and whiskers

col: change color that fills the box

lwd: set the thickness of lines

New cards

boxplot() (Line types? "lty" parameter? What parameters adjust appearance of outliers? "pch" parameter? "cex" parameter? "outbg" parameter? "outcol" parameter?)

Line types: 0) Blank (no line), 1) solid line, 2) dashed line, 3) dotted line, 4) dot-dash line, 5) longdash line, 6) Twodash line

lty: sets all lines to same type

(can change individual lines using, boxlty, medlty, whisklty)

pch: adjusts plotting character for outliers (ex: pch = 15 makes them solod squares)

cex: changes size of plotting character

outbg: sets fill color of outlier

outcol: sets border color of outlier

New cards

multiple box plots on same graph (how do we get multiple box plots on the same graph? what data structure is sent to boxplot() function to create multiple plots on the same graph?)

Create a list containing both data sets then pass it to the barplot function:

DataList <-- list(DataSet1 = VecRaw, DataSet2 = VecRaw2)

boxplot(DataList, horizontal = TRUE)

New cards

vertical vs. horizontal box plot orientation (guidelines for when to choose vertical orientation? when to choose horizontal orientation?)

Vertical: when data groups differ by TIME

Horizontal: when putting a lot of plots on same graph and if the names of each of your groups (data sets) is fairly long, this lets you spell out whole group name on left side of chart

New cards

box plot label orientation ("las" parameter and its values?)

To change label orientation, use las:

0) labels parallel to axis, 1) labels always horizontal, 2) Labels perpendicular to axis, 3) Labels always vertical

Ex: if we want horizontal labels, we write las = 1