Cogs 10A Midterm 3

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/55

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

56 Terms

New cards

Percentile rank

The percentage of individuals in the distribution with scores at or below a particular value

New cards

Percentile

When a score is defined by it’s percentile rank, that score is called a _____.

New cards

Cumulative frequencies (what are they? how to compute? identifying its column in a frequency distribution table?)

They show the numbers located at or below each score.

How to compute? → Find your X value, add that value and all the F values below it.

New cards

Cumulative percentages (what are they? how to compute? identifying its column in a frequency distribution table? relationship to real limits of frequency distribution class intervals?)

Converting cumulative frequencies into percentages. Divide the cf value by N and by 100. Ex: cf/N (100). Each cumulative percentage value is associated with the upper real limit of its interval.

New cards

Determining percentiles and percentile ranks from frequency distribution table (Be able to do it; how to do it if desired value does not appear directly in table?) cumsum() function (what does it do? Understand the vector of values returned by the function)

Some values you can determine directly from the table (ex: 3.5, 70%) Use interpolation if you cannot find the value directly from the table.

Cumsum function calculates cumulative frequencies for a frequency distribution table. It outputs a vector containing the cumulative frequencies.

New cards

Adding cumulative frequencies to a table?

You can add Cumulative frequencies to a table using data.frame: CF Table name <- data.frame(rev(Old FD Table name), rec(CFs))

New cards

Adding cumulative percentages to frequency distribution tables?

You can add cumulative percentages using cbind(), after calculating N (calculating N using sum() of $Freq)

New cards

Interpolation, what is it and its purpose?

Gives us a method for finding values that are located between two specified numbers. Estimates intermediate values.

New cards

General process of Interpolation?

Single interval is measured on two separate scales. The endpoints of the interval are known for each scale.
You are given an intermediate value on one of the scales. The problem is you need to find the corresponding intermediate value on the other scale.

New cards

Perform simple interpolation from frequency distribution table to find percentile rank. Percentile rank corresponding to X = 7

Bounded by real limits of 7.5 and 6.5.

The cumulative percentages at these real limits is 20% and 44%

Interval width (between 7.5 and 6.5) is 1 and 24 for percentages

7 is located .5 away from upper real limit (7.5), and .5 = ½

So halfway down the percentage scale (24) is 12.

Minus the top interval (44) by 12. 44-12 = 32%

New cards

Perform simple interpolation from frequency distribution table to find percentile. Find the 50th Percentile

Value of 50% is not found in the table, but it is between 10 and 60. Corresponding values for those %: 10 → 0-4 (upper: 4.5) 60 → 5-9 (upper:9.5)

Interval width for limits: 5 | Interval width for %: 50

50% is located 10 points from the top of the percentage which is a fraction 10/50 → ⅕

Now, multiply interval width (5) by our found fraction ⅕. = 1

Subtract the top interval (9.5) by 1. 9.5 - 1 = 8.5

New cards

Stem and leaf display, what is it?

Simple alternative to a frequency distribution table or graph.

New cards

What is a stem?

The first digit ( or digits )

New cards

What is a leaf?

The last digit ( or digits )

New cards

How to construct a stem and leaf display

List all of the stems in a column, go through each data one score at a time and write the leaf for each score beside its stem.

New cards

Advantages of a stem and leaf display?

Easy to construct, Identifies every individual score in the data set, provides a picture of the distribution and a list of the scores. Easy to modify the display for a more detailed picture of the distribution.

New cards

splitting stems?

Regrouped the distribution using an interval width of 5 points instead of 10. Divided into Lower leaves (0-4) and Higher leaves (5-9)

New cards

Stem() function,what does it do?

Creates a stem and leaf display by sending our raw scores.

New cards

How does having “ordered” leaves help us?

It helps to pick out values of interest when reviewing data

New cards

Be able to pick out highest/lowest score of stem & leaf

The lowest score will be the first stem and the first leaf in our table.

The highest score will be the last score on the last leaf in our table.

New cards

Know how to use the “scale” parameter

Scale parameter to 4 → two stems for each first digit value.

Scale parameter to 2 → doubling the default (1)

New cards

Measure of central tendency, what is it?,what does it identify?

Identifies a single score as the most representative or the most typical of an entire distribution. Usually a value in the middle of the distribution.

New cards

3 measures of central tendency?

First Distribution: Symmetrical, easy to identify the center.

Second Distribution: Negatively skewed, scores piling off around one area but taper off.

Third distribution: Symmetrical but has two distinct “piles”

New cards

Mean, what is it? formulas? how to compute?

Computed by adding all the scores and dividing it by the number of scores (N).

New cards

Notation for sample/population?

Sample = X-bar, Population = mu

New cards

Weighted mean, what is it? how to calculate?

Samples are not the same size so one group will make a larger contribution to the total group, hence the name “weighted” mean. Calculated by combining the sigma X values for each group, combining the N values for each group, and then dividing.

New cards

Calculate a weighted mean given group n values.

Section 1: n1 = 12 students, average score X1 = 6

Section 2: n2 = 8 students, average score X2 = 7

6 (12) + 7(8) / 12 + 8 = 6.4

New cards

Calculate a weighted mean given group weights.

Consider all of the groups that will contribute to the overall mean.

Lets say overall there are 20 students. Divide each group by 20 to find the proportion.

8/20, 12/20. Now, multiply each mean by its weight. Recall that X1 = 6 and X2 = 7, so 0.6(6) + 0.4(7) = 6.4

New cards

How does changing a score affect the mean?

This will affect the sigma X value. If you take away or add a score this will affect the sigma X and N value.

New cards

adding/subtracting a constant?

The same constant will be added to or subtract from the mean. Ex: Subtracting 2 from each score will subtract the mean by 2. Mean = 4.33 - 2 = 2.33

New cards

multiplying/dividing by a constant?

The mean will be changed in the same way. Ex: Multiplying each score by 3 will also multiply the mean by 3.

New cards

mean() function (what does it do? how to use it?)

Pass a set of scores to it and it will return the mean.

New cards

Calculating weighted mean in RStudio (how?)

Weighted mean is: sum of each group’s X values / sum of each group’s N values

Each groups Sigma X value is = groupmean[1] * N[1]

So, calculate the weighted mean by summing all groups sigma x values and dividing by the corresponding group N values.

*with raw scores, no need to calculate mean, just use sum(), and length() for N and add sums / added lengths

*with supplied weights and means, multiply each weight by its mean and add values together.

New cards

median (what is it? equivalent to what percentile? are there special symbols/notation?)

Score that divides the distribution in half. Equal to the 50th percentile. No special symbols or notations.

New cards

Calculating the median when N is odd?

List scores in order from lowest to highest, the median is the middle score in the list.

New cards

Calculating the median when N is even?

List scores in order from lowest to highest, find the two middle numbers (Ex: 4 and 5), and divide by 2. (4+5) / 2 = 4.5

New cards

Median when there are several middle scores with the same value?

Use interpolation. Looking for the 50th percentile.

New cards

Using interpolation to calculate median (how? use interpolation to find median from frequency distribution table)

Looking for the 50th percentile which is in between 40 and 90.

Interval width is 1 for X and 50 for %

50% located 40% away from top. 40/50 = ⅘

⅘ (1) = ⅘ = 0.8 so, top interval(4.5 - 0.8) = 3.7

New cards

Median() function, what does it do? what method of calculating the median does it use?)

Takes a vector of raw scores. For an odd number of scores it will return the middle score, for an even it will return the two middle scores.

New cards

interp.median() function (what does it do? package?)

Takes a vector of raw scores and finds the interpolated median. To use it, download the psych package.

New cards

Mode, what is it? how to determine?

Score or category that has the greatest frequency. In a graph, the mode will appear as the tallest part of the figure.

New cards

major mode? minor mode?

If the two mode values are not identical:

Tallest peak is called the major mode

The shortest peak is called the minor mode

New cards

Unimodal distribution? bimodal? multimodal? amodal?

Distributions with one mode are unimodal

Distributions with two modes are bimodal

Distributions with several equal high points can be described as amodal

New cards

Finding the mode in RStudio (method/steps?)

Put the holder of the raw data into the sort() function with Decreasing = TRUE. Then use the names() function to report the name of the first column in the table → mode <- names(SortedTableName)[1]

*assuming distribution is not multimodal

New cards

most preferred measure in general? reasons why it's the most generally preferred?

Mean is usually the preferred measure. Mean is affected by every score in the distribution. Mean is closely related to common measure of variability (variance and standard deviation).

New cards

When to use the mode (advantages of mode? nominal scale?)

Only mode can be used on a nominal scale. Mean or median cannot be calculated on a nominal scale. Advantages: Easy to compute, can be used with any scale of measure (nominal, ordinal, interval,ratio), Getting mode along with mean can help indicate shape of a distribution. Ex: Academic major

New cards

When to use the median (four situations?)

When there are a few extreme scores in the distribution
When some scores have undetermined values
When there is an open-ended distribution
When the data is measured on an ordinal scale

New cards

Extreme scores, what are they? is median easily affected by extreme scores?

Scores that are very different in value from most of the others in the distribution.

Median is not easily affected by extreme scores.

New cards

Used for reporting central tendency for skewed distributions?

The median

New cards

undetermined values? can median be used when there are undetermined values? can the mean?)

Incomplete or missing data values. Prevents us from computing the mean but since the median only relies on order, we can compute the median.

New cards

Open-ended distributions? can median be used when there are undetermined values? can the mean?)

When there is no upper or lower limit for one of the categories (Ex: 5 or more). Cannot compute the mean, however, you can find the median.

New cards

Ordinal scale (which measure of central tendency is preferred?)

The median is preferred.

New cards

How does distribution shape relate to measures of central tendency?

The relationship between the mean, median, and mode are determined by the shape of the distribution.

New cards

symmetrical distributions, Unimodal? bimodal? “rectangular” distribution?

Right hand side of the graph will be a mirror image of the left hand side. Mean will be exactly at the center.

If the distribution is bimodal, mean and median will be together in the center with modes on each side.

If the distribution is “rectangular”, there will be no mode, but the mean and the median will still be together in the center.

New cards

Positively or Negatively skewed distributions?

Positively skewed peak is on the left (mode)

Mean will be located to the right of the median.

Negatively skewed peak is on the right (mode)

Mean will be located to the left of the median.

New cards

Be able to identify distribution shape from given mean, median and mode values

Positively Skewed: Mean>Median>Mode

Negatively Skewed (Left-Skewed): Mean < Median < Mode

Symmetrical & Unimodal: Mean = Median = Mode

Symmetrical & Bimodal: Mean = Median, but two different modes

Symmetrical & Rectangular: Mean = Median, no clear mode