Unit 3 Representations of data

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/10

flashcard set

Earn XP

Description and Tags

Terminology and formulae for AS stats Pearson Edexcel textbook CH3.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

11 Terms

1
New cards

Common definition for any outlier

Any value that is:

  • greater than Q3 +k(Q3-Q1)

  • less than Q1 - k(Q3-Q1)

k not always used

2
New cards

What is ‘cleaning data’?

The process of removing anomalies from data. Must justify why values are being removed.

3
New cards

Why (when to) use a histogram to represent data?

when the data is grouped, continuous data.

4
New cards

Frequency density equation

frequency density = frequency/class width

5
New cards

How do you form a frequency polygon?

joining the middle of the top of each bar in a histogram with equal class widths

6
New cards

What two things do you comment on when comparing data?

Measure of spread, measure of location.

7
New cards

Which two pairs of measure of spread and location can you comment on when comparing data?

  • mean and standard deviation

OR

  • median and interquartile range

8
New cards

Which pair for comparison is more suitable for a set of data with extreme values?

Median and interquartile range

9
New cards

Adv of box plot?

  • It helps us to see the spread of the data easily.

  • The plot is clear and easy to understand.

  • It uses the range and the median values.

  • It is easy to compare the stratified data.

10
New cards

Disadv of box plot?

  • Original data is not clearly shown in the box plot.

  • Mean and mode cannot be identified using the box plot.

  • It can be easily misinterpreted.

  • If large outliers are present, the box plot is more likely to give an incorrect representation.

11
New cards

Which pair of measure of location/spread to use for box plot comparison?

median and interquartile range