Statistics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/57

flashcard set

Earn XP

Description and Tags

Topic 1 and 2

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

58 Terms

1
New cards

What is a population

All members of the set that are being studied

2
New cards

What is a sample

A smaller subset of a population that is used to draw conclusions about the population

3
New cards

What does a census do

Measures every member of a population

4
New cards

What is raw data

Unprocessed information

5
New cards

Advantages of using a population

Accurate result

6
New cards

Disadvantages of using a population

Time consuming

Expensive

Hard to process that much data

Testing can destroy all the data

7
New cards

Advantages of using a sample

Faster

Cheaper

Less data to process

Fewer people have to respond

8
New cards

Disadvantages of using a sample

Not as accurate 

small sub-groups may not be properly represented

9
New cards

What is simple random sampling

Each member of the population has an equal probability of being included in the sample

10
New cards

What is a sample frame

Members of a population are given a name or number

11
New cards

What is systematic sampling

Sample members are selected from a larger population according to a random stating point and a fixed sampling interval

12
New cards

How is a sample interval formed

By dividing the population size by the desired sample size

13
New cards

What is stratified sampling

The population is sorted into mutually exclusive groups and a proportional sample is taken from each group

14
New cards

How to find the number sampled in a group

( number in group / number in population ) x overall sample size

15
New cards

What is opportunity sampling

Taking people from the population that are available at the time and willing to take part in the survey

16
New cards

Advantages of opportunity sampling

Easy to carry out

Cheap

17
New cards

Disadvantages of opportunity sampling

Unlikely to provide a representative sample

Dependent on the researcher

18
New cards

Advantages of simple random sampling

Free of bias

Simple cheap from small samples and populations

Each sampling unit has an equal and known chance of being selected

19
New cards

Advantages of systematic sampling

Simple

Fast

Good for large samples and populations

20
New cards

Advantages of stratified sampling

Sample accurately represents population structure

Guarantees proportional representation of groups in a population

21
New cards

Disadvantages of stratified sampling

Population has to be clearly classified into distinct groups

Selection within each group requires a sampling frame and can be time consuming, disruptive and expensive if the groups are large

22
New cards

Disadvantages of systematic sampling

Sampling frame is needed

Bias if sampling frame is not random

23
New cards

Disadvantages of simple random sampling

Sampling frame is needed

Not suitable if population size is large as then will be time consuming, disruptive and expensive

24
New cards

What is bias

When something is not fair or representative

25
New cards

What will an ideal sample do

Be large enough

Represent the population

Be unbiased

26
New cards

Things to consider when reviewing data from samples

Sample size

Where the sample is taken

Time of day the sample was taken

27
New cards

What is the IQR in a box plot diagram

The box

28
New cards

What is the range in a box plot diagram

End to end of the whiskers

29
New cards

What does cumulative mean

running total

30
New cards

How to find the total population using histograms

All areas of the bars together

31
New cards

What is Bi-variate data

Data that has pairs of values for different variables

32
New cards

What do scatter diagrams represent 

It is a visual representation of any relationship between two variables

33
New cards

What is correlation

A measure of how well two variables are related to each other

34
New cards

How is strong correlation represented

The points lie close to the regression line

35
New cards

How is weak correlation represented

The points do not lie close to the regression line

36
New cards

What is a regression line and what does it show 

A line of best fit models the relationship between two variables

37
New cards

What is Interpolation

Using the data we have got to find an estimate value

38
New cards

What is Extrapolation

Making an assumption that the regression line is true for all values to find an unknown value beyond the range of known data

39
New cards

Does correlation imply causation?

No 

40
New cards

What is a measure of central tendancy

A single value in a list of values that describes the center of the data

41
New cards

Advantages of using the mean

All data values are used so all values are taken into account 

42
New cards

Advantages of using the mode

Can be used with data that is not numeric

43
New cards

Advantages of using the median

Not affected by extreme values so good for outliers

44
New cards

Disadvantages of using the mean

Affected by extreme values

Only useful with numeric data

45
New cards

Disadvantages of using the mode

May be no mode

May not represent the data well

Can be different modes

46
New cards

Disadvantages of using the median

Can take a long time to order all the data

47
New cards

What is central variation

A measure of the spread of data

48
New cards

What is variance 

A statistic that measures how far each value in the set of data is from the mean 

49
New cards

What is standard deviation

The square root of the variance

50
New cards

What are summary statistics

Information that gives a brief description of the data

51
New cards

What is the act of dealing with errors

Detecting and correcting or removing data with errors

52
New cards

What is an outlier

A value that does not follow the pattern of the data

53
New cards

How to spot an outlier

An outlier is any value that is smaller that the LQ-1.5xIQR or larger than UQ+1.5xIQR

54
New cards

What does missing data do

Makes a sample less representative

55
New cards

Ways to deal with missing data

Delete the samples with missing data elements 

Impute the value of missing data 

Remove a variable

56
New cards

What does deleting missing data do

Creates a smaller sample size and may end up not representing the whole population

57
New cards

How to impute missing data

Substitute in data values for a similar sample

Use the mean from all other values for the same statistic

Use regression techniques to predict values based on the relationship between the variable and other varibales

58
New cards

When would you remove a variable 

If a particular question has a high amount of missing data