Chapter 1 -data collection

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/77

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:42 PM on 6/17/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

78 Terms

1
New cards

population

whole set of items that are of interest

2
New cards

raw data

info from population

3
New cards

census

measures every member of population

4
New cards

sample

a selection of observations from a subset of population to find out info of population as a whole

5
New cards

sampling units

individual units of population that are numbered to form a sampling frame

6
New cards

sampling frame

a list of all sampling units in a population

7
New cards

why might sampling frame differ to population

not always possible to keep this list up to date

8
New cards

adv of census

completely accurate results

9
New cards

disadv of census

time consuming; expensive; can’t be used when testing destroys process; hard to process large amounts of data

10
New cards

how does size of sample affect results

larger samples are better for large populations as they are more accurate but more resources are required

11
New cards

disadv of sampling

not as accurate as census; sample may not be large enough to give us info about small subsets

12
New cards

adv of sampling vs census

less time consuming; cheaper; less data processed

13
New cards

when is a census used

population known, small and easily accessed

14
New cards

when is sample used

population known, largest too time consuming and expensive to interview all

15
New cards

what are the 3 types of random sampling

simple, systematic and stratified

16
New cards

describe simple random sampling

each sample of size n is allocated a random number and then samples picked from a random number generator, so equal chance of getting selected

17
New cards

adv of simple random sampling

free of bias; easy and cheap to implement

18
New cards

disadv of simple random sampling

time consuming if large population; sampling frame needed

19
New cards

describe systematic sampling

First person is randomly selected and then required elements are chosen at regular intervals from an ordered list

20
New cards

adv of systematic random sampling

simple; quick; suitable for large populations

21
New cards

disadv of systematic sampling

sampling frame needed; can introduce bias if sampling frame is not random

22
New cards

describe stratified sampling

random samples are taken from mutually exclusive groups of the population. Sample sizes within strata are in strict proportion to numbers in each strata in the population

23
New cards

how to calculate the number sampled from each strata

strata size/population size x overall sample size

24
New cards

adv of stratified sampling

accurately reflects population structure and guarantees proportional representation of all groups; random sampling within strata reduces bias

25
New cards

disadv of stratified sampling

clearly classified strata needed; selection within each strata has same disadv as simple sampling

26
New cards

2 types of non random sampling

quota sampling and opportunity sampling

27
New cards

describe quota sampling

interviewer divides population into groups based on characteristics and selects a fixed number of individuals from each group to make up the sample

28
New cards

adv of quota sampling

small sample is representative of the whole population; quick easy and cheap; easy comparison between groups; no sampling frame needed

29
New cards

disadv of quota sampling

can introduce bias; population must be divided into groups which may be costly and inaccurate; varied population mens more groups which adds time and expense; non responses not recorded d

30
New cards

describe opportunity sampling

taking a sample of those available at the time of study and who fits the criteria looked for

31
New cards

adv of opportunity sampling

easy to carry out and inexpensive

32
New cards

disadv of opportunity sampling

not representative; dependant on researcher

33
New cards

quantitive data

data assosciated with numerical observations

34
New cards

qualitative

data associated with non-numerical observations

35
New cards

continuous data

can taken any value within a given ranged

36
New cards

discrete data

can only take specific values within a given range

37
New cards

months of large data set

May to oct

38
New cards

countries named south to north

Cambrone, Hurn, Heathrow, Leeming, Leuchars

39
New cards

which places are coastal + windy

Cambrone and Leuchars

40
New cards

trend ofr max no of sunshine

north has a higher max

41
New cards

range for mean temp

5-24

42
New cards

range for daily mean temp

0-20

43
New cards

what does tr mean

trace so number is between 0<r<0.05.

44
New cards

range for daily total sunshine

0-14 hours

45
New cards

what is cloud cover measured in

oktas

46
New cards

range for cloud cover

0-8 (integers)

47
New cards

humidity range

70-100% - integers

48
New cards

what is daily mean visibility measure in

Decametres (10m = 1Dm)

49
New cards

range for daily mean visibility

200-4000 (roundest to nearest 100)

50
New cards

daily mean pressure units

hPa

51
New cards

lowest and highest daily mean pressure

900 to 1040 hPa (integers)

52
New cards

units for daily mean windspeed

knots

53
New cards

range for daily mean windspeed

3 - 10 kn (integers)

54
New cards

windspeed (beaufort conversion)

light - moderated ; most days are light

55
New cards

max gust

8-50 knots (integers)

56
New cards

wind direction

10 - 360 degrees (multiples of 10; where wind is blowing from not to)

57
New cards

features of Jacksonville florida

hot summers

58
New cards

Perth features

flipped seasons

59
New cards

features of beijing

hotter but wetter summers, colder winters

60
New cards

equation for linear interpolation

lower boundary + (class width / frequency of class x (value - cumulative frequency up to class))

61
New cards

when should you use median and IQR instead of mean and standard deviation

When there are outliers as outliers affect the mean and standard deviation

62
New cards

explain how Charlie would use quota sampling to obtain a sample of 40 workers

ask 20 men and 20 women how long their journey was

63
New cards

effect on standard deviation by a translation of points

no effect as standard deviation is not affected by addition or subtraction as it is a measure of spread

64
New cards

how to clean data before standard deviation and mean calculations

replace tr with a numerical value. Trace values are between 0 and 0.05

65
New cards

Explain why daily total rainfall data from large data set would not be suitable to find annual mean daily total rainfall

data only covers may to oct so not representative of whole year. Winter months are missing and we would expect more rain in this season so an estimation from large data set would be an underestimation

66
New cards

explain why a binomial distribution B(14,0.27) to model the number of days without rain for a 14 day summer event would not be suitable

p=0.27 is unlikely to be constant and the probability in a binomial distribution should be constant

67
New cards

median of daily mean pressure

around 1000 hPa

68
New cards

range of daily mean pressure

50 hPa

69
New cards

state the assumption involved with using the midpoint to calculate an estimate of a mean from a grouped frequency table

assumes that data is distributed uniformly throughout the class

70
New cards

why does it not matter about using the midpoint with the large data set to calculate an estimate for the mean total daily rainfall

most of the data in the first class is 0

71
New cards

why is the median appropriate for this data set

it is not affected by an extreme outlier

72
New cards

Sara is investigating the variation in daily maximum gust in Camborne in June and July. She selected the first value randomly and the selected every third value after that. Explain why this process may not generate a sample size of 20

in the LDS, some days have gaps because data was not recorded

73
New cards

A random sample of 20 customers is taken. How does a scout group affect the validity of the model

The sample requires 20 customers to be random and the scout group may invalidated this so binomial distribution would not be valid

74
New cards

suggest two improvements to a pulley model

include a more accurate value of g, include the dimensions of the ball so the distance it falls changed,

75
New cards

suggest two limitations of the pulley model

the pulley may not be smooth, air resistance

76
New cards

describe something significant about rainfall in perth

lots of zeros for rainfall

77
New cards

in the refined model, the effect of air resistance is included. How would the new value for the speed to ball hits the ground vary

the new value would be lower

78
New cards

for overseas locations, what is the only data recorded

daily mean temp daily total ranfall, daily mean pressure, daily mean windspeed