Data collection (+ Large Data Set facts)

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/48

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

49 Terms

1
New cards

What is a population?

The entire set of items (sampling units) that are of interest

2
New cards

What is a census?

An observation or measurement from every member of a population

3
New cards

What are the advantages and disadvantages of taking a census?

ADV: Should give a completely accurate result.
DISADV: 1. Time consuming for large populations, 2. expensive for large populations, 3. cannot be used when the testing process destroys the item (e.g. testing the battery life), 4. difficult to process large quantity of data

4
New cards

What is a sample?

A selection of observations or measurements from a subset of the members of a population

5
New cards

What are the advantages and disadvantages of using a sample?

ADV: 1. Less time-consuming than a census, 2. Cheaper than a census 3. Easier to process all of the data 4. Practical for tests where the items are destroyed
DISADV: 1. Doesn't measure every member of the population — inaccurate and not fully representative, 2. Sample size might not be large enough to represent the entire population

6
New cards

What does the size of a sample depend on?

The required accuracy and available resources

7
New cards

Generally, what creates a more accurate sample?

The larger the sample, the more accurate it is but you will need greater resources

8
New cards

If the population is varied, what do you need?

You need a larger sample than if the population were uniform

9
New cards

What can different samples lead to?

Different conclusions due to the natural variation in a population

10
New cards

What can the size of a sample affect?

The validity of any conclusions drawn

11
New cards

What is a sampling unit?

Individual units of a population (e.g. one single avocado)

12
New cards

What is a sampling frame?

A named or numbered list of each sampling unit in a given population

13
New cards

What does random sampling help to remove?

Bias

14
New cards

What are the three methods of random sampling?

  1. Simple random sampling
  2. Systematic sampling
  3. Stratified sampling
15
New cards

What is a simple random sample and how is one taken?

Items from the sampling frame are selected at random, so each sampling unit has an equal chance of being selected. A random number generator is used alongside a sampling frame.

16
New cards

What is a simple random sample of size n?

One where every sample of size n has an equal chance of being selected

17
New cards

What are the advantages (2) and disadvantages(3) of simple random sampling?

ADV: 1. Bias free, 2. Simple and easy to do
DISADV: 1. Sampling frame required, 2. Time-consuming for large samples, 3. Doesn't accurately represent the proportions of the population

18
New cards

What is systematic sampling and how is it carried out?

All items of the sampling frame are ordered, and the items in this ordered list are selected at regular intervals. (Every kth unit is chosen, a random number between 1 and k is the starting point).

19
New cards

What are the advantages (2) and disadvantages (2) of systematic sampling?

ADV: 1. If sampling frame is randomly ordered, there should be no bias, 2. Quick to use, 3. Suitable for large sample sizes
DISADV: 1. Sampling frame required, 2. Non-randomly ordered data may introduce bias

20
New cards

What is stratified sampling?

  • The population is first divided into mutually exclusive strata (groups/categories) and the number of members in each category is noted.
  • The sample is then made up of a proportionally representative number of members to reflect the population.
21
New cards

What are the advantages (3) and disadvantages (2) of stratified sampling?

ADV: 1. Represents population structure, 2. Guarantees representation of all groups in population, 3. Can observe relationships between subgroups
DISADV: 1. Sampling frame needed, 2. Need clear strata/categories/groups in the population, 3. Time-consuming to split a large population into categories

22
New cards

How do you calculate the number sampled in a stratum?

(Number in stratum/number in population) x overall sample size

23
New cards

What are two methods of selecting numbers for random sampling?

  1. Generating random numbers using a calculator, computer or random number table.
  2. Lottery sampling
24
New cards

What happens in lottery sampling?

The members of the sampling frame could be written on tickets and placed into a 'hat'. The required number of tickets would then be randomly pulled out

25
New cards

What are two types of non-random sampling? What is their advantage?

  1. Quota sampling
  2. Opportunity sampling
    No sampling frame required
26
New cards

What is quota sampling?

  1. Researcher splits population into groups, and the approximate size of each group is noted
  2. A quota to be filled is noted down for each group (this may or may not be representative of the population).
  3. The quotas are then filled using opportunity sampling. If any member is selected for a quota that has already been filled, they are ignored and move on.
27
New cards

What are the advantages and disadvantages of quota sampling?

ADV: 1. No sampling frame needed, 2. Allows a small sample to still be representative of the population, 3. Easy, cheap
DISADV: 1. Non-random sampling can introduce bias, 2. Population must be divided into groups which can be difficult or costly or inaccurate, 3. Not possible to find sampling errors

28
New cards

How do you determine a suitable quota?

  • use electoral register to determine the size of each group as a proportion of the whole population
  • assign the quotas as the same proportion of the whole sample
29
New cards

What is opportunity sampling?

Consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for

30
New cards

What are the advantages and disadvantages of opportunity sampling?

ADV: 1. Easy and cheap to carry out, 2. No sampling frame needed
DISADV: 1. Unlikely to provide a representative sample

  1. Results can vary depending on the individual researcher
31
New cards

What are quantitative variables/data?

Numerical data

32
New cards

What are qualitative variables/data?

Non-numerical data

33
New cards

What is a continuous variable?

A variable that can take any value in a given range

34
New cards

What is a discrete variable?

A variable that can take only specific values in a given range

35
New cards

What do class boundaries tell you?

The maximum and minimum values that belong in each class

36
New cards

What does the midpoint tell you?

The average of the class boundaries

37
New cards

What does the class width tell you?

The difference between the upper and lower class boundaries

38
New cards

What is the large data set?

Weather data samples provided by the Met Office for 5 U.K. weather stations and 3 overseas, over 2 set periods of time : May to October 1987 and 2015. These are: Leuchars, Leeming, Heathrow, Hurn, Camborne, Jacksonville, Perth and Beijing

39
New cards

What is the daily mean temperature?

The average of the hourly temperature readings during a 24 hour period

40
New cards

What is the daily total rainfall?

including solid precipitation such as snow and hail, which is melted before being included in any measurements - amounts less than 0.05 mm are recorded as 'tr' or 'trace'

41
New cards

What is daily total sunshine?

Recorded to the nearest tenth of an hour

42
New cards

What is daily mean wind direction and wind speed?

In knots, averaged over 24 hours from midnight to midnight. Mean wind directions are given as bearings and as cardinal directions. The data fro mean windspeed is also categorised according to the Beaufort scale

43
New cards

What is daily maximum gust?

The highest instantaneous windspeed recorded - measure in knots

44
New cards

What is daily maximum relative humidity?

Percentage of air saturation with water vapours. A continuous variable that can take any value within 0-100.

45
New cards

What is daily mean cloud cover?

Measured in oktas or eighths of the sky covered by cloud. Cannot be higher than 8.

46
New cards

What is daily mean visibility?

Measured in decametres (Dm). This is the greatest horizontal distance at which an object can be seen in daylight

47
New cards

what is daily mean pressure?

Measured in hectopascals (hPa)

48
New cards

Why might a median be less than a mean?

  • the distribution is skewed
  • a few large distances distort the mean
49
New cards

Why would you use the median and interquartile range rather than the mean and standard deviation?

If the data is skewed as mean and standard deviation are affected by extreme values