1/77
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
population
whole set of items that are of interest
raw data
info from population
census
measures every member of population
sample
a selection of observations from a subset of population to find out info of population as a whole
sampling units
individual units of population that are numbered to form a sampling frame
sampling frame
a list of all sampling units in a population
why might sampling frame differ to population
not always possible to keep this list up to date
adv of census
completely accurate results
disadv of census
time consuming; expensive; can’t be used when testing destroys process; hard to process large amounts of data
how does size of sample affect results
larger samples are better for large populations as they are more accurate but more resources are required
disadv of sampling
not as accurate as census; sample may not be large enough to give us info about small subsets
adv of sampling vs census
less time consuming; cheaper; less data processed
when is a census used
population known, small and easily accessed
when is sample used
population known, largest too time consuming and expensive to interview all
what are the 3 types of random sampling
simple, systematic and stratified
describe simple random sampling
each sample of size n is allocated a random number and then samples picked from a random number generator, so equal chance of getting selected
adv of simple random sampling
free of bias; easy and cheap to implement
disadv of simple random sampling
time consuming if large population; sampling frame needed
describe systematic sampling
First person is randomly selected and then required elements are chosen at regular intervals from an ordered list
adv of systematic random sampling
simple; quick; suitable for large populations
disadv of systematic sampling
sampling frame needed; can introduce bias if sampling frame is not random
describe stratified sampling
random samples are taken from mutually exclusive groups of the population. Sample sizes within strata are in strict proportion to numbers in each strata in the population
how to calculate the number sampled from each strata
strata size/population size x overall sample size
adv of stratified sampling
accurately reflects population structure and guarantees proportional representation of all groups; random sampling within strata reduces bias
disadv of stratified sampling
clearly classified strata needed; selection within each strata has same disadv as simple sampling
2 types of non random sampling
quota sampling and opportunity sampling
describe quota sampling
interviewer divides population into groups based on characteristics and selects a fixed number of individuals from each group to make up the sample
adv of quota sampling
small sample is representative of the whole population; quick easy and cheap; easy comparison between groups; no sampling frame needed
disadv of quota sampling
can introduce bias; population must be divided into groups which may be costly and inaccurate; varied population mens more groups which adds time and expense; non responses not recorded d
describe opportunity sampling
taking a sample of those available at the time of study and who fits the criteria looked for
adv of opportunity sampling
easy to carry out and inexpensive
disadv of opportunity sampling
not representative; dependant on researcher
quantitive data
data assosciated with numerical observations
qualitative
data associated with non-numerical observations
continuous data
can taken any value within a given ranged
discrete data
can only take specific values within a given range
months of large data set
May to oct
countries named south to north
Cambrone, Hurn, Heathrow, Leeming, Leuchars
which places are coastal + windy
Cambrone and Leuchars
trend ofr max no of sunshine
north has a higher max
range for mean temp
5-24
range for daily mean temp
0-20
what does tr mean
trace so number is between 0<r<0.05.
range for daily total sunshine
0-14 hours
what is cloud cover measured in
oktas
range for cloud cover
0-8 (integers)
humidity range
70-100% - integers
what is daily mean visibility measure in
Decametres (10m = 1Dm)
range for daily mean visibility
200-4000 (roundest to nearest 100)
daily mean pressure units
hPa
lowest and highest daily mean pressure
900 to 1040 hPa (integers)
units for daily mean windspeed
knots
range for daily mean windspeed
3 - 10 kn (integers)
windspeed (beaufort conversion)
light - moderated ; most days are light
max gust
8-50 knots (integers)
wind direction
10 - 360 degrees (multiples of 10; where wind is blowing from not to)
features of Jacksonville florida
hot summers
Perth features
flipped seasons
features of beijing
hotter but wetter summers, colder winters
equation for linear interpolation
lower boundary + (class width / frequency of class x (value - cumulative frequency up to class))
when should you use median and IQR instead of mean and standard deviation
When there are outliers as outliers affect the mean and standard deviation
explain how Charlie would use quota sampling to obtain a sample of 40 workers
ask 20 men and 20 women how long their journey was
effect on standard deviation by a translation of points
no effect as standard deviation is not affected by addition or subtraction as it is a measure of spread
how to clean data before standard deviation and mean calculations
replace tr with a numerical value. Trace values are between 0 and 0.05
Explain why daily total rainfall data from large data set would not be suitable to find annual mean daily total rainfall
data only covers may to oct so not representative of whole year. Winter months are missing and we would expect more rain in this season so an estimation from large data set would be an underestimation
explain why a binomial distribution B(14,0.27) to model the number of days without rain for a 14 day summer event would not be suitable
p=0.27 is unlikely to be constant and the probability in a binomial distribution should be constant
median of daily mean pressure
around 1000 hPa
range of daily mean pressure
50 hPa
state the assumption involved with using the midpoint to calculate an estimate of a mean from a grouped frequency table
assumes that data is distributed uniformly throughout the class
why does it not matter about using the midpoint with the large data set to calculate an estimate for the mean total daily rainfall
most of the data in the first class is 0
why is the median appropriate for this data set
it is not affected by an extreme outlier
Sara is investigating the variation in daily maximum gust in Camborne in June and July. She selected the first value randomly and the selected every third value after that. Explain why this process may not generate a sample size of 20
in the LDS, some days have gaps because data was not recorded
A random sample of 20 customers is taken. How does a scout group affect the validity of the model
The sample requires 20 customers to be random and the scout group may invalidated this so binomial distribution would not be valid
suggest two improvements to a pulley model
include a more accurate value of g, include the dimensions of the ball so the distance it falls changed,
suggest two limitations of the pulley model
the pulley may not be smooth, air resistance
describe something significant about rainfall in perth
lots of zeros for rainfall
in the refined model, the effect of air resistance is included. How would the new value for the speed to ball hits the ground vary
the new value would be lower
for overseas locations, what is the only data recorded
daily mean temp daily total ranfall, daily mean pressure, daily mean windspeed