BSNS112 Master

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/100

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

101 Terms

1
New cards

Quantitative Data

Discrete and Continuous

2
New cards

Discrete Data

Must be measured in specific order / values, such as number of students in a class

3
New cards

Continuous Data

Measured infinitely such as age, height, time

4
New cards

Qualitative Data

Categorical, ordinal and nominal

5
New cards

Ordinal Data

Places in order and conveys a ranking such as clothing sizes (small, medium large)

6
New cards

Nominal Data

Does not convey ranking such as ethnicity, gender

7
New cards

What type of data is the number of cars a family owns?

Discrete

8
New cards

What type of data is the type of accommodation (such as budget, tourist, superior)

Ordinal - conveys a ranking

9
New cards

What type of data is favourite fruit preference at the market?

Nominal, conveys no ranking

10
New cards

What type of data is time spent at the market?

Discrete - measures time which is a specific value

11
New cards

Weekly household spending is divided into these groups: less than $50, $50-$100, $150-$200. What type of variable is this?

Categorical & Ordinal (defines categories and placed in order to convey a ranking)

12
New cards

Cross tabulation

Compares categorical with categorical

13
New cards

scatter plot

Compares numerical with numerical

14
New cards

frequency table

analyses 1 categorical variable. E.g. the fave stall of people at the market

15
New cards

Stacked / clustered bar chart

compares categorical with categorical e.g. proportion of M/F choosing fave stall

16
New cards

Relative frequency histogram

compares categorical with numerical (e.g. market spend of various occupational groups)

17
New cards

If the 2 variables are, "favourite stall" and "if visitors are regular or not", ac ross tabulation should be used because,

both variables are categorical and define a particular category

18
New cards

Mean

simple average

19
New cards

median

middle most value (when ranked from ascending to descending)

20
New cards

mode

most frequent

21
New cards

trimmed mean

without most extreme 5%

22
New cards

Range

maximum - minimum

23
New cards

interquartle range

75th percentile minus 25th percentile

24
New cards

variance

represents spread of data around the mean. Standard deviation squared

25
New cards

standard deviation

square root of variance, higher spread means more spread

26
New cards

co-efficient of variation

compares different groups with different magnitudes to compare variability

27
New cards

skewness

positive = right negative = left

28
New cards

significantly skewed

data is skewed more than twice its standard error

29
New cards

30
New cards

mode

median

31
New cards

Kurtosis

measures the extent to which observations cluster around the central point

32
New cards

What is it called when the kurtosis statistic is zero?

normal distribution

33
New cards

data clusters close to centre: positive or negative kurtosis?

positive

34
New cards

data clusters further from centre: positive or negative kurtosis?

negative

35
New cards

co-variance

measures co-movement between 2 variables

36
New cards

correlation of co-efficient

measures the linear relationship between 2 variables

37
New cards

What graph would measure the following: comparing time spent at the market average income

scatterplot, as it measures numerical by numerical

38
New cards

population

whole collection under analysis

39
New cards

sample

a portion of the population

40
New cards

parameter

summary measure describing a characteristic of the data, a type of rule or limit

41
New cards

statistic

summary measure computed to describe a characteristic of a sample

42
New cards

primary data

collected yourself

43
New cards

secondary data

taken from another source

44
New cards

observational data

you observe and record

45
New cards

experimental data

data you've obtained through experiments

46
New cards

simple random sampling

everyone is equally likely to get chosen from the population. E.g. randomly picking a certain number of students

47
New cards

systematic random sampling

having a system when randomly selecting sample. E.g. randomly selecting a sample then every K'th sample thereafter

48
New cards

Stratified random sampling

dividing populaiton into homogenous groups (similar characteristics) then taking random sample, e.g. dividing students by which degree they take then taking random sample

49
New cards

cluster sampling

dividing population into several clusters that aren't homogenous but are each representative of the population then taking a random sample

50
New cards

You want to sample residential halls but worry that a random sample wont include the small halls. Which sampling method should you use?

Stratified random sampling

51
New cards

Non sampling errors

human errors

52
New cards

coverage errors

when the sample has targeted the wrong subjects

53
New cards

non-response error

when subject chooses to not respond, impacting the data

54
New cards

measurement error

caused by bad question and misunderstanding

55
New cards

margin of error

quantified measure of sampling error

56
New cards

probability

how likely an event is to occur

57
New cards

how is probability written

P(event)

58
New cards

What is U in probability

union - probability of one event occurring over another

59
New cards

what is 'n' in probability

intersection - probability that both events occur together

60
New cards

collectively exhaustive

when the outcomes given are the only possible outcomes

61
New cards

complement

2 events complement each other if their probabilities add to 1. E.G. P(a) + P(b) =1

62
New cards

A Priori Classical

when you already know the probability exists through information

63
New cards

Empirical (relative frequency)

when you choose to work out the probability through experiments rather than information

64
New cards

Subjective

when the probability is based in your opinion

65
New cards

Conditional Probability

the probability of an event occurring given that another event has already occurred.

66
New cards

How is conditional probability written

P(A I B) e.g. P(Student I Female) "what is the probability that it is a student and they're female"

67
New cards

how is conditional probability calculated?

P(a n b) / P(b)

68
New cards

Marginal probability

total probability of a row or column

69
New cards

Probability independence

when the probability of one event does not influence the probability of another event occurring

70
New cards

When does co-variance = 0?

when variables are independent

71
New cards

Random Variables

variables with multiple possible values and an associated probability of getting each variable

72
New cards

Discrete Random Variables

can only take on a finite number of variables, e.g. the number of 6's rolled on a dice over 2 rolls: there can only be either 0 sixes, 1 six, or 2 sixes.

73
New cards

Expected Value defined

the value we expect based on the probabilities that exist.

74
New cards

Expected Value formula

E = ∑ [x • P(x)]

75
New cards

Variance

measures data spread around the mean

76
New cards

Variance formula

V(X) = ∑ [p(xi) + (xi-M)^2]

77
New cards

Binomial Distribution

discrete probability distribution with 4 characteristics

78
New cards

what are the 4 binomial characteristics

  1. has to be 2 outcomes to every trial (success or fail)

  2. fixed number of trials

  3. probability of success remains the same for every trial

  4. trials are independent, where the outcomes don't affect each other).

79
New cards

Discrete Random Variables

Cannot be divided, whole numbers, e.g. number of phone calls in a day, number of visitors

80
New cards

Expected Value

what we expect based on previous data. Formula: E(x) = (0 x 0.25) + (1 x 0.5) + (2 x 0.25) = 1

81
New cards

Variance

spread of the data. Formula is similar to expected value: V(x) = ((0² x 0.25) + (1² x 0.5) + (2² x 0.25))-1²

82
New cards

Poisson Probability Distribution

A discrete probability distribution used to find probabilities of the number of times a certain event occurs in a specified time interval (no fixed number of trials)

83
New cards

4 characteristics of Poisson

  1. number of successes in trial is independent of number of successes in any other interval

  2. Probability is the same for all equal sized intervals

  3. probability of success in a trial is proportional to the size of the interval

  4. probability of more than one success in an interval approaches zero as it becomes smaller

84
New cards

Empirical Rule

68% = 3 standard dev 95% = 2 standard dev 100% = 1 standard dev

85
New cards

normal distribution

A function that represents the distribution of variables as a symmetrical bell-shaped graph.

86
New cards

Standardized Z-Distribution

mean = 0 standard deviation = 1

87
New cards

How to recognise if data is normally distributed

  • graph is mound shaped and symmetrical

  • mean = median

  • empirical rule applies (68=3, 95=2, 100=3)

  • skewness & kurtosis close to 0

88
New cards

Graphs to show normally distributed data

  1. histogram

  2. box plot

  3. stem & leaf

  4. qq pp plot

89
New cards

What does a sample statistic do

makes an inference on a population parameter if you cant sample an entire population.

90
New cards

A quantitative estimate involves

a mean "what is the mean grade of the students"

91
New cards

what are x̅ and μ

x̅ represents the mean in a sample statistic, and μ is the same as x̅, but it represents the whole (parameter) population

92
New cards

A qualitative estimate involves

a proportion "what proportion of the population is from christchurch

93
New cards

Interval Estimates

estimations of a range of values of a population parameter. E.g. we expect μ to fall within $75-$100, or, we expect P to fall within 0.25-0.50

94
New cards

Point Estimates

estimates an exact value of a parameter using a single value. Unlikely to estimate correctly so use interval estimate instead

95
New cards

how to calculate confidence intervals

point estimate plus or minus margin of error (confidence level x standard error)

96
New cards

standard error

is the standard deviation of sample mean/proportion and represents the sample mean/proportions accuracy

97
New cards

when would you use the z distribution when trying to estimate a confidence interval

  • when the population standard deviation is known

  • the sample is normally distributed or, sample is large

98
New cards

When would you use the t distribution when trying to estimate a confidence interval

  • population standard deviation is unknown

  • sample is normally distributed or, is large

99
New cards

when would you use the Z distribution when trying to estimate a confidence interval

for proportions as you'll always know the population ST.D

100
New cards

What are the Z values

99% = 2.576 95% = 1.96 90% = 1.645