BSNS112 Master

Studied by 1 person

0.0(0)

get a hint

hint

Quantitative Data

1 / 100

There's no tags or description

Looks like no one added any tags here yet for you.

101 Terms

Quantitative Data

Discrete and Continuous

New cards

Discrete Data

Must be measured in specific order / values, such as number of students in a class

New cards

Continuous Data

Measured infinitely such as age, height, time

New cards

Qualitative Data

Categorical, ordinal and nominal

New cards

Ordinal Data

Places in order and conveys a ranking such as clothing sizes (small, medium large)

New cards

Nominal Data

Does not convey ranking such as ethnicity, gender

New cards

What type of data is the number of cars a family owns?

Discrete

New cards

What type of data is the type of accommodation (such as budget, tourist, superior)

Ordinal - conveys a ranking

New cards

What type of data is favourite fruit preference at the market?

Nominal, conveys no ranking

New cards

What type of data is time spent at the market?

Discrete - measures time which is a specific value

New cards

Weekly household spending is divided into these groups: less than $50, $50-$100, $150-$200. What type of variable is this?

Categorical & Ordinal (defines categories and placed in order to convey a ranking)

New cards

Cross tabulation

Compares categorical with categorical

New cards

scatter plot

Compares numerical with numerical

New cards

frequency table

analyses 1 categorical variable. E.g. the fave stall of people at the market

New cards

Stacked / clustered bar chart

compares categorical with categorical e.g. proportion of M/F choosing fave stall

New cards

Relative frequency histogram

compares categorical with numerical (e.g. market spend of various occupational groups)

New cards

If the 2 variables are, "favourite stall" and "if visitors are regular or not", ac ross tabulation should be used because,

both variables are categorical and define a particular category

New cards

Mean

simple average

New cards

median

middle most value (when ranked from ascending to descending)

New cards

mode

most frequent

New cards

trimmed mean

without most extreme 5%

New cards

Range

maximum - minimum

New cards

interquartle range

75th percentile minus 25th percentile

New cards

variance

represents spread of data around the mean. Standard deviation squared

New cards

standard deviation

square root of variance, higher spread means more spread

New cards

co-efficient of variation

compares different groups with different magnitudes to compare variability

New cards

skewness

positive = right negative = left

New cards

significantly skewed

data is skewed more than twice its standard error

New cards

mode

median

New cards

Kurtosis

measures the extent to which observations cluster around the central point

New cards

What is it called when the kurtosis statistic is zero?

normal distribution

New cards

data clusters close to centre: positive or negative kurtosis?

positive

New cards

data clusters further from centre: positive or negative kurtosis?

negative

New cards

co-variance

measures co-movement between 2 variables

New cards

correlation of co-efficient

measures the linear relationship between 2 variables

New cards

What graph would measure the following: comparing time spent at the market average income

scatterplot, as it measures numerical by numerical

New cards

population

whole collection under analysis

New cards

sample

a portion of the population

New cards

parameter

summary measure describing a characteristic of the data, a type of rule or limit

New cards

statistic

summary measure computed to describe a characteristic of a sample

New cards

primary data

collected yourself

New cards

secondary data

taken from another source

New cards

observational data

you observe and record

New cards

experimental data

data you've obtained through experiments

New cards

simple random sampling

everyone is equally likely to get chosen from the population. E.g. randomly picking a certain number of students

New cards

systematic random sampling

having a system when randomly selecting sample. E.g. randomly selecting a sample then every K'th sample thereafter

New cards

Stratified random sampling

dividing populaiton into homogenous groups (similar characteristics) then taking random sample, e.g. dividing students by which degree they take then taking random sample

New cards

cluster sampling

dividing population into several clusters that aren't homogenous but are each representative of the population then taking a random sample

New cards

You want to sample residential halls but worry that a random sample wont include the small halls. Which sampling method should you use?

Stratified random sampling

New cards

Non sampling errors

human errors

New cards

coverage errors

when the sample has targeted the wrong subjects

New cards

non-response error

when subject chooses to not respond, impacting the data

New cards

measurement error

caused by bad question and misunderstanding

New cards

margin of error

quantified measure of sampling error

New cards

probability

how likely an event is to occur

New cards

how is probability written

P(event)

New cards

What is U in probability

union - probability of one event occurring over another

New cards

what is 'n' in probability

intersection - probability that both events occur together

New cards

collectively exhaustive

when the outcomes given are the only possible outcomes

New cards

complement

2 events complement each other if their probabilities add to 1. E.G. P(a) + P(b) =1

New cards

A Priori Classical

when you already know the probability exists through information

New cards

Empirical (relative frequency)

when you choose to work out the probability through experiments rather than information

New cards

Subjective

when the probability is based in your opinion

New cards

Conditional Probability

the probability of an event occurring given that another event has already occurred.

New cards

How is conditional probability written

P(A I B) e.g. P(Student I Female) "what is the probability that it is a student and they're female"

New cards

how is conditional probability calculated?

P(a n b) / P(b)

New cards

Marginal probability

total probability of a row or column

New cards

Probability independence

when the probability of one event does not influence the probability of another event occurring

New cards

When does co-variance = 0?

when variables are independent

New cards

Random Variables

variables with multiple possible values and an associated probability of getting each variable

New cards

Discrete Random Variables

can only take on a finite number of variables, e.g. the number of 6's rolled on a dice over 2 rolls: there can only be either 0 sixes, 1 six, or 2 sixes.

New cards

Expected Value defined

the value we expect based on the probabilities that exist.

New cards

Expected Value formula

E = ∑ [x • P(x)]

New cards

Variance

measures data spread around the mean

New cards

Variance formula

V(X) = ∑ [p(xi) + (xi-M)^2]

New cards

Binomial Distribution

discrete probability distribution with 4 characteristics

New cards

what are the 4 binomial characteristics

has to be 2 outcomes to every trial (success or fail)
fixed number of trials
probability of success remains the same for every trial
trials are independent, where the outcomes don't affect each other).

New cards

Discrete Random Variables

Cannot be divided, whole numbers, e.g. number of phone calls in a day, number of visitors

New cards

Expected Value

what we expect based on previous data. Formula: E(x) = (0 x 0.25) + (1 x 0.5) + (2 x 0.25) = 1

New cards

Variance

spread of the data. Formula is similar to expected value: V(x) = ((0² x 0.25) + (1² x 0.5) + (2² x 0.25))-1²

New cards

Poisson Probability Distribution

A discrete probability distribution used to find probabilities of the number of times a certain event occurs in a specified time interval (no fixed number of trials)

New cards

4 characteristics of Poisson

number of successes in trial is independent of number of successes in any other interval
Probability is the same for all equal sized intervals
probability of success in a trial is proportional to the size of the interval
probability of more than one success in an interval approaches zero as it becomes smaller

New cards

Empirical Rule

68% = 3 standard dev 95% = 2 standard dev 100% = 1 standard dev

New cards

normal distribution

A function that represents the distribution of variables as a symmetrical bell-shaped graph.

New cards

Standardized Z-Distribution

mean = 0 standard deviation = 1

New cards

How to recognise if data is normally distributed

graph is mound shaped and symmetrical
mean = median
empirical rule applies (68=3, 95=2, 100=3)
skewness & kurtosis close to 0

New cards

Graphs to show normally distributed data

histogram
box plot
stem & leaf
qq pp plot

New cards

What does a sample statistic do

makes an inference on a population parameter if you cant sample an entire population.

New cards

A quantitative estimate involves

a mean "what is the mean grade of the students"

New cards

what are x̅ and μ

x̅ represents the mean in a sample statistic, and μ is the same as x̅, but it represents the whole (parameter) population

New cards

A qualitative estimate involves

a proportion "what proportion of the population is from christchurch

New cards

Interval Estimates

estimations of a range of values of a population parameter. E.g. we expect μ to fall within $75-$100, or, we expect P to fall within 0.25-0.50

New cards

Point Estimates

estimates an exact value of a parameter using a single value. Unlikely to estimate correctly so use interval estimate instead

New cards

how to calculate confidence intervals

point estimate plus or minus margin of error (confidence level x standard error)

New cards

standard error

is the standard deviation of sample mean/proportion and represents the sample mean/proportions accuracy

New cards

when would you use the z distribution when trying to estimate a confidence interval

when the population standard deviation is known
the sample is normally distributed or, sample is large

New cards

When would you use the t distribution when trying to estimate a confidence interval

population standard deviation is unknown
sample is normally distributed or, is large

New cards

when would you use the Z distribution when trying to estimate a confidence interval

for proportions as you'll always know the population ST.D

New cards

100

What are the Z values

99% = 2.576 95% = 1.96 90% = 1.645

New cards

Explore top notes

🎼

Music Innovators

Note

Studied by 5 people

Updated ... ago

5.0 Stars(1)

The Mole and Equations

Note

Studied by 10 people

Updated ... ago

5.0 Stars(1)

ONLINE GAMES

Note

Studied by 8 people

Updated ... ago

5.0 Stars(1)

Nervous System Basic Notes

Note

Studied by 5 people

Updated ... ago

5.0 Stars(1)

😴

Physical Science - Chapter 19

Note

Studied by 12 people

Updated ... ago

5.0 Stars(1)

🌎

Chapter 19: Foreign and Military Policy

Note

Studied by 5 people

Updated ... ago

5.0 Stars(1)

Chapter 21 - Phylogeny, Speciation, and Extinction

Note

Studied by 14 people

Updated ... ago

5.0 Stars(1)

🍃

Unit 1: Period 1: 1491-1607

Note

Studied by 26493 people

Updated ... ago

4.8 Stars(224)

Explore top flashcards

Chapter 11: Cotton, Slavery, and the Old South

Flashcard74 terms

Studied by 20 people

Updated ... ago

5.0 Stars(1)

American Literature- Context/ Wider Texts

Flashcard24 terms

Studied by 27 people

Updated ... ago

5.0 Stars(1)

חלק ג 8.1

Flashcard36 terms

Studied by 17 people

Updated ... ago

5.0 Stars(2)

Measurements & CFM

Flashcard25 terms

Studied by 3 people

Updated ... ago

5.0 Stars(1)

DRRR

Flashcard74 terms

Studied by 24 people

Updated ... ago

5.0 Stars(1)

abeka health 9 section 6.3

Flashcard38 terms

Studied by 23 people

Updated ... ago

4.3 Stars(3)

VOCAB QUIZ 2

Flashcard84 terms

Studied by 35 people

Updated ... ago

5.0 Stars(1)

Munnlegt lokapróf

Flashcard68 terms

Studied by 89 people

Updated ... ago

5.0 Stars(3)