Semester 1

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/81

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

82 Terms

New cards

REVIEW OF BASIC CONCEPTS

New cards

New cards

Describe the mean/median for:

Right skew distribution
Left skew distribution

*Remember = wherever line is going (negative/positive)*

Left skew (negative) = mean < median

Right skew (positive) = mean > median

mode in middle - others to left/right of it

New cards

MEASURES OF VARIATION AND SET THEORY

New cards

go through symbols

New cards

INTRODUCTION TO PROBABILITY

New cards

What are the formulas for the 2 probability rules?

New cards

What does of:

indicate?

0 - event won’t occur

1 - event certain to occur

New cards

CONDITIONAL PROBABILITY

New cards

What is conditional probability?

Probability of one event occurring giving another event has already happened

New cards

Using conditional probabilities, we can have 3 rules of probability - what are these?

Conditional probability
Multiplication rule (conditional rearranged)
Total probability rule

New cards

What is the formula to work out conditional probability?

(working out something is something given something else has happened)

New cards

What is the formula to work out the multiplicative rule?

(something times something when we have conditional probabilities)

New cards

What is the formula to work out the total probability rule?

(total probability of one thing, using multiplied numbers or either conditional numbers)

New cards

What is bayes theorem and when do you use it?

Use it when the question asks you to ‘flip’ a conditional probability

E.g. if the questions asks:

P(S|L) - probability, given you are in London, that you are in a particular sector

But the only information you have is P(L|S) - probability in London given you are in a sector

use it if: condition in question does NOT match the condition in the data

New cards

What is the formula for bayes theorem?

New cards

How can we work out the relationship between 2 variables?

(Either dependent or independent)

Statistically independent if:

P(A∩B) = P(A) x P(B)

P(A∩B∩C∩D) = P(A) x P(B) x P(C) x P(D)

P(A|B) = P(A)

New cards

*REMEMBER - LOOK AT A Q AND SEE IF THE NUMBERS GIVEN ARE*:

Probabilities already (e.g. 0.7, 0.48)
Or just data/answers (60, 80, 100)

*REMEMBER - LOOK AT A Q AND SEE IF THE NUMBERS GIVEN ARE*:

Probabilities already (e.g. 0.7, 0.48)
Or just data/answers (60, 80, 100)

New cards

*REMEMBER - CAN’T JUST ADD UP CONDITIONAL PROBABILITIES TO GET A TOTAL PROBABILTIY OF SOMETHING*:

e.g. to get P(A) can’t add up P(A|B) + P(A|C), etc.

have to times conditional by main probability P(A|B) X P(B)

*REMEMBER - CAN’T JUST ADD UP CONDITIONAL PROBABILITIES TO GET A TOTAL PROBABILTIY OF SOMETHING*:

e.g. to get P(A) can’t add up P(A|B) + P(A|C), etc.

have to times conditional by main probability P(A|B) X P(B)

New cards

DISCRETE RANDOM VARIABLES

(counting)

DISCRETE RANDOM VARIABLES

(counting)

New cards

What are discrete random variables?

A variable that can take any whole number values as outcomes and a finite number of outcomes of a random experiment

E.g. have a situation (tossing a coin), random variable = number of heads

Discrete = can list all possible values & can count them (countable)

New cards

Probability is the measurement about the variable

How is this written?

P (X = x) = p (x)

e.g.

P (X = 1) = 1/6

New cards

What are properties of discrete random variables?

Pr has to be between 0 < P(x) < 1 (cant be negative)
Individual Pr sums to 1
Mutually exclusive (2 separate circles) & collectively exhaustive (covers all data)

New cards

What is an E[X]?

(and how is it different to the mean)

New cards

What is the formula for the E[X]?

New cards

What are the properties of an expected value? E[X]

New cards

What is the formula for variance for a discrete random variable?

Var[X]

New cards

What is the formula for permutations?

New cards

What is the formula for combinations?

New cards

What is the difference between permutations and combinations?

(when would you use both)

Permutations = order matters (ABC AND BCA are not the same thing so count as 2)

Combinations = order does not matter (ABC and BCA are the same thing so count as 1)

Pⁿ_x > Cⁿ_x

(when answer a question have to work out whether the data the order matters or not)

New cards

Mean and variance for Bernoulli distribution?

mean = p

variance = p(1-p)

New cards

What is a binomial distribution?

Sum of all Bernoulli trials

> Describes outcome of a series of n independent Bernoulli trials

successes vs failures

New cards

What is the formula for a binomial distribution?

Cⁿ_x= amount of successes you can get out of n Bernoulli trials

<p>C<sup>n</sup><sub>x<sup> </sup></sub>= amount of successes you can get out of n Bernoulli trials </p>

New cards

What are the 2 properties of a binomial distribution?

mean
variance

μ = E[X] = np

σ² = Var[X] = np(1-p)

New cards

What are the 4 things that make an experiment be distributed binomially?

(BINS)

B - binary outcomes (2 given outcomes - success or failure)

I - independent trials (success/failure of one event shouldn’t affect success/failure of another event)

N - have a defined N number of trials

S - same p per trial (all trials have same p each time)

New cards

How to figure out something if you are asked ‘at least one is..’, ‘at least two are..’ ‘ no more than 3 are…’

ALWAYS USE THE COMPLEMENT

complement = 1 - p(x= complement)

e.g. at least one late (8 flights):

p (x>1) = 1 - p(x=0)

New cards

What is poisons distribution?

Poisson distribution formula?

mean and variance?

Count of events within a time interval (using rare events)

Mean and variance = λ = np

New cards

0! = 1

1! = 1

0! = 1

1! = 1

New cards

CONTINUOUS RANDOM VARIABLES

(measuring)

CONTINUOUS RANDOM VARIABLES

(measuring)

New cards

What are continuous random variables?

A variable that you are measuring not counting and there are infinitely many possible values

Continuous = can’t list all possible values

E.g. height, weight, time, etc.

New cards

What is the expression you write when a variable has a:

Normal distribution
Standard normal distribution

And what is the difference?

Normal distribution = x is a normally distributed random variable, centered at mean (u) with a variance of sigma squared (symmetric so has an equal mean, mode and median)

Standard normal distribution = when x is standardised (transformed) into a z value

(the standardised variable z follows a standard normal distribution)

<p><strong>Normal distribution</strong> = x is a normally distributed random variable, centered at mean (u) with a variance of sigma squared (symmetric so has an equal mean, mode and median)</p><p></p><p><strong>Standard normal distribution</strong> = when x is standardised (transformed) into a z value</p><p>(the standardised variable z follows a standard normal distribution)</p>

New cards

When a random variable is distributed normally, what is its:

E [x]
Var [x]

E [x] = μ

Var [x] = σ²

New cards

What are the 2 different formulas to calculating a z score and when do you use them?

1) z = x - mu / sigma

> this is for a single observation from a normal distribution - e.g. one person’s height

2) z = x bar - mu/ sigma over square root of n

> this is for a sample mean (bottom bit is standard error of mean) - e.g. height of 50 people

(look if we are given sample mean and population mean or just population mean)

<p><strong>1) z = x - mu / sigma</strong></p><p>> this is for a <span style="color: rgb(9, 255, 6);"><u><span>single observation</span></u></span> from a normal distribution - e.g. one person’s height</p><p></p><p><strong>2) z = x bar - mu/ sigma over square root of n</strong></p><p>> this is for a <span style="color: rgb(42, 138, 234);"><u><span>sample mean</span></u></span> (bottom bit is standard error of mean) - e.g. height of 50 people</p><p></p><p>(look if we are given sample mean and population mean or just population mean)</p>

New cards

What are the steps to finding a probability for a random variable x that is normally distributed?

Get all info (x value/sample mean, population mean, variance/SD, n if needed)
Sketch normal distribution bell curve X ~ N (mean, variance)
Translate x values into Z values ( Z ~ N (0,1) (either for a singular observation or for a sample) > goes from P (x > number) to P (z > z score)
Use probability table to compute required probability

New cards

Which part of the tail do you have to work out for each different probability question:

P (x < b)
P (x > b)
P (a < x < b)

New cards

If X∼N (5,0.25) evaluate:

P (X > 5.2)

Find z score and prob and minus it from 1 to get x above 5.2

New cards

If X∼N (5,0.25) evaluate:

P (X < 5.2)

find z score and prob

New cards

If X∼N (5,0.25) evaluate:

P (3.9 < x < 5.3)

find z score for both

3.9 > will be a minus z score so have to minus from 1 to get prob

5.3 > prob

New cards

If X∼N (5,0.25) evaluate:

P (X < 3.8 or X > 4.2)

As they are both below mean (minus numbers) then have to minus both z scores from 1, then overall probability of them 2 minus from 1 to get middle bit

New cards

OVERALL - when doing these types of questions, when do you minus from 1?

If they are minus numbers (so below mean)
If the probability you are finding is the right tail probability

New cards

As n grows, the binomial distribution, it can be approximated by the normal distribution, how can a z score be shown using:

E[X] = np
Var[X] = np(1-p)

New cards

SAMPLES AND SAMPLING DISTRIBUTIONS

New cards

*SAMPLING DONE WITH REPLACEMENT* (n = 100, ask 1 person, then move on but keep person in)

New cards

For a (large) sample statistic what is its:

E[X]
Var[X]

New cards

What is the Law of Large Numbers?

(what happens as n gets bigger)

Law states that:

(given a random sample size of n from a population mean)
Sample mean will approach pop mean as n increases - this is why E[X] on average = mu
This is because as n gets bigger, our estimate of our mean is getting more precise (variance smaller) > as n goes to infinity, Var [X bar] goes towards 0
E.g. variance = 20, 20/(n=12) = 1.6, 20/32 = 0.6, 20/62 = 0.3

*Regardless of underlying prob distribution*

New cards

What is the standard error?

What is its formula?

How does the standard error change with n?

Standard deviation of a sample statistic (e.g. sample mean)

Measures how much the mean is expected to vary from sample to sample
Tells us how precise the sample mean is as an estimator of the population mean

Bigger n = smaller SE (more precise estimate)

<p><strong>Standard deviation of a sample statistic</strong> (e.g. sample mean)</p><ul><li><p>Measures how much the mean is expected to vary from sample to sample</p></li><li><p>Tells us how precise the sample mean is as an estimator of the population mean</p></li></ul><p></p><p>Bigger n = smaller SE (more precise estimate)</p>

New cards

What is the standard error for a sample variance?

New cards

What is a sample that is described as IID?

How can we denote this?

Independently and Identically distributed:

Independence = occurrence of one observation doesn’t affect Pr of another occurring
Identical distribution = each observation has the same Pr distribution as the others

Denoted as: X_i ~ iddN (μ, σ²)

(given each observation in a sample is a random variable)

New cards

What is the Central Limit Theorem (CLT)?

Theorem that states that:

the sample mean of a sample of n observations
(BOTH DISCRETE AND CONTINUOUS),
drawn from a population
with any P distribution

> WILL BE APPROXIMATELY NORMALLY DISTRIBUTED IF N IS LARGE (n > 25)

New cards

POINT ESTIMATION AND CONFIDENCE INTERVALS

New cards

What is a confidence interval?

Provides a range of values within which if we repeatedly sampled we could say, with a degree of confidence, that the true population mean would be between those points

So instead of saying a sample mean (e.g. 10) would be the best approximation of an unknown population mean, we would say with a certain degree of confidence, that an interval (e.g. between 8 and 12) holds our true population mean
Allows for variability in the estimate (around sample mean estimate)

ESIMATING MEAN

New cards

In simple, what does a confidence interval tell you, e.g. if the confidence interval was 95%?

If many repeat samples are drawn, 95% of those samples will contain the true population mean

NOT, if in any 1 sample, you’ve got a 95% certainty that the pop mean will be between those boundaries - its either in or not in region

New cards

How do we get there

don’t need to remember just good to remind you

New cards

What are the 2 types of distribution that can be used for confidence intervals and how do we know when to use them?

Standard normal (cumulative) distribution (Z table):

Left tail probabilities
When sample is bigger than 25 (large sample) or if told normally distributed

t distribution (t table):

Right tail probabilities
When sample smaller than 25 and population variance/SD isn’t known so have to use SAMPLE variance/SD

New cards

Sample mean is normally distributed, but:

> when applying to a test statistic, could be

standard normal
t distribution with n - 1

depending on whether the variance is known or unknown

<p>Sample mean is <u>normally distributed</u>, but:</p><p>> when applying to a test statistic, could be</p><ul><li><p>standard normal</p></li><li><p>t distribution with n - 1 </p></li></ul><p>depending on whether the variance is <strong>known</strong> or <strong>unknown</strong> </p><p></p>

New cards

What is the difference in a graph for a t distribution and normal distribution?

T is flatter and fatter tails

New cards

Write the confidence interval formula for:

normal distribution
t distribution

New cards

What are the steps to finding a confidence interval?

Write out all the info (CI, LOS, X bar, n, sigma, SE = sigma/square root of n)
Work out if its normal or t distribution
Draw graph/tails
Write out standardised formula for normal or t distribution
IF NORMAL - Work out z_sigma/2(which is the left tail, e.g. if CI is 95%, it would be 0.975)
IF T - work out t_sigma/2 and df (n-1) and look on table
Input value into each formula, and end up with CI = [ lower value, upper value ]

<ol><li><p>Write out all the info (CI, LOS, X bar, n, sigma, SE = sigma/square root of n)</p></li><li><p>Work out if its normal or t distribution</p></li><li><p>Draw graph/tails</p></li><li><p>Write out standardised formula for normal or t distribution</p></li><li><p><span style="color: rgb(22, 153, 3);"><strong><span>IF NORMAL</span></strong></span> - Work out z<sub>sigma/2 </sub>(which is the left tail, e.g. if CI is 95%, it would be 0.975)</p></li><li><p><span style="color: rgb(193, 118, 0);"><strong><span>IF T</span></strong></span> - work out t<sub>sigma/2</sub> and df (n-1) and look on table</p></li><li><p>Input value into each formula, and end up with CI = [ lower value, upper value ]</p></li></ol><p></p>

New cards

How will the confidence interval change depending on an increase in:

↑ LOS (alpha)
↑ sample size (n)
↑ population variance (sigma)

LOS ↑ = CI width ↓

n ↑ = CI width ↓

sigma ↑ = CI width ↑

New cards

HYPOTHESIS TESTING

New cards

What is a hypothesis test?

Define what the 2 hypothesises we have are.

Test that allows us to evaluate claims made about the population & whether our samples provides enough evidence to to support rejecting the null in favour of the alternative (or vice versa)

Null hypothesis is what we are testing against (no effect)
Alternative hypothesis states your prediction (has effect)

New cards

Under a normal distribution/z statistics (population variance known) for both a 2 and 1 tailed test, what is:

The hypothesis
The test statistic & its properties
The critical value
The rule

New cards

Under a t distribution/t statistics (population variance unknown) for both a 2 and 1 tailed test, what is:

The hypothesis
The test statistic & its properties
The critical value
The rule

New cards

What does the null hypothesis always have to contain?

REMEMBER: *everything always in terms of the null*

ALWAYS CONTAINS AN EQUALS (=)

H₀ : μ = μ₀
H₀ : μ < μ₀
H₀ : μ > μ₀

New cards

What are the 2 types of errors that can be made (in terms of the null hypothesis)

*Hint - 2 blind 2 see*

New cards

What does a level of significance represent (α) in terms of hypothesis testing?

Calculated risk of committing a type 1 error

Usually e.g. 5%, 1% or 0.1% (CI = 95%, 99% OR 99.9%)

e.g. if α = 5%:

The pr of making a T1 error = 5% (0.05)

(there is a 5% chance of rejecting the null hypothesis when the null hypothesis is actually true - so you accept the 5% risk of making a false positive error)

New cards

What are the 3 ways you can do hypothesis testing?

Using z statistics (standard normal distribution)
Using confidence interval
Using t statistics (students t distribution)

New cards

What are the steps to testing a hypothesis when using z statistics?

Get all info (n, σ, x̄, α, CI, SE.)
Write your hypothesis (two tailed or one tailed)
Draw your graph (shaded areas inside is the confidence level/LOS) and work out z_sigma/2= cv_sigma(use z table!!)
Work out your z value using the standardised formula (+ make sure to write out the properties)
Conclude - do you reject null or can you not reject the null and why + AT WHAT LOS

<ol><li><p>Get all info (n, <span>σ, x̄, α, CI, SE.)</span></p></li><li><p><span>Write your hypothesis (two tailed or one tailed)</span></p></li><li><p><span>Draw your graph (shaded areas inside is the confidence level/LOS) and work out z</span><sub><span>sigma/2 </span></sub><span>= cv</span><sub><span>sigma </span></sub><span>(use z table!!)</span></p></li><li><p><span>Work out your z value using the standardised formula (+ make sure to write out the properties)</span></p></li><li><p><span>Conclude - do you reject null or can you not reject the null and why + AT WHAT LOS</span></p></li></ol><p></p>

New cards

What are the steps to testing a hypothesis when using a confidence interval?

Get all info (n, σ/s, x̄, α, CI, SE.)
Draw your graph and work out confidence interval
Conclude - do you reject null or can you not reject the null and why + AT WHAT LOS - e.g. if in CI then cannot reject, but if not within CI then have to reject

<ol><li><p>Get all info (n, <span>σ/s, x̄, α, CI, SE.)</span></p></li><li><p><span>Draw your graph and work out confidence interval</span></p></li><li><p><span>Conclude - do you reject null or can you not reject the null and why + AT WHAT LOS - e.g. if in CI then cannot reject, but if not within CI then have to reject</span></p></li></ol><p></p>

New cards

What are the steps to testing a hypothesis when using t statistics?

Get all info (n, s, x̄, α, CI, SE.)
Write your hypothesis (two tailed or one tailed)
Draw your graph (shaded areas inside is the confidence level/LOS) and work out t_{sigma/2, n-1}= cv_sigma(use t table!!)
Work out your t value using the standardised formula (+ make sure to write out the properties)
Conclude - do you reject null or can you not reject the null and why + AT WHAT LOS