Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Knowt Play

Explore Top Notes

Chapter 10: Cell Growth and Division

Studied by 62 people

Chapter 10: Motivation and Emotion

Studied by 155 people

Theories of Personality: Abraham Maslow

Studied by 7 people

history notes- USA + world gov- presidents

Studied by 11 people

Studied by 66 people

Chapter 13 - Voltage

Studied by 1 person

Sampling Techniques and Statistical Analysis Review

Probability and Inclusion Probability

Probability is like inclusion probability.
Consider the probability of picking a female from the general human population.
Ideally, you want equal chances of sampling, meaning that the inclusion probability matches what you want.

Critical Value

QT grabs the t critical value from a probability distribution for a given degrees of freedom.
It's a fixed number for that degrees of freedom and confidence level.

Lecture Outline & Revision

Simple random sampling is not always the ideal technique for certain scenarios.
It works in theory, but real-world constraints exist.
Constraints include limited resources, money, and cost, especially when sampling land for soil carbon content.

Monitoring Question

This section will talk about stratified random sampling and revisiting a site.
Revisiting a site influences your thinking about the sets.
It involves different variations of recomputing a confidence interval band.
The focus is on one particular thing called weight.

Confidence Interval Band

The lecture discusses why we would want to torture ourselves to calculate confidence intervals differently.

Simple Random Sampling

If you toss 10 random points onto a landscape, you might miss something or sample some areas too much.
The idea of sampling is that it has to be representative.
To mitigate the two problems, you could sample more to the point that the chance of having these two problems is now randomized.
Limitations can hinder you such as a tall piece of rock that prevents you from throwing that ring behind that rock ever.

Example of Grassland and Wetland

Analogy: If an area has 80% grassland and 20% wetland, the ideal simple random sampling technique will out of 10 samples, sample 10 in the grassland and two in wetland.
Very possible that you get zero wetland.
Even more possible is that you will never ever hit that ratio of eight to two.
It's always variable.
Avoid always variable representation in the example.

Using Prior Information

If you have more information about the population, you do not have to have any additional effort in your sampling regime. You just have to ensure that, this technique is used to have a better representation of your population.
Knowing this prior information, we can use it to our advantage to perform our sampling.

Stratified Random Sampling

Knowing this prior information, we can use it to our advantage to perform our sampling.
That technique is called certified random sampling.
Divide the population into subgroups, which we call substrata, and then we sample from each stratum using simple random sampling.
This involves an extra step than random picking data points using a random number generator.
It will be difficult at first because you want to combine the estimates to get a more accurate estimate.
You're stratifying and you're getting individual stats from each sample, each stratum, but how do you then consolidate into one value that still represents all.
Algorithm:
- Divide
- Take random samples within
- Combine

Rules for Strata

When you start to think about stratified random sampling, you have to understand that there are rules, which the first is extremely important, mutually exclusive, and collectively exhaustive.
Every sample belongs to exactly one stratum.
Boundaries must be clear.
Within a stratum, they have the same inclusion probability, hopefully.
Each stratum must be sampled.

Good vs. Poor Stratification Choices

Good strata is where you can truly separate groups.
- Undergrad versus master's versus PhD.
- Judicious forest versus coniferous versus mixed
- Income levels, low, medium, high
Poor strata choices is where we go, oh, spot fans versus music lovers versus foodies.
The number of samples is relative to the size of the stratum to reppresent but not overrepresent

Advantages of Stratified Random Sampling

Addresses problems of simple random sampling in small sample sizes.
Simple random sampling giving enough samples will solve all of these problems anyway.
Advantageous for small sample sizes.

Simple Random Sampling

Not obsolete; still very reliable.
You're just gonna have a wider confidence interval margin.
People can make decisions based on any confidence interval.

Calculating Confidence Interval

Once you have your stratified sample, you need to go back to the same three things that you need to calculate the confidence interval band to describe this sample.
You still have to calculate the mean, but now it's a pooled mean.
You still have to calculate standard error, but now it's a pooled standard error.
Calculate your confidence interval to make your decisions.
Based on stratum weight.

Equations

Pool mean, a pool standard error needs to be calculated using the weighting approach in order to make statistical inferences about the true population parameters.
w = weight

Understanding Pooling by Weight

Goes back to inclusion probabilities.
For example, the land that is 80% of the population should have a weight of 0.8 compared to the land that is 20% should be a weight of 0.2.
The biggest stratum is more representative than the smallest stratum.
If you want to calculate the weight, it's simply 0.8 multiplied by the mean of that big population plus 0.2 multiplied by the mean of that small population. That's your pooled mean.
The weight is based on a population and not based on the samples that you take because samples will vary, but the weight is fixed.

Weight

Based on initial strata rules that you specify

Pooled Mean

The equation should be relatively simple because you're just multiplying by the weight there.
Equation: 0.69% can be used as a basis of how you're going to compute.
Weighted mean is pooled mean.

Pooled Standard Error

Relates back to your variance measure.
Instead of single variance term, we use the sum of weighted variances from each stratum.
You don't compute individual standard errors per stratum; instead, you start with s^2, which is your variance.
Multiply by the weight squared.
The reason for that is because fundamentally weight variance squared has a squared unit to it, and therefore you square the weight so that eventually when you square root it for standard deviation and standard error, the unit goes back to usual. So you square it knowing that you've gotta square root it.
You also have to standardize the standard error and the variance by the number of samples because that's a standard error thing.
Degrees of freedom, this is gonna be different for stratified random sample.
Again, instead of just minus one, we now have we just have to understand that we lose one degree of freedom for each stratum.
The mean of each stratum needs to be fixed by the time you hit some of the numbers there, and so you lose one per per mean value per strata.
Degrees of freedom has gone down or will go down.

Calculate Confidence Interval

After weighting, 95% confidence interval is the same as before. Pulled mean, plus-minus, pull standard error multiplied by multiplied by the t grid additive part.
The fundamental idea is still the same, Central value plus minus margin of error.
For each stratum, you calculate the variance divided by the number of samples. To get the weighted variance, you multiply it by weight square per variance.
You are standardizing to the stratum weight.
Then you're square rooting the whole thing to convert from variance to standard error.
The longer way of calculating what we call the variance of the mean and the standard area of the mean.
To perform the confidence interval, it's really just mean plus minus t grid multiple by SE.

Comparison of Random Sampling and Stratified Random Sampling

If you perform a simple random sampling and if you perform a hypothetical stratified random sampling, we actually grouped it before you perform.
When a stratified random sample is computed,
- You could, in theory, achieve a much, much lower variance of the mean.
- Your confidence interval band, you look at it here, is smaller.
  - More confident that the value lies around the range.

Efficiency

The sampling efficiency is written as: Var{SRS}/Var{StRS}, where SRS refers to simple random sampling, and StRS refers to stratified random sampling.
If the number is bigger than 1, this means that the stratified random sample is more efficient.
Sampling efficiency is ratio variances.

Stratification Tips

Your stratification can be spatial, temporal, or the way the land is managed.
Strata size determines sample size and try to follow sampling you're gonna try to sample, close to the strata size.

Monitoring Studies

In monitoring studies, you have prior information of that site, and therefore, it is possible to use the information to make a better estimate of the site rather than assuming that you're coming to a new site each time.
It has an added statistical advantage that you have prior knowledge.
Coming back means there's a change in mean value before and after because things do change over time.

Measuring Soil Carbon Content

How do we select sites for that second measurement?
Do you return to the same sites or do you select new sites? The choice affects this thing called covariance in our measurements.

Confidence Interval

To measure confidence interval of a monitored site, and to represent it in one single number, the best way to incorporate both is to have a common measurement, and a common measurement is usually difference.
The one value that links two sides on a before and after experiment is the difference between the two sides.
If you return to the same sites, you're assuming that there's covariance between the before and after, and therefore you do include it as an advantage to your calculations.
If you are doing a monitoring study where you're measuring one site and your second study is a different site by the same population, then this whole thing disappears.
Variance can be found by adding variances together

Covariance

If paired, then you're thinking of covariance

Sampling Design

In general, returning to the same site works better, but you lose information of the overall general estimate.

Statistical Analysis

95% coverage is double for the change in mean.
Use the t test function with the path equals the true option to just do the whole thing there to get a confidence interval.
Adding covariance means you tell the t test function that you're doing a paired t test.
The calculations are a bit more complex to the confidence interval when using covariance, and you are going to perform a t-test using the paired t-test function.

Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Knowt Play

Explore Top Notes

Chapter 10: Cell Growth and Division

Studied by 62 people

Chapter 10: Motivation and Emotion

Studied by 155 people

Theories of Personality: Abraham Maslow

Studied by 7 people

history notes- USA + world gov- presidents

Studied by 11 people

Studied by 66 people

Chapter 13 - Voltage

Studied by 1 person