knowt logo

BIOSTATS week 2a part 1

biostats week 2a part 1

Transcript

I We're prepping that scroll for today and sort of the rest of this week, we're going to look at ways we can quantify how screening tests and diagnostic tests do versus a bold standard.

So there's a lot of stuff we're going to talk about this week might be things that will show up on NCAT if you go into medicine it will be things that will show up throughout your career because it's good to know how well diagnostic tests do if you're going to be using the field.

So just to start out, a little bit of terminology.

So screening tests, diagnostic tests, they sound very similar, but technically slightly different.

We use screening tests in patients that are otherwise healthy, so we use those on people who don't have symptoms.

So for example, cancer screen is commonly started rolling out people in their 40s, 50s, and 60s.

They might do these at certain ages, first specific cancers, in hopes to catch up early.

On the other hand, diagnostic tests, those are used with people with disease, so or with symptoms of disease rather.

So, for example, this with COVID test, food test, any type of thing or extra test as well.

Like I said, these are sort of an impiky kind of thing here, but that's how we differentiate them.

And today we're going to be talking about kind of how we quantify the accuracy of either a screening test or a diagnostic test.

So like I said, if diagnosed early, most diseases you're going to fare better, so that is something we strive to do in the medical field.

A gold standard test is going to be something we're going to talk about a lot today and that is the test that is accepted and assumed to determine some of the most true disease stats.

Whereas like a screening test, you know it's not going to be perfect every time, right?

The reason we don't do the gold standard every time, instead of doing more of a simple screening test, is oftentimes the gold standard is going to be both expensive, invasive, comfortable, right?

To determine something that's disease that is 100%, most of the time you just go in and take out an IOC and something more invasive.

And we try and get around that, get around those costs, and get around making the patient uncomfortable.

So we do screening tests to diagnostic tests. But like I said, tests are not gonna be perfect. Many patients, and I'll show you, I think Wednesday, that even some physicians assume that if you give positive tests, that you must have the disease.

But as we all know, we have things such as false positives that can occur, right?

So, example this would be like, let's say, an athlete test positive for some substance, even if they didn't take the substance.

On the other hand, we can have false negatives if someone tests negative for a banned substance, even if they did take it.

So these tests are not perfect, right? These types of errors probably sound familiar to you already, but we can have false positive.

That will occur when a test is positive, but no disease is present.

And then we can have a false negative. And that occurs when a test is negative for a disease.

But the disease is actually present. Whatever test screening test we employ, we're going to want to minimize those as much as possible.

We don't, we want it to be correctly identifying disease most of the time.

So They both kind of have different negative connotations.

False positives can lead to unnecessary anxiety or you can actually lead some sort of stigma around having a disease in certain cultures around the world if you're diagnosed with HIV or diagnosed with something that is, you know, incorrectly diagnosed through a screening test, it sort of like ruined your life if the community finds out about that.

So that can be bad in that way. False negatives, obviously can be bad as well because it delays any type of treatment you might be enrolled in and delays if you're a disease progression improving.

So we're gonna use this as sort of our example to guide us through a lot of these probabilities we're going to talk about.

We're going to introduce a bunch of different conditional probabilities which hopefully from Friday we all sort of remember what those are.

Everything we do with these screening tests and quantifying their accuracy has to do with these conditional probabilities.

So we're going to look at a screening test for thyroid cancer or a handful of screening tests.

So in the US, about 20,000 new cases of thyroid cancer are diagnosed each year.

Some of the symptoms are hoarseness, neck pain, and large lymph nodes.

Our gold standard is going to be conventional pathology.

So that's going to require surgery that will take something out of you for out of patient, which is going to be invasive.

It's going to be expensive and uncomfortable for the patient.

So we have screening tests. We have the fine needle aspiration biopsy. So this also is going to be pretty basic. They're going to take a little tiny needle and jab you in the throat with it.

It's not too much fun. Then we have the ultrasound. So the ultrasound is kind of like, you see pregnant persons go through, just take the thing, put it on your throat, jam it in there.

Ultrasounds are actually more comfortable than you would think.

I had an ultrasound in the last few years and I believe it was like really the jabbing me with that tool.

But then we have the serum thyroglobulin and that's just a blood test, right?

And then we have radio iodine energy. Anybody know what radio iodine energy is? Yeah, you have to drink the iodine stuff and then like illuminates what's happening.

Something along those lines, right? So, just looking at all these, What would you all want to get if you were having to go in to be tested for thyroid cancer?

An ultrasound. You want the ultrasound? If anybody wants to find aspirational, You take the needles and throw them.

We'll see in a second. And what about serum thyroid problems? People are okay with blood. Whenever I get blood drawn, I pass out. So I wouldn't do that one. I worked at a hospital a few years ago. I was trying to go to physical therapy school. And those are day one, go to an employee health. Like, you've got to get your blood work done. And I tried to tell her that I was like, I don't want to pass out.

And she was kind of like, no, you're a healthcare professional.

And I was like, no, I'm going to pass out. And then there's a very small rural hospital. Everybody kind of knew the new guy very quickly. Oh, really? I pressed the hell out. But then the last time that I went to the field. So that's why I'm here to teach you all about that.

I'd much prefer it. What about radio iodine imaging? Anybody want to drink iodine? All right. So it sounds like most of us want to do ultrasound.

You got one finding an aspiration biopsy. I commend you for your bravery. And then we have some people that are OK with the vampires.

So we're basing all this off of how our comfort level with the test.

But some good information for us to know would be how accurate is the test?

How well does it determine your disease? Or when you don't have the disease, how well does it determine that you don't have the disease?

So we're going to bring in a bunch of conditional probabilities that are calculated around these tests when they roll out versus the gold standard.

So the first one we're going to talk about is sensitivity.

Sensitivity of a screening test is a conditional probability.

So do you all remember when you see this line, what that means?

A long or given? Exactly. Both of those are correct. So sensitivity is the probability that you test positive given that you have the disease, right?

Given that you are disease positive. And when you see this disease positive, are you gonna write that, stands for that gold standard, right?

And just to kind of step back here, we calculate these things.

We, in a new diagnostic test, or a new screen test, is rolled out at the FDA.

We'll calculate these things and do studies that compare patients who have had both a new screen test and the gold standard, whatever that gold standard is at the time.

So that's kind of where all of this data will come from once we start looking at it.

So that's sensitivity. That's how well the test is doing at actually identifying positive disease.

We have a complement, a number of the complement of something is just 1 minus it happening.

The complement of sensitivity is going to be a false negative rate.

So false negative rate is just one minus the sensitivity.

So if we have a really high sensitivity, really accurate test, the false negative rate is going to be low, right?

The false negative rate is going to be something we don't minimize, right?

We don't want a bunch of false negatives happening.

So false negative rate is just one minus the sensitivity, test negative given the positive.

So the second kind of important metric we look at with a test is the specificity.

Well, that sounds very similar to sensitivity, but it's kind of the opposite, right?

The specificity, once again, we're comparing diagnostic tests to a gold standard, but with this, it's the probability of testing negative given that the person does not have the disease by that gold standard.

The test negative given the disease. And you guessed it, there's a compliment to this as well.

And the compliment to this is a false positive rate.

So just like the false negatives, we want to minimize those false positives.

So our false positive rate is just going to equal 1 minus the specificity, just to compliment your game.

So a highly sensitive test will have very few false negatives.

And a highly specific test will have very few false positives.

As we'll see in a minute there's going to be sort of a trade off.

Some most tests that are highly sensitive are going to make it less specific and vice versa.

So here we go. Talk about thyroid cancer screening tests So, here we go.

We talked about fiber cancer screening tests a few minutes ago.

Most people liked ultrasound. We had one finding aspiration, a few serum fibrogavillum.

So, what do we think now? We have our sensitivity, right? Probability of testing positive given that someone actually has a disease by the gold standard.

And we have the specificity of the year. Probability of testing negative given that someone that does not have the disease.

Once again, we want these, we want to find a test that sort of maximizes both of these.

So which one will we go with? Yep, we go with the ultrasound that seems to maximize both.

The fine needle aspiration does have the highest specificity, but its sensitivity is quite a bit lower, right?

So we're going to have more false negatives here.

So I would probably go with the ultrasound here. Also the ultrasound seems to be the one that is invasive.

So like I said, if we were to minimize the false negative rate, we want to select ultrasound.

So when we have a super high sensitivity that would minimize the false negative rate And it will minimize the chance that thyroid cancer is not detected.

When in fact it is present. The cost of this though, there's always kind of a cost benefit.

We'll reach the test. We will have a higher false positive rate because we do have you know slightly lower specificity so more people who do not have cancer may test positive causing them unnecessarily alarmed and may cause them to have unnecessary surgery.

It's just something to kind of take into consideration.

If we want to minimize that false positive rate, we would choose by needle aspiration.

That was what we were most interested in. That minimizes the chance that thyroid cancer is diagnosed when in fact it is not present but the cost of this is we have that higher false negative rate so more people do have cancer and the chest negative to lower chances of successful treatment.

So school review before bringing in other stuff sounds very similar and might be confusing.

What was the conditional probability for sensitivity?

It's the probability of what, given what. A lot of people will mix these up. The probability that you test positive given that you actually have a disease, right?

And the specificity is the probability. The test negative given that you don't have these seeds.

Once again, these are calculated from studies that aim to look at how accurate a test is.

When a new test rolls out, They collect data on maybe 100,000 or 1000 patients and have them take both tests and Make a 2x2 table and look and see how accurate the test is.

So a lot of you in this room are going to be seeing patients in the future here.

When a patient ends up in your office and you give them a test and then the test result comes back, you don't have the luxury of having their gold standard.

What do you have to decide information about? All you have is their test result, right? So, while a sensitivity and specificity are great, they can help us understand the properties of the test and how accurate it is.

For you all as clinicians or us, even as consumers in medicine, it might be more useful to have a conditional probability that conditions upon the actual test, as opposed to the gold standard.

So we're going to bring in some other things called predictive values.

So like I said, Well, sensitivity and specificity are useful.

They don't tell us if someone gets a positive result, what's the probability they actually have the disease, right, which is something useful for us to know if we're seeing patients in the field.

And also it doesn't tell us if someone gets a negative test result, what's the probability that they are actually Disney's free.

So this brings in our predictive values and like I said these are going to be more useful to us in the field but you know sensitivity and specificity and both of these predictive values I'm going to expect you to kind of know the difference and kind of what all of these mean and have calculated.

So first one positive predictive value once again conditional probability and this is just a probability that someone has the disease given that they test positive.

So it just looks like sensitivity, but just flipped around.

And then on the other end, negative victim value, that's just the probability that the disease is absent given the test is negative.

So just like that specificity flipped around. And we're not just throwing numbers at you, flipping things around.

It gives you, like I said, this is something that is more helpful to people actually in the field to see patients, because all we have in the field is that positive or negative test that our patient gets.

And then we can maybe look at the medical literature and think about a should we enroll them in another test that's more invasive to make sure or based on the positive predictive value or negative predictive value of this test, can we go ahead and either say yes, they have the disease or no, they don't rule it out.

So as you'd expect with both of these, we want tests of high-privileged values, right?

We want tests that are identifying disease correctly most of the time.

So I'm just going to round this in a 2 by 2 table here and kind of start talking through this.

I think this will help. So imagine this as, over here, is our new test, right?

And then imagine this as our gold standard, right?

And so actually I'm gonna back up even more and pull out the cell sheet.

So let's say patient. If the patient's maybe re-enrolled in a study and we're trying to figure out how well the new test is doing.

So, we had a bunch of people on our study and then if we have someone, they took, this is the gold standard, so this is that invasive thing they had to do.

Then we have the new test. And then maybe in the gold standard they got a positive and then the new test they got a positive so new tested did good on that one right maybe the negative negative right we got some agreement happening.

Then some of them, you know, you might have positive, new test might get it wrong, right?

So we have a data set like this, right? And we can summarize those results by having our test result here positive and negative.

And then We have that gold standard up here, disease or no disease.

We can count how many people end up in each one of those.

We can count those, have the positive, test results, and have disease.

We have a negative test result and don't have disease.

So, then we have our false negatives here, right?

Those who had negative and do have disease and our false positives up here, those who tested positive and don't have disease.

So, we're gonna want everybody to be in these two squares, right?

That's the goal we took there. We want everybody to be correctly identified. So this little 2 by 2 j will be kind of working with data like this in the next couple days.

Yeah, once again, we're going to want people in these, and we're going to minimize false negative, false positive.

So calculating sensitivity, calculating specificity, it's pretty straightforward.

We can calculate it straight from the data. We can just calculate here as the number of true positives divided by the total people who actually have the disease.

Same with specificity, we can calculate the number of true negatives divided by the total people who don't have the disease.

Predictive values can get a little different. They get a little more difficult to estimate. So predictive values are going to depend on what proportion of all positive tests can be expected to be true with positives.

So predictive values have to do with the prevalence of the outcome in the population.

So Does anybody know what prevalence is? Prevalence is kind of like more of an epidemiology term, but I think maybe you would sound familiar with some people.

Prevalence is just a proportion. It's going to be a proportion of how many people in a population have the outcome in a certain time period.

So sensitivity and specificity and prevalence are all going to be functions of these predictive values.

And so prevalence, like I said, is that the prevalence of a disease, proportion of the population, have the disease in a given time period.

We can also think of this as the probabilities that a randomly selected member has a disease.

So if we want to calculate like what's the prevalence of HIV in Georgia, we would just take all the cases and divide it by the total population of Georgia.

Then we'd get the prevalence. So it's fairly straightforward calculation. We can just calculate this as disease positives over N.

Prevalence is estimated from large scale prospective studies.

Anybody remember what prospective studies are?

We talked about this many weeks ago. Go ahead. Looking ahead. Looking ahead, exactly. So we can estimate prevalence of a disease from studies where we enroll a bunch of people who don't have the disease and watch it occur naturally over time.

With our case control studies, we can't estimate prevalence.

Anybody think about why we wouldn't want to estimate prevalence from a case control study?

So how do we, go ahead. Exactly, yeah, so you're getting at it there. So with a case control, remember, we would find people with the disease, and then we would match them to those who don't have the disease.

So we, the researcher in case control study, are setting the prevalence ourselves in matching.

So we might set the prevalence at 50-50 when that's not what's naturally occurring.

So that was a little weird aside, but that's going to come into play a lot in the next couple of days.

Our large perspective studies are going to be the kind that can render us this type of prevalence.

This is where it gets a little mathy. I'm going to spend too much time on this, but we went over conditional probabilities a few days ago, right?

And we can use these conditional probabilities, condition on B given A, and sort of untact all of this stuff here and end up with components of a formula for our positive predictive values and negative predictive values.

And both of these formulas are going to take into account the prevalence of the disease in the population.

They're also going to take into account the sensitivity and the specificity.

So we're going to go through this in great detail.

But I will say what we can get from this is we can get the sensitivity.

We can get the complement of the specificity and the prevalence of disease.

We can apply this theorem. We can apply these conditional probabilities that are based on prior prevalence of disease and end up with some of these formulas where we can calculate the positive predictive value, that thing we will use in the field to determine how well the tet, to determine how sure we are that the person has a positive or negative result.

So we end up with these formulas. So these few slides, I wouldn't worry too much about them, but this is the ticket here.

If you put a star by this slide, this is the one trick that you're going to have to use.

And all these are, they kind of look very similar, but they are different.

So if you're trying to calculate a positive predictive value, you want to use this formula.

That's just the sensitivity times the prevalence in the numerator and the sensitivity times the prevalence again.

Add that to the complement of the specificity times the complement of the prevalence.

And then negative frequency value looks very similar but slightly different as the specificity times 1 minus the prevalence divided by the specificity times 1 minus the prevalence in the denominator.

We add that to the complement of the sensitivity times the prevalence.

So this formula can trip some people up, but the main thing is if you can get your sensitivity and specificity from the problem, unpack that.

And then pinpoint whatever your prevalence is, your prevalence will be given to you and the problem, then you just kind of plug all this in and solve, and also be able to kind of interpret the findings.

And we're going to do a few samples here if I don't write that down.

So here it is. Approximately 1% of 4-year-old women who are screened for breast cancer actually have the disease.

A mammogram can correctly identify 99% of breast cancer cases.

Also, suppose that 10% of women who test it don't have breast cancer and get a positive result.

3% that don't have cancer get a positive. So we're going to kind of go through all these conditional probabilities and see kind of unpack them from this.

So From this, what would our sensitivity mean? Sort of like, kind of given to you in words in a number format up here.

So sensitivity is a probability that we test positive given gold standard disease, right?

Which one of these three numbers out there looks most like that?

Exactly, yeah. So a mammogram can correctly identify 99% of breast cancer cases.

So, that's just kind of giving us our sensitivity and work form.

And then our specificity, probability, test negative given no cancer.

So how can we get that from this? 99 from 10. I think you're close. You'd actually subtract just 100 from 10. So remember the false positive rate, That's going to be a complement of the specificity.

And here we're given a false positive rate. So suppose that 10 percent of women who tested don't have breast cancer get a positive result.

So we're just giving a positive rate there, so we're going to just do 1 minus that.

And then we'll get that specificity. We're going to flip things around and we get 0.9. Has everybody seen that? Follow me there a little bit. And then our positive and negative predictive values are all based on the prevalence of the disease in the population.

So these predictive values are going to be functions of the prevalence.

And the prevalence is given to us here. So what would the prevalence be? So the number given to us up there, so it would be 0.01, right?

So approximately 1% for the old women who are screened actually have the disease.

So we have a sensitivity we have specificity and we have positive so now we need to calculate we have a problem So we want to calculate a positive predictive value and negative predictive value.

So just before we even calculate anything, what is the positive predictive value going to tell us and why is it going to be more advantageous to us?

It's also a conditional probability, right? It's the probability of something given something else.

Go ahead. What's the probability that they have the disease given that they had tested positive, right?

So it's sort of like the sensitivity flipped. So we're conditioning upon that test. And the reason it might be more important, or more advantageous for us is because in the field, what information do we actually have or we have a third test result?

And how do we go about calculating it? Well, we have to use these two great formulas. So in this situation, we already have our sensitivity, we already have our specificity, We are given the prevalence of the disease in the wider population.

So we can just plug all this stuff in that formula to get our positive predictive value.

So we have our prevalence, right? We have the complement of our prevalence, 0.99. We have our sensitivity. We had the complement of our specificity. So for positive predictive value, this is what we'll use.

So we'll just plug our sensitivity, plug in our prevalence, plug in our sensitivity, plug in our prevalence, plug in our specificity and our prevalence as well, and we can just solve for this.

So we get this pretty low, pretty low proportion, right?

0.09. Does Anybody want to take a stab at interpreting this course?

Keep in mind, go ahead. There's like a 9.1% chance that she actually does the breast cancer she tests positive.

Exactly. So if a randomly selected 40-year-old woman gets a positive mammogram, there's a 9.1% chance she indeed has a breast cancer.

So what do we think? Does this sound surprising to anybody? So sensitivity, sorry, Positive predictive values are going to be functions of the prevalence of the outcome in the actual population.

And among 40-year-old women, breast cancer is not that prevalent.

It's going to be one of the populations where you see it less among women, whereas when we get older, it's going to be more prevalent.

And because of that, we have less of a positive predictive value, right?

We have less power in our positive predictive value.

So you're probably thinking, okay, well why are we even using mammograms then if they have such a low positive predictive value?

Well, now we're going to flip around, and this is where I was supposed to ask if someone should be overly alarmed.

But I'm just going to go straight to our negative predictive value here.

And if we plug all this in, we calculate the negative predictive value.

Same type thing. We just plug in that specificity prevalence, specificity prevalence, sensitivity prevalence.

And we end up with 0.99. Anybody want to take the stab at interpreting this?

It's a 9.99% chance that they've tested it, but they don't have it.

Exactly. So randomly select a four-year-old woman to say negative mammogram, there's a 99.

It's 49% chance she doesn't have breast cancer. So among this population, the mammogram does a better job of ruling out disease, right?

Because that prevalence is so low. So, you know, that question I posed a minute ago, Why do we even give mammograms if that's the case?

Well, in this population, it can do a good job ruling it out.

If someone gets a negative result, a four-year-old gets a negative result, we can be pretty sure they don't.

99.9% sure they don't have a negative result. But these things change as the prevalence changes in the population.

So among 50-year-olds or 60-year-olds, this might start to flip a little bit.

It might have a better positive predictive value as that prevalence goes up.

So the sensitivity, once again sensitivity and specificity, are measures of overall test quality, test accuracy, And on the other hand, predictive values can be thought of as measures of how well the test works in the actual population.

And that's because the predictive values are dependent on the prevalence of the disease in the population itself.

We just already population itself. We just already kind of talked about that. So this will kind of be a nice bridge to win say for us.

But we just did everything like using those big crazy formulas we're gonna learn in the rest of this week you're not always gonna have to use those crazy formulas sometimes you're gonna be able to use a two by two table to calculate these things You can just simply calculate the conditional probabilities by the row, by the column.

You can also kind of turn whatever data you're given in a problem into a 2 by 2 table.

So, like, we were given all these metrics. We were given, you know, sensitivity, specificity, and prevalence.

Well, we can actually, like, make our own little 2 by 2 table from that and just use some sort of theoretical total population.

So we can say, all right, we have this sensitivity.

We have this specificity. We have this prevalence. Let's say we use this test in a population of 100,000 people.

So 100,000 would be on the total bottom right here and we could take that 100,000 and multiply that by our prevalence.

So 0.01 and then we get 1,000 in our true positive on those who we think would truly have the disease.

And we did multiply those out of the 1,000. We can multiply that by our sensitivity to those who we believe would truly have the test, our true positives out of our total, our test positive out of our total true positive.

So we can do 1000 times .99. And Then we could also do 100,000 minus these true positives to get our true negatives and multiply those by our false positive rate to get this 9,990.

And then from this, we can just do some subtraction in addition to get all our row totals and column totals.

So we can take this minus this, get that here, this minus this, this here, add up our row totals.

And from this, we can just take our test positive, true positives, and we can divide that by our total test positives.

Because what are we conditioning upon again? It's our probability of having the disease given that you tested positive so that we would be conditioning upon those positives here.

So 990 over 2,890. And we get the same number, 9.1%. It's a way you could do it that sort of gets you out of having to implement these formulas, but that's essentially what the formula is doing behind the scenes, and it just kind of gets it into two by two table format.

It also, just talking through that, it seems a little confusing, so if you don't want to go that route, I totally understand.

Like I said, on Wednesday and Friday we'll look at types of data we can actually just use the table rows to calculate stuff.

So for like a prospective study where the prevalence is naturally occurring, we can just use the table rows to calculate our predictive values.

Before we go, just public service announcement, this is kind of the part of the semester where things like start to get maybe a little more challenging, maybe if you've taken stats in high school, maybe start getting outside the stats stuff, you've taken high school.

For some reason this is also the part of the semester where people stop coming.

It's like, you know what? People all come for the first unit, which is quite a bit easier.


biostats week 2a part 1

screening tests are used for those who are otherwise healthy ex cancer screening

DIAGNOSTIC TESTS are used in patients with symptoms of a disease

most diseases fair better when diagnosed early

GOLDEN STANDARD- an accepted test that is assumed to be able to determine the true disease (expensive and invasive though)

TESTS ARE NOT PERFECT

false positives and negatives can occur in any test

Slide 5 - Test Errors • A false positive occurs when a...

false positives can lead to anxiety or uneccessary treatment and also societal stigmas

false negatives delays any treatment the person may need

Slide 7 - In the United States, about 20,000 new cases of...

Slide 8 - Available screening tests:

Slide 9 - Sensitivity To determine which screening test...

SENSITIVITY of a screening test is the conditional probability that the test is prositive given that the person actually has the disease

the line in a probability is among or given, so it is based on dependent events

Slide 10 - Sensitivity Sensitivity = 𝑃 Test + Disease + •...

complement of something is 1- it happening

FALSE NEGATIVE RATE IS EQUAL TO 1- SENSITIVITY

SENSITIVITY IS P(TEST +|DISEASE +)

SO FNR EQUALS P(TEST - | DISEASE +)

Slide 11 - Specificity • The specificity of a screening...

SPECIFICITY of a screening test is the conditional probaility that the test is - given the the person does NOT have the disease

Slide 12 - Specificity Speci1icity = 𝑃 Test − Disease − •...

FALSE POSITIVE RATE IS EQUAL TO 1- SPECIFICITY

SPECIFICITY IS P(TEST -|DISEASE -)

SO FPR EQUALS P(TEST + | DISEASE -)

Slide 13 - Sensitivity and Specificity Example: Thyroid...

you want high sensitivity and specificity for accuracy. however, you'd also want to choose a screening test that is less invasive

Slide 14 - Choosing a Screening Test Example: Thyroid Cancer

higher FNR means higher negative rate..missed treating

higher FPR means higher positive rate...unnecesary treatment

Slide 18 - Predictive Values • While very useful, the...

sensitivity and specificity does NOT tell us the probability of an actual disease case in a positive test or a lack of disease in a negative result

Slide 19 - Predictive Values • The positive predictive...

POSITIVE PREDICTIVE VALUE (PPV OR PV+) is the probability that disease is present given that the test is positive

PPV= P(DISEASE+ | TEST +)

NEGATIVE PREDICTIVE VALUE (NPV OR PV-) is the probaility that the disease is absent given a negative test

NPV= P(DISEASE- | TEST -)

Slide 20 - Predictive Values • In general, we desire tests...

Slide 21 - Predictive Values

Slide 23 - The predictive values depend on what proportion...

predictive values are based on how prevalent the disease in the population

(presence is the proportion of how many people can have a disease in a time period)

PREVALENCE IS PART OF A FUNCTION

sensitivity and specificity CAN be estimated by data while predictive values CANNOT

Slide 25 - P(D+) = D+/N = Prevalence

prevalence are based on large PROSPECTIVE studies (those who don't have disease and find it over time)

P(D+)= D+/N= PREVALENCE

Slide 26 - Bayes’ Theorem

Slide 27 - Applying Bayes’ Theorem Let 𝐴 = Test + , and 𝐵...

complement is 1- probability

Slide 29 - PPV and NPV Formulas!!

PPV = (SENSITIVITY PREVALENCE)/ [ (SENSITIVITY PREVALENCE)]+[(1- specificity)*(1-PREVALENCE)]

NPV = (SPECIFICITY (1- PREVALENCE)/ [ (SPECIFICITY (1-PREVALENCE))+(1- sensitivity)*(1-PREVALENCE)]

Slide 30 - Breast Cancer Screening Example:

10% of women who don't have breast cancer get a positive rate= CALCULATE FPR

Slide 31 - Breast Cancer Screening Example:

SENSITIVITY WOULD BE THE PERCENT THE MAMMOGRAM CAN ACTUALLY DETECT

THE OTHER 90% BSED ON THE 10% WOULD BE THE SPECIFICITY (

THE PREVALENCE IS THE 1% OF WOMEN WHO HAVE THE DISEASE WHEN THEY ARE SCREENED (.01)

Slide 32 - Breast Cancer Screening Example:

Slide 33 - Breast Cancer Screening Example:

Slide 34 - Prevalence = 0 . 01 1 − Prevalence = 1 − 0 . 01...

Prev: .01 Prev comp: .99 (100-.01) Sensitivity: .99 Specificity comp= .1

Specificity: ..9

Slide 35 - Breast Cancer Screening Example: Mammograms for...

doing the PPV calulation will show the percent chance a women has the disease when taking the test

Slide 37 - Example: Mammograms for 40 - year - old women

high percentage in NPV will show there's a higher chance she edoesn't have breast cancer

PPV can change with age

Slide 38 - Sensitivity & Specificity Versus Predictive...

Slide 41 - Breast Cancer Screening Example: Mammograms for...



Made With Glean | Open Event

BIOSTATS week 2a part 1

biostats week 2a part 1

Transcript

I We're prepping that scroll for today and sort of the rest of this week, we're going to look at ways we can quantify how screening tests and diagnostic tests do versus a bold standard.

So there's a lot of stuff we're going to talk about this week might be things that will show up on NCAT if you go into medicine it will be things that will show up throughout your career because it's good to know how well diagnostic tests do if you're going to be using the field.

So just to start out, a little bit of terminology.

So screening tests, diagnostic tests, they sound very similar, but technically slightly different.

We use screening tests in patients that are otherwise healthy, so we use those on people who don't have symptoms.

So for example, cancer screen is commonly started rolling out people in their 40s, 50s, and 60s.

They might do these at certain ages, first specific cancers, in hopes to catch up early.

On the other hand, diagnostic tests, those are used with people with disease, so or with symptoms of disease rather.

So, for example, this with COVID test, food test, any type of thing or extra test as well.

Like I said, these are sort of an impiky kind of thing here, but that's how we differentiate them.

And today we're going to be talking about kind of how we quantify the accuracy of either a screening test or a diagnostic test.

So like I said, if diagnosed early, most diseases you're going to fare better, so that is something we strive to do in the medical field.

A gold standard test is going to be something we're going to talk about a lot today and that is the test that is accepted and assumed to determine some of the most true disease stats.

Whereas like a screening test, you know it's not going to be perfect every time, right?

The reason we don't do the gold standard every time, instead of doing more of a simple screening test, is oftentimes the gold standard is going to be both expensive, invasive, comfortable, right?

To determine something that's disease that is 100%, most of the time you just go in and take out an IOC and something more invasive.

And we try and get around that, get around those costs, and get around making the patient uncomfortable.

So we do screening tests to diagnostic tests. But like I said, tests are not gonna be perfect. Many patients, and I'll show you, I think Wednesday, that even some physicians assume that if you give positive tests, that you must have the disease.

But as we all know, we have things such as false positives that can occur, right?

So, example this would be like, let's say, an athlete test positive for some substance, even if they didn't take the substance.

On the other hand, we can have false negatives if someone tests negative for a banned substance, even if they did take it.

So these tests are not perfect, right? These types of errors probably sound familiar to you already, but we can have false positive.

That will occur when a test is positive, but no disease is present.

And then we can have a false negative. And that occurs when a test is negative for a disease.

But the disease is actually present. Whatever test screening test we employ, we're going to want to minimize those as much as possible.

We don't, we want it to be correctly identifying disease most of the time.

So They both kind of have different negative connotations.

False positives can lead to unnecessary anxiety or you can actually lead some sort of stigma around having a disease in certain cultures around the world if you're diagnosed with HIV or diagnosed with something that is, you know, incorrectly diagnosed through a screening test, it sort of like ruined your life if the community finds out about that.

So that can be bad in that way. False negatives, obviously can be bad as well because it delays any type of treatment you might be enrolled in and delays if you're a disease progression improving.

So we're gonna use this as sort of our example to guide us through a lot of these probabilities we're going to talk about.

We're going to introduce a bunch of different conditional probabilities which hopefully from Friday we all sort of remember what those are.

Everything we do with these screening tests and quantifying their accuracy has to do with these conditional probabilities.

So we're going to look at a screening test for thyroid cancer or a handful of screening tests.

So in the US, about 20,000 new cases of thyroid cancer are diagnosed each year.

Some of the symptoms are hoarseness, neck pain, and large lymph nodes.

Our gold standard is going to be conventional pathology.

So that's going to require surgery that will take something out of you for out of patient, which is going to be invasive.

It's going to be expensive and uncomfortable for the patient.

So we have screening tests. We have the fine needle aspiration biopsy. So this also is going to be pretty basic. They're going to take a little tiny needle and jab you in the throat with it.

It's not too much fun. Then we have the ultrasound. So the ultrasound is kind of like, you see pregnant persons go through, just take the thing, put it on your throat, jam it in there.

Ultrasounds are actually more comfortable than you would think.

I had an ultrasound in the last few years and I believe it was like really the jabbing me with that tool.

But then we have the serum thyroglobulin and that's just a blood test, right?

And then we have radio iodine energy. Anybody know what radio iodine energy is? Yeah, you have to drink the iodine stuff and then like illuminates what's happening.

Something along those lines, right? So, just looking at all these, What would you all want to get if you were having to go in to be tested for thyroid cancer?

An ultrasound. You want the ultrasound? If anybody wants to find aspirational, You take the needles and throw them.

We'll see in a second. And what about serum thyroid problems? People are okay with blood. Whenever I get blood drawn, I pass out. So I wouldn't do that one. I worked at a hospital a few years ago. I was trying to go to physical therapy school. And those are day one, go to an employee health. Like, you've got to get your blood work done. And I tried to tell her that I was like, I don't want to pass out.

And she was kind of like, no, you're a healthcare professional.

And I was like, no, I'm going to pass out. And then there's a very small rural hospital. Everybody kind of knew the new guy very quickly. Oh, really? I pressed the hell out. But then the last time that I went to the field. So that's why I'm here to teach you all about that.

I'd much prefer it. What about radio iodine imaging? Anybody want to drink iodine? All right. So it sounds like most of us want to do ultrasound.

You got one finding an aspiration biopsy. I commend you for your bravery. And then we have some people that are OK with the vampires.

So we're basing all this off of how our comfort level with the test.

But some good information for us to know would be how accurate is the test?

How well does it determine your disease? Or when you don't have the disease, how well does it determine that you don't have the disease?

So we're going to bring in a bunch of conditional probabilities that are calculated around these tests when they roll out versus the gold standard.

So the first one we're going to talk about is sensitivity.

Sensitivity of a screening test is a conditional probability.

So do you all remember when you see this line, what that means?

A long or given? Exactly. Both of those are correct. So sensitivity is the probability that you test positive given that you have the disease, right?

Given that you are disease positive. And when you see this disease positive, are you gonna write that, stands for that gold standard, right?

And just to kind of step back here, we calculate these things.

We, in a new diagnostic test, or a new screen test, is rolled out at the FDA.

We'll calculate these things and do studies that compare patients who have had both a new screen test and the gold standard, whatever that gold standard is at the time.

So that's kind of where all of this data will come from once we start looking at it.

So that's sensitivity. That's how well the test is doing at actually identifying positive disease.

We have a complement, a number of the complement of something is just 1 minus it happening.

The complement of sensitivity is going to be a false negative rate.

So false negative rate is just one minus the sensitivity.

So if we have a really high sensitivity, really accurate test, the false negative rate is going to be low, right?

The false negative rate is going to be something we don't minimize, right?

We don't want a bunch of false negatives happening.

So false negative rate is just one minus the sensitivity, test negative given the positive.

So the second kind of important metric we look at with a test is the specificity.

Well, that sounds very similar to sensitivity, but it's kind of the opposite, right?

The specificity, once again, we're comparing diagnostic tests to a gold standard, but with this, it's the probability of testing negative given that the person does not have the disease by that gold standard.

The test negative given the disease. And you guessed it, there's a compliment to this as well.

And the compliment to this is a false positive rate.

So just like the false negatives, we want to minimize those false positives.

So our false positive rate is just going to equal 1 minus the specificity, just to compliment your game.

So a highly sensitive test will have very few false negatives.

And a highly specific test will have very few false positives.

As we'll see in a minute there's going to be sort of a trade off.

Some most tests that are highly sensitive are going to make it less specific and vice versa.

So here we go. Talk about thyroid cancer screening tests So, here we go.

We talked about fiber cancer screening tests a few minutes ago.

Most people liked ultrasound. We had one finding aspiration, a few serum fibrogavillum.

So, what do we think now? We have our sensitivity, right? Probability of testing positive given that someone actually has a disease by the gold standard.

And we have the specificity of the year. Probability of testing negative given that someone that does not have the disease.

Once again, we want these, we want to find a test that sort of maximizes both of these.

So which one will we go with? Yep, we go with the ultrasound that seems to maximize both.

The fine needle aspiration does have the highest specificity, but its sensitivity is quite a bit lower, right?

So we're going to have more false negatives here.

So I would probably go with the ultrasound here. Also the ultrasound seems to be the one that is invasive.

So like I said, if we were to minimize the false negative rate, we want to select ultrasound.

So when we have a super high sensitivity that would minimize the false negative rate And it will minimize the chance that thyroid cancer is not detected.

When in fact it is present. The cost of this though, there's always kind of a cost benefit.

We'll reach the test. We will have a higher false positive rate because we do have you know slightly lower specificity so more people who do not have cancer may test positive causing them unnecessarily alarmed and may cause them to have unnecessary surgery.

It's just something to kind of take into consideration.

If we want to minimize that false positive rate, we would choose by needle aspiration.

That was what we were most interested in. That minimizes the chance that thyroid cancer is diagnosed when in fact it is not present but the cost of this is we have that higher false negative rate so more people do have cancer and the chest negative to lower chances of successful treatment.

So school review before bringing in other stuff sounds very similar and might be confusing.

What was the conditional probability for sensitivity?

It's the probability of what, given what. A lot of people will mix these up. The probability that you test positive given that you actually have a disease, right?

And the specificity is the probability. The test negative given that you don't have these seeds.

Once again, these are calculated from studies that aim to look at how accurate a test is.

When a new test rolls out, They collect data on maybe 100,000 or 1000 patients and have them take both tests and Make a 2x2 table and look and see how accurate the test is.

So a lot of you in this room are going to be seeing patients in the future here.

When a patient ends up in your office and you give them a test and then the test result comes back, you don't have the luxury of having their gold standard.

What do you have to decide information about? All you have is their test result, right? So, while a sensitivity and specificity are great, they can help us understand the properties of the test and how accurate it is.

For you all as clinicians or us, even as consumers in medicine, it might be more useful to have a conditional probability that conditions upon the actual test, as opposed to the gold standard.

So we're going to bring in some other things called predictive values.

So like I said, Well, sensitivity and specificity are useful.

They don't tell us if someone gets a positive result, what's the probability they actually have the disease, right, which is something useful for us to know if we're seeing patients in the field.

And also it doesn't tell us if someone gets a negative test result, what's the probability that they are actually Disney's free.

So this brings in our predictive values and like I said these are going to be more useful to us in the field but you know sensitivity and specificity and both of these predictive values I'm going to expect you to kind of know the difference and kind of what all of these mean and have calculated.

So first one positive predictive value once again conditional probability and this is just a probability that someone has the disease given that they test positive.

So it just looks like sensitivity, but just flipped around.

And then on the other end, negative victim value, that's just the probability that the disease is absent given the test is negative.

So just like that specificity flipped around. And we're not just throwing numbers at you, flipping things around.

It gives you, like I said, this is something that is more helpful to people actually in the field to see patients, because all we have in the field is that positive or negative test that our patient gets.

And then we can maybe look at the medical literature and think about a should we enroll them in another test that's more invasive to make sure or based on the positive predictive value or negative predictive value of this test, can we go ahead and either say yes, they have the disease or no, they don't rule it out.

So as you'd expect with both of these, we want tests of high-privileged values, right?

We want tests that are identifying disease correctly most of the time.

So I'm just going to round this in a 2 by 2 table here and kind of start talking through this.

I think this will help. So imagine this as, over here, is our new test, right?

And then imagine this as our gold standard, right?

And so actually I'm gonna back up even more and pull out the cell sheet.

So let's say patient. If the patient's maybe re-enrolled in a study and we're trying to figure out how well the new test is doing.

So, we had a bunch of people on our study and then if we have someone, they took, this is the gold standard, so this is that invasive thing they had to do.

Then we have the new test. And then maybe in the gold standard they got a positive and then the new test they got a positive so new tested did good on that one right maybe the negative negative right we got some agreement happening.

Then some of them, you know, you might have positive, new test might get it wrong, right?

So we have a data set like this, right? And we can summarize those results by having our test result here positive and negative.

And then We have that gold standard up here, disease or no disease.

We can count how many people end up in each one of those.

We can count those, have the positive, test results, and have disease.

We have a negative test result and don't have disease.

So, then we have our false negatives here, right?

Those who had negative and do have disease and our false positives up here, those who tested positive and don't have disease.

So, we're gonna want everybody to be in these two squares, right?

That's the goal we took there. We want everybody to be correctly identified. So this little 2 by 2 j will be kind of working with data like this in the next couple days.

Yeah, once again, we're going to want people in these, and we're going to minimize false negative, false positive.

So calculating sensitivity, calculating specificity, it's pretty straightforward.

We can calculate it straight from the data. We can just calculate here as the number of true positives divided by the total people who actually have the disease.

Same with specificity, we can calculate the number of true negatives divided by the total people who don't have the disease.

Predictive values can get a little different. They get a little more difficult to estimate. So predictive values are going to depend on what proportion of all positive tests can be expected to be true with positives.

So predictive values have to do with the prevalence of the outcome in the population.

So Does anybody know what prevalence is? Prevalence is kind of like more of an epidemiology term, but I think maybe you would sound familiar with some people.

Prevalence is just a proportion. It's going to be a proportion of how many people in a population have the outcome in a certain time period.

So sensitivity and specificity and prevalence are all going to be functions of these predictive values.

And so prevalence, like I said, is that the prevalence of a disease, proportion of the population, have the disease in a given time period.

We can also think of this as the probabilities that a randomly selected member has a disease.

So if we want to calculate like what's the prevalence of HIV in Georgia, we would just take all the cases and divide it by the total population of Georgia.

Then we'd get the prevalence. So it's fairly straightforward calculation. We can just calculate this as disease positives over N.

Prevalence is estimated from large scale prospective studies.

Anybody remember what prospective studies are?

We talked about this many weeks ago. Go ahead. Looking ahead. Looking ahead, exactly. So we can estimate prevalence of a disease from studies where we enroll a bunch of people who don't have the disease and watch it occur naturally over time.

With our case control studies, we can't estimate prevalence.

Anybody think about why we wouldn't want to estimate prevalence from a case control study?

So how do we, go ahead. Exactly, yeah, so you're getting at it there. So with a case control, remember, we would find people with the disease, and then we would match them to those who don't have the disease.

So we, the researcher in case control study, are setting the prevalence ourselves in matching.

So we might set the prevalence at 50-50 when that's not what's naturally occurring.

So that was a little weird aside, but that's going to come into play a lot in the next couple of days.

Our large perspective studies are going to be the kind that can render us this type of prevalence.

This is where it gets a little mathy. I'm going to spend too much time on this, but we went over conditional probabilities a few days ago, right?

And we can use these conditional probabilities, condition on B given A, and sort of untact all of this stuff here and end up with components of a formula for our positive predictive values and negative predictive values.

And both of these formulas are going to take into account the prevalence of the disease in the population.

They're also going to take into account the sensitivity and the specificity.

So we're going to go through this in great detail.

But I will say what we can get from this is we can get the sensitivity.

We can get the complement of the specificity and the prevalence of disease.

We can apply this theorem. We can apply these conditional probabilities that are based on prior prevalence of disease and end up with some of these formulas where we can calculate the positive predictive value, that thing we will use in the field to determine how well the tet, to determine how sure we are that the person has a positive or negative result.

So we end up with these formulas. So these few slides, I wouldn't worry too much about them, but this is the ticket here.

If you put a star by this slide, this is the one trick that you're going to have to use.

And all these are, they kind of look very similar, but they are different.

So if you're trying to calculate a positive predictive value, you want to use this formula.

That's just the sensitivity times the prevalence in the numerator and the sensitivity times the prevalence again.

Add that to the complement of the specificity times the complement of the prevalence.

And then negative frequency value looks very similar but slightly different as the specificity times 1 minus the prevalence divided by the specificity times 1 minus the prevalence in the denominator.

We add that to the complement of the sensitivity times the prevalence.

So this formula can trip some people up, but the main thing is if you can get your sensitivity and specificity from the problem, unpack that.

And then pinpoint whatever your prevalence is, your prevalence will be given to you and the problem, then you just kind of plug all this in and solve, and also be able to kind of interpret the findings.

And we're going to do a few samples here if I don't write that down.

So here it is. Approximately 1% of 4-year-old women who are screened for breast cancer actually have the disease.

A mammogram can correctly identify 99% of breast cancer cases.

Also, suppose that 10% of women who test it don't have breast cancer and get a positive result.

3% that don't have cancer get a positive. So we're going to kind of go through all these conditional probabilities and see kind of unpack them from this.

So From this, what would our sensitivity mean? Sort of like, kind of given to you in words in a number format up here.

So sensitivity is a probability that we test positive given gold standard disease, right?

Which one of these three numbers out there looks most like that?

Exactly, yeah. So a mammogram can correctly identify 99% of breast cancer cases.

So, that's just kind of giving us our sensitivity and work form.

And then our specificity, probability, test negative given no cancer.

So how can we get that from this? 99 from 10. I think you're close. You'd actually subtract just 100 from 10. So remember the false positive rate, That's going to be a complement of the specificity.

And here we're given a false positive rate. So suppose that 10 percent of women who tested don't have breast cancer get a positive result.

So we're just giving a positive rate there, so we're going to just do 1 minus that.

And then we'll get that specificity. We're going to flip things around and we get 0.9. Has everybody seen that? Follow me there a little bit. And then our positive and negative predictive values are all based on the prevalence of the disease in the population.

So these predictive values are going to be functions of the prevalence.

And the prevalence is given to us here. So what would the prevalence be? So the number given to us up there, so it would be 0.01, right?

So approximately 1% for the old women who are screened actually have the disease.

So we have a sensitivity we have specificity and we have positive so now we need to calculate we have a problem So we want to calculate a positive predictive value and negative predictive value.

So just before we even calculate anything, what is the positive predictive value going to tell us and why is it going to be more advantageous to us?

It's also a conditional probability, right? It's the probability of something given something else.

Go ahead. What's the probability that they have the disease given that they had tested positive, right?

So it's sort of like the sensitivity flipped. So we're conditioning upon that test. And the reason it might be more important, or more advantageous for us is because in the field, what information do we actually have or we have a third test result?

And how do we go about calculating it? Well, we have to use these two great formulas. So in this situation, we already have our sensitivity, we already have our specificity, We are given the prevalence of the disease in the wider population.

So we can just plug all this stuff in that formula to get our positive predictive value.

So we have our prevalence, right? We have the complement of our prevalence, 0.99. We have our sensitivity. We had the complement of our specificity. So for positive predictive value, this is what we'll use.

So we'll just plug our sensitivity, plug in our prevalence, plug in our sensitivity, plug in our prevalence, plug in our specificity and our prevalence as well, and we can just solve for this.

So we get this pretty low, pretty low proportion, right?

0.09. Does Anybody want to take a stab at interpreting this course?

Keep in mind, go ahead. There's like a 9.1% chance that she actually does the breast cancer she tests positive.

Exactly. So if a randomly selected 40-year-old woman gets a positive mammogram, there's a 9.1% chance she indeed has a breast cancer.

So what do we think? Does this sound surprising to anybody? So sensitivity, sorry, Positive predictive values are going to be functions of the prevalence of the outcome in the actual population.

And among 40-year-old women, breast cancer is not that prevalent.

It's going to be one of the populations where you see it less among women, whereas when we get older, it's going to be more prevalent.

And because of that, we have less of a positive predictive value, right?

We have less power in our positive predictive value.

So you're probably thinking, okay, well why are we even using mammograms then if they have such a low positive predictive value?

Well, now we're going to flip around, and this is where I was supposed to ask if someone should be overly alarmed.

But I'm just going to go straight to our negative predictive value here.

And if we plug all this in, we calculate the negative predictive value.

Same type thing. We just plug in that specificity prevalence, specificity prevalence, sensitivity prevalence.

And we end up with 0.99. Anybody want to take the stab at interpreting this?

It's a 9.99% chance that they've tested it, but they don't have it.

Exactly. So randomly select a four-year-old woman to say negative mammogram, there's a 99.

It's 49% chance she doesn't have breast cancer. So among this population, the mammogram does a better job of ruling out disease, right?

Because that prevalence is so low. So, you know, that question I posed a minute ago, Why do we even give mammograms if that's the case?

Well, in this population, it can do a good job ruling it out.

If someone gets a negative result, a four-year-old gets a negative result, we can be pretty sure they don't.

99.9% sure they don't have a negative result. But these things change as the prevalence changes in the population.

So among 50-year-olds or 60-year-olds, this might start to flip a little bit.

It might have a better positive predictive value as that prevalence goes up.

So the sensitivity, once again sensitivity and specificity, are measures of overall test quality, test accuracy, And on the other hand, predictive values can be thought of as measures of how well the test works in the actual population.

And that's because the predictive values are dependent on the prevalence of the disease in the population itself.

We just already population itself. We just already kind of talked about that. So this will kind of be a nice bridge to win say for us.

But we just did everything like using those big crazy formulas we're gonna learn in the rest of this week you're not always gonna have to use those crazy formulas sometimes you're gonna be able to use a two by two table to calculate these things You can just simply calculate the conditional probabilities by the row, by the column.

You can also kind of turn whatever data you're given in a problem into a 2 by 2 table.

So, like, we were given all these metrics. We were given, you know, sensitivity, specificity, and prevalence.

Well, we can actually, like, make our own little 2 by 2 table from that and just use some sort of theoretical total population.

So we can say, all right, we have this sensitivity.

We have this specificity. We have this prevalence. Let's say we use this test in a population of 100,000 people.

So 100,000 would be on the total bottom right here and we could take that 100,000 and multiply that by our prevalence.

So 0.01 and then we get 1,000 in our true positive on those who we think would truly have the disease.

And we did multiply those out of the 1,000. We can multiply that by our sensitivity to those who we believe would truly have the test, our true positives out of our total, our test positive out of our total true positive.

So we can do 1000 times .99. And Then we could also do 100,000 minus these true positives to get our true negatives and multiply those by our false positive rate to get this 9,990.

And then from this, we can just do some subtraction in addition to get all our row totals and column totals.

So we can take this minus this, get that here, this minus this, this here, add up our row totals.

And from this, we can just take our test positive, true positives, and we can divide that by our total test positives.

Because what are we conditioning upon again? It's our probability of having the disease given that you tested positive so that we would be conditioning upon those positives here.

So 990 over 2,890. And we get the same number, 9.1%. It's a way you could do it that sort of gets you out of having to implement these formulas, but that's essentially what the formula is doing behind the scenes, and it just kind of gets it into two by two table format.

It also, just talking through that, it seems a little confusing, so if you don't want to go that route, I totally understand.

Like I said, on Wednesday and Friday we'll look at types of data we can actually just use the table rows to calculate stuff.

So for like a prospective study where the prevalence is naturally occurring, we can just use the table rows to calculate our predictive values.

Before we go, just public service announcement, this is kind of the part of the semester where things like start to get maybe a little more challenging, maybe if you've taken stats in high school, maybe start getting outside the stats stuff, you've taken high school.

For some reason this is also the part of the semester where people stop coming.

It's like, you know what? People all come for the first unit, which is quite a bit easier.


biostats week 2a part 1

screening tests are used for those who are otherwise healthy ex cancer screening

DIAGNOSTIC TESTS are used in patients with symptoms of a disease

most diseases fair better when diagnosed early

GOLDEN STANDARD- an accepted test that is assumed to be able to determine the true disease (expensive and invasive though)

TESTS ARE NOT PERFECT

false positives and negatives can occur in any test

Slide 5 - Test Errors • A false positive occurs when a...

false positives can lead to anxiety or uneccessary treatment and also societal stigmas

false negatives delays any treatment the person may need

Slide 7 - In the United States, about 20,000 new cases of...

Slide 8 - Available screening tests:

Slide 9 - Sensitivity To determine which screening test...

SENSITIVITY of a screening test is the conditional probability that the test is prositive given that the person actually has the disease

the line in a probability is among or given, so it is based on dependent events

Slide 10 - Sensitivity Sensitivity = 𝑃 Test + Disease + •...

complement of something is 1- it happening

FALSE NEGATIVE RATE IS EQUAL TO 1- SENSITIVITY

SENSITIVITY IS P(TEST +|DISEASE +)

SO FNR EQUALS P(TEST - | DISEASE +)

Slide 11 - Specificity • The specificity of a screening...

SPECIFICITY of a screening test is the conditional probaility that the test is - given the the person does NOT have the disease

Slide 12 - Specificity Speci1icity = 𝑃 Test − Disease − •...

FALSE POSITIVE RATE IS EQUAL TO 1- SPECIFICITY

SPECIFICITY IS P(TEST -|DISEASE -)

SO FPR EQUALS P(TEST + | DISEASE -)

Slide 13 - Sensitivity and Specificity Example: Thyroid...

you want high sensitivity and specificity for accuracy. however, you'd also want to choose a screening test that is less invasive

Slide 14 - Choosing a Screening Test Example: Thyroid Cancer

higher FNR means higher negative rate..missed treating

higher FPR means higher positive rate...unnecesary treatment

Slide 18 - Predictive Values • While very useful, the...

sensitivity and specificity does NOT tell us the probability of an actual disease case in a positive test or a lack of disease in a negative result

Slide 19 - Predictive Values • The positive predictive...

POSITIVE PREDICTIVE VALUE (PPV OR PV+) is the probability that disease is present given that the test is positive

PPV= P(DISEASE+ | TEST +)

NEGATIVE PREDICTIVE VALUE (NPV OR PV-) is the probaility that the disease is absent given a negative test

NPV= P(DISEASE- | TEST -)

Slide 20 - Predictive Values • In general, we desire tests...

Slide 21 - Predictive Values

Slide 23 - The predictive values depend on what proportion...

predictive values are based on how prevalent the disease in the population

(presence is the proportion of how many people can have a disease in a time period)

PREVALENCE IS PART OF A FUNCTION

sensitivity and specificity CAN be estimated by data while predictive values CANNOT

Slide 25 - P(D+) = D+/N = Prevalence

prevalence are based on large PROSPECTIVE studies (those who don't have disease and find it over time)

P(D+)= D+/N= PREVALENCE

Slide 26 - Bayes’ Theorem

Slide 27 - Applying Bayes’ Theorem Let 𝐴 = Test + , and 𝐵...

complement is 1- probability

Slide 29 - PPV and NPV Formulas!!

PPV = (SENSITIVITY PREVALENCE)/ [ (SENSITIVITY PREVALENCE)]+[(1- specificity)*(1-PREVALENCE)]

NPV = (SPECIFICITY (1- PREVALENCE)/ [ (SPECIFICITY (1-PREVALENCE))+(1- sensitivity)*(1-PREVALENCE)]

Slide 30 - Breast Cancer Screening Example:

10% of women who don't have breast cancer get a positive rate= CALCULATE FPR

Slide 31 - Breast Cancer Screening Example:

SENSITIVITY WOULD BE THE PERCENT THE MAMMOGRAM CAN ACTUALLY DETECT

THE OTHER 90% BSED ON THE 10% WOULD BE THE SPECIFICITY (

THE PREVALENCE IS THE 1% OF WOMEN WHO HAVE THE DISEASE WHEN THEY ARE SCREENED (.01)

Slide 32 - Breast Cancer Screening Example:

Slide 33 - Breast Cancer Screening Example:

Slide 34 - Prevalence = 0 . 01 1 − Prevalence = 1 − 0 . 01...

Prev: .01 Prev comp: .99 (100-.01) Sensitivity: .99 Specificity comp= .1

Specificity: ..9

Slide 35 - Breast Cancer Screening Example: Mammograms for...

doing the PPV calulation will show the percent chance a women has the disease when taking the test

Slide 37 - Example: Mammograms for 40 - year - old women

high percentage in NPV will show there's a higher chance she edoesn't have breast cancer

PPV can change with age

Slide 38 - Sensitivity & Specificity Versus Predictive...

Slide 41 - Breast Cancer Screening Example: Mammograms for...



Made With Glean | Open Event

robot