9 posts

## Simple statistics problem - does this make sense to you?

Back to Forum: Coffeehouse
• I'm doing a statistics problem for an entry level stats class (actually its a 300 level course but it's dead easy). Anyway, I came across this problem and the answer doesn't seem right:

The proportion of people in a given community who have a certain disease is 0.005. A test is available to diagnose the disease. If a person has the disease, the propability that the test will produce a positive result is 0.99. If the person does not have the disease, the probability that the test will produce a positive signal is 0.01. Suppose a person tests positive, what is the probability that the person actually has the disease?

So, I define event A as being "A person has the disease", and event B as being "The test comes up positive. Given the above information, we can infer:

P(A) = 0.005
P(!A) = 0.995
P(B|A) = 0.99
P(B|!A) = 0.01

The question is asking, what is P(A|B)?

This is a simple Baye's Theorem problem:

P(A|B) = (P(A) * P(B|A)) / (P(A) * P(B|A) + P(!A) * P(B|!A))

And the answer comes out to about 33%. So, if a test is done and it comes up positive, its only 33% accurate? Does that make sense?

I'm going into the assistance center tomorrow to check. I think I have my conditional probabilities mixed up.

• Yup, makes sense to me.

EDIT: IRL conditioning is harder.  People don't randomly decide to take the test - some people choose to take the test because they know they've been exposed to the test, or because they're exhibiting symptoms.  That changes the math.

• But isn't 33% kinda low? Is the test really that bad?

• jmacdonagh wrote:
I'm doing a statistics problem for an entry level stats class (actually its a 300 level course but it's dead easy). Anyway, I came across this problem and the answer doesn't seem right:

The proportion of people in a given community who have a certain disease is 0.005. A test is available to diagnose the disease. If a person has the disease, the propability that the test will produce a positive result is 0.99. If the person does not have the disease, the probability that the test will produce a positive signal is 0.01. Suppose a person tests positive, what is the probability that the person actually has the disease?

So, I define event A as being "A person has the disease", and event B as being "The test comes up positive. Given the above information, we can infer:

P(A) = 0.005
P(!A) = 0.995
P(B|A) = 0.99
P(B|!A) = 0.01

The question is asking, what is P(A|B)?

This is a simple Baye's Theorem problem:

P(A|B) = (P(A) * P(B|A)) / (P(A) * P(B|A) + P(!A) * P(B|!A))

And the answer comes out to about 33%. So, if a test is done and it comes up positive, its only 33% accurate? Does that make sense?

I'm going into the assistance center tomorrow to check. I think I have my conditional probabilities mixed up.

Well, it means that there is a 33% chance that you have the disease given that the test came out positive...  sounds reasonable.  I'd have to check your math and your equation to be certain.

  Yes, your math's right.

• jmacdonagh wrote:
But isn't 33% kinda low? Is the test really that bad?

It's not that the test is bad.  It's that the disease is so rare.

• This problem is known in medical circles as the positive predictive value for a diagnostic test or procedure. Yes, it makes sense. Given  a rare enough illness, this test comes close to being a good screening test for evaluating a high risk group.

• Comment removed at user's request.

• Programous wrote:

The way I would interpret the problem the first piece of information is irrelevant to the problem. Then Boolean logic says there are only two possible outcomes of the test, so any outcome must be either true (positive) or false (negative). The information stipulates that of the set of positive values there is a .01 chance of a false positive. Thus, due to the law of no middle ground, there must be a .01 chance of a false negative. This would indicate that of a given test there is a .99 chance that the outcome is correct, for both the set of positive and negative results. I would then say the test is .99 accurate.

The general consensus here would state that I’m wrong, if so, could someone explain the flaw in my logic?

Your logic is fine for someone who hasn't taken a statistics class...  however this is one of the less intuitive areas of statistics.  What he's calculating is the probability that you have the disease if the test comes up positive, and that's the measure they use for accuracy.

• This is why I hate stats.

Programous wrote:

The way I would interpret the problem the first piece of information is irrelevant to the problem.

Wrong.  It is very important since you are attempting to determine if a positive result means that you actually have the disease.

Programous wrote:

Then Boolean logic says there are only two possible outcomes of the test, so any outcome must be either true (positive) or false (negative). The information stipulates that of the set of positive values there is a .01 chance of a false positive.

True.

Programous wrote:

Thus, due to the law of no middle ground, there must be a .01 chance of a false negative.

False.  You can't say that at all.
Programous wrote:

This would indicate that of a given test there is a .99 chance that the outcome is correct, for both the set of positive and negative results.

Nope.
Programous wrote:

I would then say the test is .99 accurate.

The general consensus here would state that I’m wrong, if so, could someone explain the flaw in my logic?

The problem is that the chance of you actually having the disease affects the chance of the test being accurate.

I can't summarize an entire, hellish semester of Stats in one forum post, but 33% is accurate.

## Conversation locked

This conversation has been locked by the site admins. No new comments can be made.