Select your font size 
 
about us products & services consulting & support news & events contact us
Paul Meagher explains the meaning of a positive cancer test result, and in so doing he shows how to calculate conditional probability.

Learning from experience - SK

print this article 
 

To appreciate how the getConditionalProbabiltity function might be used in practice, consider a doctor confronted with the problem of determining whether a patient has cancer given that the patient tested positive on some cancer test. The test could be something as simple as a "yes" or "no" answer to a question (such as, were you ever exposed to high levels of radiation?) or it could be the result of a physical examination of the patient.

To compute the conditional probability of cancer given a positive test result, the doctor might tally the number of past cases where cancer and a positive test result occurred together and divide by the overall number of positive test results. The following code computes this probability based on a total of four past cases where this co-variation information was collected -- perhaps from the doctor's personal experiences with this particular cancer test.

Listing 2. Computing a conditional probability using getConditionalProbabiltity

<?php 
require "getConditionalProbability.php"; 

/** 
* The elements of the $Data array use this coding convention: 
* +cancer - patient has cancer 
* -cancer - patient does not have cancer 
* +test - patient tested positive on cancer test 
* -test - patient tested negative on cancer test 
**/ 

$Data[0] = array("+cancer", "+test"); 
$Data[1] = array("-cancer", "-test"); 
$Data[2] = array("+cancer", "+test"); 
$Data[3] = array("-cancer", "+test");

// specify query variable $A and conditioning variable $B 
$A = "+cancer"; $B = "+test"; 

// compute the conditional probability of having cancer given 1) 
// a positive test and 2) a sample of covariation data 
$probability = getConditionalProbabilty($A, $B, $Data); 
echo "P($A|$B) = $probability"; 
// P(+cancer|+test) = 0.66666666666667 

?>

As you can see, the probability of having cancer given:

  1. A positive test result
  2. The data collected to date

is estimated at 67 percent. In other words, in the next 100 cases where a patient tests positive, the best point estimate is that in 67 of those cases, the patient will actually have cancer. The doctor will need to weight this probability along with other information to arrive at a final diagnosis if one is warranted.

I can summarize what has been demonstrated here in more radical terms as follows:

An agent that derives a conditional probability estimate using the enumeration method appears to learn from experience and will provide an optimal estimate of the true conditional probability if it has enough representative data to draw upon.

If I replace the hypothetical doctor with a software agent implementing the enumeration algorithm above and being fed a steady diet of the case data, I might expect the agent's conditional probability estimates to become increasingly more reliable and accurate. I might say that such an agent is capable of "learning from experience."

If this is so, perhaps I want to ask what the relationship is between this simple enumeration technique for computing a conditional probability and more legitimate examples of "learning from experience," such as the semi-automated classification of spam using Bayes methods. In the next section, I will show a simple spam filter can be constructed using the enumerative power of a database.



Page:   1  2  3  4  5  6  7  8  9  10  11 Next Page: Conditional probability and SQL

The content shown in this page was first published by IBM developerWorks and is reprinted with permission from Paul Meagher (www.datavore.com)


Most Recent Website and Regional Updates

 Timing Upgrades - Factors Affecting Time Between Purchases for Tech Toys
It is possible to understand client purchase decisions by performing a regression analysis. By forming strategies based on the results, companies can optimize strategic programs to maximize profits.

 
 Personal Shopping Assistants - Turning the Table Against Merchant Databases
Consumers can use technology to watch the merchants who already have been watching them. But to do this, they need a champion.

 
 Operations Research
Links to pages related to Operations Research, which is the methodical study of how to do things better.

 

Google
 
Web transparen.com

Contact Information

Related Information

 
  Saskatoon
Regina
Prince Albert
Moose Jaw
Yorkton
Swift Current
North Battleford
Estevan
Weyburn
Corman Park
 
 
E C M | © 2003-2007 Transparen Corp.      

Standardized Services: Data Recovery Service / Creative Services / Premium Web Hosting Services / System Administration Tech Support Services
Recent Projects: Full-Service Mortgage and Financing Company / System to manage flights from Vancouver to Tofino / Photo exchange verification service
Our Vancouver BC Server Proudly Hosts: automated parking and revenue control systems, leafside lane at southlands, cost effective alternative power sources, the photo genie, pacific forage bag supply, sunburst medical, neosonic design, roger mahler photography - passionate, intriguing, desirable, the connection between east and west, affordable flights to victoria and tofino, low interest mortgage brokers in vancouver, richmond, surrey, toronto, mortgage brokers in calgary
Saskatoon, Regina, Prince Albert, Moose Jaw, Yorkton, Swift Current, North Battleford, Estevan, Weyburn, Corman Park