Elvira is a tool for building and evaluating graphical probabilistic models, more specifically Bayesian networks and influence diagrams. Elvira has its own format for the coding of models, a parser, a graphical interface for the construction of networks, with options for canonical models (OR-, AND-, and MAX-gates), exact and approximate algorithms for both discrete and continuous variables, explanation methods, decision-making algorithms, learning from databases, fusion of networks, etc.
Elvira is written and compiled in Java, which allows it to run on different architectures and operating systems (MS-DOS/Windows, linux, Solaris, etc.).
Warning. Version 0.11 of this manual accompanies the January 2003 release of Elvira. You may obtain a more recent version of Elvira through the installation page.
Example 1. For a disease whose prevalence is 8% there exists a test having a 75% sensitivity and a 96% specificity. What are the predictive of the test w.r.t. the disease?
In this problem there are two variables involved: Disease and Test. The disease can have to values, present and absent, and the test can give two results: positive and negative. Since the first variable causally influences the second, we will draw a link Disease->Test.
Then we clic on the point were we wish to place the node Disease. By double-clicking on this node, we open the Node properties window. In the Title field in this window we write Disease. By clicking on the Values tab in the same window we can check that the values that Elvira has assigned by default are "present" and " absent", which are just what we wished. The we clic on the Accept button.
In the same way, we create the second node, we double-clic on it to open the Node properties window and write Test in the Title field. In the Values tab we open the pull-down menu Type of values and select Positive-negative. Then we clic on the Accept button to close the window.
We complete the graph by adding a link from the first node to the second. First we have to change the Add random node tool () by the Add directed link tool (), which is also on the second toolbar. Then we drag the mouse from the origin node, Disease, to the destiny node, Test. The result must be similar to the above figure. If you with to move a node, change to the Selection too by clicking on the corresponding icon () on the second toolbar and drag the node with the mouse.
The CPT for the variable Disease is given by its prevalence, which, according with the statement of the example, is P(Disease=present) = 0.08. We introduce this parameter by double-clicking on the Disease node, selecting the Relation tab, writing 0.08 in the corresponding cell and clicking on pressing Enter. Please note that the bottom cell changes to 0.92, because the sum of the probabilities must be one.
In this figure we can see both probabilities because the All parameters radio button is selected. If we select the Independent parameters radio button we only see the value 0.08--the 0.92 value does not appear, because it is not necessary to display it.
The CPT for the variable Test is built by remembering that the sensitivity (75%) is the probability of a positive test result when the Disease is present, and the specificity (96%) is the probability of a negative test result when the Disease is absent. The CPT introduced into Elvira will look like this:
In principle, the graph of the network does not offer information about the conditional probabilities. However, when clicking on the Show link influences button (), Elvira offers qualitative information about the numerical parameters by drawing links with different colors and widths. A red-colored link represents a positive influence, which, roughly speaking, means that an increase in the value of the parent node implies an increase in the value of the child node. A blue link represents a negative influence. A purple link represents an ambiguous influence, i.e., an influence that is positive for some values and negative for others. The higher the influence, the thicker the link. A black link means that the link does not transmit any influence. Before introducing the CPTs all links are black, because Elvira initially assigns a uniform distribution for each configuration of the parent nodes.
We can see that the second toolbar has changed: the edit buttons, such as cut, copy and paste, has disappeared, and there are new buttons and fields: expansion threshold, purpose, save case in file, store case in memory, expand/contract node, inference options, etc.
A second change, unnoticed to the user, is that Elvira creates an internal data structure, which will be use to compute the probability of each variable. This process, known as compilation of the network, makes that, in the case of large networks, it may take a few seconds to change from edit mode to inference mode.
We can also observe that some nodes have been expanded. In fact, in the example we are showing, Elvira shows the probability of each value in two ways: by means of a horizontal bar whose probability is proportional to the probability, and by means of a number. Since we have not introduced any finding yet, these are the prior probabilities. For instance, the probability for Disease=present is 0.08, one of the parameters we introduced after the statement of the problem, while the prior probability for the values of Test has been computed by Elvira. In fact, we can check that
In principle, Elvira displays the numerical value with two decimal digits. The number of decimals can be increased by selecting the View option in the bar menu and selecting the desired Precision.
PPV = P(Disease=present| Test=positive) = 0.62 = 62%
As expected, this posterior probability is much higher than the prior probability, because a positive result in the test tends to confirm the presence of the Disease.
Analogously, the negative predictive value (NPV) es the certainty with which we can discard the Disease when the Test gives a negative result: NPV = P(Disease=absent|Test=negative). The way to compute it with Elvira is to introduce the finding "Test=negative" and to observe the posterior probability of the value "Disease=absent".
If we are interested also in seeing how the probability of the Disease varies as a function of the result of the Test, we have to store the current evidence case (an evidence case consists of a set of findings. In this example, the only finding we have is the result of the Test), which can be accomplished by means of the button Store case in memory () in the second toolbar. We see that the text field in this toolbar changes from Case number 1 on a red background to Case number 2 on a blue background.
Now that the second evidence case is the current one, we double-clic on the value "negative" of node Test. The probability of "Test=negative" becomes one and the probability of "Test=positive" becomes 0. We also observe that P(Disease=present|Test=negative) = 0.02 and P(Disease=absent|Test=negative) = 0.98. Therefore,
NPV = P(Disease=absent| Test=negative) = 0.98 = 98%
which solves our problem.
In the inference bar there is a button for deleting all the findings of the current evidence case, and another button for deleting all the evidence cases (except the initial case, corresponding to the prior probability).
Please note that in this figure we have selected Independent parameters instead of All parameters. We have also selected Canonical parameters. If we select CPT Elvira will show the conditional probability table, computed from the canonical parameters.
After editing the graph, we can introduce the conditional probability tables for nodes Disease and Test, as explained in Section 2.2.3. The node Therapy does not have a probability table because it does not represent a random variable, but a decision, which will be made depending on the available information, as will see in the next section. Now, we only have to introduce the utility table for the node Utility.
Therefore, we double-clic on the node Utility and type in the values shown in the following table:
We can save this network on disk.
From this table we can deduce that when the test gives a positive result the best decision is to apply the Therapy, because it entails a higher expected utility: 83.80 vs. 56.61. On the contrary, when the test gives a negative result, it is better not to apply the Therapy, because the utility of this decision is 98.45, vs. a utility of 89.78 in case of applying the Therapy..