1 Introduction
Probabilistic model checking of temporal logic formulae is a central problem in formal modelling, both from a theoretical and an applicative perspective [Hansson1994, Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. Classical algorithms based on matrix exponentiation and uniformisation are well understood, and form the core routines of mature software tools such as PRISM [kwiatkowska_prism_2011], MRMC [katoen_markov_2005] and UPPAAL [Behrmann2006]. Nevertheless, the need to explicitly represent the state space makes their application to large systems problematic, or, indeed, theoretically impossible in the case of systems with unbounded state spaces, which appear frequently in biological applications.
Statistical model checking (SMC) approaches [Younes2006, Zuliani2010]
have emerged in recent years as a powerful alternative to exact techniques. Such methods provide a Monte Carlo estimate of the desired probability by repeatedly sampling trajectories from the model. SMC can also provide probabilistic guarantees on the estimated probabilities, and, by choosing the number of simulations to be suitably large, one can reduce the uncertainty over the estimates arbitrarily.
While SMC offers a practical and flexible solution in many scenarios, its reliance on repeated simulation of the system makes it naturally computationally intensive. While SMC can be trivially parallelized, the approach can still be computationally onerous for systems which are intrinsically expensive to simulate, such as systems with large agent counts or exhibiting stiff dynamics.
In this paper, we propose an alternative approach to solving the probabilistic model checking problem which draws on a recently proposed technique from statistical physics [Schnoerr2017_arxiv]. We show that the model checking problem is equivalent to a sequential Bayesian computation of the marginal likelihood of an auxiliary observation process. This marginal likelihood yields the desired timebounded reachability probability, which is closely related to the eventually and globally temporal operators. We also expand the methodology to the case of the timebounded until operator, thus covering a wide range of properties for temporal logics such as CSL [Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. The formulation of the model checking problem as a Bayesian inference method allows us to utilise efficient and accurate approximation methodologies from the machine learning community. In particular, we combine Assumed Density Filtering (ADF) [Maybeck1982, Minka2001]
with a momentclosure approximation scheme, which enables us to approximate the entire cumulative distribution function (CDF) of the
firsttime that a timebounded until property is satisfied by solving a small set of closed ordinary differential equations and lowdimensional integrals.
The rest of the paper is organised as follows. We discuss the related work in Section 2 and we provide some background material on Markov chains and model checking in Section 3. We then describe our new approach, highlighting both the links and differences to the recently proposed statistical physics method of [Schnoerr2017_arxiv] in Section 4. To illustrate the performance of the method, we consider four nonlinear example systems of varying size and stiffness in Section 5, showing that the method is highly accurate and often considerably faster than SMC.
2 Related Work
In recent years, the computational challenges of probabilistic model checking have motivated the development of approaches that rely on stochastic approximations as an alternative to both classical methods and SMC. In one of the earliest attempts, passagetime distributions were approximated by means of fluid analysis [Hayden2012]. This framework was later extended to more general properties expressed as stochastic probes [Hayden2013]. Fluid approximation has also been used to verify CSL properties for individual agents for large population models [Bortolussi2012, Bortolussi2015]. In [Bortolussi2013], a Linear Noise Approximation (LNA) was employed to verify not only local properties of individuals, but also global ones, which are given as the fraction of agents that satisfy a certain local specification. The verification of such local and global properties has been recently generalised for a wider class of stochastic approximations, including moment closure [Bortolussi2017].
Regarding our work, one key difference with respect to these earlier approaches is that we consider global timebounded until properties that characterise the behaviour of the system at the population level. In that sense, our approach is mostly related to [Bortolussi2014, Bortolussi2016], which rely on the LNA to approximate the probability of global reachability properties. In particular, the LNA is used to obtain a Gaussian approximation for the distribution of the hitting time to the absorbing set [Bortolussi2014]. The methodology is different in [Bortolussi2016], where it is shown that the LNA can be abstracted as a timeinhomogeneous discretetime Markov chain which can be used to estimate timebounded reachability properties. However, this method approximates the unconstrained process, and needs to subsequently resort to space and time discretisation to approximate the desired probability.
3 Background
A ContinuousTime Markov Chain (CTMC) is a Markovian (i.e. memoryless) stochastic process that takes values on a countable state space and evolves in continuous time [Durrett2012]. More formally:
Definition 1
A stochastic process is a ContinuousTime Markov Chain if it satisfies the Markov property, i.e. for any :
(1) 
A CTMC is fully characterised by its generator matrix , whose entries denote the transition rate from state to state , for any [Norris1997]. The dynamics of a CTMC are fully described by the master equation, which is a system of coupled ordinary differential equations that describe how the probability mass changes over time for each of the states of the system. For a CTMC with generator matrix , the master equation will be:
(2) 
where is the transition probability matrix at time ; the quantity denotes the probability to transition from state at time to state at time . The master equation is solved subject to initial conditions .
Throughout this work, we shall consider CTMCs that admit a population
structure, so that we can represent the state of a CTMC as a vector of nonnegative integervalued variables
, that represent population counts for different interacting entities.3.1 Moment Closure Approximation
For most systems, no analytic solutions to the master equation in (2) are known. If the state space is finite, (2) constitutes a finite system of ordinary differential equations and can be solved by matrix exponentiation. For many systems of practical interest however, is either infinite, or so large that the computational costs of matrix exponentiation become prohibitive.
Moment closure methods constitute an efficient class of approximation methods for certain types of master equations, namely if the elements of the generator matrix are polynomials in the state . This is for example the case for population CTMC of mass action type which are frequently used to model chemical reaction networks [Gardiner2009]. In this case, one can derive ordinary differential equations for the moments of the distribution of the process. Unless the are all polynomials in of order one or smaller, the equation for a moment of a certain order will depend on higher order moments, which means one has to deal with an infinite system of coupled equations. Moment closure methods close this infinite hierarchy of equations by truncating to a certain order. A popular class of moment closure methods does so by assuming to have a certain parametric form [schnoerr2017approximation]. This then allows to express all moments above a certain order in terms of lower order moments and thus to close the equations for these lower order moments.
In this paper, we utilise the socalled normal moment closure
which approximates the solution of the master equation by a multivariate normal distribution by setting all cumulants of order greater than two to zero
[Goodman1953, schnoerr2014validity, schnoerr2015comparison]. This class of approximations was recently used within a formal modelling context in [Feng2016].3.2 Probabilistic Model Checking
The problem of probabilistic model checking of CTMCs is defined in the literature as the verification of a CTMC against Continuous Stochastic Logic (CSL) [Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. A CSL expression is evaluated over the states of a CTMC. In the original specification [Aziz1996], the syntax of a CSL formula is described by the grammar:
where is a stateformula, and is a pathformula, i.e. it is evaluated over a random trajectory of the Markov chain. An atomic proposition identifies a subset of the state space; in this paper, we consider atomic propositions to be linear inequalities on population variables. The probabilistic operator allows reasoning about the probabilities of a pathformula :
asserts whether the probability that is satisfied meets a certain bound expressed as , where and . In order to evaluate the probabilistic operator, we need to calculate the satisfaction probability for a pathformula , which involves one of three temporal operators: next , unbounded until , and timebounded until .
For a finite CTMC, it is wellknown that evaluating the probability of is reduced to matrix/vector multiplication, while evaluating the unbounded until requires solving a system of linear equations [Baier2000]. The timebounded until operator can also be evaluated numerically via an iterative method that relies on uniformisation [Baier2000]. This process may have a prohibitive computational cost if the size of the state space is too large. For systems with unbounded state space, the only option to estimate the timebounded until probabilities is by the means of stochastic simulation [Younes2006, Zuliani2010], which also has a high computational cost.
Other temporal operators can be expressed as special cases of the until operator. For the timebounded eventually operator we have: , while for the globally operator we have: . The latter two operators formally describe the problem of timebounded reachability.
4 Methodology
Assuming a property of the form , our goal is to approximate the cumulative probability of being satisfied for the first time at time , that is, the cumulative distribution function for the firstpassage time of .
4.1 Timebounded Reachability as Bayesian Inference
Before discussing the until operator, we shall consider the problem of reachability, which is closely related to the eventually temporal operator . The globally operator can also be formulated as the negation of a reachability problem, as shown in Section 3.2. If denotes the set of states that satisfy the formula , then we are interested in the probability that is reached for the first time; this quantity is also known in the literature as firstpassage time.
Building upon [Cseke2013, Cseke2016] Schnoerr et al [Schnoerr2017_arxiv] have recently formulated timebounded reachability as a Bayesian inference problem. Using this formulation, they proposed a method where the entire distribution of firstpassage times can be approximated by taking advantage of some wellestablished methodologies in the Bayesian inference and statistical physics literature. In the current section, we revise the approach of Schnoerr et al [Schnoerr2017_arxiv] for reachability, while in Section 4.2 we expand the method to the more general case of timebounded until.
In the Markov chain literature [Norris1997], the states in the set are often called the absorbing states. Let denote the set of nonabsorbing states. The cumulative probability for the system to reach an absorbing state at or before time is clearly equal to 1 minus the probability of the system having remained in until . Schnoerr et al’s insight was to formulate this probability in terms of a Bayesian computation problem. Consider an auxiliary binary observation process which evaluates to 1 whenever the system is in the nonabsorbing set . The pair constitutes a hidden Markov model (HMM) in continuous time; the required cumulative probability would then correspond to the marginal likelihood of observing a string of all 1s as output of the HMM. Computing this marginal likelihood is a central and well studied problem in machine learning and statistics.
Even in this novel formulation, the problem is generally still intractable. To make progress, we first discretise the time interval into time points with spacing . For the process at time being in we thus have the observation model if and zero otherwise. Note that is the distribution of the observation process , i.e. . The marginal likelihood of having remained in for all factorises as
(3) 
where we introduced the notation . The factors of the rhs in (3) can be computed iteratively as follows. Let be the initial condition of the process. Suppose that the system did not transition into the absorbing set until time (that is, the process remained in ), and that the state distribution conditioned on this observations is . We can solve the system forward in time up to time to obtain the predictive distribution , which will serve as a prior, and combine it with the likelihood term that the process has remained in at time .
We can then define a posterior over the state space by simply applying the Bayes rule as follows:
(4) 
The likelihood term represents the probability that the process does not leave at time . The prior denotes the state space probability considering that the process had remained in for time . The posterior then will be the state space distribution after observing that the Markov process has remained in at the current step.
Note that the evidence in (4) is just a factor in the rhs of (3). It can be easily obtained by marginalising the joint probability over :
(5) 
The process described above is a Bayesian formulation for the introduction of absorbing states. By multiplying by the likelihood, we essentially remove the probability mass of transitioning to a state in ; the remaining probability mass (the evidence) is simply the probability of remaining in . Therefore, the probability of transitioning to for the first time at time is the complement of the evidence:
(6) 
Thus, Equation (6) calculates the firstpassage time probability for any . Note that this approach neglects the possibility of the process leaving from and returning to region within on time step. The time spacing thus needs to be chosen small enough for this to be a good approximation.
Schnoerr et al [Schnoerr2017_arxiv] further approximated the binary observation likelihood
by a soft, continuous loss function. This allowed them to take the continuum limit of vanishing time steps which in turn allows to approximate the evidence
by solving a set of ODEs. In this work, we keep the binary, discontinuous observation process and keep time discrete, which allows us to extend the framework from [Schnoerr2017_arxiv] to the timebounded until operator.4.2 The Timebounded Until Operator
Consider the timebounded property which will be satisfied if a state in is reached up to time and the stochastic process has remained in until then. Assuming that is satisfied up to , there are three distinct possibilities regarding the satisfaction of the until property:

it evaluated as false if we have and simultaneously,

the property is evaluated as true if ,

otherwise the satisfaction of the property is undetermined up to time .
These possibilities correspond to three nonoverlapping sets of states: , and accordingly, as seen in Figure 1.
In order to calculate the firstpassage time probabilities for any time , we assume that the property has not been determined before . That means that the Markov process has remained in the set , which is marked as the grey area in Figure 1. The Bayesian formulation of reachability discussed in Section 4.1 can be naturally applied to the problem of reaching the union . The prior term denotes the state distribution given that the property remained undetermined before . The likelihood term indicates whether the Markov process has transitioned within the nonabsorbing set at . Finally, the posterior given by (4) will be the state space distribution after observing that the property has remained undetermined at the last step.
In contrast with the reachability problem however, once the absorbing set is reached, we only know that the formula has been determined, but we do not know whether it has been evaluated as true or false. More specifically, the evidence as given by Equation (5) represents the probability that the satisfaction has remained undetermined at time . Although the negation of the evidence was sufficient to resolve the reachability probability as in Equation (6), now we are interested only in a subset of the absorbing states. At a particular time we have to calculate the probability of reaching explicitly, which is given by the overlap mass of the prior process and probability of transitioning into :
(7) 
Considering the fact that the likelihood is actually a truncation of the state space, as it will be if and otherwise, the firstpassage probability at time is given as follows:
(8) 
Considering a Gaussian approximation for , as we discuss in the next section, and given that the state formula is a conjunction of linear inequalities, Equation (8) can be easily calculated by numerical routines.
The Bayesian formulation that we introduce has essentially the same effect as the traditional probabilistic model checking methods [Baier2002]. The probability of the until operator is usually evaluated by first introducing the set of absorbing states , and then calculating the probability of reaching the set , which is a subset of the absorbing states. The advantage of this sequential Bayesian inference formulation is that it allows us to leverage wellestablished machine learning methodologies, as we see in the section that follows.
4.3 Gaussian Approximation via Assumed Density Filtering
The Bayesian formulation as described in the previous section does not involve any approximation. In fact for a discretestate system, both the prior and the likelihood terms (i.e. and equivalently) will be discrete distributions in (4). Therefore, quantities such as the evidence in (5) and the probability of reaching in Equation (7) can be calculated exactly, as the integrals reduce to summations. However, if the size of the state space is too large or unbounded, this process can be computationally prohibitive. The formulation presented above allows us to derive an efficient approximation method that relies on approximating the discrete process by a continuous one.
We adopt a moment closure approximation scheme where all cumulants of order three or larger are set to zero, which corresponds to approximating the singletime distribution of the process by a Gaussian distribution. As described in Section
3.1, the moment closure method results in a system of ODEs that describe the evolution of the expected values and the covariances of the population variables in a given CTMC. At any time , the state distribution is approximated by a Gaussian with mean and covariance :The evidence is the probability mass of nonabsorbing states; i.e. it is observed that the process has remained within . Since is identified by linear inequalities on the population variables, both the evidence in Equation (5) and the probability mass in the target set in (7) can be estimated by numerically solving the integral in (8). There are many software routines readily available to calculate the CDF of multivariate Gaussian distributions by numerical means.
Nevertheless, the posterior in Equation (4) is not Gaussian, thus we have to introduce a Gaussian approximation. It is proven that ADF minimises the KL divergence between the true posterior and the approximating distribution, subject to the constraint that the approximating distribution is Gaussian [Maybeck1982, Minka2001]. Considering the prior , the ADF updates [Cseke2016] will be:
(9)  
(10) 
where the evidence is equal to the mass of the truncated Gaussian that corresponds to the nonabsorbing states . The dimensionality of the Gaussians is equal to the number of distinct populations in the system; this is generally small, meaning that computations of truncated Gaussian integrals can be carried out efficiently. A detailed exposition can be found in Appendix 0.A.
4.4 Algorithm
Algorithm 1 is an instantiation of model checking via sequential Bayesian inference (MCSBI). The algorithm evaluates the probability that a property is satisfied for a sequence of time points , thus approximating the CDF of the first time that is satisfied.
In the beginning of each iteration at line 5, we calculate the probability that is satisfied at . At lines 6–8, we calculate the posterior state distribution, assuming that has not been determined at the current step. Finally, the state distribution is propagated by the moment closure ODEs; the new state probabilities will serve as the prior in the next iteration.
It is useful at this stage to pause and consider the differences from the firstpassage time algorithm proposed in [Schnoerr2017_arxiv]: both papers share the same insight that reachability properties can be computed via Bayesian inference. However, the resulting algorithms are quite different. The crucial technical difficulty when considering formulae involving an until operator is the need to evaluate the probability of transitioning into the region identified by the second formula . It is unclear how to incorporate such a computation within the continuoustime differential equations approach of [Schnoerr2017_arxiv], which dictates the choice of pursuing a time discretisation approach here. The time discretisation however brings the additional benefit that we can evaluate exactly the moments of the Bayesian update in step 8 of Algorithm 1, thus removing one of the sources of error in [Schnoerr2017_arxiv] (at a modest computational cost, as the solution of ODEs is generally faster than the iterative approach proposed here).
5 Examples
In this section, we demonstrate the potential of our approach on a number of examples. More specifically, we report for each example the calculated CDF for the time that a formula is first satisfied. Additionally for each until property, we also report the CDF of the firstpassage time to the absorbing set; this corresponds to the eventually formula , following the discussion of Section 4.2.
As a baseline reference, we use the PRISM Model Checker [kwiatkowska_prism_2011], which is a wellestablished tool in the literature. For a timebounded until property , PRISM is capable of estimating its satisfaction probability by considering the following variation of the probabilistic operator . The result of denotes the probability that has been satisfied at any , thus it can be directly compared to our approach. In particular, PRISM offers numerical verification of timebounded until properties that relies on the uniformisation method [Baier2000]. We make use of numerical verification when possible, but for more complex models we resort to SMC, as an explicit representation of their state space is practically not possible.
5.1 An epidemiology model
We first consider a SIR model of spreading a contagious disease. The system state is described by a vector of three variables that represent the number of susceptible (), infected (), and recovered () individuals in a population of fixed size. The dynamics of the model are described by the following reactions:

, with rate function ;

, with rate function ;
Considering initial state , the reachable state space as reported by PRISM involves 1271 states and 2451 transitions, which is a number small enough to allow the use of numerical verification.
We consider two properties: the first property states whether the infected population remains under a certain threshold until the extinction of the epidemic:
(11) 
where . Also, we consider a property that involves more than one species:
(12) 
where
. It is well known that the random variables
will follow a joint Gaussian distribution.We have used Algorithm 1 to approximate the CDF of the time that and are first satisfied on a sequence of 200 timepoints. We have also used the hybrid engine of PRISM in order to produce accurate estimates of the satisfaction probabilities of and , for an respectively.
The calculated CDFs for are summarised in Figure (b)b, while in Figure (a)a we report the CDFs of the firstpassage time into its absorbing set. Similarly, the CDFs for are reported in (d)d, and the CDFs of the corresponding absorbing set can be found in Figure (c)c. In both cases the distribution functions calculated by our approach (MCSBI) is very close to the numerical solutions of PRISM.
5.2 LacZ  A model of prokaryotic gene expression
As a more complicated example, we consider the model of LacZ protein synthesis in E. coli that first appeared in [Kierzek2002] and has been used before as a model checking benchmark [bortolussi:smoothed16]. The full model specification can be found in the appendix.
We are interested in three variables: for the population of ribosomes, which represents the population of translated sequences, and representing the molecules of protein produced. The following property:
(13) 
monitors whether both and satisfy certain conditions until the LacZ protein produced reaches a specified threshold (i.e. ). A randomly sampled trajectory can be seen in Figure (a)a.
We have attempted to explore the reachable state space of the model using the hybrid engine of PRISM; that involved more than 26 trillions of states and 217 trillions of transitions. The state space exploration alone lasted nearly six hours and consumed more that 60 GB of memory in a computing cluster. It is fair to state that numerical methodologies can be ruled out for this example. Thus we compare our approach against SMC as implemented in PRISM. We note that 1000 samples were used by the SMC approach; the confidence interval for the results that follow
0.039, based on 99.0% confidence level.Figure 3 summarises the calculated firstpassage time CDFs evaluated on a sequence 200 timepoints. In Figure (b)b we see that the moment closure method resulted in a particularly accurate approximation of the firstpassage time distribution for the absorbing states. Regarding the distribution of , the results of MCSBI and PRISM’s SMC seem to be in agreement (Figure (c)c); however that our method overestimates the final probability of satisfying .
5.3 A stiff viral model
Stiffness is a common computational issue in many chemical reaction systems. The problem of stiffness arises when a small number reactions in the system occur much more frequently than others. This small group of fast reactions dominates the computational time, and thus renders simulation particularly expensive.
As an example of a stiff system, we consider the model of viral infection in [Haseltine2002]. The model state is described by four variables: the population of viral template , the viral genome , the viral structural protein , and that captures the number of viruses produced. For the initial state we have , and the rest of the variables are equal to zero. The reactions and the kinetic laws that determine the dynamics of the model can be found in the appendix. An interesting aspect of this model is that its state space is not bounded, therefore we resort to the statistical model checking capabilities of PRISM to evaluate our approach. The SMC used 1000 samples, resulting in confidence interval 0.038, based on 99.0% confidence level.
Figure (a)a depicts a random trajectory that shows the evolution of the viral genome and the virus population over time. We see that slowly increases until it apparently reaches a steadystate and fluctuates around the value , while continues to increase at a nonconstant rate. In this example, we shall monitor whether the viral genome remains under the value of until the virus population reaches a certain threshold:
(14) 
The results of Figure 4 show that our method did not capture the distributions functions as well as in the previous two examples. However, considering that our method is four orders of magnitude faster than statistical modelchecking (cf. Table 1), it still gives a reasonably good approximation, particularly in the case of the eventually value. Again, we have considered a sequence of length 200.
5.4 A genetic oscillator
As a final example, we consider the model of a genetic oscillator in [Azunre2011]. The original model is defined in terms of concentrations; in order to properly convert the model specification in terms of molecular populations, we consider a volume . The full model specification can be found in the appendix. We consider an initial state where , and the rest of the variables are equal to . As we can see in the random trajectory in Figure (a)a, the populations of , and approach or exceed the value of . Therefore, we have a system whose state space is simply too large to apply traditional model checking methods that rely on uniformisation.
We shall turn out interest on the variables and ; the following property monitors whether remains under until exceeds the value of :
(15) 
In this example, the CDF has been evaluated on a sequence of length 2000. As a comparison baseline for this example, we use the SMC algorithm in PRISM using 1000 samples, which resulted in confidence interval 0.030, based on 99.0% confidence level. The results in Figure 5 show an accurate approximation of the rather unusual firstpassage time distribution functions for both the absorbing states (Figure (b)b) and the property (Figure (c)c).
5.5 A note on the execution times
Table 1 summarises the execution times for our method (MCSBI) and statistical model checking (SMC). We have used numerical verification as implemented in PRISM for the SIR model only. For the other examples, the state space is simply too large to allow the use of a method that relies on explicit representation, so we report the simulation running times only. Of course, the numerical approach is much faster when this is applicable. However, the computational savings for MCSBI are obvious for the more complicated examples, in particular the viral model and the genetic oscillator.
In order to derive the moment closure approximations automatically, we have used StochDynTools [HespanhaDec06]. We note that the CDFs have been evaluated on a sequence of 200 timepoints for all models except from the genetic oscillator, where 2000 points were used instead.
Model  MCSBI  PRISM (Numerical)  PRISM (SMC) 

SIR  8 sec  1 sec  1 sec 
LacZ  38 sec  N/A  46 sec 
Viral  8 sec  N/A  24875 sec 
Genetic Oscillator  87 sec  N/A  20707 sec 
6 Conclusions
Probabilistic model checking remains one of the central problems in formal methods. As the applications of quantitative modelling extend to more complex systems, scalable techniques for accurate approximation will increasingly play a central role in the deployment of formal methods to practical systems.
Here we presented a novel approach to the classical model checking problem based on a reformulation as a sequential Bayesian inference problem. This reformulation is exact: it was originally suggested in [Schnoerr2017_arxiv] for reachability problems, and was extended in the present work to general CSL formulae including timebounded Until operators. Apart from its conceptual appeal, this reformulation is important because it enables us to obtain an approximate solution using efficient and highly accurate tools from machine learning. Our method leverages a class of analytical approximations to CTMCs known as moment closures, which enable an efficient computation of the process marginal statistics.
We have shown on a number of diverse case studies that our method achieves excellent accuracy with much reduced computational costs compared to SMC. Nevertheless, our algorithm requires some approximations to the underlying stochastic process. The first approximation is the adoption of a time discretisation; this is a controllable approximation and can be rendered arbitrarily precise by reducing the time step (at a computational cost that grows linearly with the number of steps). The second approximation consists in propagating forward the first two moments of the process via a moment closure approximation. The quality of the approximation in this case is system dependent. Several studies have examined the problem of convergence of moment closure approximations [schnoerr2014validity, schnoerr2015comparison], however, to the best of our knowledge, error bounds for such approximations are an open problem in the mathematics of stochastic processes. Despite such issues, we believe that the reformulation of model checking problems in terms of Bayesian inference has the potential to open the door to a new class of approximate algorithms to attack this classic problem in computer science.
References
Appendix 0.A Partial derivatives of Gaussian likelihood
In assumed density filtering, the mean and variance updates for a Gaussian approximate distribution are given as follows:
(16)  
(17) 
The posterior mean and variance are given as function of the prior and the derivative of the logarithm of the likelihood with respect to the prior mean . We show that the derivatives of can be expressed as a sum of partial derivatives of Gaussian distribution functions, which can be easily evaluated either analytically or numerically.
Without loss of generality, we consider the twodimensional case, i.e. , where we have:
(18) 
(19) 
Therefore, the quantities of interest will be , and , for which we have:
(20) 
(21) 
(22) 
Thus we have to calculate the likelihood partial derivatives: , and .
For a normal distribution , the evidence is always be a sum of multivariate CDFs. For example, consider a constraint of the form and for the twodimensional case:
Finally in order to calculate the ADF updates, we need to calculate the following partial derivatives for the Gaussian CDF: , and .
Firstorder derivatives
For firstorder derivatives of the form we have:
(23) 
Note that we need the derivative with respect to (rather than ). For a univariate normal distribution we have:
(24) 
Therefore for the bivariate case we have:
(25) 
Secondorder derivatives
In the case of a bivariate Gaussian distribution, the secondorder derivative with respect to both and can be evaluated analytically:
In the more general case of a multivariate Gaussian distribution, the derivative can also be evaluated analytically:
However, there may not always be an analytical form for the secondorder derivative with respect to .
(26)  
(27) 
If the random variable is univariate, then it is easy to show that the derivative of its CDF will be:
(28) 
In a different case, there is not analytical expression available. Nevertheless, it is reasonable to approximate by means of numerical differentiation.
Appendix 0.B The LacZ Model
Reaction  Rate Function  Kinetic Constant 

Appendix 0.C The Stiff Viral Model
Reaction  Rate Function  Kinetic Constant 

Appendix 0.D The Genetic Oscillator Model
Reaction  Rate Function  Kinetic Constant 

Comments
There are no comments yet.