# Bayesian hierarchical modelling with Rainier

## Introduction

In the previous post I gave a brief introduction to Rainier, a new HMC-based probabilistic programming library/DSL for Scala. In that post I assumed that people were using the latest source version of the library. Since then, version 0.1.1 of the library has been released, so in this post I will demonstrate use of the released version of the software (using the binaries published to Sonatype), and will walk through a slightly more interesting example – a dynamic linear state space model with unknown static parameters. This is similar to, but slightly different from, the DLM example in the Rainier library. So to follow along with this post, all that is required is SBT.

## An interactive session

First run SBT from an empty directory, and paste the following at the SBT prompt:

```set libraryDependencies  += "com.stripe" %% "rainier-plot" % "0.1.1"
set scalaVersion := "2.12.4"
console
```

This should give a Scala REPL with appropriate dependencies (`rainier-plot` has all of the relevant transitive dependencies). We’ll begin with some imports, and then simulating some synthetic data from a dynamic linear state space model with an AR(1) latent state and Gaussian noise on the observations.

```import com.stripe.rainier.compute._
import com.stripe.rainier.core._
import com.stripe.rainier.sampler._

implicit val rng = ScalaRNG(1)
val n = 60 // number of observations/time points
val mu = 3.0 // AR(1) mean
val a = 0.95 // auto-regressive parameter
val sig = 0.2 // AR(1) SD
val sigD = 3.0 // observational SD
val state = Stream.
iterate(0.0)(x => mu + (x - mu) * a + sig * rng.standardNormal).
take(n).toVector
val obs = state.map(_ + sigD * rng.standardNormal)
```

Now we have some synthetic data, let’s think about building a probabilistic program for this model. Start with a prior.

```case class Static(mu: Real, a: Real, sig: Real, sigD: Real)
val prior = for {
mu <- Normal(0, 10).param
a <- Normal(1, 0.1).param
sig <- Gamma(2,1).param
sigD <- Gamma(2,2).param
sp <- Normal(0, 50).param
} yield (Static(mu, a, sig, sigD), List(sp))
```

Note the use of a case class for wrapping the static parameters. Next, let’s define a function to add a state and associated observation to an existing model.

```def addTimePoint(current: RandomVariable[(Static, List[Real])],
datum: Double) = for {
tup <- current
static = tup._1
states = tup._2
ns <- Normal(((Real.one - static.a) * static.mu) + (static.a * os),
static.sig).param
_ <- Normal(ns, static.sigD).fit(datum)
} yield (static, ns :: states)
```

Given this, we can generate the probabilistic program for our model as a fold over the data initialised with the prior.

```val fullModel = obs.foldLeft(prior)(addTimePoint(_, _))
```

If we don’t want to keep samples for all of the variables, we can focus on the parameters of interest, wrapping the results in a `Map` for convenient sampling and plotting.

```val model = for {
tup <- fullModel
static = tup._1
states = tup._2
} yield
Map("mu" -> static.mu,
"a" -> static.a,
"sig" -> static.sig,
"sigD" -> static.sigD,
```

We can sample with

```val out = model.sample(HMC(3), 100000, 10000 * 500, 500)
```

(this will take several minutes) and plot some diagnostics with

```import com.cibo.evilplot.geometry.Extent
import com.stripe.rainier.plot.EvilTracePlot._

val truth = Map("mu" -> mu, "a" -> a, "sigD" -> sigD,
"sig" -> sig, "SP" -> state(0))
render(traces(out, truth), "traceplots.png",
Extent(1200, 1400))
render(pairs(out, truth), "pairs.png")
```

This generates the following diagnostic plots:  Everything looks good.

## Summary

Rainier is a monadic embedded DSL for probabilistic programming in Scala. We can use standard functional combinators and for-expressions for building models to sample, and then run an efficient HMC algorithm on the resulting probability monad in order to obtain samples from the posterior distribution of the model.

See the Rainier repo for further details. ### darrenjw

I am Professor of Statistics within the Department of Mathematical Sciences at Durham University, UK. I am an Bayesian statistician interested in computation and applications, especially to engineering and the life sciences.

## 3 thoughts on “Bayesian hierarchical modelling with Rainier”

1. Zhenhao Li (@data_fly) says:

Thanks for this example! It seems that Rainier doesn’t support multi-core parallelism. Correct me if I am wrong.

1. darrenjw says:

Not within a single chain, but the idea is to run multiple chains on different cores.

1. Zhenhao Li (@data_fly) says:

Got it. I might use it inside an Akka actor system to get distributed scalability and parallelism.

This site uses Akismet to reduce spam. Learn how your comment data is processed.