Introduction

The Systems Biology Markup Language (SBML) is an XML-based format for representation and exchange of biochemical network models. SBML is supported by most systems biology modelling tools, allowing the export of a model in SBML from one tool and then reading in another tool. Because it offers a standard way of representing biochemical networks in an unambiguous way, it can also be used as the standard way of representing models in databases of biochemical network models, such as BioModels. I haven’t talked about SBML much in this blog, so far, but I discuss it in detail in my book, Stochastic modelling for systems biology. SBML is a “good thing”, and everyone who works with (deterministic or stochastic) biochemical network models should know a bit about it.

The SBML format is fairly complex to parse and generate correctly, so it’s preferable to use a software library to take care of the details. libSBML is the community standard library developed for this purpose. It is a C++ library, but has interfaces for other languages, such as Python and Java. However, whilst it’s perfectly possible to use native libraries on the JVM, they aren’t so convenient to work with, especially in conjunction with modern automatic build and deployment tools. So when working on the JVM, a pure JVM library for working with SBML would be a lot more convenient. JSBML is exactly that – a pure Java library for working with SBML on the JVM. As of version 1.2, it is also available from Maven Central, making it super-convenient to use with modern build tools such as Maven and sbt. In this post I’ll walk through getting started with using Scala and sbt to build and run a trivial JSBML example, and highlight a couple of gotchas and provide pointers for further reading.

Using JSBML from Scala sbt projects

Dependencies in sbt

Since JSBML is now on Maven Central, adding a dependency on it should just be a matter of adding the line

libraryDependencies += "org.sbml.jsbml" % "jsbml" % "1.2"


to your sbt build.sbt file. However, for slightly mysterious reasons this doesn’t quite work. It works fine for compilation, but at runtime some dependencies are missing. I suspect this is a slight problem with the current JSBML build, but it could also be a bug/feature in sbt. Either way, the problem can be solved by explicitly including log4j dependencies in the build. So just adding:

libraryDependencies ++= Seq(
"org.sbml.jsbml" % "jsbml" % "1.2",
"org.apache.logging.log4j" % "log4j-1.2-api" % "2.3",
"org.apache.logging.log4j" % "log4j-api" % "2.3",
"org.apache.logging.log4j" % "log4j-core" % "2.3"
)


to the build file is sufficient to make everything work properly.

Example Scala program

Below is a complete Scala program to read an SBML file from disk and print to console some very basic information about the model.

object JsbmlApp {
import scala.collection.JavaConversions._

def main(args: Array[String]): Unit = {
val filename = if (args.length == 0)
"ch07-mm-stoch.xml" else args(0)
val model = document.getModel
println(model.getId + "\n" + model.getName)
val listOfSpecies = model.getListOfSpecies
val ns = model.getNumSpecies
println(s"$ns Species:") listOfSpecies.iterator.foreach(species => { println(" " + species.getId + "\t" + species.getName + "\t" + species.getCompartment + "\t" + species.getInitialAmount) }) val nr = model.getNumReactions println(s"$nr Reactions.")
}

}


There are just a few things worth noting about this simple example. The first gotcha is to try and resist the temptation to import all SBML classes into the namespace (with import org.sbml.jsbml._). This is poor programming practice at the best of times, but here it is especially problematic. Scala programmers will be aware that Unit is a very important type in the Scala language, which has nothing to do with the JSBML class Unit, which represents a physical unit of measurement. The clash can be avoided by using the fully qualified name, org.sbml.jsbml.Unit wherever the JSBML Unit class is intended, but that is rather cumbersome, so the typical Scala mechanism for dealing with this is to rename the class on import, using, for example:

import org.sbml.jsbml.{ Unit => JsbmlUnit }


Then in code it is clear that Unit refers to the Scala type and JsbmlUnit refers to the JSBML class.

Also note that JavaConversions has been imported. This provides an implicit conversion from a Java to a Scala iterator, and this simplifies iterating over SBML listOfs. Here it is used it to implicitly convert the listOfSpecies Java iterator into a Scala iterator so that I can call foreach on it.

This complete runnable example is available in my blog repo on github. This example will run on any system with a recent JVM installed. It does not require Scala, or libSBML, or JSBML, or any other dependency (sbt will take care of dependency resolution).

Once you are up and running with a simple example like this, the JSBML Documentation is fine. Start by reading the User guide and then use the API Documentation.

Conclusion

Working with SBML in Scala is quite convenient using JSBML. It is easy to include a dependence on JSBML in Scala sbt projects. JSBML has a typical Java Object-Oriented API that is somewhat unnatural in Scala, but isn’t too bad using a few tricks, such as implicit iterator conversion. It wouldn’t be very difficult to layer a more functional API on top of JSBML, but I don’t have the energy to do that. See my blog repo for the full runnable example.

Calling Scala code from R using rscala

Introduction

In a previous post I looked at how to call Scala code from R using a CRAN package called jvmr. This package now seems to have been replaced by a new package called rscala. Like the old package, it requires a pre-existing Java installation. Unlike the old package, however, it no longer depends on rJava, which may simplify some installations. The rscala package is well documented, with a reference manual and a draft paper. In this post I will concentrate on the issue of calling sbt-based projects with dependencies on external libraries (such as breeze).

On a system with Java installed, it should be possible to install the rscala package with a simple

install.packages("rscala")


from the R command prompt. Calling

library(rscala)


will check that it has worked. The package will do a sensible search for a Scala installation and use it if it can find one. If it can’t find one (or can only find an installation older than 2.10.x), it will fail. In this case you can download and install a Scala installation specifically for rscala using the command

rscala::scalaInstall()


This option is likely to be attractive to sbt (or IDE) users who don’t like to rely on a system-wide scala installation.

A Gibbs sampler in Scala using Breeze

For illustration I’m going to use a Scala implementation of a Gibbs sampler. The Scala code, gibbs.scala is given below:

package gibbs

object Gibbs {

import scala.annotation.tailrec
import scala.math.sqrt
import breeze.stats.distributions.{Gamma,Gaussian}

case class State(x: Double, y: Double) {
override def toString: String = x.toString + " , " + y + "\n"
}

def nextIter(s: State): State = {
val newX = Gamma(3.0, 1.0/((s.y)*(s.y)+4.0)).draw
State(newX, Gaussian(1.0/(newX+1), 1.0/sqrt(2*newX+2)).draw)
}

@tailrec def nextThinnedIter(s: State,left: Int): State =
if (left==0) s else nextThinnedIter(nextIter(s),left-1)

def genIters(s: State, stop: Int, thin: Int): List[State] = {
@tailrec def go(s: State, left: Int, acc: List[State]): List[State] =
if (left>0)
go(nextThinnedIter(s,thin), left-1, s::acc)
else acc
go(s,stop,Nil).reverse
}

def main(args: Array[String]) = {
if (args.length != 3) {
println("Usage: sbt \"run <outFile> <iters> <thin>\"")
sys.exit(1)
} else {
val outF=args(0)
val iters=args(1).toInt
val thin=args(2).toInt
val out = genIters(State(0.0,0.0),iters,thin)
val s = new java.io.FileWriter(outF)
s.write("x , y\n")
out map { it => s.write(it.toString) }
s.close
}
}

}


This code requires Scala and the Breeze scientific library in order to build. We can specify this in a sbt build file, which should be called build.sbt and placed in the same directory as the Scala code.

name := "gibbs"

version := "0.1"

scalacOptions ++= Seq("-unchecked", "-deprecation", "-feature")

libraryDependencies  ++= Seq(
"org.scalanlp" %% "breeze" % "0.10",
"org.scalanlp" %% "breeze-natives" % "0.10"
)

resolvers ++= Seq(
"Sonatype Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/",
"Sonatype Releases" at "https://oss.sonatype.org/content/repositories/releases/"
)

scalaVersion := "2.11.6"


Now, from a system command prompt in the directory where the files are situated, it should be possible to download all dependencies and compile and run the code with a simple

sbt "run output.csv 50000 1000"


sbt magically manages all of the dependencies for us so that we don’t have to worry about them. However, for calling from R, it may be desirable to run the code without running sbt. There are several ways to achieve this, but the simplest is to build an “assembly jar” or “fat jar”, which is a Java byte-code file containing all code and libraries required in order to run the code on any system with a Java installation.

To build an assembly jar first create a subdirectory called project (the name matters), an in it place two files. The first should be called assembly.sbt, and should contain the line

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")


Since the version of the assembly tool can depend on the version of sbt, it is also best to fix the version of sbt being used by creating another file in the project directory called build.properties, which should contain the line

sbt.version=0.13.7


sbt assembly


If this works, it should create a fat jar target/scala-2.11/gibbs-assembly-0.1.jar. You can check it works by running

java -jar target/scala-2.11/gibbs-assembly-0.1.jar output.csv 10000 10


Assuming that it does, you are now ready to try running the code from within R.

Calling via R system calls

Since this code takes a relatively long time to run, calling it from R via simple system calls isn’t a particularly terrible idea. For example, we can do this from the R command prompt with the following commands

system("java -jar target/scala-2.11/gibbs-assembly-0.1.jar output.csv 50000 1000")
library(smfsb)
mcmcSummary(out,rows=2)


This works fine, but is a bit clunky. Tighter integration between R and Scala would be useful, which is where rscala comes in.

Calling assembly Scala projects via rscala

rscala provides a very simple way to embed a Scala interpreter within an R session, to be able to execute Scala expressions from R and to have the results returned back to the R session for further processing. The main issue with using this in practice is managing dependencies on external libraries and setting the Scala classpath correctly. By using an assembly jar we can bypass most of these issues, and it becomes trivial to call our Scala code direct from the R interpreter, as the following code illustrates.

library(rscala)
sc=scalaInterpreter("target/scala-2.11/gibbs-assembly-0.1.jar")
sc%~%'import gibbs.Gibbs._'
out=sc%~%'genIters(State(0.0,0.0),50000,1000).toArray.map{s=>Array(s.x,s.y)}'
library(smfsb)
mcmcSummary(out,rows=2)


Here we call the getIters function directly, rather than via the main method. This function returns an immutable List of States. Since R doesn’t understand this, we map it to an Array of Arrays, which R then unpacks into an R matrix for us to store in the matrix out.

Summary

The CRAN package rscala makes it very easy to embed a Scala interpreter within an R session. However, for most non-trivial statistical computing problems, the Scala code will have dependence on external scientific libraries such as Breeze. The standard way to easily manage external dependencies in the Scala ecosystem is sbt. Given an sbt-based Scala project, it is easy to generate an assembly jar in order to initialise the rscala Scala interpreter with the classpath needed to call arbitrary Scala functions. This provides very convenient inter-operability between R and Scala for many statistical computing applications.

Calling Scala code from R using jvmr

[Update: the jvmr package has been replaced by a new package called rscala. I have a new post which explains it.]

Introduction

In previous posts I have explained why I think that Scala is a good language to use for statistical computing and data science. Despite this, R is very convenient for simple exploratory data analysis and visualisation – currently more convenient than Scala. I explained in my recent talk at the RSS what (relatively straightforward) things would need to be developed for Scala in order to make R completely redundant, but for the short term at least, it seems likely that I will need to use both R and Scala for my day-to-day work.

Since I use both Scala and R for statistical computing, it is very convenient to have a degree of interoperability between the two languages. I could call R from Scala code or Scala from R code, or both. Fortunately, some software tools have been developed recently which make this much simpler than it used to be. The software is jvmr, and as explained at the website, it enables calling Java and Scala from R and calling R from Java and Scala. I have previously discussed calling Java from R using the R CRAN package rJava. In this post I will focus on calling Scala from R using the CRAN package jvmr, which depends on rJava. I may examine calling R from Scala in a future post.

On a system with Java installed, it should be possible to install the jvmr R package with a simple

install.packages("jvmr")


from the R command prompt. The package has the usual documentation associated with it, but the draft paper describing the package is the best way to get an overview of its capabilities and a walk-through of simple usage.

A Gibbs sampler in Scala using Breeze

For illustration I’m going to use a Scala implementation of a Gibbs sampler which relies on the Breeze scientific library, and will be built using the simple build tool, sbt. Most non-trivial Scala projects depend on various versions of external libraries, and sbt is an easy way to build even very complex projects trivially on any system with Java installed. You don’t even need to have Scala installed in order to build and run projects using sbt. I give some simple complete worked examples of building and running Scala sbt projects in the github repo associated with my recent RSS talk. Installing sbt is trivial as explained in the repo READMEs.

For this post, the Scala code, gibbs.scala is given below:

package gibbs

object Gibbs {

import scala.annotation.tailrec
import scala.math.sqrt
import breeze.stats.distributions.{Gamma,Gaussian}

case class State(x: Double, y: Double) {
override def toString: String = x.toString + " , " + y + "\n"
}

def nextIter(s: State): State = {
val newX = Gamma(3.0, 1.0/((s.y)*(s.y)+4.0)).draw
State(newX, Gaussian(1.0/(newX+1), 1.0/sqrt(2*newX+2)).draw)
}

@tailrec def nextThinnedIter(s: State,left: Int): State =
if (left==0) s else nextThinnedIter(nextIter(s),left-1)

def genIters(s: State, stop: Int, thin: Int): List[State] = {
@tailrec def go(s: State, left: Int, acc: List[State]): List[State] =
if (left&gt;0)
go(nextThinnedIter(s,thin), left-1, s::acc)
else acc
go(s,stop,Nil).reverse
}

def main(args: Array[String]) = {
if (args.length != 3) {
println("Usage: sbt \"run &lt;outFile&gt; &lt;iters&gt; &lt;thin&gt;\"")
sys.exit(1)
} else {
val outF=args(0)
val iters=args(1).toInt
val thin=args(2).toInt
val out = genIters(State(0.0,0.0),iters,thin)
val s = new java.io.FileWriter(outF)
s.write("x , y\n")
out map { it =&gt; s.write(it.toString) }
s.close
}
}

}


This code requires Scala and the Breeze scientific library in order to build. We can specify this in a sbt build file, which should be called build.sbt and placed in the same directory as the Scala code.

name := "gibbs"

version := "0.1"

scalacOptions ++= Seq("-unchecked", "-deprecation", "-feature")

libraryDependencies  ++= Seq(
"org.scalanlp" %% "breeze" % "0.10",
"org.scalanlp" %% "breeze-natives" % "0.10"
)

resolvers ++= Seq(
"Sonatype Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/",
"Sonatype Releases" at "https://oss.sonatype.org/content/repositories/releases/"
)

scalaVersion := "2.11.2"


Now, from a system command prompt in the directory where the files are situated, it should be possible to download all dependencies and compile and run the code with a simple

sbt "run output.csv 50000 1000"


Calling via R system calls

Since this code takes a relatively long time to run, calling it from R via simple system calls isn’t a particularly terrible idea. For example, we can do this from the R command prompt with the following commands

system("sbt \"run output.csv 50000 1000\"")
library(smfsb)
mcmcSummary(out,rows=2)


This works fine, but won’t work so well for code which needs to be called repeatedly. For this, tighter integration between R and Scala would be useful, which is where jvmr comes in.

Calling sbt-based Scala projects via jvmr

jvmr provides a very simple way to embed a Scala interpreter within an R session, to be able to execute Scala expressions from R and to have the results returned back to the R session for further processing. The main issue with using this in practice is managing dependencies on external libraries and setting the Scala classpath correctly. For an sbt project such as we are considering here, it is relatively easy to get sbt to provide us with all of the information we need in a fully automated way.

First, we need to add a new task to our sbt build instructions, which will output the full classpath in a way that is easy to parse from R. Just add the following to the end of the file build.sbt:

lazy val printClasspath = taskKey[Unit]("Dump classpath")

printClasspath := {
(fullClasspath in Runtime value) foreach {
e =&gt; print(e.data+"!")
}
}


Be aware that blank lines are significant in sbt build files. Once we have this in our build file, we can write a small R function to get the classpath from sbt and then initialise a jvmr scalaInterpreter with the correct full classpath needed for the project. An R function which does this, sbtInit(), is given below

sbtInit&lt;-function()
{
library(jvmr)
system2("sbt","compile")
cpstr=system2("sbt","printClasspath",stdout=TRUE)
cpst=cpstr[length(cpstr)]
cpsp=strsplit(cpst,"!")[[1]]
cp=cpsp[1:(length(cpsp)-1)]
scalaInterpreter(cp,use.jvmr.class.path=FALSE)
}


With this function at our disposal, it becomes trivial to call our Scala code direct from the R interpreter, as the following code illustrates.

sc=sbtInit()
sc['import gibbs.Gibbs._']
out=sc['genIters(State(0.0,0.0),50000,1000).toArray.map{s=&gt;Array(s.x,s.y)}']
library(smfsb)
mcmcSummary(out,rows=2)


Here we call the getIters function directly, rather than via the main method. This function returns an immutable List of States. Since R doesn’t understand this, we map it to an Array of Arrays, which R then unpacks into an R matrix for us to store in the matrix out.

Summary

The CRAN package jvmr makes it very easy to embed a Scala interpreter within an R session. However, for most non-trivial statistical computing problems, the Scala code will have dependence on external scientific libraries such as Breeze. The standard way to easily manage external dependencies in the Scala ecosystem is sbt. Given an sbt-based Scala project, it is easy to add a task to the sbt build file and a function to R in order to initialise the jvmr Scala interpreter with the full classpath needed to call arbitrary Scala functions. This provides very convenient inter-operability between R and Scala for many statistical computing applications.

Parallel Monte Carlo using Scala

Introduction

In previous posts I have discussed general issues regarding parallel MCMC and examined in detail parallel Monte Carlo on a multicore laptop. In those posts I used the C programming language in conjunction with the MPI parallel library in order to illustrate the concepts. In this post I want to take the example from the second post and re-examine it using the Scala programming language.

The toy problem considered in the parallel Monte Carlo post used $10^9$ $U(0,1)$ random quantities to construct a Monte Carlo estimate of the integral

$\displaystyle I=\int_0^1\exp\{-u^2\}du$.

A very simple serial program to implement this algorithm is given below:

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp
import scala.annotation.tailrec

object MonteCarlo {

@tailrec
def sum(its: Long,acc: Double): Double = {
if (its==0)
(acc)
else {
sum(its-1,acc+exp(-u*u))
}
}

def main(args: Array[String]) = {
println("Hello")
val iters=1000000000
val result=sum(iters,0.0)
println(result/iters)
println("Goodbye")
}

}


Note that ThreadLocalRandom is a parallel random number generator introduced into recent versions of the Java programming language, which can be easily utilised from Scala code. Assuming that Scala is installed, this can be compiled and run with commands like

scalac monte-carlo.scala
time scala MonteCarlo


This program works, and the timings (in seconds) for three runs are 57.79, 57.77 and 57.55 on the same laptop considered in the previous post. The first thing to note is that this Scala code is actually slightly faster than the corresponding C+MPI code in the single processor special case! Now that we have a good working implementation we should think how to parallelise it…

Parallel implementation

Before constructing a parallel implementation, we will first construct a slightly re-factored serial version that will be easier to parallelise. The simplest way to introduce parallelisation into Scala code is to parallelise a map over a collection. We therefore need a collection and a map to apply to it. Here we will just divide our $10^9$ iterations into $N=4$ separate computations, and use a map to compute the required Monte Carlo sums.

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp
import scala.annotation.tailrec

object MonteCarlo {

@tailrec
def sum(its: Long,acc: Double): Double = {
if (its==0)
(acc)
else {
sum(its-1,acc+exp(-u*u))
}
}

def main(args: Array[String]) = {
println("Hello")
val N=4
val iters=1000000000
val its=iters/N
val sums=(1 to N).toList map {x => sum(its,0.0)}
val result=sums.reduce(_+_)
println(result/iters)
println("Goodbye")
}

}


Running this new code confirms that it works and gives similar estimates for the Monte Carlo integral as the previous version. The timings for 3 runs on my laptop were 57.57, 57.67 and 57.80, similar to the previous version of the code. So far so good. But how do we make it parallel? Like this:

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp
import scala.annotation.tailrec

object MonteCarlo {

@tailrec
def sum(its: Long,acc: Double): Double = {
if (its==0)
(acc)
else {
sum(its-1,acc+exp(-u*u))
}
}

def main(args: Array[String]) = {
println("Hello")
val N=4
val iters=1000000000
val its=iters/N
val sums=(1 to N).toList.par map {x => sum(its,0.0)}
val result=sums.reduce(_+_)
println(result/iters)
println("Goodbye")
}

}


That’s it! It’s now parallel. Studying the above code reveals that the only difference from the previous version is the introduction of the 4 characters .par in line 22 of the code. R programmers will find this very much analagous to using lapply() versus mclapply() in R code. The function par converts the collection (here an immutable List) to a parallel collection (here an immutable parallel List), and then subsequent maps, filters, etc., can be computed in parallel on appropriate multicore architectures. Timings for 3 runs on my laptop were 20.74, 20.82 and 20.88. Note that these timings are faster than the timings for N=4 processors for the corresponding C+MPI code…

Varying the size of the parallel collection

We can trivially modify the previous code to make the size of the parallel collection, N, a command line argument:

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp
import scala.annotation.tailrec

object MonteCarlo {

@tailrec
def sum(its: Long,acc: Double): Double = {
if (its==0)
(acc)
else {
sum(its-1,acc+exp(-u*u))
}
}

def main(args: Array[String]) = {
println("Hello")
val N=args(0).toInt
val iters=1000000000
val its=iters/N
val sums=(1 to N).toList.par map {x => sum(its,0.0)}
val result=sums.reduce(_+_)
println(result/iters)
println("Goodbye")
}

}


We can now run this code with varying sizes of N in order to see how the runtime of the code changes as the size of the parallel collection increases. Timings on my laptop are summarised in the table below.

 N     T1     T2     T3
1   57.67  57.62  57.83
2   32.20  33.24  32.76
3   26.63  26.60  26.63
4   20.99  20.92  20.75
5   20.13  18.70  18.76
6   16.57  16.52  16.59
7   15.72  14.92  15.27
8   13.56  13.51  13.32
9   18.30  18.13  18.12
10   17.25  17.33  17.22
11   17.04  16.99  17.09
12   15.95  15.85  15.91

16   16.62  16.68  16.74
32   15.41  15.54  15.42
64   15.03  15.03  15.28


So we see that the timings decrease steadily until the size of the parallel collection hits 8 (the number of processors my hyper-threaded quad-core presents via Linux), and then increases very slightly, but not much as the size of the collection increases. This is better than the case of C+MPI where performance degrades noticeably if too many processes are requested. Here, the Scala compiler and JVM runtime manage an appropriate number of threads for the collection irrespective of the actual size of the collection. Also note that all of the timings are faster than the corresponding C+MPI code discussed in the previous post.

However, the notion that the size of the collection is irrelevant is only true up to a point. Probably the most natural way to code this algorithm would be as:

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp

object MonteCarlo {

def main(args: Array[String]) = {
println("Hello")
val iters=1000000000
val sums=(1 to iters).toList map {x => ThreadLocalRandom.current().nextDouble()} map {x => exp(-x*x)}
val result=sums.reduce(_+_)
println(result/iters)
println("Goodbye")
}

}


or as the parallel equivalent

import java.util.concurrent.ThreadLocalRandom
import scala.math.exp

object MonteCarlo {

def main(args: Array[String]) = {
println("Hello")
val iters=1000000000
val sums=(1 to iters).toList.par map {x => ThreadLocalRandom.current().nextDouble()} map {x => exp(-x*x)}
val result=sums.reduce(_+_)
println(result/iters)
println("Goodbye")
}

}


Although these algorithms are in many ways cleaner and more natural, they will bomb out with a lack of heap space unless you have a huge amount of RAM, as they rely on having all $10^9$ realisations in RAM simultaneously. The lesson here is that even though functional languages make it very easy to write clean, efficient parallel code, we must still be careful not to fill up the heap with gigantic (immutable) data structures…

A functional Gibbs sampler in Scala

For many years I’ve had a passing interest in functional programming and languages which support functional programming approaches. I’m also quite interested in MOOCs and their future role in higher education. So I recently signed up for my first on-line course, Functional Programming Principles in Scala, via Coursera. I’m around half way through the course at the time of writing, and I’m enjoying it very much. I knew that I didn’t know much about Scala before starting the course, but during the course I’ve also learned that I didn’t know as much about functional programming as I thought I did, either! 😉 The course itself is very interesting, the assignments are well designed and appropriately challenging, and the web infrastructure to support the course is working well. I suspect I’ll try other on-line courses in the future.

Functional programming emphasises immutability, and discourages imperative programming approaches that use variables that can be modified during run-time. There are many advantages to immutability, especially in the context of parallel and concurrent programming, which is becoming increasingly important as multi-core systems become the norm. I’ve always found functional programming to be intellectually appealing, but have often worried about the practicalities of using functional programming in the context of scientific computing where many algorithms are iterative in nature, and are typically encoded using imperative approaches. The Scala programming language is appealing to me as it supports both imperative and functional styles of programming, as well as object oriented approaches. However, as a result of taking this course I am now determined to pursue functional approaches further, and get more of a feel for how practical they are for scientific computing applications.

For my first experiment, I’m going back to my post describing a Gibbs sampler in various languages. See that post for further details of the algorithm. In that post I did have an example implementation in Scala, which looked like this:

object GibbsSc {

import cern.jet.random.tdouble.engine.DoubleMersenneTwister
import cern.jet.random.tdouble.Normal
import cern.jet.random.tdouble.Gamma
import Math.sqrt
import java.util.Date

def main(args: Array[String]) {
val N=50000
val thin=1000
val rngEngine=new DoubleMersenneTwister(new Date)
val rngN=new Normal(0.0,1.0,rngEngine)
val rngG=new Gamma(1.0,1.0,rngEngine)
var x=0.0
var y=0.0
println("Iter x y")
for (i <- 0 until N) {
for (j <- 0 until thin) {
x=rngG.nextDouble(3.0,y*y+4)
y=rngN.nextDouble(1.0/(x+1),1.0/sqrt(2*x+2))
}
println(i+" "+x+" "+y)
}
}

}


At the time I wrote that post I knew even less about Scala than I do now, so I created the code by starting from the Java version and removing all of the annoying clutter! 😉 Clearly this code has an imperative style, utilising variables (declared with var) x and y having mutable state that is updated by a nested for loop. This algorithm is typical of the kind I use every day, so if I can’t re-write this in a more functional style, removing all mutable variables from my code, then I’m not going to get very far with functional programming!

In fact it is very easy to re-write this in a more functional style without utilising mutable variables. One possible approach is presented below.

object FunGibbs {

import cern.jet.random.tdouble.engine.DoubleMersenneTwister
import cern.jet.random.tdouble.Normal
import cern.jet.random.tdouble.Gamma
import java.util.Date
import scala.math.sqrt

val rngEngine=new DoubleMersenneTwister(new Date)
val rngN=new Normal(0.0,1.0,rngEngine)
val rngG=new Gamma(1.0,1.0,rngEngine)

class State(val x: Double,val y: Double)

def nextIter(s: State): State = {
val newX=rngG.nextDouble(3.0,(s.y)*(s.y)+4.0)
new State(newX,
rngN.nextDouble(1.0/(newX+1),1.0/sqrt(2*newX+2)))
}

def nextThinnedIter(s: State,left: Int): State = {
if (left==0) s
else nextThinnedIter(nextIter(s),left-1)
}

def genIters(s: State,current: Int,stop: Int,thin: Int): State = {
if (!(current>stop)) {
println(current+" "+s.x+" "+s.y)
genIters(nextThinnedIter(s,thin),current+1,stop,thin)
}
else s
}

def main(args: Array[String]) {
println("Iter x y")
genIters(new State(0.0,0.0),1,50000,1000)
}

}


Although it is a few lines longer, it is a fairly clean implementation, and doesn’t look like a hack. Like many functional programs, this one makes extensive use of recursion. This is one of the things that has always concerned me about functional programming – many scientific computing applications involve lots of iteration, and that can potentially translate into very deep recursion. The above program has an apparent recursion depth of 50 million! However, it runs fine without crashing despite the fact that most programming languages will crash out with a stack overflow with recursion depths of more than a couple of thousand. So why doesn’t this crash? It runs fine because the recursion I used is a special form of recursion known as a tail call. Most functional (and some imperative) programming languages automatically perform tail call elimination which essentially turns the tail call into an iteration which runs very fast without creating new stack frames. In fact, this functional version of the code runs at roughly the same speed as the iterative version I presented first (perhaps just a few percent slower – I haven’t done careful timings), and runs well within a factor of 2 of imperative C code. So actually this seems perfectly practical so far, and I’m looking forward to experimenting more with functional programming approaches to statistical computation over the coming months…

Gibbs sampling a Gaussian Markov random field (GMRF) using Java

Introduction

As I’ve explained previously, I’m gradually coming around to the idea of using Java for the development of MCMC codes, and I’m starting to build up a collection of simple examples for getting started. One of the advantages of Java is that it includes a standard cross-platform GUI library. This might not seem like the most important requirement for MCMC, but can actually be very handy in several contexts, particularly for monitoring convergence. One obvious context is that of image analysis, where it can be useful to monitor image reconstructions as the sampler is running. In this post I’ll show three very small simple Java classes which together provide an application for running a Gibbs sampler on a (non-stationary, unconditioned) Gaussian Markov random field.

The model is essentially that the distribution of each pixel is defined intrinsically, dependent only on its four nearest neighbours on a rectangular lattice, and here the distribution will be Gaussian with mean equal to the sample mean of the four neighbouring pixels and a fixed (unit) variance. On its own this isn’t especially useful, but it is a key component of many image analysis applications.

A simple Java implementation

We will start with the class MrfApp containing the main method for the application:

MrfApp.java

import java.io.*;
class MrfApp {
public static void main(String[] arg)
throws IOException
{
Mrf mrf;
System.out.println("started program");
mrf=new Mrf(800,600);
System.out.println("created mrf object");
mrf.update(1000);
mrf.saveImage("mrf.png");
System.out.println("finished program");
mrf.frame.dispose();
System.exit(0);
}
}


Hopefully this code is largely self-explanatory, but relies on a class called Mrf which contains all of the logic associated with the GMRF.

Mrf.java

import java.io.*;
import java.util.*;
import java.awt.image.*;
import javax.swing.*;
import javax.imageio.ImageIO;

class Mrf
{
int n,m;
double[][] cells;
Random rng;
BufferedImage bi;
WritableRaster wr;
JFrame frame;
ImagePanel ip;

Mrf(int n_arg,int m_arg)
{
n=n_arg;
m=m_arg;
cells=new double[n][m];
rng=new Random();
bi=new BufferedImage(n,m,BufferedImage.TYPE_BYTE_GRAY);
wr=bi.getRaster();
frame=new JFrame("MRF");
frame.setSize(n,m);
frame.setVisible(true);
}

public void saveImage(String filename)
throws IOException
{
ImageIO.write(bi,"PNG",new File(filename));
}

public void updateImage()
{
double mx=-1e+100;
double mn=1e+100;
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
if (cells[i][j]>mx) { mx=cells[i][j]; }
if (cells[i][j]<mn) { mn=cells[i][j]; }
}
}
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
int level=(int) (255*(cells[i][j]-mn)/(mx-mn));
wr.setSample(i,j,0,level);
}
}
frame.repaint();
}

public void update(int num)
{
for (int i=0;i<num;i++) {
updateOnce();
}
}

private void updateOnce()
{
double mean;
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
if (i==0) {
if (j==0) {
mean=0.5*(cells[0][1]+cells[1][0]);
}
else if (j==m-1) {
mean=0.5*(cells[0][j-1]+cells[1][j]);
}
else {
mean=(cells[0][j-1]+cells[0][j+1]+cells[1][j])/3.0;
}
}
else if (i==n-1) {
if (j==0) {
mean=0.5*(cells[i][1]+cells[i-1][0]);
}
else if (j==m-1) {
mean=0.5*(cells[i][j-1]+cells[i-1][j]);
}
else {
mean=(cells[i][j-1]+cells[i][j+1]+cells[i-1][j])/3.0;
}
}
else if (j==0) {
mean=(cells[i-1][0]+cells[i+1][0]+cells[i][1])/3.0;
}
else if (j==m-1) {
mean=(cells[i-1][j]+cells[i+1][j]+cells[i][j-1])/3.0;
}
else {
mean=0.25*(cells[i][j-1]+cells[i][j+1]+cells[i+1][j]
+cells[i-1][j]);
}
cells[i][j]=mean+rng.nextGaussian();
}
}
updateImage();
}

}


This class contains a few simple methods for creating and updating the GMRF, and also for maintaining and updating a graphical view of the GMRF as the sampler is running. The Gibbs sampler update itself is encoded in the final method, updateOnce, and most of the code is to deal with edge and corner cases (in the literal rather than metaphorical sense!). This is called repeatedly by the method update for the required number of iterations. At the end of each iteration, the method updateOnce triggers updateImage which updates the image associated GMRF. The GMRF itself is stored in a 2-dimensional array of doubles, but an image pixel typically consists of a grayscale value represented by an unsigned byte – that is, an integer from 0 to 255. So updateImage scans through the GMRF to find the maximum and minimum values and then maps the GMRF values onto the 0 to 255 scale. The image itself is set up by the constructor method, Mrf. This class relies on an additional class called ImagePanel, which is a simple GUI panel for displaying images:

ImagePanel.java

import java.awt.*;
import java.awt.image.*;
import javax.swing.*;

class ImagePanel extends JPanel {

protected BufferedImage image;

public ImagePanel(BufferedImage image) {
this.image=image;
Dimension dim=new Dimension(image.getWidth(),image.getHeight());
setPreferredSize(dim);
setMinimumSize(dim);
revalidate();
repaint();
}

public void paintComponent(Graphics g) {
g.drawImage(image,0,0,this);
}

}


This completes the application, which can be compiled and run from the command line with

javac *.java
java MrfApp


This should compile the code and run the application, which will show a GMRF updating for 1000 iterations. When the 1000 iterations are complete, the application writes the final image to a file and then quits.

Using Parallel COLT

The above classes are very convenient, as they should work with any standard Java installation. However, in more complex scenarios, it is likely that a math library such as Parallel COLT will be required. In this case it will make sense to make use of features in the COLT library, such as random number generators and 2d matrix objects. We can adapt the above application by replacing the MrfApp and Mrf classes with the following versions (the ImagePanel class remains unchanged):

MrfApp.java

import java.io.*;
import cern.jet.random.tdouble.engine.*;

class MrfApp {

public static void main(String[] arg)
throws IOException
{
Mrf mrf;
int seed=1234;
System.out.println("started program");
DoubleRandomEngine rngEngine=new DoubleMersenneTwister(seed);
mrf=new Mrf(800,600,rngEngine);
System.out.println("created mrf object");
mrf.update(1000);
mrf.saveImage("mrf.png");
System.out.println("finished program");
mrf.frame.dispose();
System.exit(0);
}

}


Mrf.java

import java.io.*;
import java.util.*;
import java.awt.image.*;
import javax.swing.*;
import javax.imageio.ImageIO;
import cern.jet.random.tdouble.*;
import cern.jet.random.tdouble.engine.*;
import cern.colt.matrix.tdouble.impl.*;

class Mrf
{
int n,m;
DenseDoubleMatrix2D cells;
DoubleRandomEngine rng;
Normal rngN;
BufferedImage bi;
WritableRaster wr;
JFrame frame;
ImagePanel ip;

Mrf(int n_arg,int m_arg,DoubleRandomEngine rng)
{
n=n_arg;
m=m_arg;
cells=new DenseDoubleMatrix2D(n,m);
this.rng=rng;
rngN=new Normal(0.0,1.0,rng);
bi=new BufferedImage(n,m,BufferedImage.TYPE_BYTE_GRAY);
wr=bi.getRaster();
frame=new JFrame("MRF");
frame.setSize(n,m);
frame.setVisible(true);
}

public void saveImage(String filename)
throws IOException
{
ImageIO.write(bi,"PNG",new File(filename));
}

public void updateImage()
{
double mx=-1e+100;
double mn=1e+100;
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
if (cells.getQuick(i,j)>mx) { mx=cells.getQuick(i,j); }
if (cells.getQuick(i,j)<mn) { mn=cells.getQuick(i,j); }
}
}
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
int level=(int) (255*(cells.getQuick(i,j)-mn)/(mx-mn));
wr.setSample(i,j,0,level);
}
}
frame.repaint();
}

public void update(int num)
{
for (int i=0;i<num;i++) {
updateOnce();
}
}

private void updateOnce()
{
double mean;
for (int i=0;i<n;i++) {
for (int j=0;j<m;j++) {
if (i==0) {
if (j==0) {
mean=0.5*(cells.getQuick(0,1)+cells.getQuick(1,0));
}
else if (j==m-1) {
mean=0.5*(cells.getQuick(0,j-1)+cells.getQuick(1,j));
}
else {
mean=(cells.getQuick(0,j-1)+cells.getQuick(0,j+1)+cells.getQuick(1,j))/3.0;
}
}
else if (i==n-1) {
if (j==0) {
mean=0.5*(cells.getQuick(i,1)+cells.getQuick(i-1,0));
}
else if (j==m-1) {
mean=0.5*(cells.getQuick(i,j-1)+cells.getQuick(i-1,j));
}
else {
mean=(cells.getQuick(i,j-1)+cells.getQuick(i,j+1)+cells.getQuick(i-1,j))/3.0;
}
}
else if (j==0) {
mean=(cells.getQuick(i-1,0)+cells.getQuick(i+1,0)+cells.getQuick(i,1))/3.0;
}
else if (j==m-1) {
mean=(cells.getQuick(i-1,j)+cells.getQuick(i+1,j)+cells.getQuick(i,j-1))/3.0;
}
else {
mean=0.25*(cells.getQuick(i,j-1)+cells.getQuick(i,j+1)+cells.getQuick(i+1,j)
+cells.getQuick(i-1,j));
}
cells.setQuick(i,j,mean+rngN.nextDouble());
}
}
updateImage();
}

}


Again, the code should be reasonably self explanatory, and will compile and run in the same way provided that Parallel COLT is installed and in your classpath. This version runs approximately twice as fast as the previous version on all of the machines I’ve tried it on.

Reference

I have found the following book very useful for understanding how to work with images in Java:

Hunt, K.A. (2010) The Art of Image Processing with Java, A K Peters/CRC Press.

Faster Gibbs sampling MCMC from within R

Introduction

This post follows on from the previous post on Gibbs sampling in various languages. In that post a simple Gibbs sampler was implemented in various languages, and speeds were compared. It was seen that R is very slow for iterative simulation algorithms characteristic of MCMC methods such as the Gibbs sampler. Statically typed languages such as C/C++ and Java were seen to be fastest for this type of algorithm. Since many statisticians like to use R for most of their work, there is natural interest in the possibility of extending R by calling simulation algorithms written in other languages. It turns out to be straightforward to call C, C++ and Java from within R, so this post will look at how this can be done, and exactly how fast the different options turn out to be. The post draws heavily on my previous posts on calling C from R and calling Java from R, as well as Dirk Eddelbuettel’s post on calling C++ from R, and it may be helpful to consult these posts for further details.

Languages

R

We will start with the simple pure R version of the Gibbs sampler, and use this as our point of reference for understanding the benefits of re-coding in other languages. The background to the problem was given in the previous post and so won’t be repeated here. The code can be given as follows:

gibbs<-function(N=50000,thin=1000)
{
mat=matrix(0,ncol=2,nrow=N)
x=0
y=0
for (i in 1:N) {
for (j in 1:thin) {
x=rgamma(1,3,y*y+4)
y=rnorm(1,1/(x+1),1/sqrt(2*x+2))
}
mat[i,]=c(x,y)
}
names(mat)=c("x","y")
mat
}


This code works perfectly, but is very slow. It takes 458.9 seconds on my very fast laptop (details given in previous post).

C

Let us now see how we can introduce a new function, gibbsC into R, which works in exactly the same way as gibbs, but actually calls on compiled C code to do all of the work. First we need the C code in a file called gibbs.c:

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <R.h>
#include <Rmath.h>

void gibbs(int *Np,int *thinp,double *xvec,double *yvec)
{
int i,j;
int N=*Np,thin=*thinp;
GetRNGstate();
double x=0;
double y=0;
for (i=0;i<N;i++) {
for (j=0;j<thin;j++) {
x=rgamma(3.0,1.0/(y*y+4));
y=rnorm(1.0/(x+1),1.0/sqrt(2*x+2));
}
xvec[i]=x; yvec[i]=y;
}
PutRNGstate();
}


This can be compiled with R CMD SHLIB gibbs.c. We can load it into R and wrap it up so that it is easy to use with the following code:

dyn.load(file.path(".",paste("gibbs",.Platform$dynlib.ext,sep=""))) gibbsC<-function(n=50000,thin=1000) { tmp=.C("gibbs",as.integer(n),as.integer(thin), x=as.double(1:n),y=as.double(1:n)) mat=cbind(tmp$x,tmp$y) colnames(mat)=c("x","y") mat }  The new function gibbsC works just like gibbs, but takes just 12.1 seconds to run. This is roughly 40 times faster than the pure R version, which is a big deal. Note that using the R inline package, it is possible to directly inline the C code into the R source code. We can do this with the following R code: require(inline) code=' int i,j; int N=*Np,thin=*thinp; GetRNGstate(); double x=0; double y=0; for (i=0;i<N;i++) { for (j=0;j<thin;j++) { x=rgamma(3.0,1.0/(y*y+4)); y=rnorm(1.0/(x+1),1.0/sqrt(2*x+2)); } xvec[i]=x; yvec[i]=y; } PutRNGstate();' gibbsCin<-cfunction(sig=signature(Np="integer",thinp="integer",xvec="numeric",yvec="numeric"),body=code,includes="#include <Rmath.h>",language="C",convention=".C") gibbsCinline<-function(n=50000,thin=1000) { tmp=gibbsCin(n,thin,rep(0,n),rep(0,n)) mat=cbind(tmp$x,tmp$y) colnames(mat)=c("x","y") mat }  This runs at the same speed as the code compiled separately, and is arguably a bit cleaner in this case. Personally I’m not a big fan of inlining code unless it is something really very simple. If there is one thing that we have learned from the murky world of web development, it is that little good comes from mixing up different languages in the same source code file! C++ We can also inline C++ code into R using the inline and Rcpp packages. The code below originates from Sanjog Misra, and was discussed in the post by Dirk Eddelbuettel mentioned at the start of this post. require(Rcpp) require(inline) gibbscode = ' int N = as<int>(n); int thn = as<int>(thin); int i,j; RNGScope scope; NumericVector xs(N),ys(N); double x=0; double y=0; for (i=0;i<N;i++) { for (j=0;j<thn;j++) { x = ::Rf_rgamma(3.0,1.0/(y*y+4)); y= ::Rf_rnorm(1.0/(x+1),1.0/sqrt(2*x+2)); } xs(i) = x; ys(i) = y; } return Rcpp::DataFrame::create( Named("x")= xs, Named("y") = ys); ' RcppGibbsFn <- cxxfunction( signature(n="int", thin = "int"), gibbscode, plugin="Rcpp") RcppGibbs <- function(N=50000,thin=1000) { RcppGibbsFn(N,thin) }  This version of the sampler runs in 12.4 seconds, just a little bit slower than the C version. Java It is also quite straightforward to call Java code from within R using the rJava package. The following code import java.util.*; import cern.jet.random.tdouble.*; import cern.jet.random.tdouble.engine.*; class GibbsR { public static double[][] gibbs(int N,int thin,int seed) { DoubleRandomEngine rngEngine=new DoubleMersenneTwister(seed); Normal rngN=new Normal(0.0,1.0,rngEngine); Gamma rngG=new Gamma(1.0,1.0,rngEngine); double x=0,y=0; double[][] mat=new double[2][N]; for (int i=0;i<N;i++) { for (int j=0;j<thin;j++) { x=rngG.nextDouble(3.0,y*y+4); y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(2*x+2)); } mat[0][i]=x; mat[1][i]=y; } return mat; } }  can be compiled with javac GibbsR.java (assuming that Parallel COLT is in the classpath), and wrapped up from within an R session with library(rJava) .jinit() obj=.jnew("GibbsR") gibbsJ<-function(N=50000,thin=1000,seed=trunc(runif(1)*1e6)) { result=.jcall(obj,"[[D","gibbs",as.integer(N),as.integer(thin),as.integer(seed)) mat=sapply(result,.jevalArray) colnames(mat)=c("x","y") mat }  This code runs in 10.7 seconds. Yes, that’s correct. Yes, the Java code is faster than both the C and C++ code! This really goes to show that Java is now an excellent option for numerically intensive work such as this. However, before any C/C++ enthusiasts go apoplectic, I should explain why Java turns out to be faster here, as the comparison is not quite fair… In the C and C++ code, use was made of the internal R random number generation routines, which are relatively slow compared to many modern numerical library implementations. In the Java code, I used Parallel COLT for random number generation, as it isn’t straightforward to call the R generators from Java code. It turns out that the COLT generators are faster than the R generators, and that is why Java turns out to be faster here… C+GSL Of course we do not have to use the R random number generators within our C code. For example, we could instead call on the GSL generators, using the following code: #include <stdio.h> #include <math.h> #include <stdlib.h> #include <gsl/gsl_rng.h> #include <gsl/gsl_randist.h> #include <R.h> void gibbsGSL(int *Np,int *thinp,int *seedp,double *xvec,double *yvec) { int i,j; int N=*Np,thin=*thinp,seed=*seedp; gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937); gsl_rng_set(r,seed); double x=0; double y=0; for (i=0;i<N;i++) { for (j=0;j<thin;j++) { x=gsl_ran_gamma(r,3.0,1.0/(y*y+4)); y=1.0/(x+1)+gsl_ran_gaussian(r,1.0/sqrt(2*x+2)); } xvec[i]=x; yvec[i]=y; } }  It can be compiled with R CMD SHLIB -lgsl -lgslcblas gibbsGSL.c, and then called as for the regular C version. This runs in 8.0 seconds, which is noticeably faster than the Java code, but probably not “enough” faster to make it an important factor to consider in language choice. Summary In this post I’ve shown that it is relatively straightforward to call code written in C, C++ or Java from within R, and that this can give very significant performance gains relative to pure R code. All of the options give fairly similar performance gains. I showed that in the case of this particular example, the “obvious” Java code is actually slightly faster than the “obvious” C or C++ code, and explained why, and how to make the C version slightly faster by using the GSL. The post by Dirk shows how to call the GSL generators from the C++ version, which I haven’t replicated here. Gibbs sampler in various languages (revisited) Introduction Regular readers of this blog will know that in April 2010 I published a short post showing how a trivial bivariate Gibbs sampler could be implemented in the four languages that I use most often these days (R, python, C, Java), and I discussed relative timings, and how one might start to think about trading off development time against execution time for more complex MCMC algorithms. I actually wrote the post very quickly one night while I was stuck in a hotel room in Seattle – I didn’t give much thought to it, and the main purpose was to provide simple illustrative examples of simple Monte Carlo codes using non-uniform random number generators in the different languages, as a starting point for someone thinking of switching languages (say, from R to Java or C, for efficiency reasons). It wasn’t meant to be very deep or provocative, or to start any language wars. Suffice to say that this post has had many more hits than all of my other posts combined, is still my most popular post, and still attracts comments and spawns other posts to this day. Several people have requested that I re-do the post more carefully, to include actual timings, and to include a few additional optimisations. Hence this post. For reference, the original post is here. A post about it from the python community is here, and a recent post about using Rcpp and inlined C++ code to speed up the R version is here. The sampler So, the basic idea was to construct a Gibbs sampler for the bivariate distribution $f(x,y) = kx^2\exp\{-xy^2-y^2+2y-4x\},\qquad x>0,y\in\Bbb{R}$ with unknown normalising constant $k>0$ ensuring that the density integrates to one. Unfortunately, in the original post I dropped a factor of 2 constructing one of the full conditionals, which meant that none of the samplers actually had exactly the right target distribution (thanks to Sanjog Misra for bringing this to my attention). So actually, the correct full conditionals are $\displaystyle x|y \sim Ga(3,y^2+4)$ $\displaystyle y|x \sim N\left(\frac{1}{1+x},\frac{1}{2(1+x)}\right)$ Note the factor of two in the variance of the full conditional for $y$. Given the full conditionals, it is simple to alternately sample from them to construct a Gibbs sampler for the target distribution. We will run a Gibbs sampler with a thin of 1000 and obtain a final sample of 50000. Implementations R Let’s start with R again. The slightly modified version of the code from the old post is given below gibbs=function(N,thin) { mat=matrix(0,ncol=3,nrow=N) mat[,1]=1:N x=0 y=0 for (i in 1:N) { for (j in 1:thin) { x=rgamma(1,3,y*y+4) y=rnorm(1,1/(x+1),1/sqrt(2*x+2)) } mat[i,2:3]=c(x,y) } mat=data.frame(mat) names(mat)=c("Iter","x","y") mat } writegibbs=function(N=50000,thin=1000) { mat=gibbs(N,thin) write.table(mat,"data.tab",row.names=FALSE) } writegibbs()  I’ve just corrected the full conditional, and I’ve increased the sample size and thinning to 50k and 1k, respectively, to allow for more accurate timings (of the faster languages). This code can be run from the (Linux) command line with something like: time Rscript gibbs.R  I discuss timings in detail towards the end of the post, but this code is slow, taking over 7 minutes on my (very fast) laptop. Now, the above code is typical of the way code is often structured in R – doing as much as possible in memory, and writing to disk only if necessary. However, this can be a bad idea with large MCMC codes, and is less natural in other languages, anyway, so below is an alternative version of the code, written in more of a scripting language style. gibbs=function(N,thin) { x=0 y=0 cat(paste("Iter","x","y","\n")) for (i in 1:N) { for (j in 1:thin) { x=rgamma(1,3,y*y+4) y=rnorm(1,1/(x+1),1/sqrt(2*x+2)) } cat(paste(i,x,y,"\n")) } } gibbs(50000,1000)  This can be run with a command like time Rscript gibbs-script.R > data.tab  This code actually turns out to be a slightly slower than the in-memory version for this simple example, but for larger problems I would not expect that to be the case. I always analyse MCMC output using R, whatever language I use for running the algorithm, so for completeness, here is a bit of code to load up the data file, do some plots and compute summary statistics. fun=function(x,y) { x*x*exp(-x*y*y-y*y+2*y-4*x) } compare<-function(file="data.tab") { mat=read.table(file,header=TRUE) op=par(mfrow=c(2,1)) x=seq(0,3,0.1) y=seq(-1,3,0.1) z=outer(x,y,fun) contour(x,y,z,main="Contours of actual (unnormalised) distribution") require(KernSmooth) fit=bkde2D(as.matrix(mat[,2:3]),c(0.1,0.1)) contour(fit$x1,fit$x2,fit$fhat,main="Contours of empirical distribution")
par(op)
print(summary(mat[,2:3]))
}
compare()


Python

Another language I use a lot is Python. I don’t want to start any language wars, but I personally find python to be a better designed language than R, and generally much nicer for the development of large programs. A python script for this problem is given below

import random,math

def gibbs(N=50000,thin=1000):
x=0
y=0
print "Iter  x  y"
for i in range(N):
for j in range(thin):
x=random.gammavariate(3,1.0/(y*y+4))
y=random.gauss(1.0/(x+1),1.0/math.sqrt(2*x+2))
print i,x,y

gibbs()


It can be run with a command like

time python gibbs.py > data.tab


This code turns out to be noticeably faster than the R versions, taking around 4 minutes on my laptop (again, detailed timing information below). However, there is a project for python known as the PyPy project, which is concerned with compiling regular python code to very fast byte-code, giving significant speed-ups on certain problems. For this post, I downloaded and install version 1.5 of the 64-bit linux version of PyPy. Once installed, I can run the above code with the command

time pypy gibbs.py > data.tab


To my astonishment, this “just worked”, and gave very impressive speed-up over regular python, running in around 30 seconds. This actually makes python a much more realistic prospect for the development of MCMC codes than I imagined. However, I need to understand the limitations of PyPy better – for example, why doesn’t everyone always use PyPy for everything?! It certainly seems to make python look like a very good option for prototyping MCMC codes.

C

Traditionally, I have mainly written MCMC codes in C, using the GSL. C is a fast, efficient, statically typed language, which compiles to native code. In many ways it represents the “gold standard” for speed. So, here is the C code for this problem.

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>

void main()
{
int N=50000;
int thin=1000;
int i,j;
gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937);
double x=0;
double y=0;
printf("Iter x y\n");
for (i=0;i<N;i++) {
for (j=0;j<thin;j++) {
x=gsl_ran_gamma(r,3.0,1.0/(y*y+4));
y=1.0/(x+1)+gsl_ran_gaussian(r,1.0/sqrt(2*x+2));
}
printf("%d %f %f\n",i,x,y);
}
}


It can be compiled and run with command like

gcc -O4 gibbs.c -lgsl -lgslcblas -lm -o gibbs
time ./gibbs > datac.tab


This runs faster than anything else I consider in this post, taking around 8 seconds.

Java

I’ve recently been experimenting with Java for MCMC codes, in conjunction with Parallel COLT. Java is a statically typed object-oriented (O-O) language, but is usually compiled to byte-code to run on a virtual machine (known as the JVM). Java compilers and virtual machines are very fast these days, giving “close to C” performance, but with a nicer programming language, and advantages associated with virtual machines. Portability is a huge advantage of Java. For example, I can easily get my Java code to run on almost any University Condor pool, on both Windows and Linux clusters – they all have a recent JVM installed, and I can easily bundle any required libraries with my code. Suffice to say that getting GSL/C code to run on generic Condor pools is typically much less straightforward. Here is the Java code:

import java.util.*;
import cern.jet.random.tdouble.*;
import cern.jet.random.tdouble.engine.*;

class Gibbs
{

public static void main(String[] arg)
{
int N=50000;
int thin=1000;
DoubleRandomEngine rngEngine=new DoubleMersenneTwister(new Date());
Normal rngN=new Normal(0.0,1.0,rngEngine);
Gamma rngG=new Gamma(1.0,1.0,rngEngine);
double x=0;
double y=0;
System.out.println("Iter x y");
for (int i=0;i<N;i++) {
for (int j=0;j<thin;j++) {
x=rngG.nextDouble(3.0,y*y+4);
y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(2*x+2));
}
System.out.println(i+" "+x+" "+y);
}
}

}


It can be compiled and run with

javac Gibbs.java
time java Gibbs > data.tab


This takes around 11.6s seconds on my laptop. This is well within a factor of 2 of the C version, and around 3 times faster than even the PyPy python version. It is around 40 times faster than R. Java looks like a good choice for implementing MCMC codes that would be messy to implement in C, or that need to run places where it would be fiddly to get native codes to run.

Scala

Another language I’ve been taking some interest in recently is Scala. Scala is a statically typed O-O/functional language which compiles to byte-code that runs on the JVM. Since it uses Java technology, it can seamlessly integrate with Java libraries, and can run anywhere that Java code can run. It is a much nicer language to program in than Java, and feels more like a dynamic language such as python. In fact, it is almost as nice to program in as python (and in some ways nicer), and will run in a lot more places than PyPy python code. Here is the scala code (which calls Parallel COLT for random number generation):

object GibbsSc {

import cern.jet.random.tdouble.engine.DoubleMersenneTwister
import cern.jet.random.tdouble.Normal
import cern.jet.random.tdouble.Gamma
import Math.sqrt
import java.util.Date

def main(args: Array[String]) {
val N=50000
val thin=1000
val rngEngine=new DoubleMersenneTwister(new Date)
val rngN=new Normal(0.0,1.0,rngEngine)
val rngG=new Gamma(1.0,1.0,rngEngine)
var x=0.0
var y=0.0
println("Iter x y")
for (i <- 0 until N) {
for (j <- 0 until thin) {
x=rngG.nextDouble(3.0,y*y+4)
y=rngN.nextDouble(1.0/(x+1),1.0/sqrt(2*x+2))
}
println(i+" "+x+" "+y)
}
}

}


It can be compiled and run with

scalac GibbsSc.scala
time scala GibbsSc > data.tab


This code takes around 11.8s on my laptop – almost as fast as the Java code! So, on the basis of this very simple and superficial example, it looks like scala may offer the best of all worlds – a nice, elegant, terse programming language, functional and O-O programming styles, the safety of static typing, the ability to call on Java libraries, great speed and efficiency, and the portability of Java! Very interesting.

Groovy

James Durbin has kindly sent me a Groovy version of the code, which he has also discussed in his own blog post. Groovy is a dynamic O-O language for the JVM, which, like Scala, can integrate nicely with Java applications. It isn’t a language I have examined closely, but it seems quite nice. The code is given below:

import cern.jet.random.tdouble.engine.*;
import cern.jet.random.tdouble.*;

N=50000;
thin=1000;
rngEngine= new DoubleMersenneTwister(new Date());
rngN=new Normal(0.0,1.0,rngEngine);
rngG=new Gamma(1.0,1.0,rngEngine);
x=0.0;
y=0.0;
println("Iter x y");
for(i in 1..N){
for(j in 1..thin){
x=rngG.nextDouble(3.0,y*y+4);
y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(2*x+2));
}
println("$i$x \$y");
}


It can be run with a command like:

time groovy Gibbs.gv > data.tab


Again, rather amazingly, this code runs in around 35 seconds – very similar to the speed of PyPy. This makes Groovy also seem like a potential very attractive environment for prototyping MCMC codes, especially if I’m thinking about ultimately porting to Java.

Timings

The laptop I’m running everything on is a Dell Precision M4500 with an Intel i7 Quad core (x940@2.13Ghz) CPU, running the 64-bit version of Ubuntu 11.04. I’m running stuff from the Ubuntu (Unity) desktop, and running several terminals and applications, but the machine is not loaded at the time each job runs. I’m running each job 3 times and taking the arithmetic mean real elapsed time. All timings are in seconds.

 R 2.12.1 (in memory) 435 R 2.12.1 (script) 450.2 Python 2.7.1+ 233.5 PyPy 1.5 32.2 Groovy 1.7.4 35.4 Java 1.6.0 11.6 Scala 2.7.7 11.8 C (gcc 4.5.2) 8.1

If we look at speed-up relative to the R code (in-memory version), we get:

 R (in memory) 1 R (script) 0.97 Python 1.86 PyPy 13.51 Groovy 12.3 Java 37.5 Scala 36.86 C 53.7

Alternatively, we can look at slow-down relative to the C version, to get:

 R (in memory) 53.7 R (script) 55.6 Python 28.8 PyPy 4 Groovy 4.4 Java 1.4 Scala 1.5 C 1

Discussion

The findings here are generally consistent with those of the old post, but consideration of PyPy, Groovy and Scala does throw up some new issues. I was pretty stunned by PyPy. First, I didn’t expect that it would “just work” – I thought I would either have to spend time messing around with my configuration settings, or possibly even have to modify my code slightly. Nope. Running python code with pypy appears to be more than 10 times faster than R, and only 4 times slower than C. I find it quite amazing that it is possible to get python code to run just 4 times slower than C, and if that is indicative of more substantial examples, it really does open up the possibility of using python for “real” problems, although library coverage is currently a problem. It certainly solves my “prototyping problem”. I often like to prototype algorithms in very high level dynamic languages like R and python before porting to a more efficient language. However, I have found that this doesn’t always work well with complex MCMC codes, as they just run too slowly in the dynamic languages to develop, test and debug conveniently. But it looks now as though PyPy should be fast enough at least for prototyping purposes, and may even be fast enough for production code in some circumstances. But then again, exactly the same goes for Groovy, which runs on the JVM, and can access any existing Java library… I haven’t yet looked into Groovy in detail, but it appears that it could be a very nice language for prototyping algorithms that I intend to port to Java.

The results also confirm my previous findings that Java is now “fast enough” that one shouldn’t worry too much about the difference in speed between it and native code written in C (or C++). The Java language is much nicer than C or C++, and the JVM platform is very attractive in many situations. However, the Scala results were also very surprising for me. Scala is a really elegant language (certainly on a par with python), comes with all of the advantages of Java, and appears to be almost as fast as Java. I’m really struggling to come up with reasons not to use Scala for everything!

Speeding up R

MCMC codes are used by a range of different scientists for a range of different problems. However, they are very (most?) often used by Bayesian statisticians who use the algorithms to target a Bayesian posterior distribution. For various (good) reasons, many statisticians are heavily invested in R, like to use R as much as possible, and do as much as possible from within the R environment. These results show why R is not a good language in which to implement MCMC algorithms, so what is an R-dependent statistician supposed to do? One possibility would be to byte-code compile R code in an analogous way to python and pypy. The very latest versions of R support such functionality, but the post by Dirk Eddelbuettel suggests that the current version of cmpfun will only give a 40% speedup on this problem, which is still slower than regular python code. Short of a dramatic improvement in this technology, the only way forward seems to be to extend R using code from another language. It is easy to extend R using C, C++ and Java. I have shown in previous posts how to do this using Java and using C, and the recent post by Dirk shows how to extend using C++. Although interesting, this doesn’t really have much bearing on the current discussion. If you extend using Java you get Java-like speedups, and if you extend using C you get C-like speedups. However, in case people are interested, I intend to gather up these examples into one post and include detailed timing information in a subsequent post.

Java math libraries and Monte Carlo simulation codes

Java libraries for (non-uniform) random number simulation

Anyone writing serious Monte Carlo (and MCMC) codes relies on having a very good and fast (uniform) random number generator and associated functions for generation of non-uniform random quantities, such as Gaussian, Poisson, Gamma, etc. In a previous post I showed how to write a simple Gibbs sampler in four different languages. In C (and C++) random number generation is easy for most scientists, as the (excellent) GNU Scientific Library (GSL) provides exactly what most people need. But it wasn’t always that way… I remember the days before the GSL, when it was necessary to hunt around on the net for bits of C code to implement different algorithms. Worse, it was often necessary to hunt around for a bit of free FORTRAN code, and compile that with an F77 compiler and figure out how to call it from C. Even in the early Alpha days of the GSL, coverage was patchy, and the API changed often. Bad old days… But those days are long gone, and C programmers no longer have to worry about the problem of random variate generation – they can safely concentrate on developing their interesting new algorithm, and leave the rest to the GSL. Unfortunately for Java programmers, there isn’t yet anything quite comparable to the GSL in Java world.

I pretty much ignored Java until Java 5. Before then, the language was too limited, and the compilers and JVMs were too primitive to really take seriously for numerical work. But since the launch of Java 5 I’ve been starting to pay more interest. The language is now a perfectly reasonable O-O language, and the compilers and JVMs are pretty good. On a lot of benchmarks, Java is really quite comparable to C/C++, and Java is nicer to code, and has a lot of impressive associated technology. So if there was a math library comparable to the GSL, I’d be quite tempted to jump ship to the Java world and start writing all of my Monte Carlo codes in Java. But there isn’t. At least not yet.

When I first started to take Java seriously, the only good math library with good support for non-uniform random number generation was COLT. COLT was, and still is, pretty good. The code is generally well-written, and fast, and the documentation for it is reasonable. However, the structure of the library is very idiosyncratic, the coverage is a bit patchy, and there doesn’t ever seem to have been a proper development community behind it. It seems very much to have been a one-man project, which has long since stagnated. Unsurprisingly then, COLT has been forked. There is now a Parallel COLT project. This project is continuing the development of COLT, adding new features that were missing from COLT, and, as the name suggests, adding concurrency support. Parallel COLT is also good, and is the main library I currently use for random number generation in Java. However, it has obviously inherited all of the idiosyncrasies that COLT had, and still doesn’t seem to have a large and active development community associated with it. There is no doubt that it is an incredibly useful software library, but it still doesn’t really compare to the GSL.

I have watched the emergence of the Apache Commons Math project with great interest (not to be confused with Uncommons Math – another one-man project). I think this project probably has the greatest potential for providing the Java community with their own GSL equivalent. The Commons project has a lot of momentum, the Commons Math project seems to have an active development community, and the structure of the library is more intuitive than that of (Parallel) COLT. However, it is early days, and the library still has patchy coverage and is a bit rough around the edges. It reminds me a lot of the GSL back in its Alpha days. I’d not bothered to even download it until recently, as the random number generation component didn’t include the generation of gamma random quantities – an absolutely essential requirement for me. However, I noticed recently that the latest release (2.2) did include gamma generation, so I decided to download it and try it out. It works, but the generation of gamma random quantities is very slow (around 50 times slower than Parallel COLT). This isn’t a fundamental design flaw of the whole library – generating Gaussian random quantities is quite comparable with other libraries. It’s just that an inversion method has been used for gamma generation. All efficient gamma generators use a neat rejection scheme. In case anyone would like to investigate for themselves, here is a complete program for gamma generation designed to be linked against Parallel COLT:

import java.util.*;
import cern.jet.random.tdouble.*;
import cern.jet.random.tdouble.engine.*;

class GammaPC
{

public static void main(String[] arg)
{
DoubleRandomEngine rngEngine=new DoubleMersenneTwister();
Gamma rngG=new Gamma(1.0,1.0,rngEngine);
long N=10000;
double x=0.0;
for (int i=0;i<N;i++) {
for (int j=0;j<1000;j++) {
x=rngG.nextDouble(3.0,25.0);
}
System.out.println(x);
}
}

}


and here is a complete program designed to be linked against Commons Math:

import java.util.*;
import org.apache.commons.math.*;
import org.apache.commons.math.random.*;

class GammaACM
{

public static void main(String[] arg) throws MathException
{
RandomDataImpl rng=new RandomDataImpl();
long N=10000;
double x=0.0;
for (int i=0;i<N;i++) {
for (int j=0;j<1000;j++) {
x=rng.nextGamma(3.0,1.0/25.0);
}
System.out.println(x);
}
}

}


The two codes do the same thing (note that they parameterise the gamma distribution differently). Both programs work (they generate variates from the same, correct, distribution), and the Commons Math interface is slightly nicer, but the code is much slower to execute. I’m still optimistic that Commons Math will one day be Java’s GSL, but I’m not giving up on Parallel COLT (or C, for that matter!) just yet…

Calling Java code from R

Introduction

In the previous post I looked at some simple methods for calling C code from R using a simple Gibbs sampler as the motivating example. In this post we will look again at the same Gibbs sampler, but now implemented in Java, and look at a couple of options for calling that code from an R session.

Stand-alone Java code

Below is some Java code for implementing the bivariate Gibbs sampler discussed previously. It relies on Parallel COLT, which must be installed and in the Java CLASSPATH in order to follow the examples.

import java.util.*;
import cern.jet.random.tdouble.*;
import cern.jet.random.tdouble.engine.*;

class Gibbs
{

public static void main(String[] arg)
{
if (arg.length != 3) {
System.err.println("Usage: java Gibbs <Iters> <Thin> <Seed>");
System.exit(1);
}
int N=Integer.parseInt(arg[0]);
int thin=Integer.parseInt(arg[1]);
int seed=Integer.parseInt(arg[2]);
DoubleRandomEngine rngEngine=new DoubleMersenneTwister(seed);
Normal rngN=new Normal(0.0,1.0,rngEngine);
Gamma rngG=new Gamma(1.0,1.0,rngEngine);
double x=0,y=0;
System.out.println("Iter x y");
for (int i=0;i<N;i++) {
for (int j=0;j<thin;j++) {
x=rngG.nextDouble(3.0,y*y+4);
y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(x+1));
}
System.out.println(i+" "+x+" "+y);
}
}

}


It can be compiled and run stand-alone from an OS shell with the following commands:

javac Gibbs.java
java Gibbs 10 1000 1


As discussed in the previous post, it is possible to call any command-line program from inside an R session using the system() command. A small wrapper function for conveniently running this code from within R can be written as follows.

gibbs<-function(N=10000,thin=500,
seed=trunc(runif(1)*1e6),
exec="Gibbs",
tmpfile=tempfile())
{
command=paste("java",exec,N,thin,seed,">",tmpfile)
system(command)
}


This can then be run from within an R session with a simple call to gibbs(). Note that a random seed is being generated within R to be passed to the Java code to be used to seed the COLT random number generator used within the Java code. As previously discussed, for many long running codes, this approach can be quite effective, and is clearly very simple. However, there is an overhead associated with the system() call, and also with writing output to disk and then reading it back again.

Using rJava

It is possible to avoid the overheads associated with the above approach by directly calling the Java code from R and having the return values returned directly into the R session from memory. There isn’t really direct support for this within the core R language, but there are couple of different solutions provided by R packages. The simplest and most popular approach seems to be the rJava package. This package can be installed with a simple

install.packages("rJava")


This should “just work” on some OSs (eg. Windows), but may fail on other OSs if R is not aware of the local Java environment. If the installation fails, check the error message carefully for advice/instructions. On most Linux systems, the problem can be fixed by quitting R, then running the following command from the shell

sudo R CMD javareconf


before re-starting R and re-attempting the installation. rJava provides a mechanism for starting a JVM within the running R session, creating objects, calling methods and having method return values returned to R. It is actually much more flexible than the .C() function for C code discussed in the previous post.

In order to use this package for our example, we must first re-factor the code slightly in the following way.

import java.util.*;
import cern.jet.random.tdouble.*;
import cern.jet.random.tdouble.engine.*;

class GibbsR
{

public static void main(String[] arg)
{
if (arg.length != 3) {
System.err.println("Usage: java GibbsR <Iters> <Thin> <Seed>");
System.exit(1);
}
int N=Integer.parseInt(arg[0]);
int thin=Integer.parseInt(arg[1]);
int seed=Integer.parseInt(arg[2]);
double[][] mat=gibbs(N,thin,seed);
System.out.println("Iter x y");
for (int i=0;i<N;i++) {
System.out.println(""+i+" "+mat[0][i]+" "+mat[1][i]);
}
}

public static double[][] gibbs(int N,int thin,int seed)
{
DoubleRandomEngine rngEngine=new DoubleMersenneTwister(seed);
Normal rngN=new Normal(0.0,1.0,rngEngine);
Gamma rngG=new Gamma(1.0,1.0,rngEngine);
double x=0,y=0;
double[][] mat=new double[2][N];
for (int i=0;i<N;i++) {
for (int j=0;j<thin;j++) {
x=rngG.nextDouble(3.0,y*y+4);
y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(x+1));
}
mat[0][i]=x; mat[1][i]=y;
}
return mat;
}

}


This code can be compiled and run from the command-line just as the previous code could.

javac GibbsR.java
java GibbsR 10 1000 1


However, we have now separated out the code we want to be able to call from R into a static method called gibbs, which runs the Gibbs sampler and stores the result in a 2-dimensional array which is its return value. We can now see how to call this code from within a running R session. We first need to set up the R environment ready to call the code.

library(rJava)
.jinit()
obj=.jnew("GibbsR")


Line 1 loads the package, line 2 starts up the JVM, and line 3 creates a link to the the GibbsR class (in general this is used to create a new Java object of the given type, but here we are using static methods). Java methods are called on Java objects using .jcall(). We can write a simple R function to conveniently call the method as follows.

jgibbs<-function(N=10000,thin=500,seed=trunc(runif(1)*1e6))
{
result=.jcall(obj,"[[D","gibbs",as.integer(N),as.integer(thin),as.integer(seed))
mat=sapply(result,.jevalArray)
mat=cbind(1:N,mat)
colnames(mat)=c("Iter","x","y")
mat
}


This can now be called with a simple jgibbs(). The first line of the function body carries out the actual method call. The return type of the method must be explicitly declared – “[[D” means a 2-dimensional array of doubles, using JNI notation. Care must also be taken to coerce the method parameters into the correct type that the Java method expects to receive. .jcall() is generally quite good at unpacking basic Java types into corresponding R types. However, the two dimensional array is here returned as an R list consisting of one-dimensional Java array objects. The unpacking is completed using the subsequent call to jevalArray() using sapply(), before the resulting matrix is tidied up and returned to the R session.