The last Valencia meeting on Bayesian Statistics and the future of Bayesian computation

I’ve spent the last week in Benidorm, Spain, for the 9th and final Valencia meeting on Bayesian Statistics. Nine of us travelled from Newcastle University, making us one of the best represented groups at the meeting. This was my fifth Valencia meeting – the first I attended was Valencia 5 which took place in Alicante back in 1994 when I was a PhD student at Durham University working with Michael Goldstein. Our contributed paper to the proceedings of that meeting was my first publication, and I’ve been to every meeting since. Michael is one of the few people to have attended all 9 Valencia meetings (in addition to the somewhat mythical “Valencia 0”). It was therefore very fitting that Michael opened the Valencia 9 invited programme (with a great talk on Bayesian analysis of complex computer models). Like many others, I’ll be a little sad to think that the Valencia meetings have come to an end.

The meeting itself was scientifically very interesting. I wish that I had the energy to give a summary of the scientific programme, but unfortunately I don’t! However, anyone who does want to get something of the flavour of the programme should take a look at the “Valencia Snapshots” on Christian Robert’s blog. My own talk gets a mention in Snapshot 4. I presented a paper entitled Parameter inference for stochastic kinetic models of bacterial gene regulation: a Bayesian approach to systems biology. Unfortunately my discussant, Sam Kou, was unable to travel to the meeting due to passport problems, but very kindly produced a pre-recorded video discussion to be played to me and the audience at the end of my talk. After a brief problem with the audio (a recurring theme of the meeting!), this actually worked quite well, though it felt slightly strange replying to his discussion knowing that he could not hear what I was saying!

There were several talks discussing Bayesian approaches to challenging problems in bioinformatics and molecular biology, and these were especially interesting to me. I was also particularly interested in the talks on Bayesian computation. Several talks mentioned the possibility of speeding up Bayesian computation using GPUs, and Chris Holmes gave a nice overview of the current technology and its potential, together with a link to a website providing further information. Although there is no doubt that GPU technology can provide fairly impressive speedups for certain Bayesian computations, I’m actually a little bit of a GPU-sceptic, so let me explain why. There are many reasons. First, I’m always a bit suspicious of a technology that is fairly closed and proprietary being pushed by a large powerful company – I prefer my hardware to be open, and my software to be free and open. Next, there isn’t really anything that you can do on a GPU that you can’t do on a decent multicore server or cluster using standard well established technologies such as MPI and OpenMP. Also, GPUs are relatively difficult to program, and time taken for software development is a very expensive cost which in many cases will dwarf differences in hardware costs. Also, in the days when 64 bit chips and many GB of RAM are sitting on everyone’s desktops, do I really want to go back to 1 GB of RAM, single precision arithmetic and no math libraries?! That hardly seems like the future I’m envisaging! Next, there are other related products like the Intel Knights Corner on the horizon that are likely to offer similar performance gains while being much simpler to develop for. Next, it seems likely to me that machines in the future are going to feature massively multicore CPUs, rendering GPU computing obsolete. Finally, although GPUs offer one possible approach to tackling the problem of speedup, they do little for the far more general and important problem of scalability of Bayesian computing and software. From that perspective, I really enjoyed the talk by Andrew McCallum on Probabilistic programming with imperatively-defined factor graphs. Andrew was talking about a flexible machine learning library he is developing called factorie for the interesting new language Scala. Whilst that particular library is not exactly what I need, fundamentally, his talk was all about building frameworks for Bayesian computation which really scale. I think this is the real big issue facing Bayesian computation, and modern languages and software platforms, including so-called cloud computing approaches, and technologies like Hadoop and MapReduce probably represent some of the directions we should be looking in. There is an interesting project called CRdata which is a first step in that direction. Clearly these technologies are somewhat orthogonal to the GPU/speedup issue, but I don’t think they are completely unrelated.

Published by

darrenjw

I am Professor of Statistics within the Department of Mathematical Sciences at Durham University, UK. I am an Bayesian statistician interested in computation and applications, especially to engineering and the life sciences.

9 thoughts on “The last Valencia meeting on Bayesian Statistics and the future of Bayesian computation”

  1. Re GPU
    “I’m always a bit suspicious of a technology that is fairly closed and proprietary being pushed by a large powerful company”
    I don’t see what is closed about CUDA, in fact, they’ve opened the platform to programming, both using CUDA or OpenCL. In the same breath yet, you mention Intel, whose compiler has been found to purposely hinder the performance of non-Intel chips.

    “Next, there isn’t really anything that you can do on a GPU that you can’t do on a decent multicore server or cluster using standard well established technologies such as MPI and OpenMP.”
    Except you need 10-25 CPU based machines to equal 1-2 GPUs in gigaflops.
    CUDA in based on C (established) and can be used in conjunction with MPI/SMP models assuming you have the (much cheaper) hardware.

    “GPUs are relatively difficult to program, and time taken for software development is a very expensive cost which in many cases will dwarf differences in hardware costs.”
    Depends on the size of your problem. If you need 300 nodes, then modifying your code to run on 4 GPU’s will be worth the cost of buying and power needed to run those 300 nodes.

    “in the days when 64 bit chips and many GB of RAM are sitting on everyone’s desktops, do I really want to go back to 1 GB of RAM, single precision arithmetic and no math libraries?!”
    I’m not sure you understand where the technology is going. Many cores means less memory per core. Adding cores is cheap, adding RAM is not (due power requirements). As of when you wrote this, GPUs already have DP, an many problems do not require DP. Regardless, Fermi has 4GB of shared memory…

    “Next, there are other related products like the Intel Knights Corner on the horizon that are likely to offer similar performance gains while being much simpler to develop for.”
    If you read that announcement you’ll notice that the 22nm chip “will use Moore’s Law to scale to more than 50 Intel processing cores on a single chip.”
    As I said before, more cores is cheap to add, and Fermi has some 500+ ALUs… Intel is coming out with only 50. Sure you have more ram, but what is the cost of accessing that memory? Does the chip include embedded hardware functions? GPUs do.

    As you can see by GPUs and Intels announcement, whether you envisage it or not, many core programming is the way of the future. If the current algorithms don’t adapt, then they’ll always be limited to the power wall ($$$) because the lack of effort to port away from big memory effortless programming.

    1. I totally agree that many core programming is the way of the future – I don’t think I suggested otherwise. I also agree that there is a lot of momentum behind GPU HPC, and that things are changing fast. But now, and I think for at least the near future, it is much simpler to parallelise a typical stochastic simulation code (using lots of RAM and lots of random numbers and double precision math library functions) using MPI or OpenMP on an SMP machine or cluster than it is to port it to run on Nvidia cards. That may change, and when it does that will change things. But as I said, by then we might all have 96-core x86 chips by default, and so then will we care?

      I also think it is important to make a distinction between the kind of codes that research statisticians write, and the kinds of HPC codes that molecular dynamics and fluid dynamics people write. By and large, the latter groups have fairly stable algorithms at the core of their research, and so the effort to port to get speed up is clearly worthwhile. But research statisticians change their algorithms in quite fundamental ways all the time, so there is a risk that by the time they have got their algorithms running quickly on many core, they are not interested in that algorithm any more! That does change the trade-offs somewhat.

  2. Pingback: VlcMad
  3. I understand your points Darren and I would like to summarise it by saying that the learning curve of GPU programming is so steep and the time of GPU technology renewal so short that, indeed, it could be worthless spending time for it. But, on GPU’s defence, I think that the future will see both GPUs and CPUs in action, in the field of scientific computing. Don’t forget that the pipeline of a GPU is much deeper and that makes GPUs unbeatable for repetitive computations. Moreover the memory bus is much wider, which consequently allows memory access in a much faster way compared to the traditional CPU-RAM system. It’s not a matter of size (a couple of GB versus +64GB). Since the CPU is usually equipped with higher amounts of memory, computations performed by GPU in concert with CPU (that builds up the solution from results from basic blocks processed by the GPUs) seems to be the right way. At least from an efficiency point of view.

      1. Well that’s a great choice although the comparison is not fair 🙂 Xeon Phi has the same instruction set of the Intel CPU it runs with. Therefore no instruction emulation is required for the Xeon, which results in much less cores 😉

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.