I just returned from an inspiring Spatial Statistics conference held in Lancaster. In this conference I gave a contributed talk on fast computation of prediction variances using sparse linear algebraic methods, described in detail in this manuscript. Following the talk I had fruitful discussions with Doug Nychka and on the methods’ potential use in LatticeKrig in the future.
I am going to cut to the chase and talk about one of the most inspiring talks I ever heard, given by Peter Diggle at the end of the conference. There was nothing “new” in this talk, nor anything cutting edge. I would call it a “story” about a journey in data analysis, from data collection to policy implementation, that contained little nuggets of wisdom that could only come from someone with decades of experience in the field. Peter talked about the problem of river blindness in Africa, and the problem of the side-effects of treatment on people co-infected with Loa Loa. Important recurring messages in Peter’s talk included the following:
- Never forget the importance of covariates in improving prediction ability in spatial models. The spatial residual field simply captures things we cannot explain.
- Do not underestimate the difficulty in data collection, and in thinking of ways to improve this (recalling the image of a “data collector” on a motocross bike heading towards a remote town to talk to village elders!).
- Never give up on trying to convey uncertainty to policy makers, and on using this to motivate improved experimental design.
- Always try to include experts in the field who will likely tell you where your spatial predictions do not make sense!
The conference contained several interesting talks, too many to mention here. Some which struck me as particularly intriguing, because of their relevance to my own research, included those by
- Doug Nychka on local modelling of climate. When modelling using precision matrices, gluing together the local models leads to a ‘valid’ global model. Local models are able to capture the variable variance and length scales apparent in climate models and are hence ideal for emulating them.
- Samir Bhatt who talked about the application of machine learning techniques to spatial modelling. In particular, Samir made the important point that spatial statisticians often only model the covariance between 2 or 3 space-time variables: Why not model the covariances between 40 or so variables (including space and time) as is done in machine learning? He advocated the use of random Fourier features (essentially fitting the covariance function in spectral space), dropout for regularisation, and the use of TensorFlow and stochastic gradient descent for optimisation. I want to note here that the latter points are key in Goodfellow’s new book on Deep Learning and I agree here with Samir that we have a lot to learn in this regard. His final comment on learning to incorporate mechanistic models inside statistical models seemed to divide the audience and Peter noted that it was interesting that this comment came after a talk that was exclusively on black-box models.
- Won Chang who talked about emulating an ice sheet model (PSU3d-ICE) in order to calibrate its parameters against observations. What was interesting is that calibration was based on both (i) time-series observations of grounding line position and (ii) spatial binary data denoting ice presence/absence (initially studied here). I queried Won as to whether it is possible to calibrate the basal friction coefficient, which is probably the (spatially-distributed) parameter that determines ice evolution the most. In his case a fingerprint method was used whereby the (spatial) pattern of the friction coefficient was fixed and one parameter was used to calibrate it. I have often wondered how one could effectively parameterise the highly spatially irregular friction coefficient for emulation and I think this is still an open question.
- Jorge Mateu who described an ANOVA method for point patterns by testing whether the K functions are the same or different between groups. The test used was the integrated difference between the K functions which of course has no known distribution under the null hypothesis.
- James Faulkner who talked about locally adaptive smoothing using Markov fields and shrinkage priors. The key prior distribution here was the horseshoe prior which I’m not too familiar with. James showed some impressive figures where the MRF could be used to recover discontinuities in the signal.
- Denis Allard who extended the parsimonious multivariate Matern covariance functions of Gneiting et al. to the spatio-temporal case (based on Gneiting’s class of spatio-temporal covariance functions).
This is my second Spatial Statistics conference, and I’m glad to have attended. It is becoming increasingly difficult to choose between the several top-notch conferences on offer in the sort of research I do, but Spatial Statistics remains one of my favourites.