Notes from Spatial Statistics 2017 (conference)

I just returned from an inspiring Spatial Statistics conference held in Lancaster. In this conference I gave a contributed talk on fast computation of prediction variances using sparse linear algebraic methods, described in detail in this manuscript. Following the talk I had fruitful discussions with Doug Nychka and on the methods’ potential use in LatticeKrig in the future.

I am going to cut to the chase and talk about one of the most inspiring talks I ever heard, given by Peter Diggle at the end of the conference. There was nothing “new” in this talk, nor anything cutting edge. I would call it a “story” about a journey in data analysis, from data collection to policy implementation, that contained little nuggets of wisdom that could only come from someone with decades of experience in the field. Peter talked about the problem of river blindness in Africa, and the problem of the side-effects of treatment on people co-infected with Loa Loa. Important recurring messages in Peter’s talk included the following:

  • Never forget the importance of covariates in improving prediction ability in spatial models. The spatial residual field simply captures things we cannot explain.
  • Do not underestimate the difficulty in data collection, and in thinking of ways to improve this (recalling the image of a “data collector” on a motocross bike heading towards a remote town to talk to village elders!).
  • Never give up on trying to convey uncertainty to policy makers, and on using this to motivate improved experimental design.
  • Always try to include experts in the field who will likely tell you where your spatial predictions do not make sense!

The conference contained several interesting talks, too many to mention here. Some which struck me as particularly intriguing, because of their relevance to my own research, included those by

  • Doug Nychka on local modelling of climate. When modelling using precision matrices, gluing together the local models leads to a ‘valid’ global model. Local models are able to capture the variable variance and length scales apparent in climate models and are hence ideal for emulating them.
  • Samir Bhatt who talked about the application of machine learning techniques to spatial modelling. In particular, Samir made the important point that spatial statisticians often only model the covariance between 2 or 3 space-time variables: Why not model the covariances between 40 or so variables (including space and time) as is done in machine learning? He advocated the use of random Fourier features (essentially fitting the covariance function in spectral space), dropout for regularisation, and the use of TensorFlow and stochastic gradient descent for optimisation. I want to note here that the latter points are key in Goodfellow’s new book on Deep Learning and I agree here with Samir that we have a lot to learn in this regard. His final comment on learning to incorporate mechanistic models inside statistical models seemed to divide the audience and Peter noted that it was interesting that this comment came after a talk that was exclusively on black-box models.
  • Won Chang who talked about emulating an ice sheet model (PSU3d-ICE) in order to calibrate its parameters against observations. What was interesting is that calibration was based on both (i) time-series observations of grounding line position and (ii) spatial binary data denoting ice presence/absence (initially studied here). I queried Won as to whether it is possible to calibrate the basal friction coefficient, which is probably the (spatially-distributed) parameter that determines ice evolution the most. In his case a fingerprint method was used whereby the (spatial) pattern of the friction coefficient was fixed and one parameter was used to calibrate it. I have often wondered how one could effectively parameterise the highly spatially irregular friction coefficient for emulation and I think this is still an open question.
  • Jorge Mateu who described an ANOVA method for point patterns by testing whether the K functions are the same or different between groups. The test used was the integrated difference between the K functions which of course has no known distribution under the null hypothesis.
  • James Faulkner who talked about locally adaptive smoothing using Markov fields and shrinkage priors. The key prior distribution here was the horseshoe prior which I’m not too familiar with. James showed some impressive figures where the MRF could be used to recover discontinuities in the signal.
  • Denis Allard who extended the parsimonious multivariate Matern covariance functions of Gneiting et al. to the spatio-temporal case (based on Gneiting’s class of spatio-temporal covariance functions).

This is my second Spatial Statistics conference, and I’m glad to have attended. It is becoming increasingly difficult to choose between the several top-notch conferences on offer in the sort of research I do, but Spatial Statistics remains one of my favourites.

 

 

Global Challenges Travel (to the University of Bristol)

In February 2017 I was fortunate enough to spend a month in Bristol on a Travel award by the University of Wollongong Global Challenges Program, aimed at furthering an already existing
collaboration between the University of Wollongong and the University of Bristol. The collaboration is based on work stemming from my previous postdoctoral position at Bristol, and it is in the field of climate science and the statistics that enable it. In this post I discuss two projects which were at the focus of this visit.

The first project is based on carbon dioxide flux inversion which, in a nutshell, is any technique that allows us to determine where the carbon dioxide is coming from, and where it is going, just from readings of carbon dioxide at the surface or from satellites.

The process by which sources and sinks, that is, the flux, generate carbon dioxide (the generative process) is well-understood. Recovering the flux from measurements of carbon dioxide (inversion) is a much harder problem.


Together with Bristol, the Centre for Environmental Informatics at the University of Wollongong is looking at ways in which sophisticated statistical techniques can be used to take into account uncertainty at each of the numerous stages in flux inversion. This means taking into account uncertainty in the satellite retrieval right down to the uncertainty in how sinks and sources of carbon dioxide evolve in time. This work hopes to reconcile several of the existing more conventional approaches that frequently neglect uncertainties at every stage of the flux inversion process, and that may easily result in over-confident estimates of flux. The approach should also allow us to objectively assess to what extent we can reliably estimate flux from, say, satellite data, when taking all uncertainties into account.

With Bristol we are specifically focusing on the OCO-2 satellite, launched by NASA in July 2014. The satellite takes readings of column-averaged carbon dioxide. Given a set of flux fields, we can reproduce the column-averaged carbon dioxide by passing the flux through a transport model (a model which simulates the movement of carbon dioxide in the atmosphere) driven by meteorology fields. These ‘model-outputs’ were provided to us by the team at Bristol. From a statistical perspective, the problem of inversion is complex, as it involves the consideration of transport, as well as all the biases and uncertainties in OCO-2 retrievals, which can be significant. There is also a lot of data: OCO-2 generates on the order of millions usable retrievals per month; how to reliably assemble all these data within an inversion framework is still an open problem. My visit to Bristol proved invaluable in sorting out some of the teething issues that emerge when attempting to do the inversion from satellite data in this way. Our joint work will be presented at the European Geophysical Union in Vienna in April 2017. This project is carried out in collaboration with Noel Cressie (Wollongong), Ann Stavert (Bristol), Matt Rigby (Bristol), Anita Ganesan (Bristol), and Peter Rayner (Melbourne).

Artist’s rendering of NASA’s Orbiting Carbon Observatory (OCO)-2 (Source: NASA)


The second project is based on global sea-level rise. Sea-level rise is both a symptom of, and a contributor to, climate change, and one of the most concerning. For example, 85% of the Australian population lives within 50 km of the coast, and it’s not so much the sea-level rise in itself which is the problem, but the increased risk and impact of storm surges which accompany a higher sea level. Considerable infrastructure re-planning will be needed in the next few decades to counter the effects of what seems an unstoppable rise in sea level.

 

Bruce C. Douglas (1997). “Global Sea Rise: A Redetermination”. Surveys in Geophysics 18: 279–292. DOI:10.1023/A:1006544227856. Can be redistributed under the Creative Commons Attribution-Share Alike 3.0 Unported license.


The GlobalMass team based in Bristol is trying to answer a simple, yet unanswered, question — what are the contributors to global sea-level rise? Global sea-level rise can be due to a number of factors: (i) thermal expansion (as the ocean heats, it expands), (ii) change in salinity (salt increases the water density), (iii) hydrology (dams, rivers, etc.) and (iv) ice sheets and ice caps (as ice sheets melt, the sea level rises). Further, sea-level rise is highly spatially and temporally dependent, making it very difficult to assess what is contributing to it without a deep understanding of the physical processes driving it, and what their regional and temporal impacts are. In order to “source separate” the contributors, we incorporate a large variety of data sources, including Argo buoy data, gravimetry, altimetry and  tide gauges, all at once. The uncertainties, as well as the footprints of the observations, are all taken into account when solving for the separate sea-level rise contributions. This work, funded by the European Union, is a follow-on of the UK-funded RATES project which I worked on for two years, and which resulted in a statistical framework for identifying the contributors to sea-level rise just from Antarctica. This visit to Bristol helped to flesh out some of the statistical problems related to the GlobalMass project objective, and devise methods and a way forward to solve them.

The GlobalMass team consists of 8 researchers from the University of Bristol and project partners from the University of Wollongong, University of Tasmania, NASA, LEGOS, and the University of Colorado, Boulder. More information on the project can be found at globalmass.eu.

GlobalMass meeting, February 2017.