Programme & Book of abstracts
Programme overview
Day 1 (21/05/2025)
Location: talks in room 222 BO, poster session in M&S CM, coffee break in the Atrium
8:00-8:20 Self-registration
8:20-8:30 Opening session
8:30-9:30 Keynote talk 1 by Finn Lindgren (University of Edinburgh): Markovian potatoes - a collaborative effort
9:30-10:30 Invited talks
- 9:30-10:00 Lola Ugarte (Public University of Navarre): On spatial confounding in multivariate areal models
- 10:00-10:30 Lisa B Gaedke-Merzhäuser (KAUST): What R (and maybe you) don’t know about INLA
10:30-11:00 Coffee break
11:00-12:30 Contributed talks
- 11:00-11:18 Stephen Jun Villejo (University of Glasgow): Validating uncertainty propagation approaches for two-stage Bayesian spatial models using simulation-based calibration
- 11:18-11:36 Man Ho Suen (University of Edinburgh): Coherent Disaggregation and Uncertainty Quantification for Spatially Misaligned Data
- 11:36-11:54 Ingelin Steinsland (NTNU): INLA made Norwegian runoff maps better
- 11:54-12:12 Sigrunn Holbek Sørbye (UiT The Arctic University of Norway): Identifying early warning signals in climatic time series
- 12:12-12:30 Jafet Belmont (University of Glasgow): Combining Efforts: Integrating Citizen Science and Survey Data using inlabru
12:30-14:00 Lunch time
14:00-16:00 Tutorial: inlabru
16:15- 18:00 Poster session, wine and cake
Day 2 (22/05/2025)
Location: talks in room 222 BO, coffee break in the Atrium
8:30-9:30 Keynote talk 2 by Sara Martino (NTNU): 20 years of INLA
9:30-10:30 Panel discussion: Implementing INLA - Past, Present and Future. Chair: Lisa B Gaedke-Merzhäuser
10:30-11:00 Coffee break
11:00-12:00 Invited talks
- 11:00-11:30 Andrea Riebler (NTNU): INLA has made Bayesian inference more accessible to applied scientists
- 11:30-12:00 Esmail Abdul Fattah (KAUST): From Single-Core to Many-Core to GPUs
12:00-13:30 Lunch time
13:30-15:30 Tutorial: MetricGraph
15:30- 16:00 Coffee break & photo
16:00-17:00 Keynote talk 3 by Janet van Niekerk (KAUST): INLA 2.0
Day 3 (23/05/2025)
Location: talks in room 222 BO, coffee break in the Atrium
8:30-9:30 Keynote talk 4 by Elias Teixeira Krainski (KAUST): graphpcor: Models for correlation matrices based on graphs
9:30-11:00 Contributed talks
- 9:30-9:48 Lenin Rafael Riera Segura (KAUST): A new class of non-stationary Gaussian fields with general smoothness on metric graphs
- 9:48-10:06 Virgilio Gómez-Rubio (Universidad de Castilla-La Mancha) “INLA con cosas”: embedding INLA to fit a larger class of models
- 10:06-10:24 Karina Lilleborge (NTNU) Joint Modelling of Line and Point Data on Metric Graphs
- 10:24-10:42 Birgir Hrafnkelsson (University of Iceland) Max-and-Smooth: A Two-Step Approach for Approximate Bayesian Inference in Latent Gaussian Models
- 10:42-11:00 Brynjólfur Gauti Guðrúnar Jónsson (University of Iceland) Modeling spatial dependence through latent Gaussian models with spatial copulas
11:00-11:30 Coffee break
11:30-12:00 Invited talk by Dan Simpson (.txt)
12:00-13:00 Håvard’s speech, awards and closing
Programme at a glance
Book of abstracts
Please click here to download a pdf version.
Day 1 (21/05/2025)
Title: Markovian potatoes - a collaborative effort
Summary. In 2004, Haavard asked me the key question that spawned the current methods for representing continuous Gaussian random fields with local basis function expansions and blending that with techniques for Markov random field computations: “Can you construct a stationary Markov random field on a sphere?”. I will revisit some of the history pre- and post-dating that key moment, and how the research has evolved into present day. Also, there will be (numerical) potatoes.
Lola Ugarte (Public University of Navarre, Spain)
Title: On spatial confounding in multivariate areal models
Summary. Spatial areal models encounter the well-known and challenging problem of spatial confounding. This issue makes it arduous to distinguish between the impacts of observed covariates and spatial random effects. Despite previous research and various proposed methods to tackle this problem, finding a definitive solution remains elusive. In this paper, we propose a simplified version of the spatial+ approach that eliminates the need to separately fit spatial models for the covariates. We apply this method to analyse two forms of crimes against women in Uttar Pradesh, India. To evaluate the performance of the new approach, we present a simulation study under different spatial confounding scenarios.
Lisa B Gaedke-Merzhauser
Title: What R (and maybe you) don’t know about INLA
Summary. We dive under the surface of the R-INLA package and take a look at how your latent Gaussian model gets transformed into linear algebra. What does performing inference with INLA mean from an algorithmic perspective and how is this realized in the implementation? Where is the time spent when you fit your model? We explore how different inference tasks pose different computational challenges and meanwhile shed a (small) beam of light on the jungle of C-code that makes the R-INLA package. We expose some of the tricks, cheat codes and strikes of genius that are used behind the scenes to provide users with fast and accurate inference results.
Stephen Jun Villejo (University of Glagow)
Title: Validating uncertainty propagation approaches for two-stage Bayesian spatial models using simulation-based calibration
Summary. This work tackles the problem of uncertainty propagation in two-stage Bayesian models, with focus on spatial applications. A two-stage modeling framework has the advantage of being more computationally efficient than a fully Bayesian approach when the first-stage model is already complex in itself, and avoids the potential problem of unwanted feedback effects. Two ways of doing two-stage modeling are the crude plug-in method and the posterior sampling method. The former ignores the uncertainty in the first-stage model, while the latter can be computationally expensive. This paper validates the two aforementioned approaches and proposes a new approach to do uncertainty propagation, which we call the Q uncertainty method, implemented using the Integrated Nested Laplace Approximation (INLA). We validate the different approaches using the simulation-based calibration method, which tests the self-consistency property of Bayesian models. Results show that the crude plug-in method underestimates the true posterior uncertainty in the second-stage model parameters, while the resampling approach and the proposed method are correct. We illustrate the approaches in a real life data application which aims to link relative humidity and Dengue cases in the Philippines for August 2018.
Man Ho Suen (University of Edinburgh)
Title: Coherent Disaggregation and Uncertainty Quantification for Spatially Misaligned Data
Summary. Spatial misalignments arise from data aggregation or attempts to align misaligned data, leading to information loss. We propose a disaggregation framework that combines the finite element method (FEM) with a first-order Taylor approximation via integrated nested Laplace approximation (INLA).
In landslide studies, landslide occurrences are often aggregated into counts based on slope units, reducing spatial detail. Our framework examines point pattern and aggregated count models under four covariate field scenarios: . The first three involve aggregation, while the latter two have incomplete fields. For these, we estimate the full covariate field using methods, with the latter two accounting for uncertainty propagation and showing superior performance. Even under model misspecification (i.e. modelling a nonlinear field as linear), these methods remain more robust.
Whenever possible, point pattern observations and full-resolution covariate fields should be prioritized. For incomplete fields, methods incorporating uncertainty propagation are preferred. This framework supports landslide susceptibility and other spatial mapping, integrating seamlessly with INLA extension packages.
Ingelin Steinsland (NTNU)
Title: INLA made Norwegian runoff maps better
Summary. Runoff is a key variable in hydrology, but most areas in the world lack runoff measurements, and runoff must be estimated. The mean annual runoff is based on 30 years. However, in many catchments there are shorter records. In Roksvåg et al (2021) a Bayesian geostatistical model for annual runoff that models several years of runoff simultaneously using both point observations (precipitation) and areal observations (catchment runoff) with two spatial fields was developed. This model was further developed and validated for estimating annual mean runoff using both long and short runoff timeseries (Roksvåg et al 2020). In Roksvåg et al (2022) a two-step methodology for merging models enabling short records and process-based simulations were developed. Step 1 is preprocessing based on Roksvåg et at (2020). Step 2 uses a spatial varying coefficient approach such that the relationship between the covariate (simulations from the HBV model) and the response variable (observed mean annual runoff). The proposed approach outperformed the HBV model when predicting runoff for ungauged and partially gauged catchments with a reduction in RMSE of 20 % for ungauged catchments and 58 % for partially gauged catchments. For ungauged catchments the proposed framework also outperformed a purely geostatistical method. In both papers the INLA-SPDE approach to Bayesian model and fast inference was essential. In 2022, the Norwegian water resources and Energy Directorate (NVE) launched a new, gridded mean annual runoff map for Norway using the methodology of Roksvåg et al (2022).
References
Roksvåg, T., Steinsland, I., and Engeland, K. (2020) Estimation of annual runoff by exploiting long-term spatial patterns and short records within a geostatistical framework, Hydrol. Earth Syst. Sci., 24, 4109–4133, https://doi.org/10.5194/hess-24-4109-2020, 2020.
Roksvåg, T., Steinsland, I., and Engeland, K (2021) A two-field geostatistical model combining point and areal observations—A case study of annual runoff predictions in the Voss area, Journal of the Royal Statistical Society Series C, 70,4, https://academic.oup.com/jrsssc/article-abstract/70/4/934/7034001, 2021
Roksvåg, T., Steinsland, I., and Engeland, K. (2022) A geostatistical spatially varying coefficient model for mean annual runoff that incorporates process-based simulations and short records, Hydrol. Earth Syst. Sci., 26, 5391–5410, https://doi.org/10.5194/hess-26-5391-2022, 2022
Sigrunn Holbek Sørbye (UiT The Arctic University of Norway)
Title: Identifying early warning signals in climatic time series
Summary. Detection of early warning signals in climatic time series is important for assessing the risk of critical transitions and anticipating potential tipping points. In addition to long-term trends, common statistical indicators of early warning signals include increased variance and rising autocorrelation, both of which may indicate critical slowing down before abrupt climate shifts. A widely used method to evaluate such changes is Kendall’s rank correlation coefficient, applied to estimates of variance and lag-1 autocorrelation in sliding windows. However, its conclusions depend on window size, sampling rate, and the detrending approach used. Additionally, changes in these indicators can be difficult to distinguish from long-range dependence, a common feature of climatic time series.
We propose a novel approach to model time-varying autocorrelation in long-range dependent time series by mixing two fractional Gaussian noise processes with a time-dependent weight function. Using INLA, the new model component is efficiently estimated simultaneously with other latent components such as mean trends and seasonal effects. Furthermore, Bayes factors can be used to test for increasing versus constant autocorrelation. The approach will be demonstrated on climatic time series, including evaluation of a reconstructed Atlantic multidecadal variability index as a potential early warning signal for shifts in North Atlantic climate dynamics.
Jafet Belmont (University of Glasgow)
Title: Combining Efforts: Integrating Citizen Science and Survey Data using inlabru
Summary. Over the last decade, opportunistic data collected through citizen science (CS) programs have proven to be a valuable and cost-effective resource for monitoring wildlife populations. However, the absence of standardized sampling protocols in these initiatives introduces analytical challenges, as data collection is often influenced by observer preferences, accessibility, and areas of public interest. In contrast, standardized surveys are carefully structured to minimize these biases but provide much lower spatiotemporal coverage. Integrating these two valuable data sources could enhance the accuracy and quality of species distribution predictions. Yet, reliably combining them remains challenging due to differences in study design, spatial and temporal coverage, and potential sampling biases. In this work, we introduce a new approach to integrating opportunistic citizen science (CS) observations with data from planned surveys while accounting for observational errors inherent in both sources. The proposed integrated model is fitted using inlabru to jointly analyse the geographical locations where a species has been reported by participating volunteers and the detection/non-detection data from robust designed surveys. By integrating CS data with survey data using robust modelling techniques, we can enhance our understanding of species distributions.
Understanding the spatial distribution of urban fire occurrences is crucial for assessing fire risk and improving prevention strategies. In this study, we analyse the spatial patterns of urban fire events in Portugal using Bayesian hierarchical models, specifically Integrated Nested Laplace Approximation (INLA) and Stochastic Partial Differential Equation (SPDE) methods. We model fire intensity through Log-Gaussian Cox Processes (LGCP) with a Poisson likelihood. The model estimates the intensity of the point process using only an intercept and the SPDE latent effect, allowing us to capture spatial heterogeneity in fire occurrence. Additionally, we apply Voronoi tessellation to partition the study area into regions based on proximity to fire occurrences. This approach enables a better exploration of local variations in fire intensity and identification of spatial clustering patterns. To assess fire risk, we generate a spatial map displaying the posterior means of the estimated fire intensity at grid points. The map provides a visual representation of fire-prone areas, highlighting regions where fire incidents are more likely to occur.
There are currently limited options for spatial population dynamics models that can effectively utilize ecological survey data, with most mechanistic models requiring detailed demographic data. To address this gap, we propose integrating biological differential equations into the INLA framework using the SPDE approach. By extending Anderka (2024)’s iterative INLA method for estimating the parameters of an SPDE, we aim to incorporate the logistic growth equation as a spatio-temporal random effect in an inhomogeneous Poisson Process. This model could estimate and predict demographic processes from count data, and is a significant step towards combining mechanistic and statistical population models. Future extensions could include more complex and biologically realistic dynamics, such as multispecies systems using the Lotka-Volterra equations.
The COVID-19 pandemic underscored the importance of mathematical modelling in epidemiology. During this period, there was also a clear expanded adoption of non-traditional data sources, such as mobility data, in public health research. Mobility can be seen as a proxy of the change in contacts between individuals and, thus, may contribute to better translate how infectious diseases spread through the population In this work we use mobility data taken from the Google Community Mobility Reports. These include daily data for the time period between February 15th 2020 and October 15th 2022 on the percentage changes in visitor numbers or time spent at categorised locations compared to a defined baseline. For Portugal, the mobility data disaggregated by district. Given the spatial and temporal nature of the data, the use of spatiotemporal modelling techniques allows for a more accurate representation of patterns, trends, and dependencies across both space and time. We used the INLA framework to estimate spatiotemporal models for several mobility categories considering several predictors such as seasonality controls, day of the week, national and regional holiday indicators, stringency and temperature. We used a standard train-test split to choose model specification and forecasting performance. For the Workplace movement category, results show that space-time interactions of Type I, which assumes an unstructured spatial and temporal interaction, performed better in the model specification stage. Forecast results indicate that the inclusion of spatiotemporal dependencies improves predictive accuracy, capturing both short-term fluctuations and longer-term mobility trends.
Environmental variables are one crucial piece in the development of statistical models in applied cases. The ever-evolving technology of remote sensing is increasingly able to provide high-resolution data, and researchers are always on the lookout for higher-resolution products that might explain their geographical processes of interest at a finer scale. This work showcases how higher resolutions of covariate data might not be appropriate to capture the large-scale environmental processes we are interested in, and the potential pitfalls of combining this high-resolution data with the stochastic partial differential equation (SPDE) approach. We demonstrate this by a marked log-Gaussian Cox process for landslide size and occurrence across specific geomorphological discretizations of Japan.
In colonial breeding seabirds, habitat use is typically inferred from time series of location estimates obtained at high resolution from animal-borne telemetry tags. However, these data often violate the assumptions of independence and unbiased sampling inherent to conventional resource selection function (RSF) models such as inhomogeneous Poisson point processes. Seabirds are central-place foragers in the breeding season, travelling between a nesting site and their foraging grounds. This repetitive spatial pattern can lead to further complexity in telemetry data, where observations near the breeding colonies may be more frequent or spatially clustered. Historically, data thinning has been the most common approach used for reducing autocorrelation in the inputs to RSFs. Although a pragmatic solution, thinning results in data loss that may impair parameter estimation. More advanced techniques also exist to account for non-independence, including log-Gaussian Cox Process (LGCP) and integrated RSF with autocorrelation-adjusted likelihood weights. In this work, we will review how different statistical approaches for quantifying resource selection can be applied to telemetry data. Using simulated telemetry data, we aim to illustrate the strengths and weaknesses of each approach, including considerations for central-place foraging behaviour, and discuss how these methods can be improved to account for the particularities of telemetry data.
During the COVID-19 pandemic, wastewater-based epidemiology (WBE) has emerged as an innovative and cost-effective approach for monitoring SARS-CoV-2 levels in communities. Despite its potential, few studies have attempted to predict viral load and disease prevalence with high spatial and temporal resolution. This study leverages WBE methodologies to forecast the spread of COVID-19 cases in the municipality of Bologna (Emilia-Romagna, Italy), starting from SARS-CoV-2 detection in the area’s sole wastewater treatment plant. We employed a deterministic model based on a mass balance approach, which equates the amount of viral RNA in the wastewater to the mass of virus produced in the study area. This mass was estimated by integrating hydraulic variables (wastewater effluent flow rate, residence time in the sewer system), biological variables (faecal excretion rate and viral degradation), and demographic factors (age, sex, household size, comorbidities, and population density). To generate daily, detailed spatial predictions for each of the 724 census tracts, we employed population-based coefficients derived from a Bayesian Flexible Besag (fBesag) model, implemented through the INLA framework. The fBesag model used a Gaussian kernel-weighted adjacency matrix based on the Euclidean distances between centroids derived from a land-use map. Incorporating these population-based coefficients led to precise estimates of the spatiotemporal distribution of COVID-19 cases, achieving a daily Mean Absolute Error (MAE) of 0.56 cases on a local scale. Furthermore, the deterministic model’s predictions showed a strong cross-correlation (>0.6) with the observed number of cases up to 4 days in advance, highlighting its effectiveness in short-term forecasting.
Latent Gaussian Models (LGMs) are a subset of Bayesian Hierarchical models popularly employed in many fields for their flexibility and computational efficiency. However, practitioners find prior elicitation on the variance parameters of LGMs challenging as they are not easily interpretable by users. Recently, several papers have tackled this issue by rethinking the model in terms of variance partitioning (VP) and assigning priors to parameters reflecting the relative contribution of each effect to the total variance. So far, the class of priors based on VP has been mainly deployed for random and fixed effects separately. This work presents a novel standardization procedure that expands the applicability of VP priors to a broader class of LGMs, including both fixed and random effects. The standardization procedure consists of a 0-mean constraint step, only required for fixed effects, and a scaling step to ensure homogeneity in the scales of the variance parameters. We describe the steps required for standardization through various examples, with a particular focus on the popular class of intrinsic Gaussian Markov random fields (IGMRFs). We also discuss the popular case of P-Splines, for which we propose an adjustment of the precision matrix for a neat separation between their linear and non-linear contributions. The practical advantages of standardization are demonstrated with simulated data and a real dataset on survival analysis. More details on this project can be found in the preprint.
Recursive inference and distributed inference can be implemented to address various scenarios involving data acquisition, model complexity, or Big Data challenges. In particular, Bayesian inference naturally provides a framework for recursively updating information. However, in practice, recursive implementation of the inferential process is often avoided due to technical challenges. Recursive inference offers a mechanism to update the available information about the inferential model as new data becomes available. On the other hand, distributed inference is typically employed when working with large datasets or complex models, allowing data and computation to be partitioned across different servers or machines to distribute the computational load. Moreover, distributed inference can also be used when data must be analyzed separately without sharing it, as in data privacy issues. In such cases, distributed inference enables the analysis of a complex model by integrating information from model components without the need to explicitly share the protected data.
This study presents a methodology for implementing recursive and distributed inference using the INLA framework. This approach leverages the computational efficiency and flexibility of INLA, providing procedures to recursively update model results as new information is acquired and to perform distributed inference for large datasets or highly complex models. In both recursive and distributed procedures, the model can be partitioned to simplify the inference process by dividing it into distinct blocks that can be solved in parallel on different machines or servers and subsequently integrated. Therefore, we present various partitioning methods, either automatic or user-defined, tailored to different models, such as spatial, temporal, or spatiotemporal models. These include applications in geostatistics, small area models, or log-Gaussian Cox processes for point processes.
Background. England has witnessed dramatic increases in the rate of depression, anxiety and other mental health problems in the last two decades, with potentially fatal consequences for population mental health given increased suicide risk amongst those living with a mental disorder. Despite this, it is unclear whether suicide rates have also risen during this period (10.7 deaths per 100,000 in 2022), or the extent to which social determinants in the local environment influence this risk.
Methods. In this ecological study, we used a hurdle-Poisson model to analyse suicide data from the Office for National Statistics, exploring spatial and temporal patterns in England (2002–2022) using a high-resolution Bayesian spatio-temporal framework. We assessed the effects of deprivation , ethnic density, population density, light pollution, railway and road network densities and greenspace on standardised mortality ratios for suicide, standardised for age and sex risk.
Findings. Over two decades, suicide risk showed no substantial change, -2.85% (95% CrI: -7.54% - 1.49%). After accounting for socio-environmental factors, 55.76% (95% CrI: 50.58% - 60.05%) of the total variation in suicides was explained by unmeasured local neighbourhood factors. Suicide risk was positively associated with deprivation and noise pollution, negatively associated with ethnic density and greenspace, while population density and light pollution were only positively associated with suicide rates in areas of low urbanicity.
Interpretations. Despite multiple policy interventions to prevent suicide, rates have not declined over the last 20 years, suggesting an urgent need to enhance prevention strategies. Our results suggest that, once we understand mechanisms, future interventions should target areas characterised by specific profiles of community-level characteristics.
Wildfires pose a major threat to Portugal, with an average of over 90,000 hectares burned annually in recent decades. Beyond high wildfire frequency, the country has experienced devastating mega-fires, such as those in 2017. Accurate forecasting of wildfire occurrence and burned areas is therefore essential for effective firefighting resource allocation and emergency preparedness. In this study, we introduce a novel two-stage ensemble model that combines XGBoost and a spatial-temporal latent Gaussian model to jointly estimate the wildfire occurrence and burn area. Initially, XGBoost identifies wildfire patterns based on meteorological covariates, coordinates, and time indicators, and its predictions are subsequently enhanced by a spatial-temporal latent Gaussian model. Our results demonstrate that this ensemble approach outperforms either method individually. To effectively model both moderate and extreme wildfire events, we employ the extended Generalized Pareto distribution (eGPD), which features a gamma-like lower bound and a Pareto-like tail. The Gradient descent algorithm is used to estimate the XGBoost and Bayesian inference is conducted for the latent Gaussian model using Integrated Nested Laplace Approximation (INLA). Additionally, we contribute to the INLA community by implementing eGPD as a likelihood function in the R-INLA package and discussing its penalized Complexity priors (PC-priors).
Current spatial point process modeling of crime data primarily relies on Euclidean distances, while criminal incidents such as robbery or vehicle crime only occur on the streets of cities. This study utilizes the recently proposed Log-Gaussian Cox Processes (LGCPs) on metric graphs to analyze crime data from the UK. The purpose is to study the effect of explanatory variables such as population density, education levels, and socioeconomic factors and to find hotspots of crime. We also compare the LGCPs on the networks with LGCPs defined in Euclidean space to investigate the effect of taking the network structure into account.
We introduce Proper Random Walks of order 2 (P-RW2), a smoothing spline methodology based on stationary autoregressive models. P-RWs retain the Markovian structure of traditional Random Walks while offering enhanced robustness and adaptability. They are well-suited for both regular and irregular data and ideal for smoothing and prediction tasks, particularly in sparse data scenarios. P-RWs offer an efficient and reliable alternative for complex modeling challenges and will soon join the family of random effects implemented within the INLA package.
High ambient temperatures can cause unnecessary mortality, with the health effects of heat being non-linear. Previous studies have shown that certain regions are more vulnerable. This study investigates the non-linear spatial vulnerabilities of heat exposure on all-cause mortality across small areas in Switzerland. We retrieved daily all-cause mortality and annual population data (2011–2022) for 2,145 municipalities, disaggregated by age and sex, from the Swiss Federal Office for Public Health and the Swiss Office for National Statistics. Daily temperature at 1 km resolution was obtained from the Federal Office for Meteorology and Climatology and aggregated to the municipality level using population weights. We developed a Bayesian Poisson hierarchical model to account for holidays, day of the week, long-term trends, and spatial correlation, allowing the heat effect to be non-linear and spatially varying. We modelled spatiotemporal correlations using Gaussian priors with a structured covariance matrix. We considered a 3-day lagged temperature, and we focused on summer months. We further examined spatial inequalities using modifiers such as green space and deprivation. Inference was done with INLA. During summer 2011–2022, we observed 160,027 deaths among individuals aged 65 years and older in Switzerland. The overall temperature-mortality association was J-shaped, with significant spatial disparities. Heat-attributable deaths were highest in northern Switzerland. Key contributors to spatial vulnerabilities included older age and lower green space coverage. This study presents a computationally efficient modelling framework to describe the spatial variation of heat effects across small areas.
Italy is one of the countries affected the most by COVID-19 and had the highest number of heat-related deaths in Europe in 2022. Extreme events like these have been consistently linked to increased mortality, highlighting the urgent need to monitor and quantify the impacts. Accurately assessing mortality burdens is crucial for informing public health policies, and implementing interventions to address health challenges, such as heat-related deaths, infectious diseases, and air pollution. Understanding these burdens helps policy makers develop strategies to protect vulnerable populations and improve public health outcomes.
This poster summarises my PhD research to date, focusing on developing a Bayesian spatio-temporal framework to estimate excess mortality at a sub-national level in Italy. Using nationwide population and weekly mortality data (2011-2019) by age, sex, and province, I applied multiple Bayesian models with different long-term trends and developed a Bayesian ensemble for each of the age-sex groups in cross-validation. Expected mortality for 2020-2022 was predicted, and excess mortality was calculated by subtracting predicted deaths from observed deaths. Covariates such as ambient temperature, national holidays, and spatio-temporal random effects were incorporated to improve accuracy.
This framework provides robust predictions for 2020–2022, with a focus on extreme events such as COVID-19 and heatwaves. During the entire period, 2,172,190 all-cause deaths were observed, with 184,536 (95%CrI: 143,431, 216,258) excess deaths after accounting for covariates. The approach enables excess mortality estimation at any spatial, temporal, and demographic resolution, helping policy makers to identify vulnerable populations and regions most affected by extreme events.
One of the most difficult issues that arise when doing Bayesian inference is model selection (García-Donato and Martínez Benito, 2013). In 1995, Green, proposed a Reversible Jump Markov Chain Monte Carlo algorithm (RJMCMC) that provides a method to estimate parameters and select better models simultaneously.
The key of the RJMCMC algorithm is that on each step of the Markov Chain, a different model, with different number of covariates, with different dimension, etc can be selected for the next step of the chain.
In recent years, INLA (rue et al, 2009) has provide a much faster way to make approximations of the models than the traditional MCMC methods. Our approach is using Reversible Jump MCMC combined with INLA to try to select the “best” model of the proposed family. Instead of using a traditional MCMC method to approximate the different models, resulting in a higher computational cost and time, we approximate this model using INLA. On each step of the Markov Chain, the marginal likelihood is used as a jump criterion. When the MCMC is finished, we have models estimated with INLA but their posterior probabilities are estimated using the created samples by the Markov Chain. The application has been tested with different datasets of R, including the cement, airquality, Swiss and mtcars datasets.
Joint work with: Virgilio Gómez Rubio and Gonzalo García Donato
Net income is a key variable of labor surveys that commonly has high rates of missing values due to attrition, particularly in rotating panel designs. Imputation techniques offer a viable solution to address missing data on this variable. This presentation introduces an approach of Bayesian spatial analysis based on stochastic partial differential equations (SPDE) with integrated Laplace nested approximation (INLA) for imputation through the several waves in a real rotating panel design survey. Methodology of this case is applied on data from Portuguese Labor Survey, using quarterly data from 2023-2024. The models used for imputation incorporate common covariates related to net income, such as age, sex, academic degree and months of active employment. For comparative purposes, imputation with classic INLA and multiple imputation were also applied. SPDE-INLA is considered as a prominent method within INLA framework with the advantage of using spatial dependence structures integrated with geographical information into an imputation process with fitted values from models. Exploring new imputation alternatives for high missing rated continuous variable in labor surveys, as net income is, can provide in short term, complete sample data for Portuguese national statistics institute (INE), better inference accuracy and serves as a potential money-saving application to minimize the necessity of reinterviews or additional sampling units. This work contributes to advancing imputation methodologies in labor survey data with rotating panel data using INLA, offering solutions for improving data quality and reliability in national statistics.
The multivariate analysis of different diseases in a spatial context remains a critical area of research. Different proposals have been made to handle several multivariate epidemiological scenarios (Botella-Rocamora et al., 2015; Palmí-Perales et al., 2023). However, these approaches are not really appropriate for comparing spatial risk patterns when different population subgroups are compared. Here we present an approach able to identify differences in spatial risk patterns among diverse at-risk populations. Specifically, the population is stratified by two factors leading in four different population groups. Different modelling scenarios are adjusted, and the best one is chosen. The proposed methodology is applied to analyse suicide-related emergency call data in the Valencian Community, Spain. Given the computational complexity of the problem, Bayesian inference is performed using the Integrated Nested Laplace Approximation (INLA) method (Rue et al., 2009), yielding efficient and reliable results.
References Botella-Rocamora P., Martinez-Beneito M., Banerjee S. (2015). A unifying modeling framework for highly multivariate disease mapping. Statistics in Medicine 34(9), 1548–1559. doi:https://doi.org/10.1002/sim.6423.
Palmí-Perales F., Gómez-Rubio V., Bivand R. S., Cameletti M., Rue H. (2023). Bayesian inference for multivariate spatial models with inla. R Journal 15(3), 172–190. doi:10.32614/RJ-2023-068.
Rue H., Martino S., Chopin N. (2009, 04). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society Series B: Statistical Methodology 71(2), 319–392. doi:10.1111/j.1467-9868.2008.00700.x.
Geostatistics is concerned with the estimation and prediction of spatially continuous phenomena using data obtained at a discrete set of locations. In geostatistics, preferential sampling occurs when these locations are not independent of the latent spatial field, and common modelling approaches that do not account for such a dependence structure might yield incorrect inferences. To overcome this issue, some methods have been proposed to model data collected under preferential sampling. However, while these methods assume a constant degree of preferentiality, real data may exhibit a degree of preferentiality that varies over space. For this reason, we propose a new model that accounts for preferential sampling by including a spatially varying coefficient that describes the strength of dependence between the process that models the sampling locations and the latent field. To do so, we approximate the preferentiality component using a set of basis functions, with the corresponding coefficients being estimated using the integrated nested Laplace approximation (INLA) method. By doing so, we allow the degree of preferentiality to vary over the domain with low computational burden. We assess our model’s performance by means of a simulation study and use it to analyse the PM2.5 levels in the USA. We conclude that, given enough observed events, our model, along with the implemented inference routine, effectively retrieves the latent field and the spatially varying preferentiality surface, even under misspecified scenarios.
Quick and efficient responses to emergencies requiring emergency medical vehicles (EMVs) are essential for improving patient survival rates and clinical outcomes. Key performance indicators, such as response time, play a crucial role in the decision-making process concerning the selection, deployment, and management of the EMV fleet. However, resource limitations and the complex nature of potential decisions complicate this process. This study aims to understand and predict the behaviour of EMV demand to enhance the quality of decision-making tools. This work utilizes real data from over 98,000 georeferenced emergency calls in Valencia during 2019. It examines the spatial dependency of these events by applying point process analysis techniques. The intensity function is estimated using both frequentist and Bayesian approaches with the statistical software R. An exploratory data analysis revealed a pattern that deviates from complete spatial randomness (CSR). A kernel estimator was used to create a heatmap, which helps identify potential hotspots in EMV demand. Using the inlabru R package, a spatial model was developed employing integrated nested Laplace approximation. After establishing the coordinate system and mesh, a log-Gaussian Cox process (LGCP) model was proposed, paving the way for further exploration of this modelling approach in future research.
Air pollution remains a critical environmental and public health challenge, demanding high-resolution spatial data to better understand its spatial distribution and impacts. This study addresses the challenges of conducting multivariate spatial analysis of air pollutants observed at aggregated levels, particularly when the goal is to model the underlying continuous processes and perform spatial predictions at varying resolutions. To address these issues, we propose a continuous multivariate spatial model based on Gaussian processes (GPs), naturally accommodating the support of aggregated sampling units. Computationally efficient inference is achieved using R-INLA, leveraging the connection between GPs and Gaussian Markov random fields (GMRFs). A custom projection matrix maps the GMRFs defined on the triangulation of the study region and the aggregated GPs at sampling units, ensuring accurate handling of changes in spatial support. This approach integrates shared information among pollutants and incorporates covariates, enhancing interpretability and explanatory power. This approach is used to downscale PM2.5, PM10 and ozone levels in Portugal and Italy, improving spatial resolution from 0.1° (~10 km) to 0.02° (~2 km), and revealing dependencies among pollutants. Our framework provides a robust foundation for analyzing complex pollutant interactions, offering valuable insights for decision-makers seeking to address air pollution and its impacts.
Wetlands are globally significant ecosystems, providing critical ecological functions such as biodiversity support, water regulation, and climate mitigation. They serve as carbon sinks, storing a substantial portion of terrestrial soil organic carbon (SOC). A key factor in maintaining SOC levels is the anaerobic soil conditions sustained by high water tables. However, when wetlands are drained, these conditions are lost, leading to the gradual release of SOC into the atmosphere as carbon emissions. In Iceland, emissions from drained wetlands account for over one-third of the country’s annual carbon output. Despite this, a comprehensive map distinguishing drained from undrained regions is currently unavailable. To address this gap, we developed a logistic regression model to predict the probability of land drainage using predominantly satellite-derived GIS data. The model is inferred through INLA. Initially, a spatial model was considered, but due to the country’s size, the high variability between observational sites, and their limited number, the spatial component was not included. Some of the observational sites have multiple observations, each with different distances with respect to a draining ditch of interest. A mixed-effects INLA model, which takes into account the neighbor structure of the observational sites through a random walk component, was implemented. The final model adequately predicts the likelihood of land being drained, providing a valuable tool for environmental management and carbon accounting. However, further refinement and validation are needed to enhance its accuracy and applicability.
Grazing regimes prove a major avenue for anthropogenic influence on landscapes, and Glen Finglas in Scotland’s Loch Lomond and the Trossachs National Forest host a long-running, structured experiment into livestock grazing’s impacts on upland ecosystems. As grazing cascades through ecosystems predominantly via vegetation impacts, we look at floral health, diversity, and abundance to quantify these effects. We construct the largest-ever unified dataset on Glen Finglas vegetative diversity, landscape classifications, and topo-climatic covariates to fit linear additive regression models with integrated nested Laplace approximations and test the impact of explicit spatial patterns compared against past methods. Our analyses identify (1) increased floral diversity and potentially more landscape turnover from grazing, (2) lower floral diversity from warmer temperatures, (3) bryophyte-specific harms and vascular plant-specific diversity benefits from higher wind speeds, and (4) interplays between landscape classes, diversity, and plant heights. Uncertainties remain, but upland landscape diversity appears at risk from warming with diverse responses to precipitation and wind speed. Thankfully, grazing may prove effective at forestalling more extreme landscape deterioration, and Scottish land managers ought to note its usefulness even under alternative climate regimes. Statistically, climate variables all show higher precision when explicitly considering spatial correlation in at least one model, so more naive approaches may prove suboptimal. In contrast, grazing responds to complex models with wider intervals and lower precision. Given the importance of spatial structures and climate variables, we advise their inclusion in future work for to improve precision when analysing observational data at Glen Finglas and elsewhere.
Advancements in computational power and methodologies have enabled research on massive datasets. However, tools for analyzing data with directional or periodic characteristics, such as wind directions and customers’ arrival time in 24-hour clock, remain underdeveloped. While statisticians have proposed circular distributions for such analyses, significant challenges persist in constructing circular statistical models, particularly in the context of Bayesian methods. These challenges stem from limited theoretical development and a lack of historical studies on prior selection for circular distribution parameters. In this article, we propose a practical and systematic framework for selecting priors that effectively prevents overfitting in circular scenarios, especially when there is insufficient information to guide prior selection. We introduce well-examined Penalized Complexity (PC) priors for the most widely used circular distributions. Comprehensive comparisons with existing priors in the literature are conducted through simulation studies and a practical case study. Finally, we discuss the contributions and implications of our work, providing a foundation for further advancements in constructing Bayesian circular statistical models.
Day 2 (22/05/2025)
Title: 20 years of INLA
Summary. This year marks not only Håvard’s 60th birthday, it is also the 20th anniversary of when Håvard (and I) began working on what would eventually become the INLA project. In this talk, I’ll reflect on those early days—how we took the first steps into developing INLA, and how the R-INLA library has evolved from a small, specialized tool for fitting GLMs into the flexible and powerful resource it is today.
Andrea Riebler
Title: INLA has made Bayesian inference more accessible to applied scientists
Summary. The introduction of sampling-based inference in the early 1990 marked a major breakthrough in Bayesian inference. However, the impact of this Bayesian revolution was less visible for the applied users. That changed with the development of INLA and the release of its corresponding R package specifically designed for latent Gaussian models. Over the years, INLA has evolved into a versatile and efficient tool, enabling fast and reliable Bayesian inference. Several R packages have since been developed around or on top of INLA, tailoring its functionality to specific application areas and making it even more user-friendly for targeted audiences. In this talk, I will reflect on my experience with INLA since its early days and highlight its practical value in applied fields—particularly in Public Health and Epidemiology, where it is extremely useful today.
Esmail Abdul Fattah
Title: From Single-Core to Many-Core to GPUs
Summary. Computational efficiency has always been a key strength of INLA, but the computing landscape has evolved significantly. This talk explores the parallel strategies currently used in R-INLA and the need for a next-generation sparse matrix solver. sTiles, designed with modern HPC principles, enables INLA to scale beyond multi-core CPUs to many-core architectures, GPUs, and distributed-memory systems. I will discuss how sTiles unlocks new performance levels, making large-scale Bayesian inference feasible in next-generation computing environments.
Title: INLA 2.0
Summary. The INLA methodology has undergone some reformulations and new approaches. Most notably, the linear predictors have been removed from the latent field and the “N” part has essentially been replaced by a low-rank Variational Bayes procedure. This new approach, INLA 2.0, has been the default since late 2021 and has propelled many new applications that were infeasible before. I will give a conceptual overview of the changes, why these changes were explored and what lies ahead for INLA, in this direction.
Day 3 (23/05/2025)
Title: graphpcor: Models for correlation matrices based on graphs
Summary. Correlation matrices are fundamental in multivariate statistical analysis, and their modeling to understand complex dependencies between random variables. As prior distributions, the Wishart for covariance and the LKJ for correlation matrices, have been widely used. In this work, we introduce two graph-based approaches for modeling correlation matrices. The graph structure is leveraged in understanding the relations between the variables. As prior distribution our proposal considers to penalize deviations from a base model ensuring robustness and interpretability. The proposed methods are illustrated with practical examples, demonstrating their flexibility and applicability to solve a variety of problems.
Joint work with Anna Freni Sterrantino, Denis Rustand, Janet van Niekerk and Håvard Rue
Lenin Rafael Riera Segura (KAUST)
Title: A new class of non-stationary Gaussian fields with general smoothness on metric graphs
Summary. The increasing availability of network data has driven the development of advanced statistical models specifically designed for metric graphs, where Gaussian processes play a pivotal role. While models such as Whittle-Matérn fields have been introduced, there remains a lack of practically applicable options that accommodate flexible non-stationary covariance structures or general smoothness. To address this gap, we propose a novel class of generalized Whittle-Matérn fields, which are rigorously defined on general compact metric graphs and permit both non-stationarity and arbitrary smoothness. We establish new regularity results for these fields, which extend even to the standard Whittle-Matérn case. Furthermore, we introduce a method to approximate the covariance operator of these processes by combining the finite element method with a rational approximation of the operator’s fractional power, enabling computationally efficient Bayesian inference for large datasets. Theoretical guarantees are provided by deriving explicit convergence rates for the covariance approximation error, and the practical utility of our approach is demonstrated through simulation studies and an application to traffic speed data, highlighting the flexibility and effectiveness of the proposed model class.
Virgilio Gómez-Rubio (Universidad de Castilla-La Mancha)
Title: “INLA con cosas”: embedding INLA to fit a larger class of models
Summary. The integrated nested Laplace approximation makes Bayesian inference faster for a wide class of hierarchical models. However, the INLA methodology can only deal with models with a very particular structure. In order to fit a larger class of models, INLA can be embedded into more general model fitting algorithms such as importance sampling, Markov chain Monte Carlo and others. These approaches could be used to fit highly structured hierarchical models such as double-hierarchical models and change-point models, for example.
In my talk I will illustrate different strategies to embed INLA within other model fitting approaches to increase the classes of models that can be fit. In addition, I will illustrate this with different examples that will include computing the posterior probabilities for model selection and change-point models.
Karina Lilleborge (NTNU)
Title: Joint Modelling of Line and Point Data on Metric Graphs
Summary. Metric graphs are a useful tool for describing spatial domains like road and river networks, where spatial dependence should act along the network. We take advantage of recent developments for such GRFs, and consider joint spatial modelling of observations with different spatial supports. Motivated by an application to traffic modelling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and root mean square errors (RMSEs), and the latter is assessed through RMSE, continuous rank probability scores, and coverage. Joint modelling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest that joint modelling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced speed camera data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. We fit the model to two data sets where we expect different spatial dependency and compare the results. This is made computationally feasible by a combination of the R packages inlabru and MetricGraph, and new code for processing geographical line data to work with existing graph representations and fmesher methods for dealing with line support on objects from MetricGraph.
Birgir Hrafnkelsson (University of Iceland)
Title: Max-and-Smooth: A Two-Step Approach for Approximate Bayesian Inference in Latent Gaussian Models
Summary. With modern high-dimensional data, complex statistical models are necessary, requiring computationally feasible inference schemes. We introduce Max-and-Smooth, an approximate Bayesian inference scheme for a flexible class of latent Gaussian models (LGMs) where one or more of the likelihood parameters are modeled by latent additive Gaussian processes. Our proposed inference scheme is a two-step approach. In the first step (Max), the likelihood function is approximated by a Gaussian density with mean and covariance equal to the maximum likelihood estimate and the inverse observed information, respectively. In the second step (Smooth), the latent parameters and hyperparameters are inferred and smoothed with the approximated likelihood function. The proposed method ensures that the uncertainty from the first step is correctly propagated to the second step. The prior density for the latent parameters and the approximated likelihood function are Gaussian. Thus, the approximate conditional posterior density of the latent parameters is also Gaussian, which facilitates efficient posterior inference in high dimensions, especially when the Gaussian prior density is specified with a sparse precision matrix. The approximate marginal posterior distribution of the hyperparameters is tractable; thus, the hyperparameters can be sampled independently of the latent parameters. Due to this fact, approximations for the marginal densities of individual latent and hyperparameters can be computed as in the INLA approach. The proposed inference scheme is demonstrated on a spatially referenced real dataset and on simulated data mimicking a spatial inference problem. Our results show that Max-and-Smooth is accurate and fast.
Brynjólfur Gauti Guðrúnar Jónsson (University of Iceland)
Title: Modeling spatial dependence through latent Gaussian models with spatial copulas
Summary. This research explores an approach to modeling spatial dependence in extreme precipitation data through the integration of latent Gaussian models and spatial copulas. We implement a Matérn-like copula structure within a stepwise inference framework to address the computational challenges inherent in analyzing large spatial datasets.
Our approach combines Generalized Extreme Value (GEV) marginal distributions with a Gaussian copula transform, employing a precision matrix structure with Kronecker products that facilitates efficient computation. The methodology follows a carefully designed sequence: site-wise GEV parameter estimation, copula parameter optimization, joint estimation incorporating data-level spatial dependence, and Bayesian spatial smoothing of parameters.
Initial applications to UKCP precipitation projections across the UK suggest this approach may offer practical advantages in certain contexts, including coherent spatial patterns in parameter estimates and computational feasibility for high-dimensional datasets. We present these preliminary findings as a contribution to the ongoing development of methods for modeling spatial extremes, with potential applications for improving projections of sub-daily extreme precipitation events.
Title:
Summary.