Model solves key challenge in combining mismatched geographic health data

Share This Post

Point observation at one of the vertices of a mesh triangle (a), point observation within one of the mesh triangles (b), mesh vertices within an areal region used to compute the projection matrix A in the SPDE approach (c) Credit: Stochastic Environmental Research and Risk Assessment (2025). DOI: 10.1007/s00477-025-02927-z

Combining data across mismatched maps is a key challenge in global health and environmental research. A powerful modeling approach has been developed to enable faster and more accurate integration of spatially misaligned datasets, including air pollution prediction and disease mapping. The study is published in the journal Stochastic Environmental Research and Risk Assessment.

Datasets describing important socio-environmental factors, such as disease prevalence and pollution, are collected on a variety of spatial scales. These range from point data values for specific locations up to areal or lattice data, where values are aggregated over regions as large as countries.

Merging these geographically inconsistent datasets is a surprisingly difficult technical challenge, embraced by biostatistician Paula Moraga and her Ph.D. student Hanan Alahmadi at KAUST.

“Our group develops innovative methods for analyzing the geographical and temporal patterns of diseases, quantifying risk factors, and enabling the early detection of disease outbreaks,” says Moraga.

We need to combine spatial data that are available at different resolutions, such as pollutant concentrations measured at monitoring stations and by satellites, and health data reported at different administrative boundary levels.”

Alahmadi and Moraga developed their new model through a Bayesian approach, which is often used to integrate large spatial datasets. Bayesian inference is usually performed using Markov chain Monte Carlo (MCMC) algorithms, which explore datasets through a “random walk.”

The algorithms decide on each next step based on the previous one until they get as close as possible to a target (or “posterior”) distribution. However, MCMC can take up a lot of computational time, so the researchers used a different framework called the Integrated Nested Laplace Approximation (INLA).

Model solves key challenge in combining mismatched geographic health data
Malaria prevalence as point (a) and areal (b) data, and altitude (c) in Madagascar Credit: Stochastic Environmental Research and Risk Assessment (2025). DOI: 10.1007/s00477-025-02927-z

“Unlike MCMC, which relies on sampling, INLA uses deterministic approximations to estimate posterior distributions efficiently,” explains Alahmadi. “This makes INLA significantly faster while still providing accurate results.”

The researchers demonstrated the power of their model by integrating point and areal data in three case studies: the prevalence of malaria in Madagascar, air pollution in the United Kingdom, and lung cancer risk in Alabama, U.S.. In all three, the model improved the speed and accuracy of predictions while providing insight into the importance of different spatial scales.

“In general, our model gives more weight to point data because they offer higher spatial precision and are often more reliable for detailed predictions,” says Alahmadi.

“In all studies, point data played a dominant role. However, the influence of areal data was greater in the air pollution study. This is primarily because the air pollution areal data had a finer resolution, which made them more informative and complementary to the point data.”

Overall, the project addresses the increasing need for data analysis tools that support evidence-based decisions in health and environmental policy. For example, if public health officials can quickly assess disease prevalence, then they can work more effectively to allocate resources and intervene in high-risk areas.

The new model could be adapted to capture dynamic changes over space and time and to address biases that may arise due to preferential sampling in certain areas. The researchers plan several other applications of their model, such as using satellite pollution data to estimate disease risks.

“We hope to combine satellite and ground-based temperature data to detect thermal extremes in Mecca, particularly during the Hajj season, where heat stress is a serious public health concern,” says Moraga. “We also intend to monitor air pollutants and track emissions, supporting Saudi Arabia’s journey toward its net-zero goals.”

More information:
Hanan Alahmadi et al, Bayesian modelling for the integration of spatially misaligned health and environmental data, Stochastic Environmental Research and Risk Assessment (2025). DOI: 10.1007/s00477-025-02927-z

Provided by
King Abdullah University of Science and Technology

Citation:
Model solves key challenge in combining mismatched geographic health data (2025, May 27)
retrieved 28 May 2025
from https://medicalxpress.com/news/2025-05-key-combining-mismatched-geographic-health.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Source link

spot_img

Related Posts

DIY Jewelweed Salve For Poison Ivy (and More!)

If you’ve ever tangled with poison ivy or been...

Journalists Draw Link Between Internet Dead Zones, Threatened Medicaid Cuts, and Health

Thank you for your interest in supporting Kaiser Health...

Hepatitis A and B vaccines: What you need to know

Hepatitis, which is an inflammation in the liver, is...

MyFitnessPal Guide: Dietary Fat Edition

Dietary fat has had a major PR problem. Some...

How to Build a Nighttime Routine for the Whole Family

I’ve written before about our morning routine and how...
- Advertisement -spot_img