Gosoniu, Laura. Development of Bayesian geostatistical models with applications in malaria epidemiology. 2008, Doctoral Thesis, University of Basel, Faculty of Science.
|
PDF
4Mb |
Official URL: http://edoc.unibas.ch/diss/DissB_8628
Downloads: Statistics Overview
Abstract
Plasmodium falciparum malaria is a leading infectious disease and a major cause of morbidity
and mortality in large areas of the developing world, especially Africa. Accurate
estimates of the burden of the disease are useful for planning and implementing malaria
control interventions and for monitoring the impact of prevention and control activities.
Information on the population at risk of malaria can be compared to existing levels of
service provision to identify underserved populations and to target interventions to high
priority areas. The current available statistics for malaria burden are not reliable because
of the poor malaria case reporting systems in most African countries and the lack of national
representative malaria surveys. Accurate maps of malaria distribution together with
human population totals are valuable tools for generating valid estimates of population at
risk.
Empirical mapping of the geographical patterns of malaria transmission in Africa requires
field survey data on prevalence of infection. The Mapping Malaria Risk in Africa (MARA)
is the most comprehensive database on malariological survey data across all sub-Sahara
Africa. Transmission of malaria is environmentally driven because it depends on the distribution
and abundance of mosquitoes, which are sensitive to environmental and climatic
conditions. Estimating the environment-disease relation, the burden of malaria can be
predicted at places where data on transmission are not available. Malaria data collected
at fixed locations over a continuous study area (geostatistical data) are correlated in space
because common exposures of the disease influence malaria transmission similarly in neighboring
areas. Geostatistical models take into account spatial correlation by introducing
location-specific random effects. Geographical dependence is considered as a function of
the distance between locations. These models are highly parametrized. State-of-the-art
Bayesian computation implemented via Markov chain Monte Carlo (MCMC) simulation
methods enables model fit. A common assumption in geostatistical modeling of malaria data is the stationarity, that is the spatial correlation is a function of distance between
locations and not of the locations themselves. This hypothesis does not always hold, especially
when modeling malaria over large areas, hence geostatistical models that take
into account non-stationarity need to be assessed. Fitting geostatistical models requires
repeated inversions of the variance-covariance matrix modeling geographical dependence.
For very large number of data locations matrix inversion is considered infeasible. Methods
for optimizing this computation are needed. In addition, the relation between environmental
factors and malaria risk is often not linear and parametric functions may not be able
to determine the shape of the relationship. Nonparametric geostatistical regression models
that allow the data to determine the form of the environment-malaria relation need to be
further developed and applied in malaria mapping.
The aim of this thesis was to develop appropriate models for non-stationary and large geostatistical
data that can be applied in the field of malaria epidemiology to produce accurate
maps of malaria distribution. The main contributions of this thesis are the development
of methods for: (i) analyzing non-stationary malaria survey data; (ii) modeling the nonlinear
relation between malaria risk and environment/climatic conditions; (iii) modeling
geostatistical mortality data collected at very large number of locations and (iv) adjusting
for seasonality and age in mapping heterogeneous malaria survey data.
Chapter 2 assessed the spatial effect of bednets on all-cause child mortality by analyzing
data from a large follow-up study in an area of high perennial malaria transmission in
Kilombero Valley, southern Tanzania. The results indicated a lack of community effect
of bednets density possibly because of the homogeneous characteristic of nets coverage
and the small proportion of re-treated nets in the study area. The mortality data of this
application were collected over 7, 403 locations. To overcome large matrix inversion a
Bayesian geostatistical model was developed. This model estimates the spatial process by
a subset of locations and approximates the location-specific random effects by a weighted
sum of the subset of location-specific random effects with the weights inversely proportional
to the separation distance.
In Chapter 3 a Bayesian non-stationary model was developed by partitioning the study
region into fixed subregions, assuming a separate stationary spatial process in each tile and
taking into account between-tile correlation. This methodology was applied on malaria
survey data extracted from the MARA database and produced parasitaemia risk maps in
Mali. The predictive ability of the non-stationary model was compared with the stationary analogue and the results showed that the stationarity assumption influenced the significance
of environmental predictors as well as the the estimation of the spatial parameters. This
indicates that the assumptions about the spatial process play an important role in inference.
Model validation showed that the non-stationary model had better predictive ability. In
addition, experts opinion suggested that the parasitaemia risk map based on the nonstationary
model reflects better the malaria situation in Mali. This work revealed that
non-stationarity is an essential characteristic which should be considered when mapping
malaria.
Chapter 4 employed the above non-stationary model to produce maps of malaria risk in
West Africa considering as fixed tiles the four agro-ecological zones that partition the region.
Non-linearity in the relation between parasitaemia risk and environmental conditions
was assessed and it was addressed via P-splines within a Bayesian geostatistical model formulation.
The model allowed a separate malaria-environment relation in each zone. The
discontinuities at the borders between the zones were avoided since the spatial correlation
was modeled by a mixture of spatial processes over the entire study area, with the weights
chosen to be exponential functions of the distance between the locations and the centers
of the zones corresponding to each of the spatial processes.
The above modeling approach is suitable for mapping malaria over areas with an obvious
fixed partitioning (i.e. ecological zones). For areas where this is not possible, a nonstationary
model was developed in Chapter 5 by allowing the data to decide on the number
and shape of the tiles and thus to determine the different spatial processes. The partitioning
of the study area was based on random Voronoi tessellations and model parameters were
estimated via reversible jump Markov chain Monte Carlo (RJMCMC) due to the variable
dimension of the parameter space.
In Chapter 6 the feasibility of using the recently developed mathematical malaria transmission
models to adjust for age and seasonality in mapping historical malaria survey data
was investigated. In particular, the transmission model was employed to translate age
heterogeneous survey data from Mali into a common measure of transmission intensity. A
Bayesian geostatistical model was fitted on the transmission intensity estimates using as
covariates a number of environmental/climatic variables. Bayesian kriging was employed
to produce smooth maps of transmission intensity, which were further converted to age
specific parasitaemia risk maps. Model validation on a number of test locations showed
that this transmission model gives better predictions than modeling directly the prevalence data. This approach was further validated by analyzing the nationally representative malaria
surveys data derived from the Malaria Indicator surveys (MIS) in Zambia. Although
MIS data do not have the same limitations with the historical data, the purpose of the analyzes
was to compare the maps obtained by modeling 1) directly the raw prevalence data
and 2) transmission intensity data derived via the transmission model. Both maps predicted
similar patterns of malaria risk, however the map based on the transmission model
predicted a slightly higher lever of endemicity. The use of transmission models on malaria
mapping enables adjusting for seasonality and age dependence of malaria prevalence and
it includes all available historical data collected at different age groups.
and mortality in large areas of the developing world, especially Africa. Accurate
estimates of the burden of the disease are useful for planning and implementing malaria
control interventions and for monitoring the impact of prevention and control activities.
Information on the population at risk of malaria can be compared to existing levels of
service provision to identify underserved populations and to target interventions to high
priority areas. The current available statistics for malaria burden are not reliable because
of the poor malaria case reporting systems in most African countries and the lack of national
representative malaria surveys. Accurate maps of malaria distribution together with
human population totals are valuable tools for generating valid estimates of population at
risk.
Empirical mapping of the geographical patterns of malaria transmission in Africa requires
field survey data on prevalence of infection. The Mapping Malaria Risk in Africa (MARA)
is the most comprehensive database on malariological survey data across all sub-Sahara
Africa. Transmission of malaria is environmentally driven because it depends on the distribution
and abundance of mosquitoes, which are sensitive to environmental and climatic
conditions. Estimating the environment-disease relation, the burden of malaria can be
predicted at places where data on transmission are not available. Malaria data collected
at fixed locations over a continuous study area (geostatistical data) are correlated in space
because common exposures of the disease influence malaria transmission similarly in neighboring
areas. Geostatistical models take into account spatial correlation by introducing
location-specific random effects. Geographical dependence is considered as a function of
the distance between locations. These models are highly parametrized. State-of-the-art
Bayesian computation implemented via Markov chain Monte Carlo (MCMC) simulation
methods enables model fit. A common assumption in geostatistical modeling of malaria data is the stationarity, that is the spatial correlation is a function of distance between
locations and not of the locations themselves. This hypothesis does not always hold, especially
when modeling malaria over large areas, hence geostatistical models that take
into account non-stationarity need to be assessed. Fitting geostatistical models requires
repeated inversions of the variance-covariance matrix modeling geographical dependence.
For very large number of data locations matrix inversion is considered infeasible. Methods
for optimizing this computation are needed. In addition, the relation between environmental
factors and malaria risk is often not linear and parametric functions may not be able
to determine the shape of the relationship. Nonparametric geostatistical regression models
that allow the data to determine the form of the environment-malaria relation need to be
further developed and applied in malaria mapping.
The aim of this thesis was to develop appropriate models for non-stationary and large geostatistical
data that can be applied in the field of malaria epidemiology to produce accurate
maps of malaria distribution. The main contributions of this thesis are the development
of methods for: (i) analyzing non-stationary malaria survey data; (ii) modeling the nonlinear
relation between malaria risk and environment/climatic conditions; (iii) modeling
geostatistical mortality data collected at very large number of locations and (iv) adjusting
for seasonality and age in mapping heterogeneous malaria survey data.
Chapter 2 assessed the spatial effect of bednets on all-cause child mortality by analyzing
data from a large follow-up study in an area of high perennial malaria transmission in
Kilombero Valley, southern Tanzania. The results indicated a lack of community effect
of bednets density possibly because of the homogeneous characteristic of nets coverage
and the small proportion of re-treated nets in the study area. The mortality data of this
application were collected over 7, 403 locations. To overcome large matrix inversion a
Bayesian geostatistical model was developed. This model estimates the spatial process by
a subset of locations and approximates the location-specific random effects by a weighted
sum of the subset of location-specific random effects with the weights inversely proportional
to the separation distance.
In Chapter 3 a Bayesian non-stationary model was developed by partitioning the study
region into fixed subregions, assuming a separate stationary spatial process in each tile and
taking into account between-tile correlation. This methodology was applied on malaria
survey data extracted from the MARA database and produced parasitaemia risk maps in
Mali. The predictive ability of the non-stationary model was compared with the stationary analogue and the results showed that the stationarity assumption influenced the significance
of environmental predictors as well as the the estimation of the spatial parameters. This
indicates that the assumptions about the spatial process play an important role in inference.
Model validation showed that the non-stationary model had better predictive ability. In
addition, experts opinion suggested that the parasitaemia risk map based on the nonstationary
model reflects better the malaria situation in Mali. This work revealed that
non-stationarity is an essential characteristic which should be considered when mapping
malaria.
Chapter 4 employed the above non-stationary model to produce maps of malaria risk in
West Africa considering as fixed tiles the four agro-ecological zones that partition the region.
Non-linearity in the relation between parasitaemia risk and environmental conditions
was assessed and it was addressed via P-splines within a Bayesian geostatistical model formulation.
The model allowed a separate malaria-environment relation in each zone. The
discontinuities at the borders between the zones were avoided since the spatial correlation
was modeled by a mixture of spatial processes over the entire study area, with the weights
chosen to be exponential functions of the distance between the locations and the centers
of the zones corresponding to each of the spatial processes.
The above modeling approach is suitable for mapping malaria over areas with an obvious
fixed partitioning (i.e. ecological zones). For areas where this is not possible, a nonstationary
model was developed in Chapter 5 by allowing the data to decide on the number
and shape of the tiles and thus to determine the different spatial processes. The partitioning
of the study area was based on random Voronoi tessellations and model parameters were
estimated via reversible jump Markov chain Monte Carlo (RJMCMC) due to the variable
dimension of the parameter space.
In Chapter 6 the feasibility of using the recently developed mathematical malaria transmission
models to adjust for age and seasonality in mapping historical malaria survey data
was investigated. In particular, the transmission model was employed to translate age
heterogeneous survey data from Mali into a common measure of transmission intensity. A
Bayesian geostatistical model was fitted on the transmission intensity estimates using as
covariates a number of environmental/climatic variables. Bayesian kriging was employed
to produce smooth maps of transmission intensity, which were further converted to age
specific parasitaemia risk maps. Model validation on a number of test locations showed
that this transmission model gives better predictions than modeling directly the prevalence data. This approach was further validated by analyzing the nationally representative malaria
surveys data derived from the Malaria Indicator surveys (MIS) in Zambia. Although
MIS data do not have the same limitations with the historical data, the purpose of the analyzes
was to compare the maps obtained by modeling 1) directly the raw prevalence data
and 2) transmission intensity data derived via the transmission model. Both maps predicted
similar patterns of malaria risk, however the map based on the transmission model
predicted a slightly higher lever of endemicity. The use of transmission models on malaria
mapping enables adjusting for seasonality and age dependence of malaria prevalence and
it includes all available historical data collected at different age groups.
Advisors: | Tanner, Marcel |
---|---|
Committee Members: | Vounatsou, Penelope and Smith, Thomas A. |
Faculties and Departments: | 09 Associated Institutions > Swiss Tropical and Public Health Institute (Swiss TPH) > Former Units within Swiss TPH > Molecular Parasitology and Epidemiology (Beck) |
UniBasel Contributors: | Tanner, Marcel and Vounatsou, Penelope and Smith, Thomas A. |
Item Type: | Thesis |
Thesis Subtype: | Doctoral Thesis |
Thesis no: | 8628 |
Thesis status: | Complete |
Number of Pages: | 142 |
Language: | English |
Identification Number: |
|
edoc DOI: | |
Last Modified: | 02 Aug 2021 15:06 |
Deposited On: | 12 Jun 2009 09:53 |
Repository Staff Only: item control page