Corresponding authors: Esther M. Sundermann (
Academic editor: Matthias Filter
To reduce the burden of human society that is caused by zoonotic diseases, it is important to attribute sources to human illnesses. One powerful approach in supporting any intervention decision is mathematical modelling. This paper presents a source attribution model which considers five sources (broilers, laying hens, pigs, turkeys) for salmonellosis and uses two datasets from Germany collected over two time periods; one from 2004 to 2007 and one from 2010 to 2011. The model uses a Bayesian modelling approach derived from the so-called Hald model and is based on microbial subtyping. In this case,
EMS is funded by the JIP MATRIX within the One Health EJP. One Health EJP has received funding from the European Union Horizon 2020 research and innovation program under grant agreement No 773830. Gathering the data and analyzing it with source attribution models was initially done as a part of the project RESET which was financially supported by the German Federal Ministry of Education and Research (BMBF) through the German Aerospace Center grant number 01Kl1013A‐H.
Zoonotic diseases are a major burden for human society. The burden relates to two categories: 1) human health burden in form of mortality and morbidity (
To reduce the human cases of zoonoses, it is important to understand the relationship of potential sources and human illness (
One modelling approach for source attribution that is based on microbial subtyping is the Bayesian model. In the context of food safety, the models developed by
The two datasets and the mathematical model by
The model metadata are part of the FSKX-file (see Suppl. material
The model attributes human cases of the zoonotic disease salmonellosis to a certain source (namely, broilers, laying hens, pigs, turkeys, and unknown). It is based on a Bayesian microbial subtyping approach described by
Datasets covering studies on
The first dataset on
The second dataset on
To summarize, the baseline and the monitoring data are comparable, i.e., the data were compiled in a similar way and the intention measures in the years were the same, thus, no significant difference in the data is expected.
Data on human
The presented Bayes data-based (DB) model is a source attribution model that is based on microbial subtyping (
A note about terminology: the terms "subtype" and "type" are used interchangeably.
The so-called Hald model (
where
where
Some authors describe difficulties with the convergence of the Hald model (
This reparameterization can only be done if all serotypes are phage typed. As not all the data of serotypes Enteritidis and Typhimurium considered by
Following the idea of
1. Parameterization of the subtype-dependent parameter
For each source Parameterize If there are no unique types, parameterize all
2. Parameterization of the source-dependent parameter
For each source
This also applies to the case that no source has a unique type.
3. Parameterization of the consumption data
If no consumption data are available, all
To estimate unknown parameters, uniform distributions are assumed as prior distributions for
In the model presented in this paper the following prior distributions were assumed:
The the limits of the prior distributions were chosen such that they produce complete posterior distributions for both datasets (baseline and monitoring data). Depending on the data, one might have to adjust the limits of the distribution (see Section "The effect of prior distributions on completeness of posterior distributions" for details).
In the next section, we describe how to parameterize the model and run model simulations using FSKX format.
All model parameters and their descriptions are presented in Table
The Bayes DB model is implemented in the programming language R (
The fskx-model can be executed, developed further, and easily adapted to new data on the local computer, e.g., using the KNIME extension FSK-Lab (see
In order to execute the model, please register at the
The default simulation runs for 2 minutes 11 seconds on the
The main result is that the existing source attribution model previously published in
To be able to successfully use the model, it is important to know how to set up and run the model as well as assess the appropriateness of the results. We present these practical issues since this is a purely technical paper it seems appropriate to provide this level of technicality here.
When running our Bayesian model using Markov Chain Monte Carlo (MCMC) methods, we studied three important aspects of model diagnostics. To ensure a high quality estimation of unknown parameters, we check the following aspects of a MCMC method: the convergence behaviour of the Markov chains, the completeness of posterior distributions, and the consistency of results.
The limits for the uniform distribution have a strong influence on the completeness of the posterior distributions. The limits are incorporated into the OpenBUGS code of the model (see file "BugsModel.txt" in the fskx-model). In the Bayes DB model, the lower limit is 0 and the upper limits are 0.2 for
The choice of the starting values of the Markov chains (also known as initial values) has an impact on the convergence and uncertainty estimates of the model calculation. The model runs with five Markov chains. The default starting values for the five chains are listed in Table
In Parameter sScenario 1, the starting points are evenly spaced in the lower fifth of the plane of possible starting values (see Fig.
In Parameter scenario 2, the starting points are concentrated near two points: one point is (0, 0) the other (0.18, 0.18) (see Fig.
Finally, the starting points cluster near two points: one point is (0, 0.18) the other (0.18, 0.18) (see Fig.
Some authors pointed out that the parameter for consumption data,
Simplifying the Bayes DB model for the baseline data by setting all
For the monitoring data, setting all
One way to interpret the need for enlarging the priors for
Source attribution methods aim to identify and quantify the contribution of different sources to disease burdens like salmonellosis (
The presented results allow to analyse the quantity of the burden assignable to each source and provide the basis to compare different datasets. Although the baseline and the monitoring data are comparable and no significant difference between the datasets is expected (see Section "Data" for details), it provides the basis for comparison. There is much more to say about the model and its results but we focus here on the technical aspects of making the model FSKX compliant and some of the model mechanics. For a more detailed discussion of the model and its results see
Modelling of source attribution is a powerful approach that can contribute to the reduction of human zoonotic cases, in particular salmonellosis. However, model results are highly sensitive to changes of multiple parameters that can differ for each model. In the presented model, these parameters include the initial values for observed
The implementation of a model in a standardized and annotated exchange format like FSKX-format is a way that focuses on long-term usability and understandability of the model. The community as well as the creators would benefit from such an approach. One example where a creator developed a model with an FSKX conform end-product in mind is the work of
Much time-consuming and/or error-prone work can be saved in the future if model development is done with a mind-set of long-term usability, reproducibility, and understandability. The FSKX format enables sharing model code reliably and reproducibly and thus paves the way for successful collaboration and further development of models.
In this work, we demonstrated that it is straight forward to take a Bayesian source attribution model running under R and OpenBUGS originally published in
We like to acknowledge the original work of Hannah Jabin and Lars Valentin and furthermore thank the colleagues from the Robert Koch Institute (W. Rabsch) and German Federal Institute for Risk Assessment (C. Dorn, A. Schroeter, R. Helmuth, A. Friedrich and I. Szabo) for providing the data on
EMS is funded by the JIP MATRIX within the One Health EJP. One Health EJP has received funding from the European Union Horizon 2020 research and innovation program under grant agreement No 773830. Gathering the data and analyzing it with source attribution models was initially done as a part of the project RESET which was financially supported by the German Federal Ministry of Education and Research (BMBF) through the German Aerospace Center grant number 01Kl1013A‐H.
Esther M. Sundermann: Conceptualization, Data Curation, Project administration, Software, Visualization, Writing - Original Draft, Writing - Review & Editing. Guido Correia Carreira: Conceptualization, Formal analysis, Writing - Original Draft, Visualization, Writing - Review & Editing. Annemarie Käsbohrer: Data Curation, Writing - Review & Editing. The author contributions are taken from
The posterior distributions for the fifth entry in the list of
Starting points for the Markov chains of Parameter scenario 1, their effects on the convergence behaviour and the model predictions. The starting points are evenly spaced in the lower fifth of the space of possible starting points (see the points in the scatter plot in the upper right corner). With this set of starting points, Markov chains converge quickly as can be seen in the four trace plots on the left hand side which show how the paramter values that the model estimates change through the iteration steps of the model calculations. Each of the four trace plots correspond to one model parameter (
Starting points for the Markov chains of Parameter scenario 2 and their effects on the convergence behaviour and the model predictions. The starting points are concentrated near the points (0, 0) and (0.18, 0.18) (see the points in the scatter plot in the upper right corner). With this set of starting points, Markov chains converge slowly as can be seen in the four trace plots on the left hand side which show how the paramter values that the model estimates change through the iteration steps of the model calculations. Each of the four trace plots correspond to one model parameter (
Starting points for the Markov chains of Parameter scenario 3 and their effect on the convergence behaviour and the model predictions. The starting points are concentrated near the points (0, 0.18) and (0.18, 0.18) (see the points in the scatter plot in the upper right corner). With this set of starting points the Markov chains do not converge within 30,000 iterations for the parameters
Model-data fit when setting
Bar plot of number of human cases of
Description of the model parameters of the source attribution model. In the row that specifies the source, article always refers to the reference description of
Id | list_sources |
Classification | INPUT |
Name | list_sources |
Description | List all possible sources |
Unit | [] |
Data Type | INTEGER |
Source | Article |
Value | c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') |
Id | qfix_ind |
Classification | INPUT |
Name | qfix_ind |
Description | Indices of subtype‐dependent factor for subtype i (qi), which will be set to fixed values. These are the four values for the human cases concerning the "unique types": S.Virchow, S.E. PT 1, S.T. DT 193, and S. Saintpaul |
Unit | [] |
Data Type | VECTOROFNUMBERS |
Source | Data |
Value | c(63,64,65,66) |
Min Value | 1 |
Max Value | Number of considered subtypes |
Id | input_FileName |
Classification | INPUT |
Name | input_FileName |
Description | Name of the file that contains the analysed data |
Unit | [] |
Data Type | STRING |
Source | Article |
Value | "Table2.csv" |
Id | OpenBUGS_parameter |
Classification | INPUT |
Name | OpenBUGS_parameter |
Description | The values that should be logged while running the OpenBUGS-model |
Unit | [] |
Data Type | STRING |
Source | Article |
Value | c("source", "unknown", "a", "q", "lambdaexp") |
Id | OpenBUGS_niter |
Classification | INPUT |
Name | OpenBUGS_niter |
Description | Number of total iterations per chain used in the OpenBUGS-model |
Unit | [] |
Data Type | INTEGER |
Source | Article |
Value | 30000 |
Min Value | OpenBUGS_nburnin+1 |
Id | OpenBUGS_nburnin |
Classification | INPUT |
Name | OpenBUGS_nburnin |
Description | Length of burn in, i.e. number of iterations to discard at the beginning. |
Unit | [] |
Data Type | INTEGER |
Source | Article |
Value | 10000 |
Min Value | 1 |
Id | aValue |
Classification | INPUT |
Name | aValue |
Description | Values for the source-dependent factors (ai) that are used to determine inital values for the OpenBUGS model |
Unit | [] |
Data Type | VECTOROFNUMBERS |
Source | Data |
Value | c(0.002,0.001,0.19, 0.18, 0.178) |
Min Value | 0 |
Id | qValue |
Classification | INPUT |
Name | qValue |
Description | Values for the subtype-dependent factors (qi) that are used to determine inital values for the OpenBUGS model |
Unit | [] |
Data Type | VECTOROFNUMBERS |
Source | Data |
Value | c(0.001,0.002, 0.199, 0.18, 0.175) |
Min Value | 0 |
Id | OpenBUGS_model |
Classification | INPUT |
Name | OpenBUGS_model |
Description | The filename of the txt-file that contains the OpenBUGS-model |
Unit | [] |
Data Type | STRING |
Source | The filename is freely chosen. The BUGS-model is descrided in the reference article. |
Value | "BugsModel.txt" |
Id | mean_res |
Classification | OUTPUT |
Name | mean_res |
Description | Mean number of estimated human salmonellosiscases attribute to potential sources |
Unit | Cases |
Data Type | VECTOROFNUMBERS |
Min Value | 0 |
Max Value | 1 |
Id | quantil_95 |
Classification | OUTPUT |
Name | quantil_95 |
Description | 95%-quantile of estimated human salmonellosiscases attributed to the potential sources |
Unit | Cases |
Data Type | VECTOROFNUMBERS |
Min Value | 0 |
Max Value | 1 |
Id | quantil_05 |
Classification | OUTPUT |
Name | quantil_05 |
Description | 5%-quantile of estimated human salmonellosiscases attributed to the potential sources |
Unit | Cases |
Data Type | VECTOROFNUMBERS |
Min Value | 0 |
Max Value | 1 |
The simulation settings for the source attribution model. The settings specify the parameter names and the values (see Table
|
|
list_sources | c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') |
qfix_ind | c(63,64,65,66) |
input_FileName | "Table2.csv" |
OpenBUGS_parameter | c("source", "unknown", "a", "q", "lambdaexp") |
OpenBUGS_niter | 30000 |
OpenBUGS_nburnin | 10000 |
aValue | c(0.002,0.001,0.19, 0.18, 0.178) |
qValue | c(0.001,0.002, 0.199, 0.18, 0.175) |
OpenBUGS_model | "BugsModel.txt" |
|
|
list_sources | c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') |
qfix_ind | c(30,31,32,33) |
input_FileName | "Table3.csv" |
OpenBUGS_parameter | c("source", "unknown", "a", "q", "lambdaexp") |
OpenBUGS_niter | 30000 |
OpenBUGS_nburnin | 10000 |
aValue | c(0.01, 0.015, 0.099, 0.08,0.02) |
qValue | c(0.001, 0.002, 0.9,0.85, 0.99) |
OpenBUGS_model | "BugsModel.txt" |
SourceAttributionModel.fskx
fskx-model
File: oo_545354.fskx