Corresponding authors: Esther M. Sundermann (
Academic editor: Matthias Filter
To reduce the burden of human society that is caused by zoonotic diseases, it is important to attribute sources to human illnesses. One powerful approach in supporting any intervention decision is mathematical modelling. This paper presents a source attribution model which considers five sources (broilers, laying hens, pigs, turkeys) for salmonellosis and uses two datasets from Germany collected over two time periods; one from 2004 to 2007 and one from 2010 to 2011. The model uses a Bayesian modelling approach derived from the socalled Hald model and is based on microbial subtyping. In this case,
EMS is funded by the JIP MATRIX within the One Health EJP. One Health EJP has received funding from the European Union Horizon 2020 research and innovation program under grant agreement No 773830. Gathering the data and analyzing it with source attribution models was initially done as a part of the project RESET which was financially supported by the German Federal Ministry of Education and Research (BMBF) through the German Aerospace Center grant number 01Kl1013A‐H.
Zoonotic diseases are a major burden for human society. The burden relates to two categories: 1) human health burden in form of mortality and morbidity (
To reduce the human cases of zoonoses, it is important to understand the relationship of potential sources and human illness (
One modelling approach for source attribution that is based on microbial subtyping is the Bayesian model. In the context of food safety, the models developed by
The two datasets and the mathematical model by
The model metadata are part of the FSKXfile (see Suppl. material
The model attributes human cases of the zoonotic disease salmonellosis to a certain source (namely, broilers, laying hens, pigs, turkeys, and unknown). It is based on a Bayesian microbial subtyping approach described by
Datasets covering studies on
The first dataset on
The second dataset on
To summarize, the baseline and the monitoring data are comparable, i.e., the data were compiled in a similar way and the intention measures in the years were the same, thus, no significant difference in the data is expected.
Data on human
The presented Bayes databased (DB) model is a source attribution model that is based on microbial subtyping (
A note about terminology: the terms "subtype" and "type" are used interchangeably.
The socalled Hald model (
where
where
Some authors describe difficulties with the convergence of the Hald model (
This reparameterization can only be done if all serotypes are phage typed. As not all the data of serotypes Enteritidis and Typhimurium considered by
Following the idea of
1. Parameterization of the subtypedependent parameter
For each source
Parameterize
If there are no unique types, parameterize all
2. Parameterization of the sourcedependent parameter
For each source
This also applies to the case that no source has a unique type.
3. Parameterization of the consumption data
If no consumption data are available, all
To estimate unknown parameters, uniform distributions are assumed as prior distributions for
In the model presented in this paper the following prior distributions were assumed:
The the limits of the prior distributions were chosen such that they produce complete posterior distributions for both datasets (baseline and monitoring data). Depending on the data, one might have to adjust the limits of the distribution (see Section "The effect of prior distributions on completeness of posterior distributions" for details).
In the next section, we describe how to parameterize the model and run model simulations using FSKX format.
All model parameters and their descriptions are presented in Table
The Bayes DB model is implemented in the programming language R (
The fskxmodel can be executed, developed further, and easily adapted to new data on the local computer, e.g., using the KNIME extension FSKLab (see
In order to execute the model, please register at the
The default simulation runs for 2 minutes 11 seconds on the
The main result is that the existing source attribution model previously published in
To be able to successfully use the model, it is important to know how to set up and run the model as well as assess the appropriateness of the results. We present these practical issues since this is a purely technical paper it seems appropriate to provide this level of technicality here.
When running our Bayesian model using Markov Chain Monte Carlo (MCMC) methods, we studied three important aspects of model diagnostics. To ensure a high quality estimation of unknown parameters, we check the following aspects of a MCMC method: the convergence behaviour of the Markov chains, the completeness of posterior distributions, and the consistency of results.
The limits for the uniform distribution have a strong influence on the completeness of the posterior distributions. The limits are incorporated into the OpenBUGS code of the model (see file "BugsModel.txt" in the fskxmodel). In the Bayes DB model, the lower limit is 0 and the upper limits are 0.2 for
The choice of the starting values of the Markov chains (also known as initial values) has an impact on the convergence and uncertainty estimates of the model calculation. The model runs with five Markov chains. The default starting values for the five chains are listed in Table
In Parameter sScenario 1, the starting points are evenly spaced in the lower fifth of the plane of possible starting values (see Fig.
In Parameter scenario 2, the starting points are concentrated near two points: one point is (0, 0) the other (0.18, 0.18) (see Fig.
Finally, the starting points cluster near two points: one point is (0, 0.18) the other (0.18, 0.18) (see Fig.
Some authors pointed out that the parameter for consumption data,
Simplifying the Bayes DB model for the baseline data by setting all
For the monitoring data, setting all
One way to interpret the need for enlarging the priors for
Source attribution methods aim to identify and quantify the contribution of different sources to disease burdens like salmonellosis (
The presented results allow to analyse the quantity of the burden assignable to each source and provide the basis to compare different datasets. Although the baseline and the monitoring data are comparable and no significant difference between the datasets is expected (see Section "Data" for details), it provides the basis for comparison. There is much more to say about the model and its results but we focus here on the technical aspects of making the model FSKX compliant and some of the model mechanics. For a more detailed discussion of the model and its results see
Modelling of source attribution is a powerful approach that can contribute to the reduction of human zoonotic cases, in particular salmonellosis. However, model results are highly sensitive to changes of multiple parameters that can differ for each model. In the presented model, these parameters include the initial values for observed
The implementation of a model in a standardized and annotated exchange format like FSKXformat is a way that focuses on longterm usability and understandability of the model. The community as well as the creators would benefit from such an approach. One example where a creator developed a model with an FSKX conform endproduct in mind is the work of
Much timeconsuming and/or errorprone work can be saved in the future if model development is done with a mindset of longterm usability, reproducibility, and understandability. The FSKX format enables sharing model code reliably and reproducibly and thus paves the way for successful collaboration and further development of models.
In this work, we demonstrated that it is straight forward to take a Bayesian source attribution model running under R and OpenBUGS originally published in
We like to acknowledge the original work of Hannah Jabin and Lars Valentin and furthermore thank the colleagues from the Robert Koch Institute (W. Rabsch) and German Federal Institute for Risk Assessment (C. Dorn, A. Schroeter, R. Helmuth, A. Friedrich and I. Szabo) for providing the data on
EMS is funded by the JIP MATRIX within the One Health EJP. One Health EJP has received funding from the European Union Horizon 2020 research and innovation program under grant agreement No 773830. Gathering the data and analyzing it with source attribution models was initially done as a part of the project RESET which was financially supported by the German Federal Ministry of Education and Research (BMBF) through the German Aerospace Center grant number 01Kl1013A‐H.
Esther M. Sundermann: Conceptualization, Data Curation, Project administration, Software, Visualization, Writing  Original Draft, Writing  Review & Editing. Guido Correia Carreira: Conceptualization, Formal analysis, Writing  Original Draft, Visualization, Writing  Review & Editing. Annemarie Käsbohrer: Data Curation, Writing  Review & Editing. The author contributions are taken from
The posterior distributions for the fifth entry in the list of
Starting points for the Markov chains of Parameter scenario 1, their effects on the convergence behaviour and the model predictions. The starting points are evenly spaced in the lower fifth of the space of possible starting points (see the points in the scatter plot in the upper right corner). With this set of starting points, Markov chains converge quickly as can be seen in the four trace plots on the left hand side which show how the paramter values that the model estimates change through the iteration steps of the model calculations. Each of the four trace plots correspond to one model parameter (
Starting points for the Markov chains of Parameter scenario 2 and their effects on the convergence behaviour and the model predictions. The starting points are concentrated near the points (0, 0) and (0.18, 0.18) (see the points in the scatter plot in the upper right corner). With this set of starting points, Markov chains converge slowly as can be seen in the four trace plots on the left hand side which show how the paramter values that the model estimates change through the iteration steps of the model calculations. Each of the four trace plots correspond to one model parameter (
Starting points for the Markov chains of Parameter scenario 3 and their effect on the convergence behaviour and the model predictions. The starting points are concentrated near the points (0, 0.18) and (0.18, 0.18) (see the points in the scatter plot in the upper right corner). With this set of starting points the Markov chains do not converge within 30,000 iterations for the parameters
Modeldata fit when setting
Bar plot of number of human cases of
Description of the model parameters of the source attribution model. In the row that specifies the source, article always refers to the reference description of
Id  list_sources 
Classification  INPUT 
Name  list_sources 
Description  List all possible sources 
Unit  [] 
Data Type  INTEGER 
Source  Article 
Value  c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') 
Id  qfix_ind 
Classification  INPUT 
Name  qfix_ind 
Description  Indices of subtype‐dependent factor for subtype i (q_{i}), which will be set to fixed values. These are the four values for the human cases concerning the "unique types": S.Virchow, S.E. PT 1, S.T. DT 193, and S. Saintpaul 
Unit  [] 
Data Type  VECTOROFNUMBERS 
Source  Data 
Value  c(63,64,65,66) 
Min Value  1 
Max Value  Number of considered subtypes 
Id  input_FileName 
Classification  INPUT 
Name  input_FileName 
Description  Name of the file that contains the analysed data 
Unit  [] 
Data Type  STRING 
Source  Article 
Value  "Table2.csv" 
Id  OpenBUGS_parameter 
Classification  INPUT 
Name  OpenBUGS_parameter 
Description  The values that should be logged while running the OpenBUGSmodel 
Unit  [] 
Data Type  STRING 
Source  Article 
Value  c("source", "unknown", "a", "q", "lambdaexp") 
Id  OpenBUGS_niter 
Classification  INPUT 
Name  OpenBUGS_niter 
Description  Number of total iterations per chain used in the OpenBUGSmodel 
Unit  [] 
Data Type  INTEGER 
Source  Article 
Value  30000 
Min Value  OpenBUGS_nburnin+1 
Id  OpenBUGS_nburnin 
Classification  INPUT 
Name  OpenBUGS_nburnin 
Description  Length of burn in, i.e. number of iterations to discard at the beginning. 
Unit  [] 
Data Type  INTEGER 
Source  Article 
Value  10000 
Min Value  1 
Id  aValue 
Classification  INPUT 
Name  aValue 
Description  Values for the sourcedependent factors (a_{i}) that are used to determine inital values for the OpenBUGS model 
Unit  [] 
Data Type  VECTOROFNUMBERS 
Source  Data 
Value  c(0.002,0.001,0.19, 0.18, 0.178) 
Min Value  0 
Id  qValue 
Classification  INPUT 
Name  qValue 
Description  Values for the subtypedependent factors (q_{i}) that are used to determine inital values for the OpenBUGS model 
Unit  [] 
Data Type  VECTOROFNUMBERS 
Source  Data 
Value  c(0.001,0.002, 0.199, 0.18, 0.175) 
Min Value  0 
Id  OpenBUGS_model 
Classification  INPUT 
Name  OpenBUGS_model 
Description  The filename of the txtfile that contains the OpenBUGSmodel 
Unit  [] 
Data Type  STRING 
Source  The filename is freely chosen. The BUGSmodel is descrided in the reference article. 
Value  "BugsModel.txt" 
Id  mean_res 
Classification  OUTPUT 
Name  mean_res 
Description  Mean number of estimated human salmonellosiscases attribute to potential sources 
Unit  Cases 
Data Type  VECTOROFNUMBERS 
Min Value  0 
Max Value  1 
Id  quantil_95 
Classification  OUTPUT 
Name  quantil_95 
Description  95%quantile of estimated human salmonellosiscases attributed to the potential sources 
Unit  Cases 
Data Type  VECTOROFNUMBERS 
Min Value  0 
Max Value  1 
Id  quantil_05 
Classification  OUTPUT 
Name  quantil_05 
Description  5%quantile of estimated human salmonellosiscases attributed to the potential sources 
Unit  Cases 
Data Type  VECTOROFNUMBERS 
Min Value  0 
Max Value  1 
The simulation settings for the source attribution model. The settings specify the parameter names and the values (see Table


list_sources  c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') 
qfix_ind  c(63,64,65,66) 
input_FileName  "Table2.csv" 
OpenBUGS_parameter  c("source", "unknown", "a", "q", "lambdaexp") 
OpenBUGS_niter  30000 
OpenBUGS_nburnin  10000 
aValue  c(0.002,0.001,0.19, 0.18, 0.178) 
qValue  c(0.001,0.002, 0.199, 0.18, 0.175) 
OpenBUGS_model  "BugsModel.txt" 


list_sources  c('Broilers', 'Laying hens', 'Pigs', 'Turkeys') 
qfix_ind  c(30,31,32,33) 
input_FileName  "Table3.csv" 
OpenBUGS_parameter  c("source", "unknown", "a", "q", "lambdaexp") 
OpenBUGS_niter  30000 
OpenBUGS_nburnin  10000 
aValue  c(0.01, 0.015, 0.099, 0.08,0.02) 
qValue  c(0.001, 0.002, 0.9,0.85, 0.99) 
OpenBUGS_model  "BugsModel.txt" 
SourceAttributionModel.fskx
fskxmodel
File: oo_545354.fskx