Data Model

The BCPNN method leverages the information component (IC) to measure the association between the vaccine and symptom. IC is widely used to measure the mutual information between two random variables.

Let \(p_i\) be the probability of a target vaccine \(i\) exposure being reported, \(p_j\) be the the probability of the target symptom \(j\) being reported, and \(p_{ij}\) be the joint probability of a report on the target symptom \(j\) under exposure to the target vaccine \(i\). Bate et al. (Bate et al. 1998) defines the metric \(\text{IC}_{ij}\) as

\[ \text{IC}_{ij} = \log_2\frac{p_{ij}}{p_i p_j}. \]

Recall the contingency table for target vaccine \(i\) and target symptom \(j\):

Target vaccine	Target symptom	All other symptoms	Total
Yes	\(n_{ij}\)	\(n_i - n_{ij}\)	\(n_i\)
No	\(n_j - n_{ij}\)	\(n - n_i - n_j + n_{ij}\)	\(n - n_i\)
Total	\(n_j\)	\(n - n_j\)	\(n\)

Let the cell counts for vaccine-symptom pairs \((i, j)\) be \(n_{ij}\). The BCPNN data model assumes

\[ n_{ij} | p_{ij} \sim \text{Binomial}(n, p_{ij}),\\ p_{ij} \sim \text{Beta}(\alpha_{ij}, \beta_{ij}) \]

where

\[ \alpha_{ij} = 1,\\ \beta_{ij} = \frac{1}{E(p_i | n_i) + E(p_j | n_j)} - 1. \]

Under the assumption of independence, the marginal sums over the rows and columns of the \(i \times j\) contingency table are:

\[ n_i | p_i \sim \text{Binomial}(n, p_i),\\ n_j | p_j \sim \text{Binomial}(n, p_j) \]

where

\[ p_i \sim \text{Beta}(1, 1),\\ p_j \sim \text{Beta}(1, 1). \]

The IC estimate is

\[ \hat{\text{IC}_{ij}} = \log_2 \frac{(n_{ij} + 1) (n + 2)^2}{(n_{ij} + 1) (n+2)^2 + n(n_i + 1) (n_j + 1)}. \]

The variance estimation is given by

\[ \hat{\sigma_{ij}}^2 = \frac{1}{(\log 2)^2} (\frac{n - n_{ij} + \gamma - 1}{(n_{ij} + 1)(n+\gamma+1)} + \frac{n-n_{i} + 1}{(n_i + 1) (n+3)} + \frac{n - n_j + 1}{(n_j + 1)(n+3)}) \]

where

\[ \gamma = \frac{(n+2)^2}{(n_i + 1)(n_j + 1)}. \]

Computation

Load the packages for BCPNN-based singal detection and ranking:


suppressMessages(library("PhViD"))
library("kableExtra")

Load the preprocessed VAERS data and transform it into the analyzable format:


df_p <- readRDS("data-processed/df_p.rds")
df_p <- df_p[, 1:3]
df_v <- as.PhViD(df_p, MARGIN.THRES = 10)

Calculate the Information Component derived by the Bayesian neural network model (Bate et al. 1998), (Norén et al. 2006) and the ranking statistic — 2.5% quantile of the posterior distribution of IC:


lst_bcpnn <- BCPNN(df_v, MIN.n11 = 10, DECISION = 3, RANKSTAT = 2)
df_bcpnn <- lst_bcpnn$SIGNALS[order(lst_bcpnn$SIGNALS$`Q_0.025(log(IC))`, decreasing = TRUE), 1:5]
row.names(df_bcpnn) <- NULL

View the top ranked vaccine-adverse event pairs:


head(df_bcpnn) %>% kable() %>% kable_styling()

drug code	event effect	count	expected count	Q_0.025(log(IC))
SMALLPOX (ACAM2000)	Troponin I increased	161	0.9218906	6.122739
SMALLPOX (DRYVAX)	Cow pox	125	0.6461921	5.914686
INFLUENZA (SEASONAL) (FLUBLOK QUADRIVALENT)	Product administered to patient of inappropriate age	141	0.9689921	5.911459
INFLUENZA (SEASONAL) (FLUCELVAX)	Drug administered to patient of inappropriate age	350	4.4240284	5.848871
MENINGOCOCCAL CONJUGATE (MENVEO)	Incorrect product formulation administered	204	2.0229991	5.838318
ROTAVIRUS (ROTASHIELD)	Gastrointestinal haemorrhage	94	0.3269645	5.834576

Bate, Andrew, Marie Lindquist, I Ralph Edwards, Sten Olsson, Roland Orre, Anders Lansner, and R Melhado De Freitas. 1998. “A Bayesian Neural Network Method for Adverse Drug Reaction Signal Generation.” European Journal of Clinical Pharmacology 54 (4): 315–21.

Norén, G Niklas, Andrew Bate, Roland Orre, and I Ralph Edwards. 2006. “Extending the Methods Used to Screen the Who Drug Safety Database Towards Analysis of Complex Associations and Improved Accuracy for Rare Events.” Statistics in Medicine 25 (21): 3740–57.

Base Ranker: Bayesian Confidence Propagation Neural Network

Data Model

Computation

Corrections