
VPC Plots with BLQ Censoring
Source:vignettes/vpc-obs-quant-blq-cens.Rmd
vpc-obs-quant-blq-cens.Rmd
This vignette will review the excellent functionality in the
vpc()
function from the vpc
package for
appropriate handling of data censored below the lower limit of
quantification (LLOQ) using the lloq
argument.
vpc_plot_exactbins()
takes advantage of this
functionality through the argument loq
, which is also
passed to lloq
in vpc()
.
Let’s get started. First, we will load the required packages.
options(scipen = 999, rmarkdown.html_vignette.check_title = FALSE)
library(pmxhelpr)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2, warn.conflicts = FALSE)
library(Hmisc, warn.conflicts = FALSE)
library(vpc, warn.conflicts = FALSE)
library(mrgsolve, warn.conflicts = FALSE)
library(withr, warn.conflicts = FALSE)
Analysis Dataset
Next let’s explore our input dataset data_sad
. This
dataset was generated via simulation from the mrgsolve
model
internal to the pmxhelpr package. We can take a quick
look at the dataset using glimpse()
from the
dplyr
.
glimpse(data_sad)
#> Rows: 720
#> Columns: 23
#> $ LINE <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,…
#> $ ID <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,…
#> $ TIME <dbl> 0.00, 0.00, 0.48, 0.81, 1.49, 2.11, 3.05, 4.14, 5.14, 7.81, 12…
#> $ NTIME <dbl> 0.0, 0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0, 8.0, 12.0, 16.0, …
#> $ NDAY <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 4, 5, 6, 7, 8, 1,…
#> $ DOSE <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10…
#> $ AMT <dbl> NA, 10, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ EVID <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ ODV <dbl> NA, NA, NA, 2.02, 4.02, 3.50, 7.18, 9.31, 12.46, 13.43, 12.11,…
#> $ LDV <dbl> NA, NA, NA, 0.7031, 1.3913, 1.2528, 1.9713, 2.2311, 2.5225, 2.…
#> $ CMT <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
#> $ MDV <dbl> 1, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1…
#> $ BLQ <dbl> -1, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, …
#> $ LLOQ <dbl> 1, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ FOOD <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ SEXF <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ RACE <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1,…
#> $ AGEBL <int> 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25…
#> $ WTBL <dbl> 82.1, 82.1, 82.1, 82.1, 82.1, 82.1, 82.1, 82.1, 82.1, 82.1, 82…
#> $ SCRBL <dbl> 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.…
#> $ CRCLBL <dbl> 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 12…
#> $ USUBJID <chr> "STUDYNUM-SITENUM-1", "STUDYNUM-SITENUM-1", "STUDYNUM-SITENUM-…
#> $ PART <chr> "Part 1-SAD", "Part 1-SAD", "Part 1-SAD", "Part 1-SAD", "Part …
Let’s visualize the data. For this visualization, we will leverage
the functionality of plot_dvtime
. First, we will derive a
factor variable from "DOSE"
to pass to the color
aesthetic.
Now let’s plot the data using ODV
colored by
Dose (mg)
and faceted by PART
.
plot_dvtime(plot_data, dv_var = "ODV", col_var = "Dose (mg)",
ylab = "Concentration (ng/mL)", log_y = TRUE) +
facet_wrap(~PART)
The concentration-time profiles increase with dose with some
potential impact of censoring at the LLOQ in the late terminal phase. To
further assess the potential impact of BLQ censoring, we will set the
loq_method = 2
in plot_dvtime
, which imputes
BLQ data to 1/2 x LLOQ. Imputing concentrations below the lower limit of
quantification as 1/2 x LLOQ normalizes the late terminal phase of the
concentration-time profile. This suggests that the artifact in the late
terminal phase is due to censoring of observations below the LLOQ, which
impacts lower doses more than higher doses.
plot_dvtime(plot_data, dv_var = "ODV", col_var = "Dose (mg)", loq_method = 2,
ylab = "Concentration (ng/mL)", log_y = TRUE) +
facet_wrap(~PART)
Indeed, while pharmacokinetic data are generally treated as continuous log-normally distributed data, in reality the distribution is truncated by the lower limit of quantification of the assays used in bioanalysis.
Therefore, whenever evaluating concentration-time plots, we hould
keep in mind the assay limitations and their potential impact on visual
trends. To examine this further, let’s visualize the two different
concentrations variables in the plot_data
object for a
single dose:
-
ODV
: Original dependent variable based on the source data. Data below the LLOQ are censored and set to missingNA
. -
DV
: Derived dependent variable with data below the LLOQ set to0.5 x LLOQ
.
We will focus in on the 100 mg dose, which is included in both Part1-SAD and Part2-FE, and appears to display some of the most visual artifact potentially due to censoring at the LLOQ.
plot_data100 <- plot_data %>%
filter(DOSE == 100) %>%
mutate(DV = ifelse(BLQ == 0, ODV, LLOQ/2))
ggplot(aes(x = TIME, y = ODV), data = plot_data100)+
geom_point(shape = 1, alpha = 0.5)+
stat_summary(fun = "mean", geom = "point",
aes(x = NTIME, y = ODV, col = "DV (>LLOQ)"), inherit.aes = FALSE) +
stat_summary(fun = "mean", geom = "line",
aes(x = NTIME, y = ODV, col = "DV (>LLOQ)"), inherit.aes = FALSE) +
stat_summary(fun = "mean", geom = "point",
aes(x = NTIME, y = DV, col = "DV (>0.5LLOQ)"), inherit.aes = FALSE) +
stat_summary(fun = "mean", geom = "line",
aes(x = NTIME, y = DV, col = "DV (>0.5LLOQ)"), inherit.aes = FALSE) +
geom_hline(yintercept = unique(plot_data$LLOQ), linetype = "dashed")+
scale_x_continuous(breaks = seq(0,168,24))+
scale_y_log10(guide = "axis_logticks")+
scale_color_manual(name = "Data Handling",
breaks = c("DV (>LLOQ)", "DV (>0.5LLOQ)"),
values = c("DV (>LLOQ)" = "#440154FF", "DV (>0.5LLOQ)" = "#21908CFF"))+
facet_wrap(~PART)+
labs(y = "Concentration (ng/mL)", x = "Time Since Dose Administration (hours)")+
theme_bw() +
theme(panel.grid.minor.y = element_blank(),
panel.grid.major.x = element_blank())
We can see in these plots that the central tendency of the late terminal phase is influenced by censoring at the LLOQ. This leads to visual artifact, which may sometimes appear like another log-linear slope. The exact pattern of impact will depend on the dose intensity, the assay sensitivity, and the study sampling schema.
The relative impact of the unobserved portion of the distribution of concentration can be further visualized as histograms faceted by nominal sampling time. These distributions are pooled across both Part 1-SAD and Part 2-FE for the 100 mg in these histograms.
ggplot(aes(x = DV), data = filter(plot_data100, NTIME %in% seq(24,144,24), DOSE == 100))+
geom_histogram()+
geom_vline(xintercept = unique(plot_data$LLOQ), linetype = "dashed", color = "red")+
scale_x_log10()+
facet_wrap(~NTIME)+
labs(y = "Density", x = "Concentration (ng/mL)")+
theme_bw()+
theme(panel.grid.minor = element_blank())
If the study protocol had only included sampling through 72 hours, then the impact of censoring at the assay LLOQ would have been negligible for a single 100 mg dose, regardless of food condition. However, a large portion of the distribution of concentrations is BLQ at later timepoints the one week duration of sampling.
It is expected that the frequency of censored data at the LLOQ will increase with increasing time from dose administration. Additionally, it is expect that the frequency of of censored data below the LLOQ will decrease with increasing dose intensity.
We can visualize this by summarizing the fraction of samples that are
missing due to assay sensitivity (MDV=1
) by timepoint,
dose, and study part.
plot_data_sumblq <- plot_data %>%
group_by(DOSE, `Dose (mg)`,NTIME, PART) %>%
dplyr::summarize(`Fraction BLQ`= mean(MDV)) %>%
ungroup()
ggplot(aes(x = NTIME, y = `Fraction BLQ`, col = `Dose (mg)`), data = plot_data_sumblq)+
geom_line()+
scale_x_continuous(breaks = seq(0,168,24))+
facet_wrap(~PART)+
labs(y = "Fraction of Samples BLQ", x = "Nominal Time Since Dose Administration (hours)")+
theme_bw() +
theme(panel.grid.minor = element_blank())
In summary, the observed patters of BLQ data were in the example dataset are anticipated for an early phase clinical study.
- increasing frequency of BLQ data with increasing time
- decreasing frequency of BLQ data with increasing dose
BLQ data are often ignored (i.e., set as missing; M1 method) in population PK model development. This is unlikely to introduce appreciable bias in model parameter estimation, provided the absolute frequency of BLQ data is low (e.g., <10-20%) and the pattern of BLQ censoring is in line with the expectations by dose and time.
However, just as BLQ censoring introduced some artifact to the exploratory plots of concentration over time, the handling of BLQ data in VPCs may impact the interpretation of these graphical model diagnostics.
VPC Plots Accounting for BLQ Censoring
Lets run a simulation and generate some VPC plots to demonstrate the impact of different approaches to BLQ handling to the visual assessment of the adequacy of model fit and the true underlying PK profile.
model <- model_mread_load("model")
#> Building model_cpp ... done.
simout <- df_mrgsim_replicate(data = data_sad, model = model,replicates = 100,
dv_var = "ODV",
time_vars = c(TIME = "TIME", NTIME = "NTIME"),
output_vars = c(PRED = "PRED", IPRED = "IPRED", DV = "DV"),
num_vars = c("CMT", "BLQ", "LLOQ", "EVID", "MDV", "DOSE", "FOOD"),
char_vars = c("PART"),
obsonly = TRUE)
Let’s start by focusing on evaluating the model fit of the 100 mg dose level we have been exploring across Part1-SAD and Part2-FE.
There are two potential approaches proccessing and plotting the VPC:
-
Exclude BLQ
: remove missing observations (MDV=1
) in both the observed and simulated data -
Censor Observed Quantiles
: set quantiles of the observed data toNA
, if the sample representing that quantile of the observed is BLQ (MDV=1
) in the bin.
Let’s plot using the default arguments to
plot_vpc_exactbins()
. The default behavior is
Exclude BLQ
, which filters out MDV=1
.
vpc_blq_drop <- plot_vpc_exactbins(
sim = sim100,
time_vars = c(TIME = "TIME", NTIME = "NTIME"),
output_vars = c(PRED = "PRED", IPRED = "IPRED", SIMDV = "SIMDV", OBSDV = "OBSDV"),
pi = c(0.05, 0.95),
ci = c(0.05, 0.95),
xlab = "Time (hours)",
ylab = "Concentration (ng/mL)",
strat_var = "PART",
min_bin_count = 4
) +
scale_y_log10(guide = "axis_logticks")
vpc_blq_drop
The observed median appears consistent with the exploratory
concentration-time profiles generated earlier excluding data below the
lower limit of quantification (i.e., plotting
"ODV"
).
Notice that the increasing trend in the observed quantiles at later
timelines coincides with completely overlapping simulated confidence
intervals for all quantiles. This is due to the decreasing sample size
with time due to exclusion of timepoints where MDV=1
.
Let’s instead try specifying a new argument loq
to
plot_vpc_exactbins()
.
loq
is the numeric value of the lower limit of
quantification in the units of the dependent variable. It primarily
serves as an alias for lloq
in vpc::vpc()
,
leveraging BLQ handling functionality already built into that
package.
vpc_blq_qobs_cens <- plot_vpc_exactbins(
sim = sim100,
time_vars = c(TIME = "TIME", NTIME = "NTIME"),
output_vars = c(PRED = "PRED", IPRED = "IPRED", SIMDV = "SIMDV", OBSDV = "OBSDV"),
xlab = "Time (hours)",
ylab = "Concentration (ng/mL)",
strat_var = "PART",
min_bin_count = 4,
loq = 1
) +
scale_y_log10(guide = "axis_logticks")
vpc_blq_qobs_cens
Now we see a red horizontal line depicting the LLOQ (1 ng/mL). Notice
also the increase in the y-axis range, the change in the shape of the
observed quantiles, and the greater resolution in separation between the
confidence intervals.
Specifying loq
assumes that MDV=1
represents only BLQ samples in the input dataset for the plot
(sim
) when EVID=0
. The function sets the
MDV
variable in sim
to zero (0) so that all
timepoints are included.
- In the calculation of the summary statistics of the observed data,
the quantiles are calculated across all data per bin and taken as
NA
when that quantile of observations is belowloq
. - In the calcluation of summary statistics of the simulated data, the simulated confidence intervals are based on all data without any censoring, which is representative of the model-predicted “true” underlying PK profile in the absence of real-world assay limitations.
Specifying loq
is the preferred method for plotting VPC
diagnostics with the pmxhelpr
package. This is not the
default method due to the all-too-common case where the assay LLOQ is
not known to the analyst. Additionally, this method of accounting for
the censoring of data below the LLOQ by specifying the loq
argument should only be applied to Visual Predictive Checks
(pcvpc = FALSE
) and should not be applied to
Prediction-corrected Visual Predictive Checks
(pcvpc = TRUE
).
Ifloq
is specified withpcvpc = TRUE
, an
error message will print from vpc::vpc()
.
vpc_pc <- plot_vpc_exactbins(
sim = simout,
time_vars = c(TIME = "TIME", NTIME = "NTIME"),
output_vars = c(PRED = "PRED", IPRED = "IPRED", SIMDV = "SIMDV", OBSDV = "OBSDV"),
pi = c(0.05, 0.95),
ci = c(0.05, 0.95),
xlab = "Time (hours)",
ylab = "Concentration (ng/mL)",
pcvpc = TRUE,
min_bin_count = 4,
loq = 1
) +
scale_y_log10(guide = "axis_logticks")
#> Joining with `by = join_by(NTIME, CMT)`
#> Joining with `by = join_by(NTIME, CMT)`
#> Prediction-correction cannot be used together with censored data (<LLOQ or
#> >ULOQ). VPC plot will be shown for non-censored data only!
vpc_pc