babelmixr2, nlmixr2 and NONMEM
By Matt Fidler and the nlmixr2 Development Team in babelmixr2
November 11, 2022
I remember attending a virtual ACoP where Tim Waterhouse said “This person is so convincing that the could sell NONMEM to a nlmixr developer”. I was in the wrong meeting so I laughed and connected to the correct meeting.
While he is correct, I don’t really want to purchase a NONMEM license, and I would think that individual pharmacometricians are the same: they don’t want to buy a personal license for the software they use at work (although CROs might be different here).
That being said, I have used NONMEM
long before I helped develop
nlmixr2
, and I’ve always appreciated all that NONMEM
brings to the
pharmacometrics community. I remember when I ran my first NONMEM
model
I was amazed and wondered how it could calculate both individual and
population effects of a complicated system.
I still think NONMEM
has an important role in our pharmacometrics
ecosystem today.
Still, our vision stands:
To develop an R-based open-source nonlinear mixed-effects modeling software that can compete with commercial tools and is suitable for regulatory submissions.
which means it would be really convenient to have an interface to
convert nlmixr2
models to NONMEM
, and other tools, to make everyone’s lives easier.
With this in mind, I am proud to announce the first nlmixr2
to
NONMEM
translator in babelmixr2
.
While this has been done before, the method whereby we are converting between the two is novel and has some surprising advantages.
How to use NONMEM
with nlmixr2
To use NONMEM
in nlmixr, you do not need to change your data or your
nlmixr2
dataset. babelmixr2
will do the heavy lifting here.
Lets take the classic warfarin example to start the comparison with…
The model we use in the nlmixr2
vignettes is:
library(babelmixr2)
# The nonmem translation requires the package pmxTools as well.
# You do not need to load it, simply have it available for use.
pk.turnover.emax3 <- function() {
ini({
tktr <- log(1)
tka <- log(1)
tcl <- log(0.1)
tv <- log(10)
##
eta.ktr ~ 1
eta.ka ~ 1
eta.cl ~ 2
eta.v ~ 1
prop.err <- 0.1
pkadd.err <- 0.1
##
temax <- logit(0.8)
tec50 <- log(0.5)
tkout <- log(0.05)
te0 <- log(100)
##
eta.emax ~ .5
eta.ec50 ~ .5
eta.kout ~ .5
eta.e0 ~ .5
##
pdadd.err <- 10
})
model({
ktr <- exp(tktr + eta.ktr)
ka <- exp(tka + eta.ka)
cl <- exp(tcl + eta.cl)
v <- exp(tv + eta.v)
emax = expit(temax+eta.emax)
ec50 = exp(tec50 + eta.ec50)
kout = exp(tkout + eta.kout)
e0 = exp(te0 + eta.e0)
##
DCP = center/v
PD=1-emax*DCP/(ec50+DCP)
##
effect(0) = e0
kin = e0*kout
##
d/dt(depot) = -ktr * depot
d/dt(gut) = ktr * depot -ka * gut
d/dt(center) = ka * gut - cl / v * center
d/dt(effect) = kin*PD -kout*effect
##
cp = center / v
cp ~ prop(prop.err) + add(pkadd.err)
effect ~ add(pdadd.err) | pca
})
}
Next you have to figure out the command to run NONMEM
(it is often
useful to use the full command path). You can set it in
options("babelmixr2.nonmem"="nmfe743")
or use
nonmemControl(runCommand="nmfe743")
. I prefer the options()
method since you only need to set it once. This could also be a
function if you prefer (but I will not cover using the function here).
Lets assume you have NONMEM
setup appropriately. To run the
nlmixr2
model using NONMEM
you simply can run it directly:
testthat::expect_error(nlmixr(pk.turnover.emax3, nlmixr2data::warfarin, "nonmem",
nonmemControl(readRounding=FALSE, modelName="pk.turnover.emax3")))
##
##
## WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
##
## (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
##
##
## 0MINIMIZATION TERMINATED
## DUE TO ROUNDING ERRORS (ERROR=134)
## NO. OF FUNCTION EVALUATIONS USED: 1088
## NO. OF SIG. DIGITS UNREPORTABLE
## 0PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
##
## nonmem model: 'pk.turnover.emax3-nonmem/pk.turnover.emax3.nmctl'
## → terminated with rounding errors, can force nlmixr2/rxode2 to read with nonmemControl(readRounding=TRUE)
## Error : nonmem minimization not successful
Note that a few options you may note in the nonmemControl()
here is
modelName
which helps control the output directory of NONMEM
(if
not specified babelmixr2
tries to guess based on the model name based on the input).
Now if you wanted, you could do the standard approach of changing
sigdig
, sigl
, tol
etc, to get a successful NONMEM
model
convergence, of course that is supported.
One of the other approaches is to ignore the rounding errors that
have occurred and read into nlmixr2
anyway:
# Can still load the model to get information (possibly pipe) and create a new model
f <- nlmixr(pk.turnover.emax3, nlmixr2data::warfarin, "nonmem",
nonmemControl(readRounding=TRUE, modelName="pk.turnover.emax3"))
## → loading into symengine environment...
## → pruning branches (`if`/`else`) of full model...
## ✔ done
## → finding duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → compiling EBE model...
## ✔ done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 27560
## → compress parHist in nlmixr2 object, save 4760
You may see more work happening than you expected to need for an already
completed model. When reading in a NONMEM model, babelmixr2
grabs:
NONMEM
’s objective function valueNONMEM
’s covariance (if available)NONMEM
’s optimization historyNONMEM
’s final parameter estimates (including the ETAs)NONMEM
’sPRED
andIPRED
values (for validation purposes)
These are used to solve the ODEs as if they came from an nlmixr2 optimization procedure.
This means that you can compare the IPRED
and PRED
values of
nlmixr2
/rxode2
and know immediately if your model validates.
This is similar to the procedure Kyle Baron advocates for validating a
NONMEM model against a mrgsolve
model (see
https://mrgsolve.org/blog/posts/2022-05-validate-translation/).
The advantage of this method is that you need to simply write one model to
get a validated roxde2
/nlmixr2
model.
In this case you can see the validation when you print the fit object:
print(f)
## ── nlmixr² nonmem ver 7.4.3 ──
##
## OBJF AIC BIC Log-likelihood Condition Number
## nonmem focei 1326.91 2252.605 2332.025 -1107.302 NA
##
## ── Time (sec $time): ──
##
## setup optimize covariance table other
## elapsed 0.004 0.001 0.001 0.06 6.914
##
## ── Population Parameters ($parFixed or $parFixedDf): ──
##
## Est. Back-transformed BSV(CV% or SD) Shrink(SD)%
## tktr 6.24e-07 1 86.5 59.8%
## tka -3.01e-06 1 86.5 59.8%
## tcl -2 0.135 28.6 1.34%
## tv 2.05 7.78 22.8 6.44%
## prop.err 0.0986 0.0986
## pkadd.err 0.512 0.512
## temax 6.42 0.998 3.00 100.%
## tec50 0.141 1.15 45.0 6.06%
## tkout -2.95 0.0522 9.16 32.4%
## te0 4.57 96.6 5.24 18.1%
## pdadd.err 3.72 3.72
##
## No correlations in between subject variability (BSV) matrix
## Full BSV covariance ($omega) or correlation ($omegaR; diagonals=SDs)
## Distribution stats (mean/skewness/kurtosis/p-value) available in $shrink
## Information about run found ($runInfo):
## • NONMEM terminated due to rounding errors, but reading into nlmixr2/rxode2 anyway
## Censoring ($censInformation): No censoring
## Minimization message ($message):
##
##
## WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
##
## (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
##
##
## 0MINIMIZATION TERMINATED
## DUE TO ROUNDING ERRORS (ERROR=134)
## NO. OF FUNCTION EVALUATIONS USED: 1088
## NO. OF SIG. DIGITS UNREPORTABLE
## 0PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
##
## IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.3e-06
## PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=6.57e-06
## IPRED absolute difference compared to Nonmem IPRED: atol=7.97e-05; 95% percentile: (2.18e-06, 0.00064)
## PRED absolute difference compared to Nonmem PRED: atol=6.57e-06; 95% percentile: (2.75e-07,0.00337)
## there are solving errors during optimization (see '$prderr')
## nonmem model: 'pk.turnover.emax3-nonmem/pk.turnover.emax3.nmctl'
##
## ── Fit Data (object is a modified tibble): ──
## # A tibble: 483 × 35
## ID TIME CMT DV PRED RES IPRED IRES IWRES eta.ktr eta.ka eta.cl
## <fct> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.5 cp 0 1.16 -1.16 0.444 -0.444 -0.864 -0.506 -0.506 0.699
## 2 1 1 cp 1.9 3.37 -1.47 1.45 0.446 0.840 -0.506 -0.506 0.699
## 3 1 2 cp 3.3 7.51 -4.21 3.96 -0.660 -1.03 -0.506 -0.506 0.699
## # … with 480 more rows, and 23 more variables: eta.v <dbl>, eta.emax <dbl>,
## # eta.ec50 <dbl>, eta.kout <dbl>, eta.e0 <dbl>, cp <dbl>, depot <dbl>,
## # gut <dbl>, center <dbl>, effect <dbl>, ktr <dbl>, ka <dbl>, cl <dbl>,
## # v <dbl>, emax <dbl>, ec50 <dbl>, kout <dbl>, e0 <dbl>, DCP <dbl>, PD <dbl>,
## # kin <dbl>, tad <dbl>, dosenum <dbl>
That is in this case:
IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.3e-06
PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=6.57e-06
IPRED absolute difference compared to Nonmem IPRED: atol=7.97e-05; 95% percentile: (2.18e-06, 0.00064)
PRED absolute difference compared to Nonmem PRED: atol=6.57e-06; 95% percentile: (2.75e-07,0.00337)
Which means there are very few differences between the predictions
of rxode2
and NONMEM
, or this model is “validated”.
Since it is a nlmixr2
fit, you can do interesting things with this fit that you couldn’t do in NONMEM
or even in another translator. For example, if you wanted to add a covariance step you can with getVarCov()
:
getVarCov(f)
## → loading into symengine environment...
## → pruning branches (`if`/`else`) of full model...
## ✔ done
## → calculate jacobian
## → calculate sensitivities
## → calculate ∂(f)/∂(η)
## → finding duplicate expressions in inner model...
## → optimizing duplicate expressions in inner model...
## → finding duplicate expressions in EBE model...
## → optimizing duplicate expressions in EBE model...
## → compiling inner model...
## ✔ done
## → finding duplicate expressions in FD model...
## → optimizing duplicate expressions in FD model...
## → compiling EBE model...
## ✔ done
## → compiling events FD model...
## ✔ done
## calculating covariance matrix
## Warning in foceiFitCpp_(.ret): using R matrix to calculate covariance, can check
## sandwich or S matrix with $covRS and $covS
## Warning in foceiFitCpp_(.ret): gradient problems with covariance; see $scaleInfo
## → compress origData in nlmixr2 object, save 27560
## Updated original fit object f
## tktr tka tcl tv temax
## tktr 1.892598e-02 -1.582985e-02 -1.981657e-05 3.266277e-04 0.0015335469
## tka -1.582985e-02 1.888286e-02 -2.652577e-05 3.175901e-04 0.0011916368
## tcl -1.981657e-05 -2.652577e-05 2.505341e-04 1.152329e-05 -0.0008937098
## tv 3.266277e-04 3.175901e-04 1.152329e-05 3.202883e-04 0.0011777851
## temax 1.533547e-03 1.191637e-03 -8.937098e-04 1.177785e-03 7.6242618702
## tec50 1.333488e-04 1.435212e-04 -3.647821e-04 1.262144e-04 0.0490792404
## tkout 1.033562e-04 1.030440e-04 -9.918052e-05 1.201488e-04 -0.0189849996
## te0 1.506058e-05 1.176585e-05 -9.650248e-06 1.229662e-05 -0.0004769028
## tec50 tkout te0
## tktr 0.0001333488 1.033562e-04 1.506058e-05
## tka 0.0001435212 1.030440e-04 1.176585e-05
## tcl -0.0003647821 -9.918052e-05 -9.650248e-06
## tv 0.0001262144 1.201488e-04 1.229662e-05
## temax 0.0490792404 -1.898500e-02 -4.769028e-04
## tec50 0.0018652677 1.582355e-04 -1.380255e-04
## tkout 0.0001582355 6.353965e-04 5.249358e-05
## te0 -0.0001380255 5.249358e-05 8.894088e-05
nlmixr2
is more generous in what constitutes a covariance step. The
r,s
covariance matrix is the “most” successful covariance step for
focei
, but the system will fall back to other methods if necessary.
While this covariance matrix is not r,s
, and should be regarded with
caution, it can still give us some clues on why this things are not working in
NONMEM
.
When examining the fit, you can see the shrinkage is high for temax
, tktr
and tka
, so they could be dropped, makiing things more likely to converge in NONMEM
.
If we use model piping to remove the parameters, the new run will start at the last model’s best estimates (saving a bunch of model development time).
In this case, I specify the output directory pk.turnover.emax4
with
the control and get the following:
f2 <- f %>% model(ktr <- exp(tktr)) %>%
model(ka <- exp(tka)) %>%
model(emax = expit(temax)) %>%
nlmixr(data=nlmixr2data::warfarin, est="nonmem",
control=nonmemControl(readRounding=FALSE,
modelName="pk.turnover.emax4"))
## ! remove between subject variability `eta.ktr`
## ! remove between subject variability `eta.ka`
## ! remove between subject variability `eta.emax`
## → loading into symengine environment...
## → pruning branches (`if`/`else`) of full model...
## ✔ done
## → finding duplicate expressions in EBE model...
## → optimizing duplicate expressions in EBE model...
## → compiling EBE model...
## ✔ done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 27560
## → compress parHist in nlmixr2 object, save 7448
You can see the NONMEM
run is now successful and validates against
the rxode2
model below:
f2
## ── nlmixr² nonmem ver 7.4.3 ──
##
## OBJF AIC BIC Log-likelihood Condition Number
## nonmem focei 1418.923 2338.618 2405.498 -1153.309 1.852796e+16
##
## ── Time (sec f2$time): ──
##
## setup table compress other
## elapsed 0.003 0.05 0.02 6.347
##
## ── Population Parameters (f2$parFixed or f2$parFixedDf): ──
##
## Est. SE %RSE Back-transformed(95%CI) BSV(CV%)
## tktr 6.24e-07 9.05e-05 1.45e+04 1 (1, 1)
## tka -3.57e-06 0.000153 4.29e+03 1 (1, 1)
## tcl -1.99 0.0639 3.2 0.136 (0.12, 0.154) 27.6
## tv 2.05 2.66 130 7.76 (0.042, 1.44e+03) 23.6
## prop.err 0.161 0.161
## pkadd.err 0.571 0.571
## temax 9.98 4.96 49.7 1 (0.565, 1)
## tec50 0.131 1.61 1.23e+03 1.14 (0.0489, 26.6) 43.6
## tkout -2.96 28.3 954 0.0517 (4.63e-26, 5.77e+22) 8.63
## te0 4.57 0.411 9 96.7 (43.2, 217) 5.19
## pdadd.err 3.59 3.59
## Shrink(SD)%
## tktr
## tka
## tcl 3.19%
## tv 10.7%
## prop.err
## pkadd.err
## temax
## tec50 7.12%
## tkout 33.8%
## te0 17.2%
## pdadd.err
##
## Covariance Type (f2$covMethod): nonmem.r,s
## No correlations in between subject variability (BSV) matrix
## Full BSV covariance (f2$omega) or correlation (f2$omegaR; diagonals=SDs)
## Distribution stats (mean/skewness/kurtosis/p-value) available in f2$shrink
## Censoring (f2$censInformation): No censoring
## Minimization message (f2$message):
##
##
## WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
##
## (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
##
##
## 0MINIMIZATION SUCCESSFUL
## HOWEVER, PROBLEMS OCCURRED WITH THE MINIMIZATION.
## REGARD THE RESULTS OF THE ESTIMATION STEP CAREFULLY, AND ACCEPT THEM ONLY
## AFTER CHECKING THAT THE COVARIANCE STEP PRODUCES REASONABLE OUTPUT.
## NO. OF FUNCTION EVALUATIONS USED: 2391
## NO. OF SIG. DIGITS IN FINAL EST.: 4.1
##
## IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.82e-06
## PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=7.17e-06
## IPRED absolute difference compared to Nonmem IPRED: atol=7.42e-05; 95% percentile: (2.13e-06, 0.000645)
## PRED absolute difference compared to Nonmem PRED: atol=7.17e-06; 95% percentile: (3.11e-07,0.00342)
## nonmem model: 'pk.turnover.emax4-nonmem/pk.turnover.emax4.nmctl'
##
## ── Fit Data (object f2 is a modified tibble): ──
## # A tibble: 483 × 32
## ID TIME CMT DV PRED RES IPRED IRES IWRES eta.cl eta.v eta.ec50
## <fct> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.5 cp 0 1.16 -1.16 0.920 -0.920 -1.56 0.689 0.228 0.160
## 2 1 1 cp 1.9 3.38 -1.48 2.68 -0.780 -1.09 0.689 0.228 0.160
## 3 1 2 cp 3.3 7.53 -4.23 5.94 -2.64 -2.36 0.689 0.228 0.160
## # … with 480 more rows, and 20 more variables: eta.kout <dbl>, eta.e0 <dbl>,
## # cp <dbl>, depot <dbl>, gut <dbl>, center <dbl>, effect <dbl>, ktr <dbl>,
## # ka <dbl>, cl <dbl>, v <dbl>, emax <dbl>, ec50 <dbl>, kout <dbl>, e0 <dbl>,
## # DCP <dbl>, PD <dbl>, kin <dbl>, tad <dbl>, dosenum <dbl>
One thing to emphasize: unlike other translators, you will know immediately if the translation is off because the model will not validate. Hence you can start this process with confidence - you will know immediately if something is wrong.
Conclusion
The first release of babelmixr2
includes a NONMEM
translation function.
The advantages of this are:
For
nlmixr2
development, we can easily compare toNONMEM
to see how we’re doing with respect to the current gold standard.For people who are using
rxode2
andNONMEM
, writing a model withnlmixr2
syntax and using it to runNONMEM
will let you only write one model, and save you time debugging and coding it yourself.For pharmacometricians using
NONMEM
, you can take an unsuccessfulNONMEM
fit, get information (covariance shrinkage, etc) about the model and you will be able to make informed decisions on how to proceed.
Many of these advantages come from the fact that babelmixr2
leans into supporting nlmixr2
development for those fluent in NONMEM
and
having nlmixr2
available can help pharmacometricans in daily tasks, even when they need to use another tool.
The astute reader will also notice that the full model
runs in nlmixr2
’s focei
without adjustment. I would like to caution
that this doesn’t mean that nlmixr2
’s focei is better: rather, it is
different (as mentioned in a previous blog post). I have seen cases
in the past where something runs better in NONMEM
than nlmixr2
so
comparisons based on a single model should be regarded with caution (I
no longer have these examples available, though, soyou’ll have to take my word for it).
Thanks for reading!
- Posted on:
- November 11, 2022
- Length:
- 14 minute read, 2783 words
- Categories:
- babelmixr2
- Tags:
- new-version NONMEM