nlmixr2 2.0.8 Objectively Surprising
By Matt Fidler and the nlmixr2 Development Team in nlmixr2
October 25, 2022
Last time I blogged promised to talk about a few other things, including:
Likelihood based on each observation (and how to get it)
Standard Errors / Hessians, etc for between subject variabilities or
eta
s (and how to get them)
Hessians for the individual between subject variability is also used for
the focei
calculation. So, if you are impatient, I will give you brief
instructions on where to get each component of the likelihood:
The individual observation’s likelihood contribution is contained in the datasets where the original (left) merged with the fit (right) in any of the following accessor methods:
fit$dataMergeLeft
,fit$dataMergeRight
,fit$dataMergeInner
, orfit$dataMergeFull
. In these dataset an new column is added$nlmixrLlikObs
The individual -Hessian
etas
can be accessed byfit$etaH
orfit$phiH
Other components may be accessed as well:
Syntax | Values returned |
---|---|
$phiH , $etaH |
Hessian matrix |
$phiC , $etaC |
Covariance matrix, standard deviations on diagonals, correlations on off-diagonals |
$phiR , $etaR |
Correlation matrix, standard deviations on diagonals, correlations on off-diagonals |
$phiSE ,
$etaSE |
Standard error of each individual’s eta |
$phiRSE ,
$etaRSE |
Relative standard error of each individual’s eta |
These all require that the cwres
are in the fit because they come from
the focei
calculations (and are also under the focei
assumption).
Objective function Motivating example
I was working with Bill Denney to prepare a recent course that features
babelmixr2
. In this course, you can perform a NCA
analysis (using
PKNCA
), then use these values (and possibly calculate a unit
conversion) to create a initial nlmixr2
PK model. This model has with
NCA
derived initial estimates and ranges (and needed unit conversions
too)!
This is exciting to me, as someone who has been wanting this feature in
nonlinear mixed effects modeling packages like nlmixr2
for quite
awhile.
Two nearly identical models
Still, when testing this we came across the following (possibly) surprising situation:
one.compartment <- function() {
ini({
tka <- 0.45 # Log Ka
tcl <- 1 # Log Cl
tv <- 3.45 # Log V
eta.ka ~ 0.6
eta.cl ~ 0.3
eta.v ~ 0.1
add.sd <- 0.7
})
model({
ka <- exp(tka + eta.ka)
cl <- exp(tcl + eta.cl)
v <- exp(tv + eta.v)
d/dt(depot) = -ka * depot
d/dt(center) = ka * depot - cl / v * center
cp = center / v
cp ~ add(add.sd)
})
}
fit1 <- nlmixr(one.compartment, nlmixr2data::theo_sd,
est="focei", control=list(print=0))
## ℹ parameter labels from comments will be replaced by 'label()'
## → loading into symengine environment...
## → pruning branches (`if`/`else`) of full model...
## ✔ done
## → calculate jacobian
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate sensitivities
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate ∂(f)/∂(η)
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate ∂(R²)/∂(η)
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → finding duplicate expressions in inner model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in inner model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → finding duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → compiling inner model...
## → creating rxode2 include directory
## → getting R compile options
## → precompiling headers
## ✔ done
## ✔ done
## → finding duplicate expressions in FD model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in FD model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → compiling EBE model...
## ✔ done
## → compiling events FD model...
## ✔ done
## calculating covariance matrix
## [====|====|====|====|====|====|====|====|====|====] 0:00:01
## done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 5952
## → compress parHist in nlmixr2 object, save 2400
Now multiply the cp
by 1000 and the observations by 1000 for a nearly
identical model (ie, change the scale for different units)
d2 <- nlmixr2data::theo_sd %>%
mutate(DV=ifelse(AMT==0, DV*1000, DV))
# Use model piping to scale `cp`:
one.compartment %>%
model(cp = 1000*center/v) %>%
ini(add.sd=700)->
m2
## ℹ parameter labels from comments will be replaced by 'label()'
## ℹ change initial estimate of `add.sd` to `700`
# Verify the new model
print(m2)
## ── rxode2-based free-form 2-cmt ODE model ───────────────────────────────────────────────────────────────────────
## ── Initalization: ──
## Fixed Effects ($theta):
## tka tcl tv add.sd
## 0.45 1.00 3.45 700.00
##
## Omega ($omega):
## eta.ka eta.cl eta.v
## eta.ka 0.6 0.0 0.0
## eta.cl 0.0 0.3 0.0
## eta.v 0.0 0.0 0.1
##
## States ($state or $stateDf):
## Compartment Number Compartment Name
## 1 1 depot
## 2 2 center
## ── μ-referencing ($muRefTable): ──
## theta eta level
## 1 tka eta.ka id
## 2 tcl eta.cl id
## 3 tv eta.v id
##
## ── Model (Normalized Syntax): ──
## function() {
## ini({
## tka <- 0.45
## label("Log Ka")
## tcl <- 1
## label("Log Cl")
## tv <- 3.45
## label("Log V")
## add.sd <- c(0, 700)
## eta.ka ~ 0.6
## eta.cl ~ 0.3
## eta.v ~ 0.1
## })
## model({
## ka <- exp(tka + eta.ka)
## cl <- exp(tcl + eta.cl)
## v <- exp(tv + eta.v)
## d/dt(depot) = -ka * depot
## d/dt(center) = ka * depot - cl/v * center
## cp <- 1000 * center/v
## cp ~ add(add.sd)
## })
## }
fit2 <- nlmixr(m2, d2, est="focei", control=list(print=0))
## → loading into symengine environment...
## → pruning branches (`if`/`else`) of full model...
## ✔ done
## → calculate jacobian
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate sensitivities
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate ∂(f)/∂(η)
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → calculate ∂(R²)/∂(η)
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → finding duplicate expressions in inner model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in inner model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → finding duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in EBE model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → compiling inner model...
## ✔ done
## → finding duplicate expressions in FD model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → optimizing duplicate expressions in FD model...
## [====|====|====|====|====|====|====|====|====|====] 0:00:00
## → compiling EBE model...
## ✔ done
## → compiling events FD model...
## ✔ done
## calculating covariance matrix
## [====|====|====|====|====|====|====|====|====|====] 0:00:01
## done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 6400
## → compress parHist in nlmixr2 object, save 1856
Comparing Estimates
As expected the population estimates are similar:
#fit1
print(fixef(fit1))
## tka tcl tv add.sd
## 0.4755982 1.0152032 3.4605765 0.6964329
#fit2
print(fixef(fit2))
## tka tcl tv add.sd
## 0.4632747 1.0116245 3.4602015 695.3004086
Note that the additive error is (unsurprisingly) larger by a factor of about 1000.
Still, the Omega matrices are similar too:
# fit 1
print(fit1$omega)
## eta.ka eta.cl eta.v
## eta.ka 0.3965371 0.00000000 0.0000000
## eta.cl 0.0000000 0.06609421 0.0000000
## eta.v 0.0000000 0.00000000 0.0189034
# fit 2
print(fit2$omega)
## eta.ka eta.cl eta.v
## eta.ka 0.3892106 0.00000000 0.00000000
## eta.cl 0.0000000 0.06839672 0.00000000
## eta.v 0.0000000 0.00000000 0.01885031
And the etas:
# fit 1
print(fit1$etaObf)
## ID eta.ka eta.cl eta.v OBJI
## 1 1 0.07380512 -0.47377507 -0.092726943 12.731100
## 2 2 0.18745323 0.14029818 0.004629602 17.761561
## 3 3 0.36028190 0.02522593 0.052062437 0.241700
## 4 4 -0.29241554 -0.02115808 -0.013641003 10.966127
## 5 5 -0.05365737 -0.15263117 -0.144045845 29.068040
## 6 6 -0.39525871 0.36564639 0.193421905 7.797285
## 7 7 -0.78074888 0.14163632 0.055298747 2.226522
## 8 8 -0.17817254 0.16047222 0.093002288 6.936265
## 9 9 1.34938977 0.04245204 -0.001210923 8.356287
## 10 10 -0.74338444 -0.38309248 -0.172511056 8.096207
## 11 11 0.74253294 0.28298630 0.135626472 3.353714
## 12 12 -0.52921042 -0.12846924 -0.201645041 9.287018
# fit 2
print(fit2$etaObf)
## ID eta.ka eta.cl eta.v OBJI
## 1 1 0.08676139 -0.47321299 -0.091753369 164.5500
## 2 2 0.19859995 0.14435666 0.004612295 169.7950
## 3 3 0.37118976 0.02871478 0.052154278 152.2332
## 4 4 -0.28032986 -0.01807234 -0.013271402 162.9404
## 5 5 -0.04139780 -0.14997041 -0.143548796 181.0793
## 6 6 -0.38511801 0.37180668 0.192607687 159.7349
## 7 7 -0.76890394 0.14568098 0.055362585 154.1730
## 8 8 -0.16701882 0.16475592 0.092908448 158.9137
## 9 9 1.35698124 0.04608685 -0.001213913 160.5138
## 10 10 -0.72976734 -0.38187406 -0.171329853 159.9207
## 11 11 0.75188271 0.28813929 0.135213191 155.3850
## 12 12 -0.51666992 -0.12569554 -0.201079720 161.2189
The ETAs are similar too; You can also see the individual contribution
to the objective functions are quite different (OBJI
). So it should be
no surprise that the objective functions are different:
# fit 1
print(fit1$objf)
## [1] 116.8218
# fit 2
print(fit2$objf)
## [1] 1940.458
What about NONMEM?
You might say, well are these objective functions off? maybe nlmixr2
is broken? (If you see anything surprising of course submit a bug report
if you can).
Well, with the coming babelmixr2
you can run the same models in NONMEM
(with certain caveats we will discuss later), and these objective
functions also are similar NONMEM between nlmixr2
and NONMEM
(which
is unsurprising since we use the NONMEM objective functions in Wang 2007
(1) to validate our likelihood)
This means that nlmixr2
is constitent with NONMEM
and these
objective function differences are due to other factors.
Exploring more with individual observation contribution
One of the new features is the ability to see individual observations
contribution to the likelihood in focei
related methods.
This can help us explore the differences.
In nlmixr2
, you can use the fit$dataMergeInner
to merge the original
data and the fit data. During this merge process it will also add the
column $nlmixrLlikObs
:
dm1 <- fit1$dataMergeInner
dm1ll <- dm1 %>%
select(ID, nlmixrLlikObs) %>%
group_by(ID) %>%
summarize(sllik=sum(nlmixrLlikObs))
dm2 <- fit2$dataMergeInner
dm2ll <- dm2 %>%
group_by(ID) %>%
summarize(sllik=sum(nlmixrLlikObs))
print(dm1ll)
## # A tibble: 12 × 2
## ID sllik
## <fct> <dbl>
## 1 1 0.648
## 2 2 3.24
## 3 3 2.29
## 4 4 0.916
## 5 5 0.126
## 6 6 2.87
## 7 7 1.10
## 8 8 -9.99
## 9 9 -1.94
## 10 10 3.47
## 11 11 -5.27
## 12 12 -0.691
print(dm2ll)
## # A tibble: 12 × 2
## ID sllik
## <fct> <dbl>
## 1 1 -75.3
## 2 2 -72.7
## 3 3 -73.7
## 4 4 -75.1
## 5 5 -75.9
## 6 6 -73.1
## 7 7 -74.9
## 8 8 -86.0
## 9 9 -77.9
## 10 10 -72.5
## 11 11 -81.3
## 12 12 -76.7
# It is clear that there are individual differences in log-likelihood
In the normal (non generalized likelihood) the observation likelihoods are given by \(l_{i, obs}\):
\[l_{i, obs} = -0.5\times\left(\frac{\textsf{IPRED}-\textsf{DV}}{v}\right)^2-0.5*\log(v)\]
Where \(v=\) variance of the estimate at that point. In this case it is \(\textsf{add.sd}^2\)
You can see part of the difference is the relative differences of this term for subjects. Most of this is likely driven by the large (and unsurprising) differences in the variance component.
If you want, you can see which observations give the biggest difference by comparing point by point.
Finishing up the likelihood calculation
A part of the individual Hessians are the other component that is used in the likelihood calculation. With the new tools you can also see what this contribution to each individual’s likelihood is:
hess1 <- merge(fit1$etaObf, dm1ll) %>%
mutate(hessLlik=OBJI-sllik)
hess2 <- merge(fit2$etaObf, dm2ll) %>%
mutate(hessLlik=OBJI-sllik)
print(hess1)
## ID eta.ka eta.cl eta.v OBJI sllik hessLlik
## 1 1 0.07380512 -0.47377507 -0.092726943 12.731100 0.6483445 12.082755
## 2 10 -0.74338444 -0.38309248 -0.172511056 8.096207 3.4738096 4.622398
## 3 11 0.74253294 0.28298630 0.135626472 3.353714 -5.2668018 8.620516
## 4 12 -0.52921042 -0.12846924 -0.201645041 9.287018 -0.6914396 9.978458
## 5 2 0.18745323 0.14029818 0.004629602 17.761561 3.2441449 14.517416
## 6 3 0.36028190 0.02522593 0.052062437 0.241700 2.2925777 -2.050878
## 7 4 -0.29241554 -0.02115808 -0.013641003 10.966127 0.9159529 10.050174
## 8 5 -0.05365737 -0.15263117 -0.144045845 29.068040 0.1263595 28.941680
## 9 6 -0.39525871 0.36564639 0.193421905 7.797285 2.8691600 4.928125
## 10 7 -0.78074888 0.14163632 0.055298747 2.226522 1.1031798 1.123342
## 11 8 -0.17817254 0.16047222 0.093002288 6.936265 -9.9852521 16.921517
## 12 9 1.34938977 0.04245204 -0.001210923 8.356287 -1.9405163 10.296803
print(hess2)
## ID eta.ka eta.cl eta.v OBJI sllik hessLlik
## 1 1 0.08676139 -0.47321299 -0.091753369 164.5500 -75.32152 239.8715
## 2 10 -0.72976734 -0.38187406 -0.171329853 159.9207 -72.50307 232.4237
## 3 11 0.75188271 0.28813929 0.135213191 155.3850 -81.26471 236.6497
## 4 12 -0.51666992 -0.12569554 -0.201079720 161.2189 -76.65736 237.8763
## 5 2 0.19859995 0.14435666 0.004612295 169.7950 -72.74190 242.5369
## 6 3 0.37118976 0.02871478 0.052154278 152.2332 -73.66908 225.9023
## 7 4 -0.28032986 -0.01807234 -0.013271402 162.9404 -75.10030 238.0407
## 8 5 -0.04139780 -0.14997041 -0.143548796 181.0793 -75.85543 256.9347
## 9 6 -0.38511801 0.37180668 0.192607687 159.7349 -73.09561 232.8305
## 10 7 -0.76890394 0.14568098 0.055362585 154.1730 -74.87371 229.0467
## 11 8 -0.16701882 0.16475592 0.092908448 158.9137 -85.99411 244.9078
## 12 9 1.35698124 0.04608685 -0.001213913 160.5138 -77.92497 238.4387
You can see the individual Hessian contribution is actually large in
this particular likelihood as well. (You can explore their difference
more using $etaH
if you wish)
Conclusion
Well that is everything for now. This illustrates a few things:
How to get individual likelihoods
How to split apart the likelihood contribution from the Normal assumption of the observations and the contribution from the hessian. (Note this works with generalized likelihood too)
Where to get standard errors of
etas
I wish I had known where these came from earlier, but I seem to want to know how things work. For a more in-depth reference you could use the paper by Almquist (2) to dig into the full likelihood math.
References
1. Wang Y. Derivation of various NONMEM estimation methods. Journal of Pharmacokinetics and Pharmacodynamics. 2007;34(5):575–93.
2. Almquist J, Leander J, Jirstrand M. Using sensitivity equations for computing gradients of the FOCE and FOCEI approximations to the population likelihood. Journal of Pharmacokinetics and Pharmacodynamics. 2015 Jun;42(3):191–209.
- Posted on:
- October 25, 2022
- Length:
- 11 minute read, 2210 words
- Categories:
- nlmixr2
- Tags:
- new-version