--- title: "Is OwenQ reliable?" author: "Stéphane Laurent" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true number_sections: true vignette: > %\VignetteIndexEntry{Is OwenQ reliable?} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(collapse=TRUE) library(OwenQ) ``` # Purpose The purpose of this vignette is to assess the correctness of the functions of the `OwenQ` package. As said in the main vignette, the fourth Owen cumulative function $O_4$, implemented under the name `powen4` allows to evaluate the power of equivalence tests. This is perhaps the main application of the `OwenQ` package and we will particularly focus on this situation. Our strategy runs as follows: - To test the `OwenT` function, we will compare some values it returns to those returned by other implementations. Note that the other functions of the `OwenQ` package rely on `OwenT` when the number of degrees of freedom is odd. Therefore, by testing these other functions for an odd number of degrees of freedom, we implicitely test the `OwenT` function. - We will compare a couple of values returned by `OwenQ1` and `OwenQ2` to the values obtained by numerical integration in Wolfram|Alpha. - We will test `powen4` by performing some power calculations relying on this function and comparing the results with the ones provided by SAS. We will also perform these power calculations with the help of the `ipowen4` function, which evaluates the $O_4$ function by a powerful numerical integration, and we will compare the results. - These power calculations will be performed for $100$ different scenarios, and we will conclude that `powen4` is reliable for these scenarios. Then, to test the other functions of the `OwenQ` package, we will check that some mathematical relations hold for these $100$ scenarios. The folder `tests/testthat` of the `OwenQ` package contains many comparisons with Wolfram|Alpha. The results we obtain with `OwenQ` are always the same as the ones obtained with Wolfram|Alpha up to a tolerance of at least $10$ decimal digits. # Owen $T$-function (`OwenT`) The Owen $T$-function is implemented under the name `OwenT`. This is a port of the function `owens_t` of the C++ `special_functions` library included in the `boost` set of libraries. Some details about the C++ implementation are [available here](https://www.boost.org/doc/libs/1_77_0/libs/math/doc/html/math_toolkit/owens_t.html). The libraries provided by `boost` are peer-reviewed, and this is enough to trust the reliability of `OwenT`. Nevertheless, below we compare the results of `OwenT` to the six values given in Patefield's article, up to $14$ decimal digits. We have also checked that Wolfram|Alpha gives these values. ```{r} testData <- data.frame( h = c(0.0625, 6.5, 7, 4.78125, 2, 1), a = c(0.25, 0.4375, 0.96875, 0.0625, 0.5, 0.9999975), Patefield = c(3.8911930234701e-02, 2.0005773048508e-11, 6.3990627193899e-13, 1.0632974804687e-07, 8.6250779855215e-03, 6.6741808978229e-02), OwenT = numeric(6) ) for(i in 1:nrow(testData)){ testData$OwenT[i] <- OwenT(testData$h[i], testData$a[i]) } print(testData, digits=14) ``` We get the same results with `OwenT`. # Owen $Q$-functions: comparing a couple of values The two Owen $Q$-functions $Q_1$ and $Q_2$ are defined by: $$ Q_1(\nu, t, \delta, R) = \frac{1}{\Gamma\left(\frac{\nu}{2}\right)2^{\frac12(\nu-2)}} \int_0^R \Phi\left(\frac{tx}{\sqrt{\nu}}-\delta\right) x^{\nu-1} e^{-\frac{x^2}{2}} \mathrm{d}x $$ and $$ Q_2(\nu, t, \delta, R) = \frac{1}{\Gamma\left(\frac{\nu}{2}\right)2^{\frac12(\nu-2)}} \int_R^\infty \Phi\left(\frac{tx}{\sqrt{\nu}}-\delta\right) x^{\nu-1} e^{-\frac{x^2}{2}} \mathrm{d}x. $$ They are implemented in the `OwenQ` package under the respective names `OwenQ1` and `OwenQ2`, for integer values of $\nu$, following Owen's algorithm (1965). In Wolfram|Alpha, these functions are not available, but we can evaluate them by numerical integration. Below we compare two values of `OwenQ1` to the ones returned by Wolfram|Alpha. - $\nu=3$, $t=3$, $\delta=2$, $R=5$ ([link to Wolfram](http://www.wolframalpha.com/input/?i=NIntegrate%5B(1%2BErf%5B(3*x%2FSqrt%5B3%5D-2)%2FSqrt%5B2%5D%5D)*x%5E(3-1)*Exp%5B-x%5E2%2F2%5D,%7Bx,0,5%7D%5D%2F2%2FGamma%5B3%2F2%5D%2F2%5E((3-2)%2F2))) ```{r} # wolfram: Integrate[(1+Erf[(3*x/Sqrt[3]-2)/Sqrt[2]])*x^(3-1)*Exp[-x^2/2],{x,0,5}]/2/Gamma[3/2]/2^((3-2)/2) OwenQ1(3, 3, 2, 5) ``` ![](wolfram_OwenQ1.png) Our value rounded to $6$ digits is the same as the one given by Wolfram. - $\nu=1000$, $t=3$, $\delta=2$, $R=30$ ([link to Wolfram](http://www.wolframalpha.com/input/?i=NIntegrate%5B(1%2BErf%5B(3*x%2FSqrt%5B1000%5D-2)%2FSqrt%5B2%5D%5D)*x%5E(1000-1)*Exp%5B-x%5E2%2F2%5D,%7Bx,0,30%7D%5D%2F2%2FGamma%5B1000%2F2%5D%2F2%5E((1000-2)%2F2))) ```{r} # wolfram: Integrate[(1+Erf[(3*x/Sqrt[1000]-2)/Sqrt[2]])*x^(1000-1)*Exp[-x^2/2],{x,0,30}]/2/Gamma[1000/2]/2^((1000-2)/2) print(OwenQ1(1000, 3, 2, 30), digits=16) ``` ![](wolfram_OwenQ1_1000.png) The two values are the same up to $13$ digits.
Now we compare two values of `OwenQ2` to the ones returned by Wolfram|Alpha. - $\nu=3$, $t=3$, $\delta=2$, $R=5$ ([link to Wolfram](http://www.wolframalpha.com/input/?i=NIntegrate%5B(1%2BErf%5B(3*x%2FSqrt%5B3%5D-2)%2FSqrt%5B2%5D%5D)*x%5E(3-1)*Exp%5B-x%5E2%2F2%5D,%7Bx,5,Infinity%7D%5D%2F2%2FGamma%5B3%2F2%5D%2F2%5E((3-2)%2F2))) ```{r} # wolfram: Integrate[(1+Erf[(3*x/Sqrt[3]-2)/Sqrt[2]])*x^(3-1)*Exp[-x^2/2],{x,5,Infinity}]/2/Gamma[3/2]/2^((3-2)/2) OwenQ2(3, 3, 2, 5) ``` ![](wolfram_OwenQ2.png) The two values are identical. - $\nu=1000$, $t=3$, $\delta=2$, $R=5$ ([link To Wolfram](http://www.wolframalpha.com/input/?i=NIntegrate%5B(1%2BErf%5B(3*x%2FSqrt%5B1000%5D-2)%2FSqrt%5B2%5D%5D)*x%5E(1000-1)*Exp%5B-x%5E2%2F2%5D,%7Bx,5,Infinity%7D%5D%2F2%2FGamma%5B1000%2F2%5D%2F2%5E((1000-2)%2F2))) ```{r} # wolfram: Integrate[(1+Erf[(3*x/Sqrt[1000]-2)/Sqrt[2]])*x^(1000-1)*Exp[-x^2/2],{x,5,Infinity}]/2/Gamma[1000/2]/2^((1000-2)/2) print(OwenQ2(1000, 3, 2, 5), digits=16) ``` ![](wolfram_OwenQ2_1000.png) The two values are identical up to $13$ digits. # Fourth Owen cumulative function $O_4$ (`powen4`) As seen in the other vignette, the fourth Owen cumulative function $O_4$, implemented under the name `powen4`, can be used to evaluate the power of equivalence tests. The `powerTOST` function below returns the power of the equivalence test for a so-called parallel design with equal variances, when considering the alternative hypothesis $H_1\colon\{-\Delta < \delta_0 < \Delta \}$, where $\delta_0$ denotes the difference between the two means. This function takes as arguments the significance level $\alpha$, the difference $\delta_0$ between the two means, the threshold $\Delta$, the common standard deviation $\sigma$ of the two samples, and the two sample sizes $n_1$ and $n_2$. ```{r} powerTOST <- function(alpha, delta0, Delta, sigma, n1, n2, algo=2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) powen4(dof, q, -q, delta1, delta2, algo=algo) } ``` As you can see, the `powen4` function has an argument `algo`. This is also the case for the other Owen functions, except for `ptOwen`. The `algo` argument can take two values, `1` or `2`. Th default value is `algo=2`, and as we will see later, the evaluation is more reliable with this option. The value of `algo` corresponds to a small difference in the algorithm. With `algo=1`, the evaluation is a bit faster. And we will see that `powerTOST` is reliable for `algo=1` when the value of `n1+n2` is not too large. ## Comparisons with SAS ```{r echo=FALSE} SAS <- structure(list(alpha = c(0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.01, 0.01, 0.01, 0.01, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.05, 0.05, 0.01, 0.01, 0.05, 0.05), delta0 = c(0, 0, 0, 0, 0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 1, 2, 2.5, 0, 1, 2, 2.5, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 4, 0, 4, 0, 4, 0, 4), Delta = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5), sigma = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 7, 7, 7, 7, 7, 7, 7, 7, 4, 4, 4, 4, 8, 8, 8, 8, 10, 10, 10, 10, 14, 14, 14, 14, 35, 35, 30, 50, 6, 6, 9, 9), n1 = c(10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 50, 50, 50, 50, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 185, 185, 185, 185, 185, 185, 185, 185, 250, 250, 250, 250, 500, 500, 500, 500, 600, 600, 600, 600, 1190, 1190, 1190, 1190), n2 = c(10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 10, 15, 20, 25, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 50, 50, 50, 50, 90, 90, 90, 90, 100, 100, 100, 100, 100, 100, 100, 100, 10, 10, 10, 10, 100, 100, 100, 100, 250, 250, 250, 250, 500, 500, 500, 500, 600, 600, 600, 600, 10, 10, 10, 10), powerPASS = c(0.3909, 0.6954, 0.8558, 0.9343, 0.9709, 0.3827, 0.6784, 0.8366, 0.9178, 0.9589, 0.3591, 0.6295, 0.7807, 0.868, 0.9202, 0.3229, 0.5554, 0.6932, 0.7847, 0.8491, 0.2782, 0.4653, 0.5831, 0.6717, 0.7426, 0.2297, 0.3697, 0.4621, 0.5388, 0.606, 0.7037, 0.8576, 0.9232, 0.9545, 0.9709, 0.6865, 0.8385, 0.906, 0.94, 0.9589, 0.637, 0.7826, 0.8544, 0.895, 0.9202, 0.5619, 0.695, 0.7694, 0.8168, 0.8491, 0.4706, 0.5847, 0.6561, 0.7059, 0.7426, 0.3736, 0.4634, 0.5247, 0.5705, 0.606, 0.9624, 0.7985, 0.3433, 0.1529, 0.8179, 0.7035, 0.436, 0.2982, 0.9828, 0.9151, 0.6438, 0.2617, 0.9083, 0.7492, 0.3744, 0.0929, 0.8457, 0.7303, 0.4546, 0.19, 0.9824, 0.9142, 0.6423, 0.261, 0.9671, 0.8452, 0.4616, 0.1129, 0.9714, 0.8552, 0.4733, 0.1161, 0.1179, 0.0169, 0.7856, 0.0268, 0.2343, 0.0277, 0.0836, 0.0315), powerSAS = c(0.39094, 0.69541, 0.8558, 0.93426, 0.97092, 0.38272, 0.67836, 0.83661, 0.91781, 0.95889, 0.35908, 0.62954, 0.78069, 0.86796, 0.92017, 0.32287, 0.5554, 0.69322, 0.78471, 0.84909, 0.2782, 0.46531, 0.58306, 0.67174, 0.7426, 0.2297, 0.36967, 0.46208, 0.53883, 0.606, 0.70373, 0.85764, 0.92324, 0.95452, 0.97092, 0.68648, 0.83846, 0.90604, 0.94003, 0.95889, 0.63703, 0.78255, 0.85439, 0.89501, 0.92017, 0.56192, 0.69503, 0.76944, 0.81676, 0.84909, 0.47061, 0.5847, 0.6561, 0.70594, 0.7426, 0.37365, 0.46343, 0.52475, 0.57052, 0.606, 0.96239, 0.79849, 0.34329, 0.15288, 0.81791, 0.70346, 0.43595, 0.29819, 0.98278, 0.91512, 0.64376, 0.26169, 0.90831, 0.74924, 0.37443, 0.0929, 0.84568, 0.73025, 0.45459, 0.19001, 0.9824, 0.91417, 0.64226, 0.26103, 0.96713, 0.84523, 0.46161, 0.11288, 0.97112, 0.85433, 0.47184, 0.11536, 0.11547, 0.01658, 0.78512, 0.02642, 0.23194, 0.02739, 0.08256, 0.03115 )), .Names = c("alpha", "delta0", "Delta", "sigma", "n1", "n2", "powerPASS", "powerSAS"), class = "data.frame", row.names = c(NA, -100L)) SAS <- SAS[,-7] ``` The table shown below contains $100$ results of power calculations with SAS version 9.4, for a total sample size $n_1+n_2$ going from $20$ to $1200$. The results are rounded to $5$ decimal digits. ```{r, echo=FALSE} knitr::kable(SAS, row.names = TRUE) ``` This table is stored in a dataframe named `SAS`. We compare the SAS values to the ones obtained by our `powerTOST` function. ```{r} power <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { power[i] <- powerTOST( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } ``` Rounding our values to $5$ decimal digits, we find that our results are identical to the SAS results: ```{r} identical(round(power,5), SAS$powerSAS) ``` With the option `algo=1`, the results are identical as well: ```{r} power <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { power[i] <- powerTOST( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i], algo = 1 ) } identical(round(power,5), SAS$powerSAS) ``` ## Comparison with numerical integration The `ipowen4` internal function of the `OwenQ` package evaluates the fourth Owen cumulative function $O_4$ by a powerful numerical integration implemented in C++ (with the package `RcppNumerical`). Thus we have two completely different implementations of $O_4$. The `ipowerTOST` function below is obtained by replacing `powen4` with `ipowen4` in `powerTOST`. We will compare the results of `powerTOST` with the ones of `ipowerTOST`. ```{r} ipowerTOST <- function(alpha, delta0, Delta, sigma, n1, n2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) OwenQ:::ipowen4(dof, q, -q, delta1, delta2) } ``` ### Failures and successes of `powen4` - For $n_1=n_2=1000$, $\alpha=0.05$, $\delta=0$, $\Delta=5$, the `powen4` function with the option `algo=1` abnormally takes some negative values: ```{r, echo=FALSE, fig.width=8, fig.height=4} oldpar <- par(mar=c(4,4,0.4,0.4)) layout(t(c(1,2))) sigma <- seq(65,69,len=100) n1 <- 1000; n2 <- 1000 plot(sigma, powerTOST(0.05, 0, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.05, 0, 5, sigma, n1, n2)) lines(sigma, y, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) abline(h=0, col="green", lty=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) sigma <- seq(15,69,len=100) plot(sigma, powerTOST(0.05, 0, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.05, 0, 5, sigma, n1, n2)) lines(sigma, y, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) abline(h=0, col="green", lty=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) par(oldpar) ``` Note that the discrepancy between `powen4` and `ipowen4` occurs only for $\sigma > 65$. - For $n_1=n_2=720$, $\alpha=0.05$, $\delta=0$, $\Delta=5$, we observe a discrepancy between `powen4` with the option `algo=1` and `ipowen4`: ```{r, echo=FALSE, fig.width=4, fig.height=4} oldpar <- par(mar=c(4,4,0.4,0.4)) n1 <- n2 <- 720 sigma <- seq(56,57,len=100) plot(sigma, powerTOST(0.05, 0, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.05, 0, 5, sigma, n1, n2)) lines(sigma, y, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) par(oldpar) ``` There is no discrepancy between `powen4` with the option `algo=2` and `ipowen4`. The irregularity of `powen4` with the option `algo=1` suggests that it returns wrong values. - For $n_1=n_2=700$, $\alpha=0.01$, $\delta=0$, $\Delta=5$, we still observe a small discrepancy between `powen4` with the option `algo=1` and `ipowen4`, and we still observe an irregularity of `powen4` (on the second figure below): ```{r, echo=FALSE, fig.width=8, fig.height=4} oldpar <- par(mar=c(4, 4, 0.2, 0.2)) layout(t(c(1,2))) n1 <- n2 <- 700 sigma <- seq(35,45,len=100) plot(sigma, powerTOST(0.01, 1, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.01, 1, 5, sigma, n1, n2)) lines(sigma, y, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) n1 <- n2 <- 700 sigma <- seq(38.5,39,len=100) plot(sigma, powerTOST(0.01, 1, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.01, 1, 5, sigma, n1, n2)) lines(sigma, y, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) par(oldpar) ``` With the option `algo=2`, the results of `powen4` coincide with the ones of `ipowen4`. - For $n_1 = n_2 = 600$, $\alpha=0.005$, $\delta=0$, $\Delta=5$, we do not observe any discrepancy between `powen4` and `ipowen4`: ```{r, echo=FALSE, fig.width=4, fig.height=4} oldpar <- par(mar=c(4, 4, 0.7, 0.2)) n1 <- n2 <- 600 sigma <- seq(30,36,len=100) plot(sigma, powerTOST(0.005, 0, 5, sigma, n1, n2), type="l", lwd=2, xlab=expression(sigma), ylab="power") y <- sapply(sigma, function(sigma) ipowerTOST(0.005, 0, 5, sigma, n1, n2)) lines(sigma, y, pch=19, col="blue", lwd=2) y <- sapply(sigma, function(sigma) powerTOST(0.05, 0, 5, sigma, n1, n2, algo=1)) lines(sigma, y, col="red", lwd=2) legend("topright", c("powen4 - 1", "powen4 - 2", "ipowen4"), lty=c(1,1,1), col=c("red", "black", "blue")) par(oldpar) ``` We conclude that `powen4` with the option `algo=1` is not reliable when the number of degrees of freedom is too large. As said before, the interest of the option `algo=1` is that the evaluation is faster. For $n_1=n_2=2500$, the results of `powerTOST` with `algo=2` still coincide with the result of `ipowerTOST`: ```{r} alpha <- 0.05; delta0 <- 0; Delta <- 5 sigma <- 110 n1 <- n2 <- 2500 powerTOST(alpha, delta0, Delta, sigma, n1, n2) ipowerTOST(alpha, delta0, Delta, sigma, n1, n2) ``` And even for $n_1=n_2=5000$: ```{r} sigma <- 152 n1 <- n2 <- 5000 powerTOST(alpha, delta0, Delta, sigma, n1, n2) ipowerTOST(alpha, delta0, Delta, sigma, n1, n2) ``` ### Comparisons for $n_1+n_2 \leq 1200$ Below we compare the results returned by `powerTOST` to the ones returned by `ipowerTOST` for the parameters given in the `SAS` dataset. ```{r} power <- ipower <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { power[i] <- powerTOST( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) ipower[i] <- ipowerTOST( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } identical(round(power, 10), round(ipower, 10)) ``` Results are the same up to $10$ digits. And also with the option `algo=1`: ```{r} power <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { power[i] <- powerTOST( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i], algo = 1 ) } identical(round(power, 10), round(ipower, 10)) ``` ## Conclusion - The `powen4` function with option `algo=1` seems to return correct values for $\nu \leq 1200$. - The `powen4` function with option `algo=2` allows higher values of $\nu$. - In any case, we recommend to compare the result of `powen4` with the one of `ipowen4`. # Other Owen cumulative functions We previously validated the `powen4` function for the $100$ scenarios of the dataset `SAS`. Now we test the functions `powen1`, `powen2` and `powen3` by checking the equality $$ O_1(\nu, t_1, t_2, \delta_1, \delta_2) + O_2(\nu, t_1, t_2, \delta_1, \delta_2) + O_3(\nu, t_1, t_2, \delta_1, \delta_2) + O_4(\nu, t_1, t_2, \delta_1, \delta_2) = 1. $$ We restrict our attention to default setting `algo=2`. We check the above equality for the $100$ scenarios of the dataset `SAS`. ```{r} f <- function(alpha, delta0, Delta, sigma, n1, n2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) powen1(dof, q,-q, delta1, delta2) + powen2(dof, q,-q, delta1, delta2) + powen3(dof, q,-q, delta1, delta2) + powen4(dof, q,-q, delta1, delta2) } test <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { test[i] <- f( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } all(abs(test-1) < 1e-14) ``` We find that each of the $100$ sums is equal to $1$ up to a tolerance of $14$ digits. # First Owen $Q$-function $Q_1$ (`OwenQ1`) We previously validated the `powen4` function (implementation of the fourth Owen cumulative function $O_4$) for the $100$ scenarios of the dataset `SAS`. Now, to test `OwenQ1`, we will use the following equality: $$ O_4(\nu, t_1, t_2, \delta_1, \delta_2) = Q_1\left(\nu, t_2, \delta_2, \frac{\sqrt{\nu}(\delta_1-\delta_2)}{t_1-t_2}\right) - Q_1\left(\nu, t_1, \delta_1, \frac{\sqrt{\nu}(\delta_1-\delta_2)}{t_1-t_2}\right). $$ We use this formula to perform the power calculations with `OwenQ1` instead of `powen4`. ```{r} powerTOST2 <- function(alpha, delta0, Delta, sigma, n1, n2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) R <- sqrt(dof)*(delta1 - delta2)/q/2 OwenQ1(dof, -q, delta2, R) - OwenQ1(dof, q, delta1, R) } power2 <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { power2[i] <- powerTOST2( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } all(abs(power - power2) < 1e-9) ``` The values rounded to $9$ digits are the same as the ones we previously obtained with `powen4`. # Second Owen $Q$-function $Q_2$ (`OwenQ2`) We previously validated `powen2` for the $100$ scenarios of the dataset `SAS`. Now we test the `OwenQ2` function by checking the equality $$ O_2(\nu, t_1, t_2, \delta_1, \delta_2) = Q_2\left(\nu, t_1, \delta_1, \frac{\sqrt{\nu}(\delta_1-\delta_2)}{t_1-t_2}\right) - Q_2\left(\nu, t_2, \delta_2, \frac{\sqrt{\nu}(\delta_1-\delta_2)}{t_1-t_2}\right). $$ We check this equality for the $100$ scenarios of the dataset `SAS`. ```{r} g <- function(alpha, delta0, Delta, sigma, n1, n2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) R <- sqrt(dof)*(delta1 - delta2)/q/2 x <- OwenQ2(dof, q, delta1, R) - OwenQ2(dof, -q, delta2, R) y <- powen2(dof, q, -q, delta1, delta2) x - y } test <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { test[i] <- g( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } all(abs(test) < 1e-15) ``` We find that each of the $100$ equalities hold true up to a tolerance of $15$ digits. # Student cumulative distribution function (`ptOwen`) We finally test `ptOwen`, the implementation of the cumulative distribution function of the noncentral Student distribution. If $T_1$ follows the noncentral Student distribution with number of degrees of freedom $\nu$ and noncentrality parameter $\delta_1$, then for any $t_1$ the equality $$ \Pr(T_1 \leq t_1) = O_1(\nu, t_1, t_2, \delta_1, \delta_2) + O_2(\nu, t_1, t_2, \delta_1, \delta_2) $$ holds for any $t_2$ and $\delta_2$. We check this equality for the $100$ scenarios of the dataset `SAS`. ```{r} h <- function(alpha, delta0, Delta, sigma, n1, n2) { se <- sqrt(1/n1 + 1/n2) * sigma delta1 <- (delta0 + Delta) / se delta2 <- (delta0 - Delta) / se dof <- n1 + n2 - 2 q <- qt(1 - alpha, dof) x <- ptOwen(q, dof, delta1) y <- powen1(dof, q, -q, delta1, delta2) + powen2(dof, q, -q, delta1, delta2) x - y } test <- numeric(nrow(SAS)) for (i in 1:nrow(SAS)) { test[i] <- h( alpha = SAS$alpha[i], delta0 = SAS$delta0[i], Delta = SAS$Delta[i], sigma = SAS$sigma[i], n1 = SAS$n1[i], n2 = SAS$n2[i] ) } all(abs(test) < 1e-15) ``` For each scenario, the equality holds up to a tolerance of $15$ digits. # References - Patefield, M. (2000). *Fast and Accurate Calculation of Owen's T Function.* Journal of Statistical Software 5 (5), 1-25. - Owen, D. B. (1965). *A special case of a bivariate noncentral t-distribution.* Biometrika 52, 437-446. - Wolfram Alpha LLC. 2017. Wolfram|Alpha. (access July 17, 2017).