Below, the essence of `POSSA`

’s power (and type 1 error
rate, T1ER) calculation for sequential designs is explained using the
simple example of a t-test. Of course, for t-tests specifically, you
could use other and potentially more user-friendly software – however,
the point here is to understand how the `POSSA`

works, and,
subsequently, it should be possible for you to extend this example to
virtually any statistical null hypothesis test. Cases of multiple
hypotheses (and recording descriptives) are explained on a separate
page, but these too are straightforward extensions of the present
examples.

This tutorial presupposes the basic conceptual understanding of sequential analyses (and the basics or statistical power and T1ER). There are several related free tutorials online (see e.g. Lakens, 2014).

All parameters mentioned below are described in detail in the full
documentation (see `?sim`

and `?pow`

).

So let’s say you want to compare the means of two independent
continuous variables with a simple t-test. Your smallest effect size of
interest (SESOI) is `5`

units (arbitrary units that could
represent anything – e.g., the height of a plant in cm, or the time of
performing a task in seconds, etc.).

First, write a function that simulates the samples (i.e., observed values) for a single instance of the hypothetical experiment in case the null hypothesis is true (no actual difference between the two population means) as well as in case the alternative hypothesis is true (there is in fact difference).

If the null hypothesis “H0” is true, the two samples have the same
mean, let’s say `0`

units; and let’s assume normally
distributed values an SD of `10`

units for each sample. For
this, two samples can each be simulated via the function
`rnorm(sampleSize, mean = 0, sd = 10)`

(where
`sampleSize`

is a variable to be provided via another
function, as explained later). You can assign this to
`sample1`

, which represents a baseline variable (e.g., “group
A”), and, separately, to `sample2_h0`

, which represents a
sample that does not differ from the baseline (“H0” true). If the null
hypothesis “H1” is true, the baseline sample may again have a mean of
`0`

, but the other sample should have the SESOI,
`5`

units (again assuming an SD of `10`

) – which
can be simulated as `rnorm(sampleSize, mean = 5, sd = 10)`

.
This can be assigned to `sample2_h1`

, which represents a
sample that does differ from the baseline (“H1” true). (The baseline
sample has identical properties in case of H0 and H1, so it’s enough to
simulate it only once, and use that same resulting sample for both cases
– for the sake of brevity as well as of computational speed.)

All this is to be put into a function, which takes a
`sampleSize`

argument and returns the simulated samples as a
named list, as follows.

```
= function(sampleSize) {
customSample list(
sample1 = rnorm(sampleSize, mean = 0, sd = 10),
sample2_h0 = rnorm(sampleSize, mean = 0, sd = 10),
sample2_h1 = rnorm(sampleSize, mean = 5, sd = 10)
) }
```

Note that the `customSample`

and `sampleSize`

variables were written with camel case (likeThis), while the
`_h0`

and `_h1`

endings used underscores
(like_this). This alternate notation is used in the present tutorial to
help distinguish what names can be freely chosen, and what is
necessarily named so for the `POSSA`

package: custom names
are all camel case, while underscores indicate notation necessary for
`POSSA`

. Here specifically, the returned list’s element names
for the sample that varies depending on whether the null (H0) and
alternative (H1) hypothesis is true must be indicated with
`_h0`

and `_h1`

endings, respectively, with a
common root (here: `sample2_h`

). The root name or the other
variable names (`customSample`

and `sampleSize`

)
could be named differently, and everything would work just as well.

Now, write a function that performs the test; here, a one-sided t-test (with equal variances assumed – this is just for the example’s outcome’s comparability for other software, but normally Welch’s test should be preferred).

The function’s parameters must be identical to the names of the list
elements returned by the sample function (here: `sample1`

,
`sample2_h0`

, and `sample2_h1`

). The returned
value must be a named vector that contains the derived p values. The
name of each p value must start with `p_`

, and end with
`_h0`

for the “H0” outcome and with `_h1`

for the
“H1” outcome. (Below, it’s simply `p_h0`

/`p_h1`

,
but it could just as well be, e.g.,
`p_my_t.test_h0`

/`p_my_t.test_h1`

).

```
= function(sample1, sample2_h0, sample2_h1) {
customTest c(
p_h0 = t.test(sample1, sample2_h0, 'less', var.equal = TRUE)$p.value,
p_h1 = t.test(sample1, sample2_h1, 'less', var.equal = TRUE)$p.value
) }
```

(Other information, in particular descriptives such as means or SDs, can also be returned via this function – but this is described in a separate vignette.)

To make sure the two functions are working correctly, as a quick test you can run the following line (with any applicable number argument).

```
do.call(customTest, customSample(55))
#> p_h0 p_h1
#> 0.67172109 0.03385126
```

This passes `55`

as (arbitrary) sample size to
`customSample`

, which passes the created samples to
`customTest`

, which finally returns the named vector with the
`p_h0`

and `p_h1`

p values. (Given the
randomization, if you run this line repeatedly, the outcome should
differ each time – however, with large enough numbers, `p_h1`

is likely to be noticeably smaller than `p_h0`

.)

Time to load `POSSA`

.

`library('POSSA')`

Now, the `POSSA::sim`

function can be used with
`customSample`

for `fun_obs`

(function for
observation values) and `customTest`

for
`fun_test`

(function for statistical testing), and any
desired numbers of observation. (This runs the simulation
`45000`

times by default, so it takes a while. For quick
testing, set the number to a lower value via the `n_iter`

parameter, e.g., `n_iter = 500`

.)

```
= sim(fun_obs = customSample,
dfPvals n_obs = 80,
fun_test = customTest)
```

The returned `dfPvals`

contains all the simulated p
values, and can be directly passed to `POSSA::pow`

to get the
power for any specified alpha level.

`pow(dfPvals)`

This prints to the console the following.

```
#> # POSSA pow() results #
#> N(average-total) = 160.0 (if H0 true) or 160.0 (if H1 true)
#> (p) Type I error: .05140; Power: .93327
#> Local alphas: (1) .05000
```

This is a fixed sample design with 80 observations per each of the
two groups. The function uses an alpha of `.05`

by default,
this may be modified e.g. to `.01`

below.

```
pow(dfPvals, alpha_global = 0.01)
#> # POSSA pow() results #
#> N(average-total) = 160.0 (if H0 true) or 160.0 (if H1 true)
#> (p) Type I error: .00991; Power: .79069
#> Local alphas: (1) .01000
```

So far, we can easily get the same information from other software,
e.g. `pwr::pwr.t.test(n = 80, d = 0.5, alternative = 'greater')`

(returns `power = 0.93368`

) and
`pwr::pwr.t.test(n = 80, d = 0.5, sig.level = 0.01, alternative = 'greater')`

(returns `power = 0.79068`

).

However, sequential designs are extremely easy to be expanded from
the above `sim`

and `pow`

examples. One just has
to specify multiple instances of observation numbers for the
`n_obs`

in `sim`

and specify a local alphas in
`pow`

, e.g. as below.

```
= sim(fun_obs = customSample,
dfPvalsSeq n_obs = c(27, 54, 81),
fun_test = customTest)
pow(dfPvalsSeq, alpha_locals = NA)
#> # POSSA pow() results #
#> N(average-total) = 158.6 (if H0 true) or 98.8 (if H1 true)
#> (p) Type I error: .05000; Power: .89922
#> Local alphas: (1) .02288; (2) .02288; (3) .02288
#> Likelihoods of significance if H0 true: (1) .02296; (2) .01631; (3) .01073
#> Likelihoods of significance if H1 true: (1) .42324; (2) .32298; (3) .15300
```

The `alpha_locals = NA`

input specifies that all local
alphas are initially `NA`

, in which case they will all be
calculated to have a single identical number (corresponding to Pocock’s
correction in case of equally spaced looks). Rounding to the
conventional 3 fractional digits, the adjusted constant local alpha
number corresponds to the exact statistics (`.023`

) provided
in other software,
e.g. `gsDesign::gsDesign(k = 3, test.type = 1, alpha = 0.05, sfu = 'Pocock')`

.

The console output contains only the most important information, but,
when assigned, the `pow`

function returns a data frame with
detailed information per each look (and their totals), including the
specified sample sizes, the numbers of iterations, etc. The printed
information includes

- the average number of subjects in H0 and in H1 scenarios (i.e., out of all simulations, on average at how many participants the experiment was stopped),
- the T1ER (here, it’s for a single p value, but, in case of multiple p values, it would be a combined T1ER as specified)
- the power (again, would be combined in case of multiple p values),
- the local alphas per look (and per each p value; but in the present
case there is only one, named simply
`p`

, the root derived from`p_h0`

/`p_h1`

), - the ratios of significant findings per each look (corresponding to the local alphas).

The look numbers are always in parenthesis, such that
“`(1)`

” designates the first look, “`(2)`

” the
second look, etc.

The console output can also be called as
`print(dfPvalsSeq)`

. The number of fractional digits to which
the values are to be rounded can be changed via a `round_to`

argument, such as e.g. `print(dfPvalsSeq, round_to = 2)`

.

`POSSA`

helps obtain the local alphas, in sequential
designs, that result in a specified overall (global) T1ER, which may
also be called “global alpha”. This procedure is automatized for most
practical cases, but still users should ideally have a basic idea of how
this works. However, impatient readers (who do not want to have
customized adjustments) can skip to the **Specifying initial local
alphas** section below.

The essential mechanism is a staircase procedure, where the local
alphas are continually increased when the T1ER is smaller than specified
and decreased when the T1ER is larger then specified, and each direction
change (from decrease to increase or vice versa) moves onto a smaller
step size in the adjustment. For example, the simplest case is when all
local alphas are provided as `NA`

, as above: in this case,
the initial replacement value (which may be modified via the
`adj_init`

parameter) is by default the global alpha divided
by the maximum number of looks, as a rough initial estimation.
Subsequently, the given p values are tested with this setting, and a
global T1ER is calculated. If it is larger than specified, the
replacement value is decreased, or vice versa. The amount of decrease is
the first step, by default `0.01`

. Then the testing is
repeated, and the change is repeated in the same direction until the
obtained T1ER passes the specified T1ER (e.g., becomes smaller, when
initially larger). Then the replacement value is increased with a
smaller second step size, by default `0.05`

. And so on, until
either there are no more steps (as specified) or (more typically) the
obtained T1ER is close enough to the specified one (e.g., by default,
matches it up to 5 fractional digits; may be specified via
`alpha_precision`

), at which point the procedure stops and
the obtained local alphas (each having the same value) and the
corresponding results are returned and printed.

The adjustment actually happens via a function that can be specified
via the `adjust`

parameter, but, by default, whenever there
is at least one `NA`

value among the given
`alpha_locals`

argument, the function implements the
above-described procedure, and looks as follows.

```
= function(adj, prev, orig) {
adjust is.na(orig)] = adj
prev[return(prev)
}
```

In this function, `adj`

means the adjustment value (in
this case a replacement value), `prev`

the previous vector of
local alphas (obtained in the last adjustment), and `orig`

the original vector of local alphas as provided for
`alpha_locals`

. (For the first adjustment, `prev`

is the same as `orig`

). Any sort of custom function may be
provided as `adjust`

, but it cannot contain any parameter
other than these three (`adj`

, `prev`

, and
`orig`

).

By default, if no `NA`

value is found among the given
`alpha_locals`

, the `adjust`

function uses
multiplication to adjust the given values (with `adj_init`

set to `1`

), as follows.

```
= function(orig, adj) {
adjust return(orig * adj)
}
```

By default, there are altogether 11 step sizes, chosen in a way that
typically results in 5-fractional-digit T1ER match before running
getting to the last step, and the calculation typically takes only a few
seconds. (Step sizes can be modified via the
`staircase_steps`

parameter.)

Given the above-described default mechanisms, there are, without
modifying the adjustment procedure, two easy ways to provide the
argument for `alpha_locals`

(for any given p value): (a) a
vector that includes (or consist entirely of) `NA`

values
that are to be replaced with a constant number that results in the
desired global T1ER, or (b) a vector with all numeric values that are
each to be multiplied with a number in order to obtain a vector of
alphas that again result in the desired global T1ER.

However, you can also simply check the results, without any
adjustment, of any given set of local alphas, by setting
`adjust = FALSE`

.

Here are some examples, all using the data frame
“`dfPvalsSeq`

” created above with
`POSSA::sim()`

.

To use Haybittle–Peto boundary (all interim local alphas
`.001`

), you can specify `.001`

for each local
alpha except for the last, which could be `NA`

to be adjusted
for the given global alpha (here: `.025`

), as below.

```
pow(dfPvalsSeq, alpha_locals = c(0.001, 0.001, NA), alpha_global = 0.025)
#> # POSSA pow() results #
#> N(average-total) = 161.8 (if H0 true) or 140.4 (if H1 true)
#> (p) Type I error: .02500; Power: .88411
#> Local alphas: (1) .00100; (2) .00100; (3) .02431
#> Likelihoods of significance if H0 true: (1) .00111; (2) .00067; (3) .02322
#> Likelihoods of significance if H1 true: (1) .09224; (2) .21596; (3) .57591
```

In this specific scenario, despite noticeably decreased average
sample in case of H1 (`140.4`

instead of the fixed
`162`

), the adverse effect of interim stops on T1ER is so
minimal (see, in case of H0, the tiny ratio significant [false positive]
findings at the first and second looks), that the last look’s alpha
(`.02431`

) hardly differs from the specified global alpha
(`.025`

).

You can also check what happens without adjustment, using
`adjust = FALSE`

.

```
pow(dfPvalsSeq, alpha_locals = c(0.001, 0.001, 0.025), alpha_global = 0.025, adjust = FALSE)
#> # POSSA pow() results #
#> N(average-total) = 161.8 (if H0 true) or 140.4 (if H1 true)
#> (p) Type I error: .02551; Power: .88618
#> Local alphas: (1) .00100; (2) .00100; (3) .02500
#> Likelihoods of significance if H0 true: (1) .00111; (2) .00067; (3) .02373
#> Likelihoods of significance if H1 true: (1) .09224; (2) .21596; (3) .57798
```

Which leads to the same conclusion from the reverse perspective: the
T1ER hardly exceeds `.25`

.

Nonetheless, more substantial sample reduction can be achieved with
somewhat more liberal interim local alphas; besides, normally it makes
more sense to increase the local alphas gradually (because, in brief,
given the greater uncertainty of the initial smaller samples, beginning
with lower local alphas and gradually increasing them provides the most
efficient “information gain” for the same eventual T1ER). For this, one
could base the adjustment on one of the popular boundary methods used
for parametric tests, such as the O’Brien-Fleming bounds. These bounds
for parametric tests can be easily calculated via other packages. They
may not apply to nonparametric tests in exactly the same way, but the
principle of exponentially increasing local alphas remains the same. (To
decide what fits your specific scenario best, you could also check the
outcomes of various versions of the Wang-Tsiatis bounds by changing its
*Delta* parameter; e.g., in `gsDesign::gsDesign()`

,
set `sfu = 'WT'`

for Wang-Tsiatis, and vary its Delta via
`sfupar`

.) Most importantly in any case, the adjustment of
alphas in `POSSA`

will ensure that the T1ER is at the
specified level. This will only make a difference for other custom tests
and multiple hypotheses, but not for the present t-test example.

For our example of three equally spaced interim looks,
O’Brien-Fleming bounds can be calculated e.g. via
`gsDesign::gsDesign(k = 3, test.type = 1, alpha = 0.05, sfu = 'OF')`

,
giving the local alphas of `0.0015`

, `0.0181`

, and
`0.0437`

. (The same is returned by
`rpact::getDesignGroupSequential(typeOfDesign = "OF", informationRates = c(1/3, 2/3, 1), alpha = 0.05)`

.)

To merely calculate the T1ER and power for any specific tests, again
use `adjust = FALSE`

.

```
pow(dfPvalsSeq, alpha_locals = c(0.0015, 0.0181, 0.0437), alpha_global = 0.05, adjust = FALSE)
#> # POSSA pow() results #
#> N(average-total) = 160.8 (if H0 true) or 118.7 (if H1 true)
#> (p) Type I error: .04951; Power: .93087
#> Local alphas: (1) .00150; (2) .01810; (3) .04370
#> Likelihoods of significance if H0 true: (1) .00162; (2) .01809; (3) .02980
#> Likelihoods of significance if H1 true: (1) .11531; (2) .57211; (3) .24344
```

(To verify the power via other software, see
e.g. `summary(rpact::getSampleSizeMeans(rpact::getDesignGroupSequential(typeOfDesign = "OF", informationRates = c(1/3, 2/3, 1), alpha = 0.05, beta = 1-0.93087), alternative = 0.5))`

.)

Since these local alphas were specifically chosen to have a T1ER of
`.05`

for the t-test comparison used here, they will hardly
change even with `adjust = TRUE`

. However, to test the
adjustment function, we can set a different global alpha,
e.g. `.01`

, so that the adjustment will reduce each local
alpha value by multiplication.

```
pow(dfPvalsSeq, alpha_locals = c(0.0015, 0.0181, 0.0437), alpha_global = 0.025)
#> # POSSA pow() results #
#> N(average-total) = 161.4 (if H0 true) or 126.9 (if H1 true)
#> (p) Type I error: .02498; Power: .87487
#> Local alphas: (1) .00071; (2) .00862; (3) .02080
#> Likelihoods of significance if H0 true: (1) .00089; (2) .00876; (3) .01533
#> Likelihoods of significance if H1 true: (1) .07629; (2) .49747; (3) .30111
```

(Cf.
`summary(rpact::getSampleSizeMeans(rpact::getDesignGroupSequential(typeOfDesign = "asUser", informationRates = c(1/3, 2/3, 1), userAlphaSpending = c(.00071, .0089, .0244), beta = 1-.87487), alternative = 0.5))`

.)

Finally, just for quick demonstration, for the same scenario, here is
a function that adjusts all values via additions (constant
increase/decrease of all local alpha values). Also, here the global
alpha is set to be larger (`alpha_global = 0.1`

), just so as
to make the local alphas increase too in this case.

```
pow(
dfPvalsSeq,alpha_locals = c(0.0015, 0.0181, 0.0437),
alpha_global = 0.1,
adjust = function(adj, prev, orig) {
return(orig + adj)
}
)
#> # POSSA pow() results #
#> N(average-total) = 161.4 (if H0 true) or 126.9 (if H1 true)
#> (p) Type I error: .02498; Power: .87487
#> Local alphas: (1) .00071; (2) .00862; (3) .02080
#> Likelihoods of significance if H0 true: (1) .00089; (2) .00876; (3) .01533
#> Likelihoods of significance if H1 true: (1) .07629; (2) .49747; (3) .30111
```

By default, the interim local alphas will all be zero (i.e., no stopping for significance), and the final local alpha will equal the given global alpha – meaning a simple fixed design (with the maximum specified observation number).

```
pow(dfPvalsSeq)
#> # POSSA pow() results #
#> N(average-total) = 162.0 (if H0 true) or 162.0 (if H1 true)
#> (p) Type I error: .04860; Power: .93636
#> Local alphas: (1) none; (2) none; (3) .05000
#> Likelihoods of significance if H0 true: (1) 0; (2) 0; (3) .04860
#> Likelihoods of significance if H1 true: (1) 0; (2) 0; (3) .93636
```

This is to make the setting of *futility bounds* alone easy
and straightforward. For instance, to stop collection for futility
whenever the p value exceeds `.5`

(during interim looks), one
can simply add `fut_locals = 0.5`

.

```
pow(dfPvalsSeq, fut_locals = 0.5)
#> # POSSA pow() results #
#> N(average-total) = 101.1 (if H0 true) or 158.3 (if H1 true)
#> (p) Type I error: .04516; Power: .91622
#> Local alphas: (1) none; (2) none; (3) .05000
#> Likelihoods of significance if H0 true: (1) 0; (2) 0; (3) .04516
#> Likelihoods of significance if H1 true: (1) 0; (2) 0; (3) .91622
#> Futility bounds: (1) .50000; (2) .50000
#> Likelihoods of exceeding futility alpha if H0 true: (1) .03011; (2) .00176
#> Likelihoods of exceeding futility alpha if H1 true: (1) .03324; (2) .00180
```

This is equivalent to specifying each futility bound. Since futility
bound is meaningless for the final look, and the present example
(`dfPvalsSeq`

) contains two interim looks, two values are
required. Hence the
`pow(dfPvalsSeq, fut_locals = c(0.5, 0.5))`

gives results
equivalent to `pow(dfPvalsSeq, fut_locals = 0.5)`

(as
above).

Of course, once again, a constant value is not ideal; the futility bound is more effective with values decreasing by each look. The specific choices depends on how much power one is willing to sacrifice in exchange for reduced sample sizes (which, as seen in the outputs, will help primarily in case H0 is true).

```
pow(dfPvalsSeq, fut_locals = c(0.6, 0.3))
#> # POSSA pow() results #
#> N(average-total) = 100.9 (if H0 true) or 159.3 (if H1 true)
#> (p) Type I error: .04587; Power: .92331
#> Local alphas: (1) none; (2) none; (3) .05000
#> Likelihoods of significance if H0 true: (1) 0; (2) 0; (3) .04587
#> Likelihoods of significance if H1 true: (1) 0; (2) 0; (3) .92331
#> Futility bounds: (1) .60000; (2) .30000
#> Likelihoods of exceeding futility alpha if H0 true: (1) .01549; (2) .01307
#> Likelihoods of exceeding futility alpha if H1 true: (1) .01789; (2) .01336
```

Importantly, alphas stopping for significance and bounds stopping for futility can be combined in any desired way.

For instance:

```
pow(
dfPvalsSeq,alpha_locals = c(0.002, 0.018, 0.044),
fut_locals = c(0.6, 0.3)
)
#> # POSSA pow() results #
#> N(average-total) = 99.7 (if H0 true) or 114.3 (if H1 true)
#> (p) Type I error: .05000; Power: .92229
#> Local alphas: (1) .00211; (2) .01897; (3) .04636
#> Likelihoods of significance if H0 true: (1) .00216; (2) .01864; (3) .02920
#> Likelihoods of significance if H1 true: (1) .13907; (2) .55542; (3) .22780
#> Futility bounds: (1) .60000; (2) .30000
#> Likelihoods of exceeding futility alpha if H0 true: (1) .03304; (2) .37542
#> Likelihoods of exceeding futility alpha if H1 true: (1) .01789; (2) .01336
```

(Note the robust sample size reduction in case of both H0 and H1, with power hardly changed as compared to the fixed design.)

The `alpha_locals`

functions same as without futility
bounds; e.g., in this last example it was adjusted via multiplication,
but it could also be again given `NA`

values to be adjusted
via replacements.

To disable, at any given look, stopping either for significance or for futility, just set the local alpha/bound to zero or to one, respectively. For example, the above example can be modified as below to have no stopping for futility at the first look and no stopping for significance at the second look.

```
pow(
dfPvalsSeq,alpha_locals = c(0.002, 0, 0.044),
fut_locals = c(1, 0.3)
)
#> # POSSA pow() results #
#> N(average-total) = 123.8 (if H0 true) or 145.2 (if H1 true)
#> (p) Type I error: .05000; Power: .93416
#> Local alphas: (1) .00235; (2) none; (3) .05167
#> Likelihoods of significance if H0 true: (1) .00247; (2) 0; (3) .04753
#> Likelihoods of significance if H1 true: (1) .14633; (2) 0; (3) .78782
#> Futility bounds: (1) none; (2) .30000
#> Likelihoods of exceeding futility alpha if H0 true: (1) 0; (2) .01867
#> Likelihoods of exceeding futility alpha if H1 true: (1) 0; (2) .01916
```

(Of course, once again, this example hardly makes sense in a real case; it serves only as a demonstration here.)

Consider the sample creation given in the beginning:

```
= function(sampleSize) {
customSample list(
sample1 = rnorm(sampleSize, mean = 0, sd = 10),
sample2_h0 = rnorm(sampleSize, mean = 0, sd = 10),
sample2_h1 = rnorm(sampleSize, mean = 5, sd = 10)
) }
```

Here, we have two different groups (denoted as “sample1” and
“sample2”), but, since their sample sizes (numbers of observations) are
the same, for convenience the sample size can be passed via a single
parameter (here: `sampleSize`

). It could also be specified
separately, passing via two parameters. In such a case, the parameter
names must exactly correspond to the sample root names (i.e., the names
without any final `0`

/`1`

that specify “H0” or
“H1”); here, `sample1`

and `sample2_h`

.

```
= function(sample1, sample2_h) {
customSampleTwoParam list(
sample1 = rnorm(sample1, mean = 0, sd = 10),
sample2_h0 = rnorm(sample2_h, mean = 0, sd = 10),
sample2_h1 = rnorm(sample2_h, mean = 5, sd = 10)
) }
```

Now, the previous `sim`

and `pow`

procedures
can be run just the same, and the results will be identical as in the
first example (with single parameter passed).

```
= sim(fun_obs = customSampleTwoParam,
dfPvalsSeqTwoParam n_obs = c(27, 54, 81),
fun_test = customTest)
pow(dfPvalsSeqTwoParam, alpha_locals = NA)
```

Here, given the `n_obs = c(27, 54, 81)`

with a single
vector, this is automatically assigned to each of the two given sample
parameters (`sample1`

and `sample2_h`

). These can
also be assigned separately, and the results will once again be
identical.

```
= sim(
dfPvalsSeqTwoParamSeparate fun_obs = customSampleTwoParam,
n_obs = list(
sample1 = c(27, 54, 81),
sample2_h = c(27, 54, 81)
),fun_test = customTest
)
pow(dfPvalsSeqTwoParamSeparate, alpha_locals = NA)
```

Importantly, this also allows specifying different observation numbers for each of the two samples.

```
= sim(
dfPvalsSeqTwoParamSeparate fun_obs = customSampleTwoParam,
n_obs = list(
sample1 = c(17, 44, 71),
sample2_h = c(37, 64, 91)
),fun_test = customTest
)
pow(dfPvalsSeqTwoParamSeparate, alpha_locals = NA)
```

Now, the outcome is of course different (but would be especially affected in case of unequal variances).

Incidentally: all sequential design examples so far used two interim looks – but there could of course be any other number. For instance, here is a design with a senseless amount of six interim looks (for two differently sized samples), just for the fun of it.

```
= sim(
dfPvalsManyLooks fun_obs = customSampleTwoParam,
n_obs = list(
sample1 = c(8, 16, 24, 32, 40, 48, 56),
sample2_h = c(10, 20, 30, 40, 50, 60, 70)
),fun_test = customTest
)
pow(dfPvalsManyLooks, alpha_locals = NA)
#> # POSSA pow() results #
#> N(average-total) = 122.4 (if H0 true) or 78.8 (if H1 true)
#> (p) Type I error: .04991; Power: .78164
#> Local alphas: (1) .01381; (2) .01381; (3) .01381; (4) .01381; (5) .01381; (6) .01381; (7) .01381
#> Likelihoods of significance if H0 true: (1) .01478; (2) .01036; (3) .00682; (4) .00600; (5) .00449; (6) .00378; (7) .00369
#> Likelihoods of significance if H1 true: (1) .10896; (2) .14676; (3) .14107; (4) .12364; (5) .10562; (6) .08716; (7) .06844
```

In a real experiment, you may not know in advance the exact number of observations at the stopping point. For example, you plan (and preregister) to open an initial 30 slots for participation, after which you would perform the first testing (first interim “look”) – however, some participants will probably have to be excluded (e.g. because of failing attention checks), expected to be only a few, but cannot be known exactly in advance.

Assuming that minor variations in power do not matter, you can simply
calculate with the approximate observation numbers (at the given planned
looks) in advance, and, when the actual observation numbers are known,
the calculation is repeated with these latter numbers to ensure that the
T1ER remains as desired. For example, taking into account some necessary
exclusions, the expected observation numbers with two interim looks are
`27`

(first look), `54`

, and `81`

.
This may be calculated as before, using O’Brien-Fleming bounds (as
provided by, e.g.,
`gsDesign::gsDesign(test.type = 1, alpha = 0.05, sfu = 'OF', timing = c(27, 54, 81)/81)`

,
then adjusted by `POSSA`

).

```
= sim(fun_obs = customSample,
dfPvals_planned n_obs = c(27, 54, 81),
fun_test = customTest)
pow(dfPvals_planned, alpha_locals = c(0.0015, 0.0181, 0.0437), alpha_global = 0.05)
#> # POSSA pow() results #
#> N(average-total) = 160.8 (if H0 true) or 118.5 (if H1 true)
#> (p) Type I error: .05000; Power: .93164
#> Local alphas: (1) .00152; (2) .01830; (3) .04419
#> Likelihoods of significance if H0 true: (1) .00164; (2) .01831; (3) .03004
#> Likelihoods of significance if H1 true: (1) .11604; (2) .57309; (3) .24251
```

Now, let’s say that, due to somewhat more exclusions than expected,
the initial sample contains only 24 valid observations. You can rerun
the simulation with the modified first observation number, and run the
power calculation with initial alphas corrected with
`gsDesign::gsDesign(test.type = 1, alpha = 0.05, sfu = 'OF', timing = c(24, 54, 81)/81)`

.

```
= sim(fun_obs = customSample,
dfPvals_first n_obs = c(24, 54, 81),
fun_test = customTest)
pow(dfPvals_first, alpha_locals = c(0.0009, 0.0182, 0.0438), alpha_global = 0.05)
#> # POSSA pow() results #
#> N(average-total) = 161.0 (if H0 true) or 120.6 (if H1 true)
#> (p) Type I error: .05000; Power: .93484
#> Local alphas: (1) .00091; (2) .01850; (3) .04453
#> Likelihoods of significance if H0 true: (1) .00087; (2) .01760; (3) .03153
#> Likelihoods of significance if H1 true: (1) .07056; (2) .61742; (3) .24687
```

Let’s say that the p value was not below the local alpha
(`.02295`

), so you move open another 30 slots, and once again
end up with more exclusions than expected, resulting in only
`49`

valid observations. The adjusted calculation is then as
follows.

```
::gsDesign(test.type = 1, alpha = 0.05, sfu = 'OF', timing = c(24, 49, 81)/81)
gsDesign# gives new initial local alphas
= sim(fun_obs = customSample,
dfPvals_second n_obs = c(24, 49, 81),
fun_test = customTest)
pow(dfPvals_second, alpha_locals = c(0.0009, 0.0145, 0.0447), alpha_global = 0.05)
#> # POSSA pow() results #
#> N(average-total) = 161.0 (if H0 true) or 119.6 (if H1 true)
#> (p) Type I error: .04996; Power: .93153
#> Local alphas: (1) .00090; (2) .01453; (3) .04479
#> Likelihoods of significance if H0 true: (1) .00091; (2) .01360; (3) .03544
#> Likelihoods of significance if H1 true: (1) .07009; (2) .53733; (3) .32411
```

(And, if needed, a similar modified recalculation could be done for the last look.)

As can be seen, the power differences are very small. Nonetheless, if it is important to keep the same level of power with high precision (e.g., up to 3 fractional digits), you can plan for (and preregister) to conditionally adjust (increase/decrease) the subsequent interim (if any) and final sample sizes to achieve the same power. (Technically, another option is to conditionally adjust the alpha level, i.e., the T1ER, but that’s rather questionable.)

Go to the next
(and the only other important) `POSSA`

vignette see how to
record descriptives, easily check varying factors, and test multiple
hypotheses.