\documentclass{article}

\usepackage{indentfirst}
\usepackage{amsmath,amsthm}
\usepackage{natbib}
\usepackage{url}

\newcommand{\BinomialDis}{\text{\upshape Bin}}
\newcommand{\UniformDis}{\text{\upshape Unif}}
\newcommand{\MannWhitneyDis}{\text{\upshape MannWhit}}
\newcommand{\SignRankDis}{\text{\upshape SignRank}}
\newcommand{\opor}{\mathbin{\rm or}}
\newcommand{\opand}{\mathbin{\rm and}}

\newcommand{\abs}[1]{\lvert #1 \rvert}

\newtheorem{theorem}{Theorem}
\newtheorem{hypothesis}{Hypothesis}

%\VignetteEngine{knitr::knitr}
%\VignetteIndexEntry{Design of R Package fuzzyRankTests}

\begin{document}

\title{Design of the Fuzzy Rank Tests Package}
\author{Charles J. Geyer}
\maketitle

\section{Introduction}

We do fuzzy $P$-values and confidence intervals following
\citet{geyer-meeden} and \citet{thompson-geyer} for three classical
nonparametric procedures: the sign test and the two Wilcoxon tests and
their associated confidence intervals.

\subsection{Classical Randomized Tests}

Following \citet{geyer-meeden}, let
$$
   x \mapsto \phi(x, \alpha, \theta)
$$
denote the critical function of a randomized test having significance level
$\alpha$ and point null hypothesis $\theta$, that is, the randomized test
rejects the null hypothesis $\theta = \theta_0$ at level $\alpha$ when the
observed data are $x$ with probability $\phi(x, \alpha, \theta_0)$.

The requirement that $\phi(x, \alpha, \theta)$ be a probability restricts
it to being between zero and one (inclusive).  The requirement that the
test have its nominal level is
\begin{equation} \label{eq:level}
   E_\theta \{ \phi(X, \alpha, \theta) \} = \alpha,
   \qquad \text{for all $\theta$ and $\alpha$}.
\end{equation}

\subsection{Fuzzy $P$-values}

A \emph{fuzzy $P$-value} for this test with null hypothesis $\theta = \theta_0$
and observed data $x$ is a random variable having distribution function
\begin{equation} \label{eq:fpv-df}
   \alpha \mapsto \phi(x, \alpha, \theta_0)
\end{equation}
\citep[Section~1.4]{geyer-meeden}.  In order for this to be a distribution
function, it must be nondecreasing
\begin{equation} \label{eq:nested}
   \alpha_1 < \alpha_2 \mathrel{\rm implies}
   \phi(x, \alpha_1, \theta) \le \phi(x, \alpha_2, \theta),
   \qquad \text{for all $x$ and $\theta$}.
\end{equation}
\citet{geyer-meeden} call this property of the family of tests
(each $\alpha$ specifying a different tests) being \emph{nested}.

For all the tests considered
by \citet{geyer-meeden}, \citet{thompson-geyer}, or in this document,
the fuzzy $P$-value is a continuous random variable, so its probability
density function is
$$
   \frac{\partial \phi(x, \alpha, \theta_0)}{\partial \alpha}
$$
considered as a function of $\alpha$ for fixed $x$ and $\theta_0$.

If we write the fuzzy $P$-value as a random variable $P$, then the
critical function has the interpretation
$$
   \phi(X, \alpha, \theta) = \Pr\nolimits_\theta(P \le \alpha \mid X).
$$
and \eqref{eq:level} applied to this gives
$$
   \Pr\nolimits_\theta(P \le \alpha)
   =
   E_\theta\{ \Pr\nolimits_\theta(P \le \alpha \mid X) \}
   =
   E_\theta\{ \phi(X, \alpha, \theta) \}
   =
   \alpha
$$
so $P$ is unconditionally $\UniformDis(0, 1)$ distributed for all $\theta$.

\subsection{Fuzzy Confidence Intervals}

A \emph{fuzzy confidence interval} associated with this test
having coverage probability $1 - \alpha$ is
for observed data $x$ a fuzzy set having membership function
\begin{equation} \label{eq:fci}
   \theta \mapsto 1 - \phi(x, \alpha, \theta)
\end{equation}
\citep[Section~1.4]{geyer-meeden}.
The interpretation is that the right-hand side of \eqref{eq:fci} is
the degree to which $\theta$ is considered to be ``in'' the fuzzy confidence
interval.  The coverage probability is
$$
   E_\theta\{ 1 - \phi(X, \alpha, \theta) \} = 1 - \alpha,
$$
which is just \eqref{eq:level} rewritten.


\subsection{Latent Variable Randomized Tests}

One way to make a randomized test is the following \citep{thompson-geyer}.
Suppose we have observed data $x$ and latent data $y$ (also called random
effects and Bayesians call them parameters).  And suppose we have a
possibly randomized test for the complete data with critical function
$$
   (x, y) \mapsto \psi(x, y, \alpha, \theta).
$$
Then $\phi$ defined by
$$
   \phi(x, \alpha, \theta) = E \{ \psi(X, Y, \alpha, \theta) \mid X = x \}
$$
is a critical function for a randomized test for the observed data.

\section{Sign Test}

Suppose we observe data $X_1$, $\ldots$, $X_n$ that are IID from some
distribution with median $\mu$.  The conventional sign test of a hypothesized
value of $\mu$ is based on the test statistic $W$, which is the number of $X_i$
strictly greater than $\mu$.

If the distribution of the $X_i$ is continuous, so no ties are possible,
then under the following null hypothesis the distribution of $W$
is $\BinomialDis(n, 1 / 2)$.
\begin{hypothesis} \label{null-sign}
The hypothesized value of $\mu$ is the median of the distribution
    of the $X_i$.
\end{hypothesis}

If the distribution of the $X_i$ is not continuous, so ties are possible,
we break the ties by ``infinitesimal jittering'' of the data.
If $l$, $t$, and $u$ data points are below $\mu$, tied with $\mu$,
and above $\mu$, respectively, then after infinitesimal jittering
we have $W = u + T$ of the
jittered data points strictly greater than $\mu$,
where $T \sim \BinomialDis(t, 1 / 2)$.
Again we have $W \sim \BinomialDis(n, 1 / 2)$,
although, strictly speaking, this requires that we change the null hypothesis
to the following.
\begin{hypothesis} \label{null-sign-jittered}
The hypothesized value of $\mu$ is the median of the distribution
    of the infinitesimally jittered $X_i$.
\end{hypothesis}
Since we cannot in practice distinguish infinitesimally separated points,
we consider these two hypotheses the same in practice.

Following \citet{thompson-geyer},
we call $W$ the \emph{latent test statistic} and base the fuzzy test
on it.  Table~\ref{tab:endpoint} shows how a fuzzy test based
on such a test statistic works.
\begin{table}
\caption{Endpoints of Supports of Fuzzy $P$-values.  $T$ is a random variable
having the null distribution of the test statistic, $t$ is the observed value
of the test statistic, and $\tau$ is the center of symmetry of the null
distribution of the test statistic.  The fuzzy $P$-value is uniformly
distributed on the interval with the endpoints indicated in the table.}
\label{tab:endpoint}
\begin{center}
\begin{tabular}{lcc}
       & lower & upper \\
       & endpoint & endpoint \\
\hline
upper-tailed & $\Pr(T > t)$ & $\Pr(T \ge t)$ \\
lower-tailed & $\Pr(T < t)$ & $\Pr(T \le t)$ \\
two-tailed & $\Pr(\abs{T - \tau} > \abs{t - \tau})$ &
    $\Pr(\abs{T - \tau} \ge \abs{t - \tau})$
\end{tabular}
\end{center}
\end{table}

\subsection{One-Tailed Tests}

\subsubsection{General}

Suppose we have a test with univariate discrete test statistic $X$
taking values in a discrete set $S$, and we want to do an upper-tailed test.
We claim that the random variable $P$ defined as follows is a fuzzy $P$-value:
conditional on $X = x$ the random variable $P$ has the
$\UniformDis\bigl(a(x, \theta), b(x, \theta)\bigr)$ distribution, where
\begin{align*}
   a(x, \theta) & = \Pr\nolimits_\theta\{ X > x \}
   \\
   b(x, \theta) & = \Pr\nolimits_\theta\{ X \ge x \}
\end{align*}

To prove this we must show that the critical function
$$
   \phi(x, \alpha, \theta)
   =
   \begin{cases}
   0, & \alpha \le a(x, \theta)
   \\
   \frac{\alpha - a(x, \theta)}{b(x, \theta) - a(x, \theta)}, &
   a(x, \theta) < \alpha < b(x, \theta)
   \\
   1, & \alpha \ge b(x, \theta)
   \end{cases}
$$
satisfies \eqref{eq:level} (it clearly satisfies \eqref{eq:nested}).
Let $x^*$ denote the unique element of $S$ such that
$a(x^*, \theta) < \alpha < b(x^*, \theta)$.
Note that $x^*$ is a function of $\alpha$ even though the notation
does not indicate this.
Then
\begin{align*}
   E_\theta\{ \phi(X, \alpha, \theta) \}
   & =
   \sum_{x \in S} \Pr\nolimits_\theta(X = x)
   \cdot \phi(x, \alpha, \theta)
   \\
   & =
   \Pr\nolimits_\theta\bigl(X > x^*\bigr)
   +
   \Pr\nolimits_\theta\bigl(X = x^*\bigr)
   \cdot
   \frac{\alpha - a(x^*, \theta)}{b(x^*, \theta) - a(x^*, \theta)}
\end{align*}
and this is equal to $\alpha$, so \eqref{eq:level} holds.

\subsubsection{Latent Test for Sign Test}

The latent fuzzy $P$-value for latent test statistic $u + T$
is $\UniformDis(a, b)$ where
\begin{align*}
   a & = \Pr\{ W > u + T \}
   \\
   b & = \Pr\{ W \ge u + T \}
\end{align*}
Note that $a$ and $b$ depend on $u + T$ although the notation does not
indicate this.
The (actual, non-latent) fuzzy $P$-value is a mixture of these uniform
distributions (mixing over the latent variable $T$).
This means the (actual, non-latent) fuzzy $P$-value has continuous,
piecewise linear CDF with
knots $\Pr\{ W > u + t - k \}$ and corresponding values
$\Pr\{ T < k \}$, where $k = 0$, \ldots, $t + 1$.
Recall that the distribution of $W$ is $\BinomialDis(n, 1 / 2)$
and the distribution of $T$ is $\BinomialDis(t, 1 / 2)$.

For a lower-tailed test, swap $l$ and $u$ and proceed as above.

\subsection{Two-Tailed Tests}

For a two-tailed test, we use same distributions of $T$ and $W$ as in
the one-tailed test, but now the latent test statistic is
$$
   g(T) = \max(u + T, l + t - T).
$$
The latent fuzzy $P$-value
for latent test statistic $g(T)$ is $\UniformDis(a, b)$ where
\begin{align*}
   a & = \Pr\{ \lvert W \rvert > g(T) \}
   \\
   b & = \Pr\{ \lvert W \rvert \ge g(T) \}
\end{align*}
Because of the symmetry of the distribution of $W$ we always have
\begin{align*}
   a & = 2 \Pr\{ W > g(T) \}
   \\
   b & = \min\bigl( 1, 2 \Pr\{ W \ge g(T) \} \bigr)
\end{align*}
Note that in all of these $a$ and $b$ depend on $T$ even though the notation
does not indicate this.

Thus we can get the fuzzy $P$-value for a two-tailed test by simply
calculating the fuzzy $P$-value for the one-tailed test for the tail
the data favor and multiplying the knots by two \emph{if} none of the
resulting knots exceeds one.

If some of the resulting knots exceed one, then we are in an uninteresting
case from a practical standpoint (the data give essentially no evidence
against the null hypothesis), but we should do the calculation correctly
anyway.

One way to think of the relation between the one-tailed and two-tailed
fuzzy $P$-values is that if $P$ is the one-tailed fuzzy $P$-value,
then $\min(2 P, 1 - 2 P)$ is the two-tailed fuzzy $P$-value.
One easy case discussed above is $P$ is concentrated below $1 / 2$
so the two-tailed fuzzy $P$ value is $2 P$.
The other easy case discussed above is $P$ is concentrated above $1 / 2$
so the two-tailed fuzzy $P$ value is $1 - 2 P$, which is twice the one-tailed
fuzzy $P$-value for the other-tailed test.

The complications arise when the support of the distribution of $P$ contains
$1 / 2$, in which case the upper bound of the support of the two-tailed
fuzzy $P$-value will be one.  Suppose we have $l \le u$ and $P$ is the fuzzy
$P$-value for the upper tailed test (if $l > u$, swap).  Then $P$
is the mixture of $\UniformDis(a_k, b_k)$ random variables, where
\begin{equation} \label{endpoints-one}
\begin{split}
   a_k & = \Pr\{ W < l + k \}
   \\
   b_k & = \Pr\{ W \le l + k \}
\end{split}
\end{equation}
for $k = 0$, $\ldots$, $t$, and the mixing probabilities
are $\Pr\{ T = k \}$.  When $l + k \ge n / 2$, the endpoints
\eqref{endpoints-one}, when doubled, are wrong for the two-tailed $P$-value
(because one or both exceeds one).  There are two cases to consider.
When $l + k = n / 2$, which means $n$ must be even, we then have
$$
   1 - 2 a_k = 2 b_k - 1
$$
and the entire mixing probability $\Pr\{ T = k \}$ should be assigned
to the interval $(2 a_k, 1)$.
When $l + k > n / 2$, which can happen whether $n$ is odd or even, we have
$$
   1 < 2 a_k < 2 b_k
$$
but also
\begin{align*}
   1 - 2 a_k & = 2 b_{k^*}
   \\
   1 - 2 b_k & = 2 a_{k^*}
\end{align*}
for some $k^* < k$.  In fact, we have
$$
   l + k^* = n - (l + k)
$$
hence
$$
   k^* = n - 2 l - k
$$
and now the mixing probability assigned to the
interval $(2 a_{k^*}, 2 b_{k^*})$ should be
$\Pr\{ T = k^* \} + \Pr\{ T = k \}$.

\subsection{Two-Sided Confidence Intervals}

In principle we get a two-tailed fuzzy confidence interval by ``inverting''
the two-tailed fuzzy test.  So there is nothing left to do.
In practice, we want to apply some additional theory.

Let $X_{(i)}$, $i = 0$, $\ldots$, $n + 1$ be the order statistics,
where $X_{(0)} = - \infty$ and $X_{(n + 1)} = + \infty$.
The result of the sign test does not change as $\mu$ changes within an
interval between order statistics.  Thus we only need calculate for each
order statistic and for each interval between order statistics.
Moreover, we can save some work using the following theorems.
\begin{theorem} \label{th:zero}
If
$$
   2 \Pr \{ W < m \} < \alpha,
$$
where $W \sim \BinomialDis(n, 1 / 2)$, then the membership function of
the fuzzy confidence interval with coverage $1 - \alpha$ is zero for
$$
   \mu < X_{(m)} \opor X_{(n - m + 1)} < \mu.
$$
\end{theorem}
If the only $m$ satisfying the condition is $m = 0$, then the assertion
of the theorem is vacuously true.
\begin{proof}
For $\mu < X_{(m)}$
we have at least $n - m + 1$ data values above $\mu$.
Since $m \le n / 2$ the latent test
statistic is at least $n - m + 1$.
Hence the fuzzy $P$-value has support
bounded above by $2 \Pr(W \ge n - m + 1) = 2 \Pr(W < m) < \alpha$.
Hence we accept this $\mu$ with probability zero,
and the membership function of the fuzzy confidence interval is zero
at this $\mu$.
The case $\mu > X_{(n - m + 1)}$ follows by symmetry.
\end{proof}

\begin{theorem} \label{th:one}
If
$$
   \alpha \le 2 \Pr \{ W \le m \},
$$
where $W \sim \BinomialDis(n, 1 / 2)$,
then the membership function of
the fuzzy confidence interval with coverage $1 - \alpha$ is one for
\begin{equation} \label{eq:one}
   X_{(m + 1)} < \mu < X_{(n - m)}.
\end{equation}
\end{theorem}
If the only $m$ satisfying the condition have $m + 1 \ge n - m$,
then the assertion of the theorem is vacuously true.
\begin{proof}
If \eqref{eq:one} holds, then
we have at least $m + 1$ data values below $\mu$ and at least $m + 1$ above.
Hence we also have at most $n - m - 1$ data values below $\mu$ and at most
$n - m - 1$ above, and the latent test statistic is at most $n - m - 1$.
Hence the fuzzy $P$-value has support
bounded below by $2 \Pr(W > n - m - 1) = 2 \Pr(W \le m) \ge \alpha$.
Hence we accept this $\mu$ with probability one,
and the fuzzy confidence interval is one at this $\mu$.
\end{proof}

Thus we see that if we chose $m$ to satisfy
\begin{equation} \label{quantile}
   2 \Pr \{ W < m \} < \alpha \le 2 \Pr \{ W \le m \},
\end{equation}
where $W \sim \BinomialDis(n, 1 / 2)$, then we only need to calculate
the membership function of the fuzzy confidence interval at the points
$X_{(m)}$, $X_{(m + 1)}$, $X_{(n - m)}$, and $X_{(n - m + 1)}$,
which need not be distinct, and on the intervals
$(X_{(m)}, X_{(m + 1)})$ and $(X_{(n - m)}, X_{(n - m + 1)})$,
if nonempty, on which the membership function is constant.
Thus there are at most 6 numbers to calculate.  Theorems~\ref{th:zero}
and~\ref{th:one} give the membership function everywhere else.

These six numbers are calculated by carrying out the fuzzy two-tailed test
for the relevant hypothesized value of $\mu$ and then
calculating $\Pr \{ P > \alpha \}$, where $P$ is the fuzzy $P$-value.

\subsubsection{No Ties}

Conventional theory says, when the distribution of the $X_i$ is continuous,
that the interval $(X_{(k)}, X_{(n + 1 - k)})$ is a
$1 - 2 \Pr\{ W < k \}$ confidence interval for the true unknown population
median, where $W \sim \BinomialDis(n, 1 / 2)$.

Suppose we want coverage $1 - \alpha$.  Then either one of the intervals
already discussed has coverage $1 - \alpha$ or there is a unique
$\alpha / 2$ quantile of the $\BinomialDis(n, 1 / 2)$ distribution.
Call it $m$.  Then
$$
   1 - 2 \Pr \{ W < m \} > 1 - \alpha > 1 - 2 \Pr \{ W \le m \}
$$
and the left hand side is the coverage of $(X_{(m)}, X_{(n + 1 - m)})$
and the right hand side is the coverage of $(X_{(m + 1)}, X_{(n - m)})$.
Thus a mixture of these two confidence intervals is a randomized confidence
interval with coverage $1 - \alpha$.  The corresponding fuzzy confidence
interval has membership function that is 1.0 on $(X_{(m + 1)}, X_{(n - m)})$,
is $\gamma$ on the part
of $(X_{(m)}, X_{(n + 1 - m)})$ not in the narrower interval,
and zero elsewhere, where $\gamma$ is determined as follows.
The coverage of this interval is
$$
   \gamma  \bigl[ 1 - 2 \Pr \{ W < m \} \bigr]
   + (1 - \gamma) \bigl[ 1 - 2 \Pr \{ W \le m \} \bigr]
$$
Setting this equal to $1 - \alpha$ and solving for $\gamma$ gives
\begin{equation} \label{gamma}
   \gamma = \frac{2 \Pr \{ W \le m \} - \alpha}{2 \Pr \{ W = m \}}
\end{equation}
and
the condition that $m$ is a unique $\alpha / 2$ quantile of $W$ guarantees
$0 < \gamma < 1$.

Note that this discussion agrees with Theorems~\ref{th:zero} and~\ref{th:one}.
We have simply obtained more information.  Now we know that the membership
function of the fuzzy confidence interval is $\gamma$ on the intervals
$(X_{(m)}, X_{(m + 1)})$ and $(X_{(n - m)}, X_{(n - m + 1)})$.
Under the assumption that the distribution of the $X_i$ is continuous,
the values at the jumps of the membership function do not matter because
single points occur with probability zero (assuming ``no ties'' results
from having a continuous data distribution).

\paragraph{Caveat} The above discussion is predicated on $m + 1 < n - m$,
which is the same as $m < (n - 1) / 2$.  This can only fail for ridiculously
small coverage probabilities.  If $m \ge (n - 1) / 2$, then either
$m = (n - 1) / 2$ when $n$ is odd or $m = n / 2$ when $n$ is even, and
$$
   1 - \alpha < 1 - 2 \Pr\{ W < m \}
   =
   \begin{cases}
   2 \Pr\{ W = (n - 1) / 2 \}, & \text{$n$ odd}
   \\
   \Pr\{ W = n / 2 \}, & \text{$n$ even}
   \end{cases}
$$
and either $1 - \alpha$ is very small or $n$ is very small or both.

In this case our fuzzy confidence interval is zero outside the closure
of the interval $(X_{(m)}, X_{(n - m + 1)})$ and is $\gamma$ on this interval,
where the coverage is now
$$
   1 - \alpha = \gamma  \bigl[ 1 - 2 \Pr \{ W < m \} \bigr]
$$
so $\gamma$ is now given by
\begin{equation} \label{gamma-foo}
   \gamma
   =
   \frac{1 - \alpha}{1 - 2 \Pr \{ W < m \}}
\end{equation}
and the ``core'' of the fuzzy confidence interval (the set on which its
membership function is one) is empty.

There is a somewhat more complicated form that manages to be
either \eqref{gamma} or \eqref{gamma-foo}, whichever is valid
\begin{equation} \label{gamma-cool}
   \gamma
   =
   \frac{\Pr \{ W \le m \opor W \ge n - m \} - \alpha}
   {\Pr \{ W = m \opor W = n - m \}}
\end{equation}

\subsubsection{With Ties}

We claim that the same solution works with ties, except that when ties
are possible
we must be careful about how the membership function of the fuzzy interval
is defined at jumps.  We claim the fuzzy interval still has the same form
with membership function
\begin{equation} \label{non-degenerate}
   I(\mu)
   =
   \begin{cases}
   0, & \mu < X_{(m)}
   \\
   \beta_1, & \mu = X_{(m)}
   \\
   \gamma, & X_{(m)} < \mu < X_{(m + 1)}
   \\
   \beta_2, & \mu = X_{(m + 1)}
   \\
   1, & X_{(m + 1)} < \mu < X_{(n - m)}
   \\
   \beta_3, & \mu = X_{(n - m)}
   \\
   \gamma, & X_{(n - m)} < \mu < X_{(n - m + 1)}
   \\
   \beta_4, & \mu = X_{(n - m + 1)}
   \\
   0, & \mu > X_{(n - m + 1)}
   \end{cases}
\end{equation}
where $m$ is chosen so that \eqref{quantile} holds.

Any or all of the intervals on which $I(\mu)$ is nonzero may be empty
either because of ties or, as mentioned in the ``caveat'' in the
preceding section because $m + 1 \ge n - m$.
Any or all of the betas may also be forced to be equal because the
order statistics they go with are tied.
We may have $X_{(m)} = - \infty$ and $X_{(n - m + 1)} = + \infty$,
in which case the corresponding betas are irrelevant.

\begin{theorem} \label{th:gamma}
When \eqref{non-degenerate} is as described above,
$\gamma$ in \eqref{non-degenerate} is given by \eqref{gamma-cool}.
\end{theorem}
If both of the intervals $(X_{(m)}, X_{(m + 1)})$ and
$(X_{(n - m)}, X_{(n - m + 1)})$ are empty, then the theorem is vacuous.
\begin{proof}
For $X_{(m)} < \mu < X_{(m + 1)}$
we have exactly $m$ data values below $\mu$ and exactly $n - m$ above,
and the latent test statistic is $n - m$.  The fuzzy $P$-value is uniform
on the interval with endpoints $2 \Pr\{ W < m \}$ and $2 \Pr \{ W \le m \}$
in case $m + 1 < n - m$ and otherwise uniform on the interval with
endpoints $2 \Pr\{ W < m \}$ and one.
The lower endpoint is the same in both cases, and we can write the upper
endpoint as $\Pr\{ W \le m \opor W \ge n - m \}$ in both cases.
Hence the probability the fuzzy $P$-value is greater than $\alpha$ is
given by \eqref{gamma-cool}, and this is the value of $I(\mu)$ for this
$\mu$.
The case $X_{(n - m)} < \mu < X_{(n - m + 1)}$ follows by symmetry.
\end{proof}

Now we know everything about the fuzzy confidence interval except for
the betas, which can be determined by inverting the fuzzy test (as can
the value at all points), so now we are down to inverting the test at
at most four points
$X_{(m)}$, $X_{(m + 1)}$, $X_{(n - m)}$, and $X_{(n - m + 1)}$, any or
all of which may be tied.  When they are tied, there is no simple formula
for the corresponding beta (the simplest description is the one just given:
invert the fuzzy test, the value is one minus the fuzzy decision).
Thus we make no attempt at providing such a formula for the general case.

However, we can say a few things about the betas.
\begin{theorem} \label{th:convex}
The fuzzy confidence interval given by \eqref{non-degenerate} is convex.
\end{theorem}
Convexity of a fuzzy set with membership function $I$ is the property
that all of the sets $\{\, \mu : I(u) \ge u \,\}$ are convex.
\begin{proof}
First consider the case when $X_{(m + 1)} < X_{(n - m)}$ so there are
points $\mu$ where $I(\mu) = 1$.  Then, since all of the betas are
probabilities, convexity only requires
$\beta_1 \le \gamma \le \beta_2$ if $X_{(m)} < X_{(m + 1)}$
and
$\beta_4 \le \gamma \le \beta_3$ if $X_{(n - m)} < X_{(n - m + 1)}$,
and the latter follows from the former by symmetry.

So consider $X_{(m)} = \mu < X_{(m + 1)}$.
Say we have $l$, $t$, and $u$ data values below, tied with, and above
$\mu$, respectively.
Then we have $l + t = m$ and $u = n - m$ and $t \ge 1$.
The latent test statistic is $n - m + T$,
where $T \sim \BinomialDis(t, 1 / 2)$.
The CDF of the fuzzy $P$-value has
knots $2 \Pr\{ W > n - m + t - k \}$ and
values $\Pr\{ T < k \}$, where $k = 0$, \ldots, $t + 1$.
We can rewrite the
knots $2 \Pr\{ W < m - t + k \}$.  The two largest
are $2 \Pr \{ W < m \}$ and $2 \Pr \{ W \le m \}$, which bracket $\alpha$.
The probability of accepting $\mu$ is thus the probability that a uniform
on this interval is greater than $\alpha$, which is $\gamma$ given by
\eqref{gamma} times $\Pr \{ T = t \}$.  Hence we do have $\beta_1 \le \gamma$.

Then consider $X_{(m)} < \mu = X_{(m + 1)}$.
With $l$, $t$, and $u$ as above,
we have $l = m$ and $u + t = n - m$ and $t \ge 1$.
Now the two smallest knots
are $2 \Pr \{ W < m \}$ and $2 \Pr \{ W \le m \}$.
The probability of accepting $\mu$ is now
$$
   \beta_2 = \Pr\{ T > 0 \} + \gamma \cdot \Pr \{ T = 0 \}
$$
so $\beta_2 > \gamma$.
That finishes the case $X_{(m + 1)} < X_{(n - m)}$.

We turn now to the case $X_{(m + 1)} = X_{(n - m)}$.  We conjecture
(still to be proved)
that $\beta_2 = \beta_3$ is the maximum of the membership function,
in which case convexity requires
$\beta_1 \le \gamma \le \beta_2$ if $X_{(m)} < X_{(m + 1)}$
and
$\beta_4 \le \gamma \le \beta_3$ if $X_{(n - m)} < X_{(n - m + 1)}$.
But we have already proved these, because the proofs above did not assume
anything about whether $X_{(m + 1)}$ was equal or unequal to $X_{(n - m)}$.
Thus the only issue is whether $\beta_2 = \beta_3$ is the maximum.

If $X_{(m)} < X_{(m + 1)}$, then we have
proved $\beta_1 \le \gamma \le \beta_2$ which implies that the maximum
does not occur to the left of $\beta_2 = \beta_3$.
If $X_{(m)} = X_{(m + 1)}$, then we have $\beta_1 = \beta_2 = \beta_3$,
which also implies that the maximum
does not occur to the left of $\beta_2 = \beta_3$.
By symmetry, the maximum also
does not occur to the right of $\beta_2 = \beta_3$.
That finishes the case $X_{(m + 1)} = X_{(n - m)}$.

We turn now to the case $X_{(m + 1)} > X_{(n - m)}$, in which case we
must have $m = n - m$, $\beta_1 = \beta_3$, and $\beta_2 = \beta_4$.
There are two cases to consider.  If $X_{(m)} < X_{(m + 1)}$, then
we conjecture that $\gamma$ is the maximum of the membership function,
in which case convexity requires $\beta_1 \le \gamma$ and $\beta_4 \le \gamma$,
but these have already been proved.
If $X_{(m)} = X_{(m + 1)}$, then $\beta_1 = \beta_2 = \beta_3 = \beta_4$ is
the only nonzero value of $I(\mu)$ and convexity holds trivially.
\end{proof}

\subsubsection{No Ties, Continued}

We can refine the calculations above in the most common case.
\begin{theorem} \label{th:half}
When there are no ties among the data, the membership
function \eqref{non-degenerate} is equal to the average of its left
and right limits.
\end{theorem}
This means
\begin{subequations}
\begin{align}
   \beta_1 = \beta_4 & = \gamma / 2
   \label{fran}
   \\
   \beta_2 = \beta_3 & = \gamma / 2 + 1 / 2
   \label{fred}
\end{align}
\end{subequations}
in the case $m + 1 < n - m$.
When $m + 1 = n - m$, we still have \eqref{fran},
but \eqref{fred} is replaced by $\beta_2 = \beta_3 = \gamma$.
When $m = n - m$, we still have \eqref{fran},
but \eqref{fred} is replaced by $\beta_1 = \beta_2 = \beta_3 = \beta_4$.
\begin{proof}
When there are no ties and $X_{(m)} = \mu$,
then we have $m - 1$ data values below, one tied with, and $n - m$
above $\mu$, respectively.
The latent test statistic is $n - m + T$,
where $T \sim \BinomialDis(1, 1 / 2)$.
The distribution function of the fuzzy $P$-value is continuous and
piecewise linear with knots $2 \Pr \{ W < m - 1 \}$, $2 \Pr \{ W < m \}$,
and $\Pr \{ W \le m \opor W \ge n - m \}$ and values at these knots
$0$, $1 / 2$, and $1$.
As always, the probability of accepting $\mu$ is $\Pr \{ P > \alpha \}$,
and by \eqref{quantile} we have
$2 \Pr \{ W < m \} < \alpha < \Pr \{ W \le m \opor W \ge n - m \}$,
so the probability of accepting $\mu$ is $\gamma / 2$, where
$\gamma$ is given by \eqref{gamma-cool}.

When there are no ties and $X_{(m + 1)} = \mu$
and $m + 1 < n - m$,
the distribution function of the fuzzy $P$-value is continuous and
piecewise linear with knots $2 \Pr \{ W < m \}$, $2 \Pr \{ W \le m \}$,
and $\Pr \{ W \le m + 1 \opor W \ge n - m - 1 \}$ and values at these knots
$0$, $1 / 2$, and $1$.
Again, $2 \Pr \{ W < m \} < \alpha < 2 \Pr \{ W \le m \}$,
so $\Pr \{ P > \alpha \} = \gamma / 2 + 1 / 2$,
where $\gamma$ is given by \eqref{gamma}.

When there are no ties and $X_{(m + 1)} = \mu$
and $m + 1 = n - m$, we have $\Pr \{ W \le m \} = 1 / 2$,
and the fuzzy $P$-value is uniformly distributed on the interval
with endpoints $2 \Pr \{ W < m \}$ and one.
and the probability of accepting $\mu$ is
is $(1 - \alpha) / [1 - 2 \Pr\{W < m\}]$, which is \eqref{gamma-foo}.

When there are no ties and $X_{(m + 1)} = \mu$
and $m = n - m$, there is nothing to prove because $\beta_1 = \beta_3$
and $\beta_2 = \beta_4$ follow from $m = n - m$.
\end{proof}

\subsection{One-Sided Confidence Intervals}

From the preceding, one-sided intervals should now be obvious.
We merely record a few specific details.  A lower bound interval
has the form
\begin{equation} \label{non-degenerate-low}
   I(\mu)
   =
   \begin{cases}
   0, & \mu < X_{(m)}
   \\
   \beta_1, & \mu = X_{(m)}
   \\
   \gamma, & X_{(m)} < \mu < X_{(m + 1)}
   \\
   \beta_2, & \mu = X_{(m + 1)}
   \\
   1, & X_{(m + 1)} < \mu < \infty
   \end{cases}
\end{equation}
where $m$ is chosen so that
\begin{equation} \label{quantile-low}
   \Pr \{ W < m \} < \alpha \le \Pr \{ W \le m \}
\end{equation}
and
\begin{equation} \label{gamma-low}
   \gamma = \frac{\Pr \{ W \le m \} - \alpha}{\Pr \{ W = m \}}
\end{equation}

Theorem~\ref{th:convex} and Theorem~\ref{th:half} still apply
and the betas can still be determined by inverting the fuzzy test.

\section{The Rank Sum Test}

Now we have two samples $X_1$, $\ldots$, $X_m$ and $Y_1$, $\ldots$, $Y_n$.
The test is based on the differences
$$
   Z_{i j} = X_i - Y_j, \qquad i = 1, \ldots, m, \ j = 1, \ldots, n.
$$
Let $Z_{(i)}$, $i = 1$, $\ldots$, $m n$ be the order statistics of the
$Z_{i j}$.  If we assume (1) the two samples are independent,
(2) each sample is IID, (3) the distribution of the $X_i$ is continuous,
and (4) the distribution of the $Y_j$ is the same as the distribution of
the $X_i$ except for a shift, then (i) the marginal distribution of the
$Z_{i j}$ is symmetric about the shift $\mu$ and (ii) the number of
$Z_{i j}$ less than $\mu$ has the distribution of the Mann-Whitney form
of the test statistic for the Wilcoxon rank sum test.

Thus all of the theory is nearly the same as in the preceding section
but with the following differences.
\begin{itemize}
\item The $Z_{(i)}$ now play the role formerly played by the $X_{(i)}$.
\item The Mann-Whitney distribution now plays the role formerly played by
    the binomial distribution.
\item We have $Z_{i j}$ tied with $\mu$ when $X_i$ is tied with $Y_j + \mu$.
    Infinitesimal jittering only breaks ties within one class of tied
    $X_i$ and $Y_j + \mu$ values.  The number of $Z_{i j} < \mu$ coming from
    such a class with $m_k$ of the $X_i$ tied with $n_k$ of the $Y_j + \mu$
    has the Mann-Whitney distribution for sample sizes $m_k$ and $n_k$.
    The total number is the sum over each tied class.
\end{itemize}

\subsection{One-Tailed Tests}

Let $\MannWhitneyDis(m, n)$ denote the Mann-Whitney distribution,
the distribution
of the number of negative $Z_{i j}$ when there are no ties.  The range of
this distribution is zero to $N = m n$.  It is symmetric, with center of
symmetry $N / 2$.  Let $W \sim \MannWhitneyDis(m, n)$.

If there are no ties and the observed value of the Mann-Whitney statistic
is $w$, then the fuzzy $P$-value for a upper-tailed test is uniformly
distributed on the interval with endpoints $\Pr\{ W < w \}$
and $\Pr \{ W \le w \}$.

If there are ties, then let $m_k$ and $n_k$ be the number of $X_i$ and the
number of $Y_j + \mu$ tied at the $k$-th tie value,
let $T_k \sim \MannWhitneyDis(m_k, n_k)$,
and let
$$
   T = T_1 + \ldots + T_K,
$$
where $K$ is the number of distinct tie values.  There is no ``brand name''
for the distribution of $T$.  The Mann-Whitney distribution obviously does
not have an ``addition rule'' like the binomial:
$\MannWhitneyDis(m_1 + \ldots + m_K, n_1 + \cdots + n_K)$ is \emph{clearly not}
the distribution of $T$.  We know the distribution of the $T_k$.
We must simply calculate the distribution of $T$ by brute force and ignorance
(BFI) using the convolution of probability vectors.
$T$ ranges from zero to
$$
   t = \sum_{i = 1}^K m_k n_k
$$
Denote its distribution function by $G$.

Now let $l$ be the number of $Z_{i j}$ less than $\mu$.  The latent test
statistic is $l + T$.  The fuzzy $P$-value has a continuous, piecewise linear
CDF with knots at $\Pr\{ W < l + k \}$ and values $G(k)$
for $k = 0$, $\ldots$, $t + 1$.

The lower-tailed test is the same (merely swap $X$'s and $Y$'s or use the
symmetry of the Mann-Whitney distribution and replace $l$ by $N - l - t$).

\subsection{Two-Tailed Tests}

The problem with two-tailed tests is much the same as we saw with the
sign test.  If $P$ is the one-tailed
fuzzy $P$-value, then the two-tailed
fuzzy $P$-value is $\min(2 P, 1 - 2 P)$.

\subsection{Confidence Intervals}

Everything is the same as with the sign test.  Merely replace the $X_{(i)}$
there with the $Z_{(i)}$ here and replace the binomial distribution with
the Mann-Whitney distribution.

\section{The Signed Rank Test}

Now we have one sample $X_1$, $\ldots$, $X_n$.
The test is based on the $N = n (n + 1) / 2$ Walsh averages
\begin{equation} \label{eq:walsh}
   \frac{X_i + X_j}{2}, \qquad i \le j.
\end{equation}
Let $Z_{(k)}$, $k = 1$, $\ldots$, $N$ be the order statistics of the
Walsh averages.  If we assume (1) the sample is IID
(2) the distribution of the $X_i$ is continuous,
(3) the distribution of the $X_i$ is symmetric about $\mu$,
then the distribution of the number of Walsh averages greater than $\mu$
is symmetric about $N / 2$ and has the distribution
of the test statistic for the Wilcoxon signed rank test.

Thus all of the theory is nearly the same as for the sign test
but with the following differences.
\begin{itemize}
\item The $Z_{(k)}$ now play the role formerly played by the $X_{(i)}$.
\item The signed rank distribution now plays the role formerly played by
    the binomial distribution.
\item We have $Z_{(k)}$ tied with $\mu$ in one of two cases.
\begin{itemize}
\item Some number $m$ of the $X_i$ are tied with $\mu$.  (Each $X_i$
    is a Walsh average, the case $i = j$ in \eqref{eq:walsh}.)
    Infinitesimal jittering only breaks ties within one class of tied
    values.  In this case the number of these $m$ Walsh averages that
    are greater than $\mu$ after jittering has the signed rank distribution
    with sample size $m$.
\item Some number $k$ of the $X_i$ are tied with each other and less than
    $\mu$, another number $m$ of the $X_i$ are tied with each other and
    greater than $\mu$, and $(X_i + X_j) / 2 = \mu$ for $X_i$ in the
    former class and $X_j$ in the latter.  Again,
    infinitesimal jittering only breaks ties within these classes of tied
    values.  In this case, the number of these $m n$ Walsh averages that
    are greater than $\mu$ after jittering has the Mann-Whitney distribution
    with sample size $k$ and $m$.
\end{itemize}
\end{itemize}
Thus we see that both the signed rank and Mann-Whitney distributions are
involved in calculating fuzzy $P$-values and fuzzy confidence intervals
associated with the signed rank test.

\subsection{One-Tailed Tests}

Let $\MannWhitneyDis(m, n)$ denote the Mann-Whitney distribution,
with sample sizes $m$ and $n$, and let $\SignRankDis(n)$ denote
the Wilcoxon signed rank distribution for sample size $n$.

If there are no ties, if $W$ is a random variable having
the $\SignRankDis(n)$ distribution, and if the observed value
of the signed rank statistic (the number of Walsh averages greater than $\mu$)
is $w$, then the fuzzy $P$-value for a upper-tailed test is uniformly
distributed on the interval with endpoints $\Pr\{ W < w \}$
and $\Pr \{ W \le w \}$.

If there are ties, then $T$ denote a random variable having the distribution
of the number of Walsh averages that were tied with $\mu$ before infinitesimal
jittering and are greater than $\mu$ after jittering.  Let $G$ denote
the cumulative distribution function of this distribution.

As in the preceding section, the distribution of $T$ is, in general, not
a brand name distribution.  Rather it is the sum of random variables
having Wilcoxon signed rank or Mann-Whitney distributions.

Now let $l$ be the number of observed Walsh averages less than $\mu$.
The latent test
statistic is $l + T$.  The fuzzy $P$-value has a continuous, piecewise linear
CDF with knots at $\Pr\{ W < l + k \}$ and values $G(k)$
for $k = 0$, $\ldots$, $t + 1$.

The lower-tailed test is the same except let $l$ be the number of observed
Walsh averages greater than $\mu$.

\subsection{Two-Tailed Tests and Confidence Intervals}

The issues here are similar to those for the signed test and rank sum test.

\begin{thebibliography}{}

\bibitem[Geyer and Meeden(2005)]{geyer-meeden}
Geyer, C.~J. and Meeden, G.~D. (2005).
\newblock Fuzzy and randomized confidence intervals and P-values
    (with discussion).
\newblock \emph{Statistical Science} \textbf{20} 358--387.

\bibitem[Thompson and Geyer(2007)]{thompson-geyer}
Thompson, E.~A. and Geyer, C.~J. (2007).
\newblock Fuzzy P-values in latent variable problems.
\newblock \emph{Biometrika} \textbf{94} 49--60.

\end{thebibliography}

\end{document}