<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<GmsArticle>
  <MetaData>
    <Identifier>mibe000104</Identifier>
    <IdentifierDoi>10.3205/mibe000104</IdentifierDoi>
    <IdentifierUrn>urn:nbn:de:0183-mibe0001041</IdentifierUrn>
    <ArticleType>Research Article</ArticleType>
    <TitleGroup>
      <Title language="en">A SAS&#47;IML algorithm for exact nonparametric paired tests</Title>
      <TitleTranslated language="de">Ein SAS&#47;IML-Algorithmus f&#252;r exakte nichtparametrische Tests f&#252;r gepaarte Beobachtungen</TitleTranslated>
    </TitleGroup>
    <CreatorList>
      <Creator>
        <PersonNames>
          <Lastname>Leuchs</Lastname>
          <LastnameHeading>Leuchs</LastnameHeading>
          <Firstname>Ann-Kristin</Firstname>
          <Initials>AK</Initials>
        </PersonNames>
        <Address>Department of Mathematics and Technique, RheinAhrCampus, Koblenz University of Applied Sciences, S&#252;dallee 2, 53424 Remagen, Germany<Affiliation>Department of Mathematics and Technique, RheinAhrCampus, Koblenz University of Applied Sciences, Remagen, Germany</Affiliation></Address>
        <Email>aleuchs&#64;rheinahrcampus.de</Email>
        <Creatorrole corresponding="yes" presenting="no">author</Creatorrole>
      </Creator>
      <Creator>
        <PersonNames>
          <Lastname>Neuh&#228;user</Lastname>
          <LastnameHeading>Neuh&#228;user</LastnameHeading>
          <Firstname>Markus</Firstname>
          <Initials>M</Initials>
        </PersonNames>
        <Address>
          <Affiliation>Department of Mathematics and Technique, RheinAhrCampus, Koblenz University of Applied Sciences, Remagen, Germany</Affiliation>
        </Address>
        <Creatorrole corresponding="no" presenting="no">author</Creatorrole>
      </Creator>
    </CreatorList>
    <PublisherList>
      <Publisher>
        <Corporation>
          <Corporatename>German Medical Science GMS Publishing House</Corporatename>
        </Corporation>
        <Address>D&#252;sseldorf</Address>
      </Publisher>
    </PublisherList>
    <SubjectGroup>
      <SubjectheadingDDB>610</SubjectheadingDDB>
    </SubjectGroup>
    <DatePublishedList>
      
    <DatePublished>20101215</DatePublished><DateRepublished>20120405</DateRepublished></DatePublishedList>
    <Language>engl</Language>
    <SourceGroup>
      <Journal>
        <ISSN>1860-9171</ISSN>
        <Volume>6</Volume>
        <Issue>1</Issue>
        <JournalTitle>GMS Medizinische Informatik, Biometrie und Epidemiologie</JournalTitle>
        <JournalTitleAbbr>GMS Med Inform Biom Epidemiol</JournalTitleAbbr>
      </Journal>
    </SourceGroup>
    <ArticleNo>04</ArticleNo>
    <Erratum><DateLastErratum>20120405</DateLastErratum><Pgraph>A minus sign has been added in Attach. 2 Long version of the macro, page 2: </Pgraph><Pgraph>shift &#61; j(adshort&#91;k&#93;,1,0) &#47;&#47; prob&#91;1:lng&#43;1</Pgraph><Pgraph><Indentation><Indentation><Indentation>-adshort&#91;k&#93;&#93;;</Indentation></Indentation></Indentation></Pgraph></Erratum>
  </MetaData>
  <OrigData>
    <Abstract language="de" linked="yes"><Pgraph>Bei der Auswertung von Versuchen sind h&#228;ufig zwei abh&#228;ngige Stichproben zu vergleichen. Eine M&#246;glichkeit der Auswertung besteht dann darin, f&#252;r jedes Paar die Differenz zu berechnen und anschlie&#223;end einen Einstichproben-Test auf diese Differenzen anzuwenden. In diesem Artikel werden drei verschiedene nichtparametrische Tests f&#252;r den Vergleich gepaarter Daten (d.h. Einstichproben-Tests) diskutiert. Ein in SAS&#47;IML geschriebenes Macro wird pr&#228;sentiert, das diese Tests als exakte Permutationstests berechnet. Das Macro basiert auf einem von Munzel &#38; Brunner (2002) <TextLink reference="1"></TextLink> vorgestelltem Shift-Algorithmus.</Pgraph></Abstract>
    <Abstract language="en" linked="yes"><Pgraph>A common problem in practice is the comparison of two dependent samples. One possibility to evaluate such data is to compute the difference for each pair and apply a one-sample test. In this paper we discuss three nonparametric tests for the comparison of paired samples (i.e. one-sample tests). We present a macro written in SAS&#47;IML to perform these tests as exact permutation tests. The macro is based on a shift-algorithm presented by Munzel &#38; Brunner (2002) <TextLink reference="1"></TextLink>.</Pgraph></Abstract>
    <TextBlock linked="yes" name="Introduction">
      <MainHeadline>Introduction</MainHeadline><Pgraph>A common problem in practice is the comparison of two dependent samples. Examples are before&#47;after comparisons, samples of the same subjects or samples of matched pairs of related subjects. </Pgraph><Pgraph>Here, we consider an example presented by Buck <TextLink reference="2"></TextLink>. A study was performed to investigate an antibiotic. Its efficacy was measured by means of the number of leucocytes in the urine. The study population consisted of 10 subjects. Each subject was treated with a daily dose of 4 g. For each subject the number of leucocytes was determined before and 4 weeks after the treatment. The observed values for the 10 pairs are listed in Table 1 <ImgLink imgNo="1" imgType="table"/>.</Pgraph><Pgraph>The distribution of the number of leucocytes is unknown. Therefore parametric methods to determine the efficacy of the antibiotic, e.g. a one-sample t-test using the differences (baseline &#8211; after treatment), might be not appropriate. Instead, the differences (Table 1 <ImgLink imgNo="1" imgType="table"/>) can be analysed using nonparametric tests such as the Wilcoxon signed rank test, the modification of Wilcoxon&#8217;s test according to Pratt, or a test based on the original data. In this article, we shall discuss all three tests and present a macro written in SAS&#47;IML to perform these tests as exact permutation tests. </Pgraph></TextBlock>
    <TextBlock linked="yes" name="Nonparametric tests for paired data">
      <MainHeadline>Nonparametric tests for paired data</MainHeadline><Pgraph>In the following we denote the two paired samples with <Mark2>X</Mark2> &#61; (<Mark2>x</Mark2><Subscript>1</Subscript>, <Mark2>x</Mark2><Subscript>2</Subscript>, &#8230; , <Mark2>x</Mark2><Subscript>n</Subscript>) and <Mark2>Y</Mark2> &#61; (<Mark2>y</Mark2><Subscript>1</Subscript>, <Mark2>y</Mark2><Subscript>2</Subscript>, &#8230;, <Mark2>y</Mark2><Subscript>n</Subscript>), thus the sample size is <Mark2>n</Mark2>. Moreover, let <Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2> &#61; <Mark2>x</Mark2><Mark2><Subscript>i</Subscript></Mark2> &#8211; <Mark2>y</Mark2><Mark2><Subscript>i</Subscript></Mark2> be the difference for the <Mark2>i</Mark2>-th pair (<Mark2>x</Mark2><Mark2><Subscript>i</Subscript></Mark2>, <Mark2>y</Mark2><Mark2><Subscript>i</Subscript></Mark2>), <Mark2>i</Mark2> &#61; 1, &#8230; , <Mark2>n</Mark2>. </Pgraph><Pgraph>One way to handle such paired data is to compute the difference of the two values for each pair. Then a test statistic based on these differences can be calculated. Due to the differences the two-sample problem is reduced to a one-sample problem. </Pgraph><Pgraph>For nonparametric tests there are often two alternative ways to carry out the test: one is to use the asymptotic distribution of the test statistic, the other is to perform an exact test, i.e. to use the exact null distribution of the statistic. In this paper we will focus primarily on the exact tests. To perform an exact nonparametric test, one needs to determine the exact null distribution (i.e. permutation distribution) of the test statistic.</Pgraph><Pgraph>To determine this distribution one needs to consider all possible permutations of the observed data. Since we consider paired data permutations are only possible within the pairs. Therefore the only way to permute is to swap the two values of a pair, i.e. to change the sign of the difference. Consequently, there are two possible permutations for one pair and 2<Mark2><Superscript>n</Superscript></Mark2> possible permutations for <Mark2>n</Mark2> pairs. Given all these 2<Mark2><Superscript>n</Superscript></Mark2> permutations the exact null distribution can be obtained by computing the test statistic for each permutation.</Pgraph><Pgraph>To be more precise, an exact nonparametric test for paired data consists of the following 4 steps irrespective of the used tests statistic <TextLink reference="3"></TextLink>.</Pgraph><Pgraph><OrderedList><ListItem level="1" levelPosition="1" numString="1.">The differences <Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2> have to be computed for the <Mark2>n</Mark2> pairs of data, and the test statistic has to be computed for the observed differences.</ListItem><ListItem level="1" levelPosition="2" numString="2.">For the <Mark2>n</Mark2> pairs all 2<Mark2><Superscript>n</Superscript></Mark2> possible assignments of plus and minus signs to the &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124;&#8217;s have to be obtained.</ListItem><ListItem level="1" levelPosition="3" numString="3.">The test statistic has to be computed for each of the 2<Mark2><Superscript>n</Superscript></Mark2> possible assignments.</ListItem><ListItem level="1" levelPosition="4" numString="4.">Then, the p-value can be computed as the proportion of assignments with a test statistic as or more supportive of the alternative than the observed value.</ListItem></OrderedList></Pgraph><Pgraph>Below we will concentrate on three different tests: Wilc<TextGroup><PlainText>oxo</PlainText></TextGroup>n signed rank test, Pratt&#8217;s modification of Wilcoxon&#8217;s signed rank test and a test based on original data. All three tests share the same null hypothesis, namely, the median <Mark2>&#952;</Mark2> of the differences is zero: <Mark2>H</Mark2><Subscript>0</Subscript> : <Mark2>&#952;</Mark2> &#61; 0. The alternative can be either one-sided (<Mark2>H</Mark2><Subscript>1 </Subscript>: <Mark2>&#952;</Mark2> &#62; 0, respectively <TextGroup><Mark2>H</Mark2><Subscript>1</Subscript><PlainText> : </PlainText><Mark2>&#952;</Mark2><PlainText> &#60; 0</PlainText></TextGroup>) or two-sided (<Mark2>H</Mark2><Subscript>1</Subscript> : <Mark2>&#952;</Mark2> &#8800; 0). Moreover, we assume that the differences <Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2> are independent and symmetrically distributed. Note that the difference between two exchangeable random variables has a symmetric distribution <TextLink reference="4"></TextLink>.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Wilcoxon&#39;s signed rank test">
      <MainHeadline>Wilcoxon&#39;s signed rank test</MainHeadline><Pgraph>At first assume that none of the differences is zero. The first step is to assign ranks to the absolute value of the differences &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124;: the smallest &#124;d<Mark2><Subscript>i</Subscript></Mark2>&#124; gets the rank 1, the secondly smallest &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124; gets the rank 2 until the largest &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124; gets rank <Mark2>n</Mark2>. In the presence of ties (i.e. &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124;&#61;&#124;<Mark2>d</Mark2><Mark2><Subscript>j</Subscript></Mark2>&#124; for some <Mark2>i</Mark2> &#8800; <Mark2>j</Mark2>) average ranks are assigned.</Pgraph><Pgraph>The statistic <Mark2>R</Mark2><Superscript>&#43;</Superscript> of the signed rank test is given by the sum of the ranks of the positive differences. One can compute the exact null distribution of this statistic using all 2<Mark2><Superscript>n</Superscript></Mark2> permutations as mentioned above. The exact p-value either one-sided or two-sided can be obtained from this null distribution. When the asymptotic distribution of the test statistic is used, one needs the standardised statistic (<Mark2>R</Mark2><Superscript>&#43;</Superscript> &#8211; <Mark2>E</Mark2><Subscript>0</Subscript>(<Mark2>R</Mark2><Superscript>&#43;</Superscript>))&#47;<Mark2>Var</Mark2><Subscript>0</Subscript>(<Mark2>R</Mark2><Superscript>&#43;</Superscript>), i.e. the standardised rank sum of the positive differences, which is asymptotically standard normal. Under the null hypothesis the expected value of the statistic is given by <ImgLink imgNo="1" imgType="inlineFigure"/>. If there are no ties the variance under the null hypothesis is given by <ImgLink imgNo="2" imgType="inlineFigure"/>. In the presence of ties this variance changes. A formula for the corrected variance is given by <TextLink reference="5"></TextLink>.</Pgraph><Pgraph>Up to now we assumed that none of the differences is zero. If there were differences equal to zero, Wilcoxon suggested discarding all zeros and applying the signed rank test to the reduced sample <TextLink reference="6"></TextLink>. In most applications this suggestion is applied, as it is in our macro for the Wilcoxon signed rank test.</Pgraph><Pgraph>The Wilcoxon signed rank test can be performed using the SAS procedure UNIVARIATE. Even in SAS version 9.2 the test is, by default, carried out based on the asymptotic distribution if the remaining sample size is larger than 20. If the remaining sample size is &#8804;20 an exact test is performed.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Pratt&#39;s modification of Wilcoxon&#39;s signed rank test">
      <MainHeadline>Pratt&#39;s modification of Wilcoxon&#39;s signed rank test</MainHeadline><Pgraph>In contrast to Wilcoxon&#8217;s signed rank test where all zeros are ignored the modification according to Pratt <TextLink reference="7"></TextLink> accounts for them. In the following <Mark2>n</Mark2><Subscript>0</Subscript> denotes the number of zeros in the sample of differences. Pratt suggests assigning each zero the rank zero. All non-zero differences are ranked according to their absolute values with values from <Mark2>n</Mark2><Subscript>0</Subscript> &#43; 1 to <Mark2>n</Mark2>, i.e. the smallest non-zero difference &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124; gets rank <Mark2>n</Mark2><Subscript>0</Subscript> &#43; 1, etc., until the largest &#124;<Mark2>d</Mark2><Mark2><Subscript>i</Subscript></Mark2>&#124; gets rank <Mark2>n</Mark2>. </Pgraph><Pgraph>Analogous to Wilcoxon&#8217;s signed rank test the statistic of Pratt&#8217;s modification is given by the sum of the ranks for positive differences. One can compute the exact null distribution of this statistic using all 2<Mark2><Superscript>n</Superscript></Mark2> permutations as mentioned above. The exact p-value either one-sided or two-sided can be obtained from this null distribution.</Pgraph><Pgraph>Note that due to the modification the expectation of the statistic under the null hypothesis changes: <ImgLink imgNo="3" imgType="inlineFigure"/>. The variance is the same as for Wilcoxon&#8217;s test. The standardized test statistic is still asymptotic standard normal distributed <TextLink reference="8"></TextLink>. The test according to Pratt is not implemented in SAS.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Test based on original data">
      <MainHeadline>Test based on original data</MainHeadline><Pgraph>Wilcoxon&#8217;s signed rank test as well as the modification according to Pratt is based on ranks. However, the null hypothesis that the median of the differences is zero can also be tested with a test based on the original data. Instead of assigning ranks and summing up the ranks of positive differences, for this test the statistic is computed by summing up the positive differences. Thus, this test uses the information about the sign of the difference and also the magnitude of the difference. Analogous to Wilcoxon&#8217;s signed rank test and Pratt&#8217;s modification one can compute the exact null distribution of this statistic using all 2<Mark2><Superscript>n</Superscript></Mark2> permutations as mentioned above. The exact <TextGroup><PlainText>p-value</PlainText></TextGroup> either one-sided or two-sided can be read from this null distribution. Note that several alternative forms of the statistic are also possible <TextLink reference="3"></TextLink>.</Pgraph><Pgraph>This permutation test is implemented in StatXact (Cytel Software Corporation, Cambridge, Mass.), but is not available within a SAS procedure. Note that StatXact also offers the Wilcoxon signed rank test.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Comparison and discussion of the tests">
      <MainHeadline>Comparison and discussion of the tests</MainHeadline><Pgraph>So far we only considered the null hypothesis <Mark2>H</Mark2><Subscript>0</Subscript> : <Mark2>&#952;</Mark2> &#61; 0. Note that all tests are likewise applicable if the null hyp<TextGroup><PlainText>ot</PlainText></TextGroup>hesis reads <Mark2>H</Mark2><Subscript>0</Subscript> : <Mark2>&#952;</Mark2> &#61; <Mark2>&#952;</Mark2><Subscript>0</Subscript>. In this case one only needs to subtract <Mark2>&#952;</Mark2><Subscript>0</Subscript> from the differences and then apply one of the presented tests.</Pgraph><Pgraph>As we already mentioned Wilcoxon&#8217;s signed rank tests drops zeros. However, ignoring zeros may lead to contradictory tests results <TextLink reference="7"></TextLink>. Namely, one sample is not sign<TextGroup><PlainText>if</PlainText></TextGroup>icantly positive while a more negative sample (all observations where reduced equally in magnitude) can be sign<TextGroup><PlainText>if</PlainText></TextGroup>icantly positive. For illustration the following sample is considered </Pgraph><Pgraph><Indentation>0, 2, 3, 4, 6, 7, 8, 9, 11, 14, 15, 17, &#8211;18,</Indentation></Pgraph><Pgraph>as presented by Pratt <TextLink reference="7"></TextLink>. The null hypothesis to be tested is <Mark2>H</Mark2><Subscript>0</Subscript> : <Mark2>&#952;</Mark2> &#62; 0. To apply Wilcoxon&#8217;s signed rank test the zero is discarded and then ranks are assigned. This leads to the signed ranks 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, &#8211;12. The observed sum of positive ranks is 66, the p-value of Wilcoxon&#8217;s signed rank test is 60&#47;2<Superscript>12</Superscript> &#61; 0.0171. </Pgraph><Pgraph>Now consider the decreased sample &#8211;0.5, 1.5, 2.5, 3.5, 5.5, 6.5, 7.5, 8.5, 10.5, 13.5, 14.5, 16.5, &#8211;17.5 with signed ranks &#8211;1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, &#8211;13. For this sample the sum of positive ranks is 77 and the p-value is  109&#47;2<Superscript>13</Superscript> &#61; 0.0133. Thus the first sample has the larger p-value, indicating that it is less significantly positive. In this example we decreased the sample by an amount of 0.5. One gets the same result for any amount less than 1.</Pgraph><Pgraph>Note that such a contradictory result is possible in the other direction, too. There are examples where one sample is significantly positive while an increased sample is not significantly positive. For the modification according to Pratt which accounts for zeros this problem does not occur. In the example mentioned above the p-value of Pratt&#8217;s test is 0.0120 for the first sample. Obviously, the p-value of the decreased sample (0.0133) does not change. For more details it is referred to Pratt <TextLink reference="7"></TextLink>. All <TextGroup><PlainText>p-values</PlainText></TextGroup> in this paragraph were calculated with the new SAS-Macro presented in this paper.</Pgraph><Pgraph>To compute the exact null distributions of the three statistics mentioned above we used a shift-algorithm presented by Munzel &#38; Brunner <TextLink reference="1"></TextLink> (originally developed by Streitberg &#38; R&#246;hmel <TextLink reference="9"></TextLink>). The main part of their paper was the presentation of a new nonparametric test for paired ordered categorical data. For such data none of the three tests (Wilcoxon&#8217;s signed rank test, the modification according to Pratt and the test based on original data) should be used since for them differences need to be computed.  For more details it is referred to Munzel &#38; Brunner <TextLink reference="1"></TextLink>. </Pgraph><Pgraph>For applications the question arises when to use which test. For ordered categorical data the test presented by Munzel &#38; Brunner <TextLink reference="1"></TextLink> can be recommended since the other three tests discussed here cannot be applied. However, the sign test <TextLink reference="3"></TextLink> is an alternative which can be applied for ordered categorical data. </Pgraph><Pgraph>For the other tests the null hypothesis states that the population median of the difference is zero. For this test problem ignoring zero differences is not appropriate <TextLink reference="10"></TextLink>. Therefore, we suggest Pratt&#8217;s modification over Wilcoxon&#8217;s signed rank test whenever zero differences are possible. A different question is whether a rank test such as Pratt&#8217;s modification or a test based on original data should be applied. The latter may be preferable when the underlying data are approximately normal. However, for non-normal distributions rank tests are relatively powerful. Therefore, rank permutation tests are still useful although more complicated permutation tests can be carried out with modern PCs <TextLink reference="11"></TextLink>.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Example">
      <MainHeadline>Example</MainHeadline><Pgraph>Consider again the example presented above in Table 1 <ImgLink imgNo="1" imgType="table"/>. To evaluate the data using Wilcoxon&#8217;s signed rank test all zeros are dropped and then ranks need to be assigned. Therefore the absolute values of the differences (last row of Table 1 <ImgLink imgNo="1" imgType="table"/>) need to be ordered and the smallest value gets rank 1, the secondly smallest value gets rank 2, etc. The ranks including the signs are displayed in Table 2 <ImgLink imgNo="2" imgType="table"/>. The tests statistic of Wilcoxon&#8217;s signed rank test is the sum of the positive ranks which is 34. </Pgraph><Pgraph>To compute the p-values of the exact tests all 2<Superscript>8</Superscript> &#61; 256 possible permutations need to be considered and for each permutation the statistic needs to be computed. We obtain the p-value for the one-sided alternative &#8220;median greater than zero&#8221; as the probability of those statistics which are at least 34 (observed statistic). The observed permutation contains only one negative value which is the secondly smallest rank. Consequently, there are overall three permutations with a statistic of at least 34, namely, the observed permutation and the two permutations with only positive ranks except the smallest one which is either positive or negative. Therefore the one-sided p-value is given by 3&#47;256 &#61; 0.0117.</Pgraph><Pgraph>When considering the two-sided alternative, permutations with a statistic of at most 2 are as supportive of the alternative as permutations with a statistic of at least 34, because 2 and 34 have the same distance to 18 which is the expected value of the rank sum under the null hypothesis. Multiplying the permutations with a statistic of at least 34 with &#8211;1 yield the permutations with a statistic of at most 2. Therefore the two-sided p-value is given by 6&#47;256 &#61; 0.0234.</Pgraph><Pgraph>Wilcoxon&#8217;s signed rank test simply ignores the zeros. In contrast Pratt&#8217;s modification accounts for them. All zeros are assigned the rank zero and for all other differences ranks are assigned as with Wilcoxon except for the fact, that the smallest value of the non-zero differences gets the rank <Mark2>n</Mark2><Subscript>0</Subscript> &#43; 1, the next one <Mark2>n</Mark2><Subscript>0</Subscript> &#43; 2, etc. where <Mark2>n</Mark2><Subscript>0</Subscript>  is the number of zeros. The ranks according to Pratt&#8217;s modification are displayed in Table 2 <ImgLink imgNo="2" imgType="table"/>. As for the signed rank test the statistic for Pratt is the sum of the positive ranks which is 48 in this example. The one-sided p-value (alternative: &#8220;median greater than zero&#8221;) is given by the proba<TextGroup><PlainText>bilit</PlainText></TextGroup>y of the permutations with a statistic of at least 48: 3&#47;256 &#61; 0.0117. </Pgraph><Pgraph>The expected value of the test statistic is 26. Therefore the two-sided p-value is 6&#47;256 &#61; 0.0234 (permutations with statistics that are at least 48 and at most 4). Note that for Pratt&#8217;s modification the number of permutations is 2<Superscript>8</Superscript> as well, because one cannot assign a sign to the zeros.</Pgraph><Pgraph>For the test based on original data the observed differences were used (no ranks need to be assigned). The tests statistic which is the sum of the positive differences is 22.5. The permutation distribution leads the one-sided p-value 3&#47;256 &#61; 0.0117 and the two-sided p-value <TextGroup><PlainText>6&#47;256 &#61; 0.0234</PlainText></TextGroup>. Note that, under the null hypothesis, the expected value of the statistic is 12.25.</Pgraph><Pgraph>This example yields the same p-values for all three tests. Note that there are examples where the p-values differ.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="The SAS-Macro for exact paired tests">
      <MainHeadline>The SAS-Macro for exact paired tests</MainHeadline><Pgraph>We will now present a macro written in SAS&#47;IML using a shift-algorithm <TextLink reference="1"></TextLink> to perform the three presented tests. Note that the test introduced by Munzel &#38; Brunner <TextLink reference="1"></TextLink> can also be carried out with the algorithm presented in <TextLink reference="1"></TextLink>. Besides the computation of ranks one important part of the program is the generation of the permutation null distribution. For all discussed tests the procedure to compute this distribution is identical. An observed data vector <Mark2>d</Mark2>  is given which contains the signed ranks of the differences (for Wilcoxon&#8217;s signed rank test and the modification according to Pratt) or the original observed differences (for the test based on original data). The statistic is the sum of the positive values in this vector irrespective of the test and can be written as <Mark2>s</Mark2>&#39; <Mark2>d</Mark2>  where  <Mark2>s</Mark2>&#39; is the transpose of the vector <ImgLink imgNo="4" imgType="inlineFigure"/>.</Pgraph><Pgraph>To get the permutation null distribution the probabilities <Mark2>g(t)</Mark2>&#47;2<Mark2><Superscript>n</Superscript></Mark2>  for all possible values  <Mark2>t</Mark2> of the statistic need to be determined, where <Mark2>g(t)</Mark2>  is the number of permutations which leads to a specific statistic <Mark2>t</Mark2>  (i.e.<Mark2>g(t)</Mark2>  is the number of elements in <ImgLink imgNo="5" imgType="inlineFigure"/>.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Shift-Algorithm">
      <MainHeadline>Shift-Algorithm</MainHeadline><Pgraph>The shift-algorithm presented by Munzel &#38; Brunner <TextLink reference="1"></TextLink> is an easy way to compute these probabilities or rather the numbers <Mark2>g(t)</Mark2>, but only for a vector  <Mark2>d</Mark2> of integer values. Since in our case the vector <Mark2>d</Mark2> can contain decimal numbers (e.g. average ranks), we multiply the data vector <Mark2>d</Mark2>  with 10<Mark2><Superscript>k</Superscript></Mark2>  for a sufficiently high value <Mark2>k</Mark2>, then we apply the shift-algorithm and finally divide by 10<Mark2><Superscript>k</Superscript></Mark2>. </Pgraph><Pgraph>In the following the shift-algorithm is explained by means of the vector  <Mark2>d</Mark2> &#61; (<Mark2>d</Mark2><Subscript>1</Subscript>, <Mark2>d</Mark2><Subscript>2</Subscript>, <Mark2>d</Mark2><Subscript>3</Subscript>) &#61; (2,&#8211;4,4) (i.e. <Mark2>n</Mark2> &#61; 3). First Munzel &#38; Brunner divide the numbers in <Mark2>d</Mark2> by their largest common factor so that the computational capacity is reduced. This leads to <Mark2>d</Mark2><Subscript>mod</Subscript> &#61; (1,&#8211;2,2). Table 3 <ImgLink imgNo="3" imgType="table"/> illustrates the shift-algorithm for the modified vector <Mark2>d</Mark2><Subscript>mod</Subscript> . The first column contains all integer values between the lowest possible value of the statistic <Mark2>t</Mark2> which is 0 and the biggest one which is 1 &#43; 2 &#43; 2 &#61; 5.</Pgraph><Pgraph>Note, that only one permutation, namely &#8220;all elements in <Mark2>d</Mark2> are negative&#8221;, yields a statistic <Mark2>t</Mark2> &#61; 0. Therefore the algorithm starts with the vector  (1,0,0,0,0,0) which finally will contain the numbers <Mark2>g(t)</Mark2>: i.e. at the beginning only the permutation &#8220;all elements are 0&#8221; is considered, which leads to the statistic <Mark2>t</Mark2> &#61; 0 .</Pgraph><SubHeadline>Step 1</SubHeadline><Pgraph>This starting vector is now shifted by the absolute value of  <Mark2>d</Mark2><Subscript>mod,1</Subscript> &#61; 1. The result (0,1,0,0,0,0)  (column 3 of Table 3 <ImgLink imgNo="3" imgType="table"/>) corresponds to the situation that <Mark2>d</Mark2><Subscript>mod,1</Subscript>  is positive and all other observations are negative, i.e. <Mark2>t</Mark2> &#61; 1 . The outcome of the first step (column 4) which is the sum of the starting vector and the shifted vector contains the numbers <Mark2>g(t)</Mark2>, if only the permutations where <Mark2>d</Mark2><Subscript>mod,2</Subscript> and <Mark2>d</Mark2><Subscript>mod,3</Subscript> are negative are considered, i.e. there is one possibility to get the statistic <Mark2>t</Mark2> &#61; 0, if <Mark2>d</Mark2><Subscript>mod,1</Subscript>  is negative and <Mark2>t</Mark2> &#61; 1  if <Mark2>d</Mark2><Subscript>mod,1</Subscript>  is positive.</Pgraph><SubHeadline>Step 2</SubHeadline><Pgraph>The result of step 1 is now shifted by the absolute value &#124;<Mark2>d</Mark2><Subscript>mod,2</Subscript>&#124; &#61; 2. This shifted vector (column 5) corresponds to the case that <Mark2>d</Mark2><Subscript>mod,1</Subscript>  has an arbitrary sign,  <Mark2>d</Mark2><Subscript>mod,2</Subscript> is positive and <Mark2>d</Mark2><Subscript>mod,3</Subscript>  is negative. Therefore the sum of column 4 and 5 (column 6) contains the numbers  <Mark2>g(t)</Mark2>, if only the permutations, where  <Mark2>d</Mark2><Subscript>mod,3</Subscript> is negative, are considered.</Pgraph><SubHeadline>Step 3</SubHeadline><Pgraph>Finally the resulting vector of step 2 is shifted by <TextGroup><PlainText>&#124;</PlainText><Mark2>d</Mark2><Subscript>mod,3</Subscript><PlainText>&#124; &#61; 2</PlainText></TextGroup> (column 7). The shifted vector contains the numbers <Mark2>g(t)</Mark2>, if only the permutations, where &#124;<Mark2>d</Mark2><Subscript>mod,3</Subscript>&#124;  is positive, are considered. If this vector is summed up with the result of step 2, it leads to the numbers <Mark2>g(t)</Mark2> to be determined: i.e.  (1,1,2,2,1,1) are the numbers of all <TextGroup><PlainText>2</PlainText><Mark2><Superscript>n</Superscript></Mark2><PlainText> &#61; 8</PlainText></TextGroup>  permutations which lead to the statistics (0,1,2,3,4,5).</Pgraph><Pgraph>We get the corresponding probabilities by dividing the numbers <Mark2>g(t)</Mark2> by 2<Mark2><Superscript>n</Superscript></Mark2>  (column 9).</Pgraph><Pgraph>Table 3 <ImgLink imgNo="3" imgType="table"/> shows the permutation-distribution of  <Mark2>d</Mark2><Subscript>mod</Subscript> as a result of the shift-algorithm. The aim was to determine the permutation distribution of the original vector <Mark2>d</Mark2>. This can be easily done by multiplying the first column of Table 3 <ImgLink imgNo="3" imgType="table"/> with the largest common factor of the values in <Mark2>d</Mark2>. The resulting distribution of <Mark2>d </Mark2>is displayed in Table 4 <ImgLink imgNo="4" imgType="table"/>.</Pgraph></TextBlock>
    <TextBlock linked="yes" name="Application of the macro">
      <MainHeadline>Application of the macro</MainHeadline><Pgraph>A short version of the macro which will be explained below is printed in the appendix (Attachment 1 <AttachmentLink attachmentNo="1"/>) and can be downloaded at the journal&#8217;s homepage. In order to use the macro properly the following remarks on the parame<TextGroup><PlainText>ter</PlainText></TextGroup>s are necessary.</Pgraph><Pgraph>As one can see the MACRO is called <Mark2>signedrank</Mark2> and three parameters need to be specified<LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph><Mark1>&#37;MACRO</Mark1><Mark2> signedrank(data, label&#95;diff, test, round&#61;4);</Mark2><LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph>The first parameter <Mark2>data</Mark2> specifies the dataset. The second parameter <Mark2>label&#95;diff</Mark2> is the name of the variable inside the dataset <Mark2>data</Mark2> which should be analysed. The third parameter <Mark2>test</Mark2> specifies the test to be performed. You can choose &#39;<Mark2>signed&#39;</Mark2> for Wilcoxon&#8217;s signed rank test, <Mark2>&#39;pratt&#39;</Mark2> for the modification according to Pratt and <Mark2>&#39;original&#39;</Mark2> for the test based on original data. </Pgraph><Pgraph>The parameter <Mark2>round</Mark2> is an optional parameter which is 4 by default. This parameter specifies the number of decimal places to remain after rounding. Rounding is only performed if the test based on original data is computed.</Pgraph><Pgraph>Note that in the long version of the macro (<TextGroup><PlainText>Attachment 2 </PlainText></TextGroup><AttachmentLink attachmentNo="2"/>) you can download on the journal&#8217;s homepage there is yet another parameter <TextGroup><Mark2>alternative</Mark2></TextGroup>. With this parameter the alternative is specified. Choosing <Mark2>alternative &#61; &#39;two&#39;</Mark2> leads to the output of a two-sided p-value. Choosing <TextGroup><Mark2>alternative</Mark2></TextGroup><Mark2> &#61; &#39;greater&#39;</Mark2> or <Mark2>&#39;less&#39;</Mark2> leads to the output of a one-sided p-value. Finally, choosing<Mark2> alternative &#61; &#39;all&#39;</Mark2> leads to the output of all three p-values.</Pgraph><Pgraph>Let us again consider the example of Buck <TextLink reference="2"></TextLink>. To use the macro the 10 values of the difference are saved in a data set.<LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph><Mark1>DATA</Mark1><Mark2> example;</Mark2><LineBreak></LineBreak><Mark2>INPUT diff &#64;&#64;;</Mark2><LineBreak></LineBreak><Mark2>CARDS;</Mark2><LineBreak></LineBreak><Mark2>0.8 3 2.3 4.3 4.8 4.5 0 2.8 &#8211;2 0</Mark2><LineBreak></LineBreak><Mark2>;</Mark2><LineBreak></LineBreak><Mark1>RUN</Mark1><Mark2>;</Mark2><LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph>Then the macro is invoked.<LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph><Mark1>&#37;signedrank</Mark1><Mark2>(example, diff, &#39;pratt&#39;);</Mark2><LineBreak></LineBreak><LineBreak></LineBreak></Pgraph><Pgraph>The output is presented here for Pratt&#8217;s test. The total sample size, the number of non-zeros, the observed value of the test statistic, and the p-value(s) are given (<TextGroup><PlainText>Table 5 </PlainText></TextGroup><ImgLink imgNo="5" imgType="table"/>). As you can see, we get the same two-sided p-value (0.0234) as above.</Pgraph><Pgraph></Pgraph></TextBlock>
    <TextBlock linked="yes" name="Notes">
      <MainHeadline>Notes</MainHeadline><SubHeadline>Acknowledgement</SubHeadline><Pgraph>The authors gratefully acknowledge support of this work by the Ministry for Education, Science, Youth, and Culture of Rhineland-Palatinate for a Competence Center in Biomathematics.</Pgraph><SubHeadline>Competing interests</SubHeadline><Pgraph>The authors declare that they have no competing interests.</Pgraph></TextBlock>
    <References linked="yes">
      <Reference refNo="2">
        <RefAuthor>Buck W</RefAuthor>
        <RefTitle>Der Vorzeichen-Rang-Test nach Pratt</RefTitle>
        <RefYear>1975</RefYear>
        <RefJournal>Meth Inf Med</RefJournal>
        <RefPage>224-30</RefPage>
        <RefTotal>Buck W. Der Vorzeichen-Rang-Test nach Pratt. Meth Inf Med. 1975;14(4):224-30.</RefTotal>
      </Reference>
      <Reference refNo="3">
        <RefAuthor>Higgins JJ</RefAuthor>
        <RefTitle></RefTitle>
        <RefYear>2004</RefYear>
        <RefBookTitle>An introduction to modern nonparametric statistics</RefBookTitle>
        <RefPage></RefPage>
        <RefTotal>Higgins JJ. An introduction to modern nonparametric statistics. Brooks&#47;Cole: Pacific Grove; 2004.</RefTotal>
      </Reference>
      <Reference refNo="4">
        <RefAuthor>Randles RH</RefAuthor>
        <RefAuthor>Wolfe DA</RefAuthor>
        <RefTitle></RefTitle>
        <RefYear>1979</RefYear>
        <RefBookTitle>Introduction to the theory of nonparametric statistics</RefBookTitle>
        <RefPage></RefPage>
        <RefTotal>Randles RH, Wolfe DA. Introduction to the theory of nonparametric statistics. Wiley: New York; 1979.</RefTotal>
      </Reference>
      <Reference refNo="5">
        <RefAuthor>Hollander M</RefAuthor>
        <RefAuthor>Wolfe DA</RefAuthor>
        <RefTitle></RefTitle>
        <RefYear>1999</RefYear>
        <RefBookTitle>Nonparametric statistical methods</RefBookTitle>
        <RefPage></RefPage>
        <RefTotal>Hollander M, Wolfe DA. Nonparametric statistical methods. 2nd ed. Wiley: New York; 1999.</RefTotal>
      </Reference>
      <Reference refNo="6">
        <RefAuthor>Wilcoxon F</RefAuthor>
        <RefTitle></RefTitle>
        <RefYear>1949</RefYear>
        <RefBookTitle>Some rapid approximate statistical procedures</RefBookTitle>
        <RefPage></RefPage>
        <RefTotal>Wilcoxon F. Some rapid approximate statistical procedures. New York: American Cyanamid Co; 1949.</RefTotal>
      </Reference>
      <Reference refNo="7">
        <RefAuthor>Pratt JW</RefAuthor>
        <RefTitle>Remarks on zeros and ties in the Wilcoxon signed rank procedures</RefTitle>
        <RefYear>1959</RefYear>
        <RefJournal>J Am Stat Assoc</RefJournal>
        <RefPage>655-67</RefPage>
        <RefTotal>Pratt JW. Remarks on zeros and ties in the Wilcoxon signed rank procedures. J Am Stat Assoc. 1959;54:655-67.</RefTotal>
      </Reference>
      <Reference refNo="8">
        <RefAuthor>Buck W</RefAuthor>
        <RefTitle>Signed-rank tests in the presence of ties (with extended tables)</RefTitle>
        <RefYear>1979</RefYear>
        <RefJournal>Biom J</RefJournal>
        <RefPage>501-26</RefPage>
        <RefTotal>Buck W. Signed-rank tests in the presence of ties (with extended tables). Biom J. 1979;21(6):501-26. DOI: 10.1002&#47;bimj.4710210602</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1002&#47;bimj.4710210602</RefLink>
      </Reference>
      <Reference refNo="1">
        <RefAuthor>Munzel U</RefAuthor>
        <RefAuthor>Brunner E</RefAuthor>
        <RefTitle>An exact paired rank test</RefTitle>
        <RefYear>2002</RefYear>
        <RefJournal>Biom J</RefJournal>
        <RefPage>584-93</RefPage>
        <RefTotal>Munzel U, Brunner E. An exact paired rank test. Biom J. 2002;44(5):584-93. DOI: 10.1002&#47;1521-4036(200207)44:5&#60;584::AID-BIMJ584&#62;3.0.CO;2-9</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1002&#47;1521-4036(200207)44:5&#60;584::AID-BIMJ584&#62;3.0.CO;2-9</RefLink>
      </Reference>
      <Reference refNo="9">
        <RefAuthor>Streitberg B</RefAuthor>
        <RefAuthor>R&#246;hmel J</RefAuthor>
        <RefTitle>Exakte Verteilungen f&#252;r Rang- und Randomisierungstests im allgemeinen c-Stichprobenfall</RefTitle>
        <RefYear>1987</RefYear>
        <RefJournal>EDV Biol Med</RefJournal>
        <RefPage>12-19</RefPage>
        <RefTotal>Streitberg B, R&#246;hmel J. Exakte Verteilungen f&#252;r Rang- und Randomisierungstests im allgemeinen c-Stichprobenfall. EDV Biol Med. 1987;18:12-19.</RefTotal>
      </Reference>
      <Reference refNo="10">
        <RefAuthor>Larocque D</RefAuthor>
        <RefAuthor>Randles RH</RefAuthor>
        <RefTitle>Confidence intervals for a discrete population median</RefTitle>
        <RefYear>2008</RefYear>
        <RefJournal>Am Stat</RefJournal>
        <RefPage>32-9</RefPage>
        <RefTotal>Larocque D, Randles RH. Confidence intervals for a discrete population median. Am Stat. 2008;62(1):32-9. DOI: 10.1198&#47;000313008X269738</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1198&#47;000313008X269738</RefLink>
      </Reference>
      <Reference refNo="11">
        <RefAuthor>Neuh&#228;user M</RefAuthor>
        <RefTitle>Efficiency comparisons of rank and permutation tests</RefTitle>
        <RefYear>2005</RefYear>
        <RefJournal>Stat Med</RefJournal>
        <RefPage>1777-8</RefPage>
        <RefTotal>Neuh&#228;user M. Efficiency comparisons of rank and permutation tests. Stat Med. 2005;24(11):1777-8. DOI: 10.1002&#47;sim.1939</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1002&#47;sim.1939</RefLink>
      </Reference>
    </References>
    <Media>
      <Tables>
        <Table format="png">
          <MediaNo>1</MediaNo>
          <MediaID>1</MediaID>
          <Caption><Pgraph><Mark1>Table 1: Example: number of leucocytes&#47;h</Mark1></Pgraph></Caption>
        </Table>
        <Table format="png">
          <MediaNo>2</MediaNo>
          <MediaID>2</MediaID>
          <Caption><Pgraph><Mark1>Table 2: Example: ranks for Wilcoxon&#8217;s signed rank test and the modification according to Pratt</Mark1></Pgraph></Caption>
        </Table>
        <Table format="png">
          <MediaNo>3</MediaNo>
          <MediaID>3</MediaID>
          <Caption><Pgraph><Mark1>Table 3: Shift-algorithm for (1,&#8211;2,2)</Mark1></Pgraph></Caption>
        </Table>
        <Table format="png">
          <MediaNo>4</MediaNo>
          <MediaID>4</MediaID>
          <Caption><Pgraph><Mark1>Table 4: Permutation distribution for (2,&#8211;4,4)</Mark1></Pgraph></Caption>
        </Table>
        <Table format="png">
          <MediaNo>5</MediaNo>
          <MediaID>5</MediaID>
          <Caption><Pgraph><Mark1>Table 5</Mark1></Pgraph></Caption>
        </Table>
        <NoOfTables>5</NoOfTables>
      </Tables>
      <Figures>
        <NoOfPictures>0</NoOfPictures>
      </Figures>
      <InlineFigures>
        <Figure format="png" height="36" width="106">
          <MediaNo>1</MediaNo>
          <MediaID>1</MediaID>
          <AltText language="en">Equation 1</AltText>
          <AltText language="de">Formel 1</AltText>
        </Figure>
        <Figure format="png" height="38" width="161">
          <MediaNo>2</MediaNo>
          <MediaID>2</MediaID>
          <AltText>Equation 2</AltText>
        </Figure>
        <Figure format="png" height="38" width="176">
          <MediaNo>3</MediaNo>
          <MediaID>3</MediaID>
          <AltText>Equation 3</AltText>
        </Figure>
        <Figure format="png" height="24" width="114">
          <MediaNo>5</MediaNo>
          <MediaID>5</MediaID>
          <AltText>Equation 5</AltText>
        </Figure>
        <Figure format="png" height="21" width="55">
          <MediaNo>4</MediaNo>
          <MediaID>4</MediaID>
          <AltText>Equation 4</AltText>
        </Figure>
        <NoOfPictures>5</NoOfPictures>
      </InlineFigures>
      <Attachments>
        <Attachment>
          <MediaNo>1</MediaNo>
          <MediaID filename="mibe000104.a1.pdf" mimeType="application/pdf" origFilename="MIBE-Macro-Short-Version.pdf" size="311725" url="">1</MediaID>
          <AttachmentTitle>Appendix: Short version of the macro</AttachmentTitle>
        </Attachment>
        <Attachment>
          <MediaNo>2</MediaNo>
          <MediaID filename="mibe000104.a2.pdf" mimeType="application/pdf" origFilename="MIBE-Macro-Long-Version&#95;korr.pdf" size="286218" url="">2</MediaID>
          <AttachmentTitle>Long version of the macro</AttachmentTitle>
        </Attachment>
        <NoOfAttachments>2</NoOfAttachments>
      </Attachments>
    </Media>
  </OrigData>
</GmsArticle>