THE GOSPEL PRIORITY PROBLEM REEXAMINED:

INVESTIGATING THE VERBAL AGREEMENT BETWEEN PARALLEL TEXTS,
UTILIZING THE FREQUENCY DISTRIBUTION OF STRINGS OF IDENTICAL CONSECUTIVE WORDS

A Paper by James W. Deardorff
Dec. 1997; Revised Aug. 1999, June 2004, Nov. 2004

Abstract:

The strong role that theological commitment played in bringing about the present consensus on Gospel priorities (Mark-Q hypothesis) is reviewed in Part I. A key point against the present consensus, noted by various scholars, is the apparent interdependence of Greek wording between Matthew and Luke's "Q" verses, there being excessive numbers of lengthy strings of consecutive identical duplicated words within parallel passages, up to 27 words long.

A technique, believed new, is described in Part II for assessing when such duplicated strings are due to chance arrangements of words during ordinary editing, or when due to a deliberate desire to make the fact of duplication be evident. In Part III the latter is found to be the case for the Matthew-Luke "Q" verses as well as for the verbal agreement of Matthew-Mark parallels. The technique is first explored against several actual test cases as well as against a synthetic test case.

A modified Augustinian hypothesis, which expands upon studies by Zahn and Vost, can explain the observations, while being compatible with the external evidence and other internal evidence. It is summarized in Part IV and here, where it is tested against critics' arguments against Matthean priority. It requires that theological commitment be abandoned and the Gospel writers be treated as possessing human motivations over the question of whether Jews or gentiles were more worthy to become disciples.

I. A Summary of the Gospel Priority Problem as Influenced by 19th-Century Theological Commitment
II. Use of Frequency Distributions of Duplicate Strings of Words in Parallel Passages: Theory and Tests of Verbal Agreement
III. Results from Frequency Distributions of Duplicate Word Strings in the Gospel Parallels
IV. A Solution Embracing Realistic Editorial Behavior

For the past three quarters of a century the synoptic gospel priority problem has been considered settled by a scholarly consensus that favors the two-document hypothesis, which views Mark as having been written first, and Matthew and Luke written independently of each other.1 However, numerous arguments disputing this consensus have been raised by scholars of the past century who decry the validity of this outcome, and many of their arguments still stand intact.2 Much of the evidence has been set out by a resurgence of the Griesbach school of thought, and as regards to Matthew having come before Mark, it has either not been capably refuted, or has been refuted using only weak or reversible argumentation.3

The same can be said for evidence indicating that Luke came after Matthew and is dependent upon Matthew, not just upon Mark.4 Evidence for this dependence is bolstered by scholars favoring the Augustinian hypothesis (Matthew-Mark-Luke), who have set forth many plausible, non-refuted arguments.5 Traditional Augustinian scholars tended to utilize the now largely outdated assumption that the Gospels were written by the men whose names are attached to them, or that Peter was the source for Mark, or both. However, it must be emphasized that these do not necessarily negate other portions of their argumentation. These particular assumptions will not be utilized here, and the "writer of a Gospel" will be referred to in that manner rather than by the name of a Gospel.

One tool available to the text critic has been to compare the Greek text of certain parallel Gospel passages and note how long a string of successive identical words may occur. The longest extends to 33 words. Such verbal harmony has helped lead to the agreed upon belief that Matthew and Mark are not independent, and so also perhaps Matthew and Luke.6 However, this tool has not heretofore been quantified in terms of frequency distributions that can be compared against a statistically expected distribution for independently edited works. This will be done here, in Part II. The results from Part III will suggest that the two-document hypothesis, and any refutations applicable to a modernized Augustinian hypothesis, need reexamination from within a framework devoid of theological commitment. Hence, a brief review of the Gospel priority problem with this result in mind is presented first.

I. A Summary of the Gospel Priority Problem as Influenced by 19th-Century Theological Commitment

From the first few centuries until the 18th, it was common knowledge that Matthew had come first, followed by Mark and then Luke, due to this having been expressed by Irenaeus, Origen and Augustine.7 In addition, the terse gleanings from Papias support this Augustinian viewpoint in two ways:

that Matthew had been first written in the Semitic tongue, as would be expected of the earliest gospel;
that parts of Mark were written in improper order, which implies improper order relative to a preexisting, written gospel.8

This earlier gospel, then, if not Luke, must have been Matthew.

In 1783 Johann Jakob Griesbach challenged this tradition, proposing on the basis of a particularized textual criticism that the order had been Matthew-Luke-Mark, with each evangelist having made use of the previous one's works. By this assumption he was able to explain in somewhat acceptable terms the peculiarity that Luke follows Mark's order and content where Mark deviates from Matthew's order, and frequently deviates from both where Mark follows Matthew. He could explain that the writer of Mark "almost never diverged from Matthew in order and seldom in content unless he was following the order and content of Luke."9

The Tbingen school that grew in general support of this hypothesis helped spark both opposition and creativity from less radical scholars. In 1820 both the Augustinian and Griesbach hypotheses were challenged by Johann Gottfried Herder, who, as explained in a near contemporary report by H. U. Meijboom, believed that "among writings of such close affinity as the gospels, the briefest must be considered the earliest document."10 As noted by W. R. Farmer, this idea built upon the still earlier idea that the first gospel had been an oral gospel, which would be expected to give rise to a shorter written gospel rather than to a longer one.11

In 1838 Christian Hermann Weisse applied a more scholarly approach to the same basic idea. He reconstructed an Ur-gospel based upon passages found in all three synoptics, though this guaranteed that it would most closely conform in extent and content to the shortest of the three, namely Mark.12 However, many others have since pointed out the weakness of this argument, since Mark could as easily be an altered abbreviation of Matthew, as was considered to be the case by Augustine. Yet, Weisse took a large step in framing the modern two-document hypothesis by postulating that in addition to the priority of Mark, or of an Ur-Marcus, the Logia referred to by Papias was a second, important source of sayings. With the latter assumption, he was following F.E.D. Schleiermacher.13

In the 1840-1860 period Eduard Ruess's studies largely convinced fellow French scholars of the priority of Mark; he also believed that "when two gospels are mutually dependent the earliest date must be attributed to the shorter one." He seems to have been among the first to attach a theological motivation to this belief, in wondering aloud why the writer of Mark, if he were not the first evangelist, would have omitted many of the discourses he found in Matthew.14 However, as noted by E. P. Sanders, theological commitment ought not play any role in seeking historical truth: "I have been engaged for some time in the effort to free history and exegesis from the control of theology; that is, from being obligated to come to certain conclusions which are predetermined by theological commitment."15 Such commitment on the part of a scholar prevents him from seriously considering more plausible alternative solutions, because his theology causes him to consider as implausible what would otherwise be considered plausible.

In the above instance, an alternative we shall come back to is that the writer of Mark was located in Rome.16 He was then writing his gospel for gentiles, and therefore omitted Judaisms he felt his future readers would not be interested in.17 Another is that this writer did not share Matthew's emphasis upon humility and pacifism, and did not wish to include admonitions and concepts that he did not agree with or understand. Although these may seem like obvious alternatives to consider, theological commitment has dominated in deterring their open discussion, culminating with Canon Streeter's remark in 1924 that the writer of Mark would have had to be a "lunatic," if he had copied from Matthew, for leaving out the Sermon on the Mount and most of the parables.18 Such pejorative stemming from a most influential scholar seems to have caused still later scholars who may have wished to publish discussions of the alternatives to avoid doing so, lest they be branded lunatics.

It should be mentioned that the well known New Testament scholar David Friedrich Strauss, whose influential books came out in 1835-1840, is believed to be indirectly responsible for aiding in the promotion of the Marcan-priority hypothesis over the Griesbach hypothesis.19 He had favored the Griesbach hypothesis, which places Matthew first, but considered much within the gospels to be myths, in particular, the post-crucifixion appearances of the risen Jesus. The backlash that this induced from theologically committed scholars was very strong, thereby helping to promote the opposing hypothesis of Marcan priority.

Around 1850 the influential Heinrich Ewald similarly supported Mark as being primary, and furthermore contended that Luke and Matthew were written independently.20 This latter belief would seem to have arisen out of the theological commitment that the Gospel writers were "divine men," as described by Eusebius,21 and so surely should have had no need to copy from one another's work. Yet the assumption of independence between Luke and Matthew has persisted within the two-document hypothesis to this day. Ewald did not attempt to hide his theological commitment, and felt that Biblical criticism, though like a storm, was necessary in clearing away the haze and darkness caused by the misinterpretations of other, perfidious and satanic, scholars.22 The latter he identified with the Tbingen school of biblical scholarship.

Weisse's form of the two-document hypothesis was endorsed by Heinrich Holtzmann, who rescued it from the likely defeat it would otherwise have suffered from the Tbingen school. Although Holtzmann assumed that an Ur-gospel underlay all four Gospels, he also had an Ur-Marcus in mind here, again in essence because Mark is the shortest of the Gospels.23 Later, Ur-Marcus came to be identified with Mark itself.

For the second document of the two-document hypothesis, Holtzmann drew upon the Logia of Papias, but assumed it contained material common to Luke and Matthew. This gave birth to the modern form of the two-document hypothesis. (Later, this common material was called "Q" and divorced from the testimony of Papias.) Advocates of the two-document hypothesis then learned to compromise a little with theological commitment, to the extent that Matthew and Luke were allowed to be dependent upon Mark, after Ur-Marcus became identified with canonical Mark. This was less radical than the Tbingen school's departure from theological commitment in that the Griesbachians treated Mark as being dependent upon both Luke and Matthew, and Luke also being dependent upon Matthew.

At this point another reason why the Augustinian school of thought was gradually abandoned needs to be mentioned. By the 19th century, if not before, it had been noticed that when parallel passages of Matthew and Mark dealing with the disciples are carefully compared, the Twelve consistently come out looking relatively dumb, fearful and disrespectful to their Lord in Mark as compared to Matthew. Even the Jewish people receive similar treatment. These are called "Mark's harder readings." This comparison is amply documented by Pierson Parker.24 If Matthew were considered primary to Mark, this would mean that the writer of Mark had intentionally made slight alterations in meaning to the Matthean text he copied/translated in order to cast the Jewish disciples and people in a bad light. Such a thought must have been intolerable to theologically committed scholars, and to others who wished to remain in good standing with their Christian colleagues and editors. It is still intolerable to some today, as evidenced by what one scholar said at a Society of Biblical Literature annual meeting, as reported by Daniel B. Wallace.24.1 This scholar confessed, "I cannot hold to Matthean priority because of Mark s decidedly harder readings." The readings are "harder" only because it is hard for theologically committed scholars to believe that the writer of a gospel could have been pro-gentile and anti-Jewish.

The problem could most easily be removed by assuming Mark held priority over Matthew; then it could be argued that the writer of Matthew had improved upon Mark's rough language, and had added reverential touches. I believe the problem was of such grave theological concern that scholars rarely dared to mention it in writing; hence we cannot be certain just when this repressed consideration became paramount.25 Due to its powerful emotional influence, however, it very likely contributed strongly to the growth in favor of the two-document hypothesis, and once that hypothesis dominated, the problem did not exist for its advocates. The most they then ever needed to say is: "Who can deny that Matthew enhances the disciples' and Jesus' images, both by upgrading their status and also by removing the unflattering warts which Mark for whatever reason retained?"26 Though the problem still existed for scholars who supported the Augustinian and Griesbach hypotheses, theological commitment or standards of "good taste" would prevent their opponents from mentioning it, and so no defense was needed.

If such commitment were set aside, however, students of the synoptic problem could then discuss reasons why the writer of Mark would have made such alterations to the Matthean text he incorporated into his gospel. Chief among these is the fact that Matthew denigrates gentiles in at least eight places, which vituperation the writer of Mark quite naturally either refused to include in his gospel or greatly alleviated.27 The problem is enhanced by the real possibility that the key counteracting passages of Mt 12:17-22 and 28:18-20, the latter of which contains a Trinitarian-like formula, were later additions to Matthew.28

If theological commitment and political correctness are both set aside, one could also question why the writer of Matthew would have been anti-gentile in his outlook. With the Jewish background commonly believed of this writer, probably having once been a Pharisee or scribe and possibly even a rabbi, the answer is straight-forward. In being a strict follower of the Torah, he was to treat his own people with kindness (Lv 19:18), while treating gentiles as dire enemies (Dt 7:1-8; Ex 23:22-24) or as slaves (Lv 25:43-46). The God of Israel would be looking out for the welfare of his own people, who dwell in Zion, and not of outsiders (Is 10:24-27). Thus it is not surprising that the writer of Matthew held these anti-gentile beliefs. Jn 4:9 indicates that they persisted at least until the time of writing of that gospel. That this writer could seem anti-Jewish, too (e.g., Mt 27:25), could easily reflect his great disappointment that, nearly a century later, the Jewish people had not come to recognize Jesus as Messiah.

Thus the suspicion is strong, from a fresh Augustinian viewpoint, that the writer of Mark was avenging Matthew's denigration of gentiles by in turn denigrating the Jewish disciples.29 Needless to say, this possibility is studiously avoided in our own era, as it attaches an anti-Semitic attitude to the writer of Mark, and invites the same to any scholar who suggests it. At best, one can only say that by this hypothesis the writer of Mark cast the Jewish disciples in a bad light so as to make gentiles look more capable by comparison and thus promote their discipleship, of which the writer of Matthew evidently did not approve. It would only have been human nature for the writer of Mark to have struck back at the writer of Matthew in some such manner.

Neither is the Griesbach school of thought free from its own theological commitment. It is because the editorial behavior of the writer of Luke seemed so inexcusable, if Luke came third and depended upon Matthew, that Griesbach postulated Mark to have come third; then a consequent editorial strategy underlying Mark could seemingly make sense and restore the editorial reputation of the writer of Luke. As later expressed by Streeter, the writer of Luke would seem to have been a "crank" if he had taken Matthean material not in Mark out of order and placed it into inappropriate or out-of-context places within his own gospel. 30 This viewpoint again must stem from a theological conviction that a gospel writer would never exhibit unsavory though realistic psychological behavior, and it tends to close off all debate of alternative explanations. Subsequent scholars did not wish to be labeled as "a person who supports a crank" by authorities within their own professional field.

The key alternative here is that the writer of Luke, like the writer of Mark, must have been appalled at Matthew's denigration of gentiles and statements to the effect that discipleship was reserved for the children of Israel. He would then have much preferred Mark over Matthew, though he apparently felt obliged to write a more universal gospel that required inclusion of much from Matthew that the writer of Mark had omitted (these inclusions would later become known as Q). The writer of Luke could then subtly express his feelings against Matthew by following Mark's order and content where Mark deviates from Matthew's order, and inserting his own special material and Matthean inclusions elsewhere, the latter in improper order and context.31 As a result, there would be relatively few agreements in order between Matthew and Luke, as observed. Although this editorial behavior, for someone with an "axe to grind," may seem quite plausible in hindsight, theological commitment or concerns of "good taste" have until 1992 prevented its discussion in the available literature on New Testament studies.32

Thus the basis for the Griesbach school stems from an attempt to avoid a very serious breach of theological commitment, causing this school to accept what they must have considered to be less serious breaches of same in departing from the Augustinian tradition. Not surprisingly, the school of thought that suffers the least disruption of theological commitment is the one that came to dominate, namely that of the two-document (or two-source) hypothesis. Its scholars could avoid the most serious breaches of faith already discussed by placing Mark first and maintaining that Luke and Matthew were written independently.33 This latter assumption is primarily what will be called into question by the remainder of this paper.

This brief review need not be amplified here or continued further towards the present time, as the key assumptions leading to the present two-document hypothesis were laid down over a century ago. Although modern scholars are less beholden to theological commitment than their counterparts back then, the danger that exists now is the strong desire for the present consensus to signify successful scholastic achievement and progress within the profession. This desire tends to preclude any serious reexamination of the subject, its tenuous assumptions and the alternatives, lest such be recognized as an admission that the consensus could well have been wrong in its essential conclusions over the past century.

II. Use of Frequency Distributions of Duplicate Strings of Words in Parallel Passages: Theory and Tests of Verbal Agreement

There are only a finite number of ways a particular thought or sentence is usually expressed, within a particular language. Thus if two different translators or editors are working independently from the same extensive text, either translating or editing it or both, there will be numerous occasions whereupon the same two or three or more successive words are by chance or of necessity utilized by both parties, within parallel portions of the two texts. Evidently, long strings of duplicated words will occur much less frequently than short strings.

Suppose that at a certain point the two translators or editors of the same basic text, working independently, have by chance or necessity used the same three words in a row in expressing a thought within that particular section of the text. The odds that they would each choose the same fourth word is some fraction less than unity. Then the odds that their choice for the fifth word would also coincide is again expected to be further diminished by some fraction less than unity, as the passage is completed and the next one within the parallel texts commences. Upon continuing this process through the entire text with all its parallel passages, one obviously finds that there is some average value for this fraction of diminishing odds, which we shall call f. The frequency distribution that expresses this reasoning is simply the geometric distribution, or the exponential distribution or curve, as will be explained.

Thus if I is the number of successive duplicated words in a string, and Y(I) is the number of strings of that particular length occurring in the one editor/translator's work, upon comparing it against parallel passages of the other's, we might postulate:

Y(I)= A*exp(-b*I) (1)

where the asterisk denotes multiplication, "exp" denotes e (=2.7183) raised in this instance to the -b*I power, b is the exponential decay coefficient and A is a coefficient of proportionality.

That is, for example, with

Y(4)/Y(3) = Y(5)/Y(4) = ... = Y(I+1)/Y(I) = f

we have, upon using Eq. (1) and dividing Y(I+1)/Y(I),

f = exp[-b*(I+1)]/exp(-b*I) = exp(-b),

which relates f to b and explains why an exponential distribution might be expected to prevail.

Even if a second editor/translator is not working independently of an earlier translation, an exponentially shaped distribution is still expected to result, for ordinary editing in any advanced language, though significant deviations may occur for I as small as 2 or possibly 3 when excessive numbers of particular short phrases may exist. However, Greek is an ideal language upon which to utilize the method, as word order is there not particularly important and many different choices of verb tense, voice, mood, etc., as well as word order, lie at the disposal of the editor/translator. In the case of editing rather than translation, the particular value of b within the exponential formula will depend primarily upon the particular style of the editor; if he feels the text is in poor shape and needs much editing, b will be relatively large. In the case of translation by two independent translators, b will be relatively large if the translators tend to use rather different vocabularies, or if one emphasizes meaning while the other emphasizes literalness. The feature to be emphasized here, therefore, is how well the distribution fits an exponential-like curve, especially as I becomes increasingly large, rather than the precise value of b.

The coefficient A is very roughly proportional to the number of duplicated words, S, within the word strings, which is S = Summation[I*Y(I)]. It is of lesser importance, except that the larger S is the better, so as to minimize sampling error (scatter in the derived data). The ratio S/N, where N is the total number of words in the text analyzed, will also be examined. However, N is dependent upon the manner in which the parallel passages to be analyzed are selected: whether simply by pericopes, or by excluding irrelevant verses within pericopes, or by excluding irrelevant halves of verses. Hence no particular importance will be placed upon A or S/N here.

Duplicate "strings" of only one word each, or sometimes two words each, are excluded from the present practice of the method because of their non-representativeness, and S will include the summation from I = 2 or 3 on up.

For normal editing/translating, the frequency distribution of strings of verbal agreement is expected to follow a curve of exponential type for all three of the following cases, which involve two texts, (a) and (b) that can be compared:

1) Original (a) ---> Edited version (b)

2) Original     ---> 1st translator (a)

      "         ---> 2nd translator (b)

3) Same as 2) except that 2nd translator utilizes (a) as well as the original

This expectation follows from the fact that the editor/translator would be making changes of many different kinds at very many different places within his manuscript in his attempt to produce a more accurate version or translation, and/or to utilize better word choices and/or better grammar, and/or to produce a final work that is more understandable or one that better follows a certain philosophy or theology, or one that is written for a different intended audience. The number of reasons for editing or achieving a differently worded translation is then sufficiently large that randomness can dominate the overall process. Since in case 3) the 2nd translator is also an editor, making some use of the 1st translator's work, we shall simply refer in all three cases to the later writer as editor/translator.

There is one vitally important exception, however. If this editor/translator purposely refrained in places from making editorial alterations over lengthy strings of text, purposely copying his source text word for word in a significant number of these places, the resulting frequency distribution would deviate from the exponential, as this would produce an excessive number of duplicated longer strings of words above and beyond the exponential distribution exhibited by the shorter strings. This case of purposeful copying of longer strings without editing them will turn out to be of special interest here, in comparing parallel Greek text of the synoptic gospels where copying of one sort or another is already suspected by most New Testament scholars to have taken place. However, the possibility of non-purposeful copying of sections of text through the editor/translator's inadvertent relaxation of a critical attitude at times during the course of his editing of a lengthy text will also be examined.

Testing the Procedure on Greek Text Translated from Hebrew

I am unaware that the method has been utilized previously, and so some tests are in order. To determine if the distribution actually lies close to the exponential for ordinarily edited/translated works, and to ascertain a rough value for b, the method is here first employed upon the Greek text of 2 Chr 35-36, all of Ezra and Neh 7:73-8:12, which overlap with the apocalyptic text of 1 Esdras (with the exception of 1 Esdr 2:16-5:7, which has no parallels.)34 These two texts are believed to relate back to different translations from Hebrew text, and whether or not the latter translator/editor borrowed words or phrases from the earlier one's Greek translation does not matter, if this was done in any ordinary manner.

Words are not considered duplicated here unless they are spelled exactly the same, and word strings common to parallel passages do not qualify unless the words are consecutive in both parallel strings and occur in exactly the same order. Punctuation is neglected (since it was not present in the original ancient texts), and also capitalization, if it was used only to signify the start of a sentence. 1 Esdr 5:29-34 and 8:13 were excluded from analysis, as they consist of long repetitive phrases like "the sons of X, the sons of Y...," all strung together, which have no counterpart within the Gospel parallels to be analyzed, and which might bias the result to some small degree. The results for this test case are shown in Table 1:

Table 1. Number of occurrences, Y, of duplicate strings of I words in the Septuagint's text of 2 Chr 35-36, Ezra and Neh 7:73-8:12, relative to the parallel verses in 1 Esdras.

I	Y	Exp. Curve
2	232	147.1
3	114	90.1
4	44	55.2
5	36	33.8
6	24	20.7
7	12	12.7
8	6	7.8
9	1	4.8
10	1	2.9
11	1	1.8
12	1	1.10
13	1	0.67
14	0	0.41
15	0	0.25
16	0	0.15
17	0	0.094
etc.

The column labeled "exp curve" is given by

Y(I) = 392 exp(-0.49*I) (2)

which means that for each word added to a string's length, the probability of occurrence of that string is reduced by the factor f = 0.613, on the average. Thus, these two extensive sets of parallel passages appear to support the method, with b being in the vicinity of 0.5 (0.49 here). For I = 2 the exponential curve fitted allows for a significant excess of occurrences due to so many nouns, both proper and common, being preceded by the article. Hence the I = 2 datum is essentially ignored here.

The observed value for Y(3) lies somewhat above the exponential curve; this might be due to 3-word phrases such as the preceding 2-word examples followed by "kai," as well as 3-word prepositional phrases, tending to survive editorial rearrangements. On the other hand, this did not occur in two cases out of six to be examined, and thus may not be statistically significant. And since the data for small I are the more numerous and contain less sampling error, relative to the mean, than for large I, I have retained the I = 3 data in this study. Hence I have fit the exponential curve within the region I > 2 up to but not including values of I for which Y becomes as small as 2 or less. (It is an accident of sampling error here that five consecutive values of Y = 1 occurred in this latter region, without being interspersed with any zeroes or two's.)

The above exponential was derived as a least-squares linear fit to the data after transforming the Y values to their natural logarithms. Although the values in Table 1 derived from the "exp curve" are not accurate in any absolute sense to the number of digits indicated, the decimal digits are retained so as to avoid any confusion between the fitted-curve values and the Y integers of the raw data.

Since the two texts involved here were written at points well separated in time, we may consider the behavior of only the later translator/editor, as in case 3): Did he translate and edit in an ordinary manner that did not involve purposely copying lengthy strings of text, in case he had utilized the earlier translation in such a way while undertaking his own translation? The results indicate that his translation or editing was indeed ordinary in the sense that there are no lengthy strings of text for I >13, for which the expected number of strings rapidly diminishes to less than one, i.e., to 0.4 and less from the exponential curve. That is, if there were ten other nearly identical test cases available for analysis, an expectation from the exponential curve of 0.4 for a duplicate word length of 14 words means that in about 4 out of 10 of the cases one such string of words would occur, but none in the other 6 cases.

One value of this kind of analysis is that it indicates that one or two strings of verbal agreement as long as 12 or 13 words is not unexpected in this case, since the expected number of occurrences there has fallen only to 1.10 and 0.67, respectively. However, if two or three duplicate strings of 17 or more words had occurred, this would have been more indicative of copying along with purposeful refraining from editing, as the estimated frequency of occurrence in those instances is only 0.094 or less.

For reference purposes, the value of S here was found to be 1494, while that for N (from parallels within the 1 Esdras text) was approximately 6660, giving a ratio of S/N = 0.22.

Examining the Procedure's Sampling Error

The region of moderate to large I is of greatest interest here, but that is precisely where sampling error becomes relatively greatest. Hence it is of interest to explore the sampling error in a synthetic case with the same frequency distribution. Suppose that you have a mixed bag full of 577 black balls and 423 red balls --1000 in all. You start drawing them out one at a time, replacing each in the bag and remixing it before drawing the next, and keeping account of how many black balls in a row you remove until drawing a red one. Overall, you expect the odds of drawing a black ball to be f = 0.577, and if you had just drawn two black balls in a row, your odds of drawing a third are still 0.577, etc. Upon repeating the procedure over and over, it must lead to a geometric or exponential probability distribution for the number of strings, Y, of consecutively drawn black balls as a function of the length of string, I. And as may be calculated from the relation between f and b, the associated exponential decay factor should be b = 0.55. (This value, rather than 0.49 for b, was chosen here from a preliminary analysis in which the 1 Esdras data had been fit by eye rather than by the method of least squares, and had then yielded the somewhat greater value of b. However, the difference is quite minor and the results remain of interest.)

The computer equivalent to this thought experiment was carried out, until the number of extracted black balls that occurred in strings of two or longer first exceeded S = 1495, approximately as in the 1 Esdras test case above. The results are given in Table 2:

Table 2. Number of occurrences, Y, of strings of length I of consecutively drawn black balls from a mixed bag of 577 black balls and 423 red ones, after 3113 drawings (after S=1497), from a typical realization. Also shown: the associated exponential curve, and estimated frequencies of occurrence of extrema.

I	Y	Exp. Curve	Top 1%	Top 5%	Bottom 5%	Bottom 1%
0	590	559
1	311	322.5
2	174	186.1	219	210	164	152
3	122	107.4	135	123	90	79
4	64	61.9	84	76	49	44
5	33	35.7	54	45	27	22
6	17	20.6	36	28	14	10
7	14	11.9	23	17	6.7	3.6
8	3	6.9	16	11	3.1	1.0
9	3	3.96	11	7.1	1.0	0.07
10	6	2.28	8.1	5.2	0.12	0
11	0	1.32	6.4	3.7	0	0
12	3	0.76	4.8	2.9	0	0
13	0	0.44	3.9	2.1	0	0
14	0	0.25	3.1	1.6	0	0
15	1	0.15	2.5	1.3	0	0
16	0	0.084	2.0	1.1	0	0
17	0	0.049	1.7	0.9	0	0
18	0	0.028
19	0	0.016
20	0	0.009
21	etc.	0.005

The exponential curve in this case could be fit using the data for I = 0, 1 and 2, as well as for larger I, and by utilizing the average from very large numbers of computer generated test cases.35. It is given by

Y(I) = 559 exp(-0.55*I). (3)

Table 2 shows the same general appearance or degree of scattering of values as Table 1, especially if both are presented in graphical form. In particular, it shows that a long string can easily occur (I = 15) through chance where the expected exponential frequency has dropped considerably below unity (to 0.15 in this case). The computer experiment was run another 99 times in order to obtain the firm estimate of the mean and estimates of the magnitude of the more extreme Y(I) values that occur less than or equal to 5% and 1% of the time, respectively. These are also presented in Table 2. Thus a string of 15 is expected somewhat more often than 5 times in 100; a string of 20 is expected to occur about one time out of 100, and cannot be totally ruled out. However, if such occurred in addition to two or three other similarly long strings within the same real data set, this would be cause for rejection of the assumption that the balls had been drawn randomly from the bag (or in the case of word strings of verbal agreement, a rejection of the assumption that ordinary editing of text had occurred without any deliberate copying).

A More Rigorous Computer Analogy

A further set of computer runs was made to test the merit of the computer analogy, in view of the fact that in reality the odds for the editor/translator's choice of his next word, in a developing string, to be a duplicate of the first editor/translator's word in general is different from word to word, and would not always have the same value of f. For example, if at some point the first editor/translator had translated a phrase as "the kingdom of God," the odds are quite high, perhaps 0.9, that having duplicated "kingdom" the second editor/translator will also duplicate "of God:" yet the odds are not unity since he may instead choose to write "of heaven" or "His" (kingdom of Him). On the other hand, at another point, if after having by chance duplicated several words the second editor/translator needs to express some particular verbal action, he may choose a different tense, voice or mood, or different but nearly synonymous verb, than what the first writer chose; or he may even transpose or interject another phrase or thought into that position. So at that point the odds might be only 0.25, say, of his choosing the same word as in the parallel text.

Hence in these further computer runs the odds themselves of choosing each black ball from the bag were allowed to vary randomly on each draw from the bag, between values of 0.254 and 0.90, with a uniform distribution of odds in between. In the long run, then, the odds overall were the average of 0.9 and 0.254, which is the same as the previous value: 0.577.

The results were statistically the very same as before, being well fit by the same exponential distribution with the same values of A and b. (This program and its simpler predecessor were kindly programmed for me, using C++, by Frank Griswold of Corvallis, Oregon.) Thus as long as randomness is involved, the exponential or geometric distribution is to be expected from ordinary editorial alterations, barring word strings that are extremely short. It doesn't matter that the particular odds for each decision is steady or varies from word to word depending upon their context; a single value of f and thus for b that applies overall is obtained, when randomness dominates. Hence the exponential frequency distribution is a basic one against which to compare actual frequency distributions of word strings of verbal agreement in the case of ordinary editing and/or translation, when a sufficiently large data sample exists.

Test of the Procedure on English Translations of Greek Text

As a further test using these same parallel passages from 1 Esdras and Ezra et al., their English translations were analyzed in exactly the same manner. 36 Results are shown in Table 3:

Table 3. Number of occurrences, Y, of duplicate strings containing I words each in two separate English translations of the Septuagint's text of 2 Chr 35-36, Ezra and Neh 7:73-8:12, relative to parallel verses in 1 Esdras.

I	Y	Exp. Curve
2	335	319.9
3	202	196.0
4	122	120.1
5	63	73.6
6	47	45.1
7	34	27.6
8	13	16.9
9	14	10.4
10	3	6.4
11	6	3.9
12	2	2.38
13	1	1.46
14	4	0.89
15	2	0.55
16	0	0.33
17	1	0.20
18	0	0.12
19	0	0.076
20	0	0.046
21	1	0.029
22	0	0.018
23	0	0.011
24	etc.	0.0067

Here the value of b for the least-squares exponential fit to the Y(I) data happens to be the same: 0.49, leading to

Y(I) = 900 exp(-0.49*I). (4)

Again the exponential or geometric distribution fits well for I up through 15. For I = 17 we do see an occurrence where the expectation is for only 0.2 events on the average, but since this could happen in about 1 case out of 5, it cannot be considered anomalous. Moreover, in this instance, and also in one I=14 and one I=15 case, strings of names and/or numberings were involved, which tend to resist editorial alteration. The one string at I=21 occurs where the odds against its presence given by the exponential curve is about 34:1. This does verge on the appearance of an anomaly. However, because there is only one such event at low odds we cannot conclude that it did not occur through ordinary editing/translation.37 Statistically, isolated rare events do (and must) occasionally occur, as noted from the computer-generated black-ball test cases previously analyzed.

For reference purposes, the total number of words in the parallel text of 1 Esdras in English was approximately N = 8498, yielding S/N = 0.363. This larger value of S/N, with consequent occurrence of greater numbers of duplicate strings in the English translations as compared to the Greek, may be due to the fewer variants of expression available in English than in Greek. Moreover, in the two Greek texts compared, the translators had used vocabularies that were significantly different; however, this seems no cause for disqualification of the data, as the different synoptic evangelists, whose texts will be compared soon, also utilized somewhat distinctive expressions or vocabularies.

Tests involving Two Different English Translations:
A. From a Psalter

The one text in this case was comprised of Psalms 92-101 from the RSV Bible. It derives from an English translation of the Jewish Masoretic Text of the 6th to 9th centuries, after the translation had evolved through several intermediate editions of various English Bibles, culminating in the RSV text of 1952. The other text is the Nikitas Psalter, 1981 edition, which is presently used by the Greek Orthodox Church. It stems from the (Greek) LXX text of these psalms.

Table 3.1. Number of occurrences, Y, of duplicate strings containing I words each in two different English translations of Psalms 92-101.

I	Y	Exp. Curve
2	65	68.8
3	45	46.4
4	30	31.4
5	20	21.2
6	19	14.3
7	9	9.6
8	6	6.5
9	1	4.4
10	0	2.97
11	2	2.00
12	0	1.35
13	1	0.91
14	2	0.62
15	0	0.42
16	0	0.28
17	0	0.19
18	1	0.13
19	0	0.086
20	0	0.058

The least-squares derived exponential, fit from the data of I = 3 through 8, is given by

151 exp(-0.393*I). (4.1) As can be expected in this case with only 110 verses available for analysis, the scatter becomes large for I > 8. The single replicated phrase of 18 words might be thought to be anomalous; however, if one sums the expected values from the fitted curve from I = 16 onwards, they amount to 0.86, close to unity. This is consistent with the exponential curve retaining validity far out in its tail, and with ordinary, objective editing/translating behavior occurring during the forming of the texts.

B. From the Book of Enoch

An exceptionally lengthy data set is available from two different translations of 1 Enoch, also known as the Ethiopian Enoch. Parts of it date back to very early Old Testament times of Enoch himself, when it had been written in Aramaic, with later transcriptions and editings of it having been made by custodians of the sacred Jewish literature. At some point it was believed to have been translated into Greek, and still later into Ethiopic its main text was discovered in Ethiopia around 1773. By 1821 its first translation from Ethiopic into English had been done by Richard Laurence. In 1912 another translation of it from Ethiopic into English was made by R. H. Charles. The verbal agreement between these two translations is compared here.

Table 3.2. Number of occurrences, Y, of duplicate strings containing I words each, from parallel passages within two different English translations of 1 Enoch.

I	Y	Exp. Curve
3	823	828.8
4	500	516.9
5	328	322.4
6	208	201.1
7	128	125.5
8	79	78.3
9	47	48.7
10	39	30.4
11	21	19.0
12	12	11.8
13	11	7.4
14	2	4.6
15	1	2.9
16	3	1.8
17	1	1.1
18	0	0.70
19	1	0.44
20	2	0.27
21	1	0.17
22	0	0.11
23	0	0.06
24	0	0.04

When plotted graphically, the scatter does not become noticeable until I = 10. The exponential was fit using values from I = 3 through 9, and is given by:

3415 exp(-0.472*I). (4.2)

The data do not extend appreciably further than what is predicted by the exponential tail. Although for I = 13 Y is about 4 more than is expected from the exponential, for I = 14 it is about 3 less. For I > 17 a total of 4 long duplicate strings occurred, while the exponential expectation is for 1.9. This is not inconsistent with what is expected from sampling error. Thus these data behave as if the second translation were essentially independent of the first one, although there may have been a tendency for the second translator to avoid duplicating words used by the first,37.1 it being very improbable that Charles was unaware of Laurence's translation. If so, this comparison demonstrates that the exponential curve nevertheless fits the data well, and that some longer strings are to be expected even if the later translator/editor tried to avoid replicating words used by the earlier translator. At the least, it shows that there is no tendency for any great excess of longer strings to occur well past the point where the exponential expectation does not predict them.

For this case, S/N was approximately 5990/36,000 = 0.17.

Test of the Frequency Distribution Resulting from Non-uniform Editing

A test case became available to this investigator for which the conditions of translation/editing were better known. It involved the TJ document described elsewhere in this web site. Numerous errors and imperfections had been noticed in the 1996 English version of this German text, and so two translators, to be designated here as R&P, both proficient in both German and English, undertook the task of making a thoroughly improved translation. They of course utilized the 1996 English version of the TJ as well as its 1996 German version, and so fall under the category 3) case of translating/editing discussed earlier. This case will be seen to be of particular interest to the synoptic-gospel hypothesis of Matthew having been written first in a Semitic tongue, then utilized (translated) by the writer of Mark as he formed his Greek gospel from it, then translated by the writer of Luke as he formed his Greek gospel from it and from Mark, and finally translated directly into the Greek form of Matthew very close to the form we have it today, with this latter translator making use of Mark and Luke. This hypothesis has been called the modified Augustinian hypothesis.

The first editors/translators in this case (R&P) undertook their task in the usual way, i.e., they did not refrain from revising any portions of the 1996 text out of any purposeful reason, but made improvements everywhere they felt they were needed. The R&P draft version of the TJ was then compared with the 1996 TJ for their verbal identities.38

Table 4. Number of occurrences, Y, of duplicate strings of I words in a row within parallel passages (in English) of the 1996 edition of the TJ and its interim revised version by R&P. Second column: expected value for a purely exponential frequency of occurrence (Exp. Curve); third column: expected value for a hyper-exponential frequency distribution (Hypo-Exp.).

I	Y	Exp. Curve	Hypo-Exp.	I	Y	Exp. Curve	Hypo-Exp.	I	Y	Exp. Curve	Hypo-Exp.
3	463	269.2	419.8	35	11	7.0	7.9	67	0	0.183	0.72
4	356	240.2	334.0	36	10	6.3	7.2	68	0	0.163	0.67
5	272	214.3	272.1	37	5	5.6	6.6	69	0	0.145	0.63
6	217	191.2	225.3	38	4	5.0	6.1	70	0	0.130	0.59
7	186	170.6	188.9	39	6	4.4	5.6	71	1	0.116	0.55
8	150	152.3	160.1	40	6	4.0	5.2	72	1	0.103	0.52
9	118	135.8	136.8	41	1	3.5	4.8	73	0	0.092	0.49
10	118	121.2	117.8	42	2	3.2	4.4	74	2	0.082	0.46
11	107	108.2	102.0	43	7	2.8	4.1	75	1	0.073	0.43
12	96	96.5	88.8	44	4	2.5	3.7	76	0	0.065	0.40
13	78	86.1	77.7	45	5	2.2	3.5	77	2	0.058	0.38
14	82	76.8	68.2	46	5	2.0	3.2	78	1	0.052	0.35
15	67	68.5	60.2	47	3	1.8	3.0	79	1	0.046	0.33
16	52	61.2	53.3	48	1	1.6	2.7	80	2	0.041	0.31
17	46	54.6	47.3	49	2	1.4	2.5	81	0	0.037	0.29
18	33	48.4	42.1	50	3	1.3	2.4	:	:	:	:
19	29	43.4	37.6	51	1	1.13	2.19	85	1	0.023	0.23
20	30	38.8	33.6	52	2	1.01	2.04	86	0	0.021	0.22
21	29	34.6	30.2	53	0	0.90	1.89	:	:	:	:
22	26	30.9	27.1	54	3	0.80	1.76	90	1	0.013	0.172
23	26	27.5	24.4	55	0	0.72	1.64	91	1	0.0118	0.162
24	24	24.6	22.1	56	0	0.64	1.52	92	0	0.0106	0.153
25	14	21.9	19.9	57	3	0.57	1.42	93	1	0.0094	0.144
26	18	19.6	18.1	58	1	0.51	1.32	94	0	0.0084	0.136
27	21	17.5	16.4	59	0	0.46	1.23	:	:	:	:
28	19	15.6	14.9	60	1	0.41	1.15	104	1	0.0027	0.0777
29	11	13.9	13.5	61	1	0.36	1.07	105	0	0.0024	0.0736
30	6	12.4	12.3	62	1	0.32	1.00	:	:	:	:
31	14	11.1	11.2	63	1	0.29	0.94	156	1	0.000007	0.0057
32	13	9.9	10.3	64	0	0.26	0.88	157	0	0.000006	0.0054
33	10	8.8	9.4	65	2	0.23	0.82	:	:	:	:
34	6	7.9	8.6	66	3	0.21	0.77		end	of data

The column labeled "Exp. Curve" is the least-squares exponential fit using the data from I = 3 through 40. It is given by

Y(I) = 379 exp(-0.114*I) (4a)

However, it is a poor fit to the data. For I = 3 through 7 it produces Y values significantly too small, while for I = 8 through 26 it produces Y values all too large (with only one exception). And for I > 60 it cannot account for more than about 3 occurrences of very long duplicate strings whereas 25 actually occur.

The editing/translating behavior that can account for this is one where the editors at times become disturbed over how poor the previous translation was, and then make very frequent editorial changes for a significant section of their text; then at later times they find other sections of the text to have been translated quite well, and so relax their degree of criticality for a while and make relatively few editorial alterations for some time thereafter. Such non-randomness in criticality causes an excess in Y(I) for small I, relative to the exponential curve, and also at large I, leaving a deficiency at intermediate I if the amount of data is great enough, and the overall degree of criticality not too great, for the effect to be noticeable. In this case these conditions appear to have been met, with the amount of data within duplicate strings greatly exceeding that of previous cases analyzed, except for the 1 Enoch comparison. This TJ case, then, appears to be one in which unintentional verbatim copying of rather lengthy strings of text at times did take place.

A key feature to notice, however, is that, except for the inevitable scatter in the data, the Y values do decrease monotonically with increasing I. This suggests that a modified exponential function could fit the data if the I coordinate is shrunk as in the following formula:

Y(I) = A exp(-b*I ⁿ)

with n < 1. Accordingly, a 3-point fit to the data was performed using smoothed Y values at I = 5, 30 and 62. The resulting curve, here called a hypo-exponential, is given by:

Y(I) = 1473 exp(-0.663*I ^0.581) (4b)

Its values are listed (under Hypo-Exp.) in the 4th, 8th and 12th columns of Table 4. The fit is seen to be quite satisfactory. In particular, at large I beyond the points of fitting it still yields satisfactory expectations for occurrences, upon lumping together the data in blocks of 5 or 10. Even the final lengthy string of 156 words is not too surprising an occurrence in that the sum of the hypo-exponential expectations for all I > 115 (within which region only the one event occurred) is about 0.92.

We now have a background within which to place whatever results we find from analysis of the verbal agreement between Gospel parallels. For ordinary translating with editing, an exponential frequency distribution should be found. If inattentive copying of longer strings occurs, a hypo-exponential form for the frequency distribution may occur. If deliberate copying of an excessive number of longer strings should occur, the resulting frequency distribution could be too abnormal to satisfy any exponential form.

III. Frequency Distributions of Duplicate Word Strings in the Gospel Parallels

Results for Mark versus Matthew

We turn first to the extensive parallels between Matthew and Mark.39 Again, words were not counted as duplicates unless spelled exactly the same, only strings of two or more consecutive words were counted, and all other text possessing parallels was included in the analysis. Secondary members of doublets and pleonasms were not included unless they were present as parallels in both texts. Results are shown in Table 5:

Table 5. Number of occurrences, Y, of duplicate strings of I words in a row within parallel passages (in Greek) of Matthew and Mark. Expected frequency of occurrence in last column.

I	Y	Exp. Curve
2	321	264.2
3	195	173.6
4	122	114.1
5	54	74.9
6	52	49.2
7	33	32.4
8	28	21.3
9	13	14.0
10	9	9.2
11	2	6.0
12	5	4.0
13	4	2.6
14	2	1.71
15	2	1.12
16	3	0.74
17	1	0.49
18	2	0.32
19	1	0.21
20	2	0.14
21	0	0.090
22	0	0.059
23	2	0.039
24	0	0.026
25	0	0.017
26	0	0.011
27	0	0.0073
28	0	0.0048
29	1	0.0031
30	0	0.0020
31	0	0.0013
32	0	0.00088
33	1	0.00059
34	0	0.00038
35	0	0.00025
36	etc.	0.00017

The least-squares exponential fit again made use of the data from I=3 up through I=10, since for I=11 Y had fluctuated down to 2. (The sampling error in ln(Y) increases strongly as Y approaches values as small as 2 or 3.) It is given by

Y(I) = 612 exp(-0.42*I) (5)

with the decay coefficient, b, again being in the vicinity of 0.5, namely 0.42. However, for I > 15 we see that the frequency of occurrence consistently exceeds the expected value from the exponential curve, and by increasing ratios, as I increases further. At I = 23, for which Y=2 strings occurred duplicated in Matthew and Mark, their occurrence exceeds the expected number within the surrounding block of I = 22 to 28 by a factor of 9. The very long string of I = 29 exceeds the expected (fractional) number of occurrences for ordinary editing within the surrounding block I = 28 to 32 by a factor of 89. The exceedingly long string of I = 33 exceeds the expected (fractional) number of occurrences for all I > 32 by a factor of 470.40

Anticipating, then, that a hypo-exponential distribution might have resulted through inadvertent copying of longer strings of text, a 3-point hypo-exponential fit was made using the data for I = 4 (Y=122), 10 (Y=9) and 20 (Y=1.0). The resulting curve is given by:

Y(I) = 4.82*10¹⁰ exp(-16.42*I ^0.135)

However, the small power to which I is raised (n = 0.135) and consequent huge values for b (16.42) and A ( 4.82*10¹⁰) indicate a poorly conditioned formula, in which the calculated values of Y are super-sensitive to the precise values of the three parameters. It furthermore yields values of Y at I < 3 that are much too large, while predicting that one or two more lengthy strings of duplicate words should occur for I > 35, which does not occur. This is due to the fact that the data for I < 13 appears to follow a normal exponential quite well, while the Y values for I > 14 abruptly become too large in comparison, requiring an ill-conditioned hypo-exponential to fit.

It appears, then, that the verbal agreement between these two gospels, for the longer strings, did not occur through the ordinary editorial process, and quite likely not through inadvertent copying either. The parameters derived for the hypo-exponential curve of the TJ test case produced a well-conditioned formula, suggesting that non-uniform criticality in editing only moderately alters the exponential shape from that expected when editorial decisions are made at random intervals during editing/translating. Instead, the much stronger distortion of the exponential shape for the frequency distribution of the Matthew-Mark parallels requires us to consider that one editor or translator, when making use of the other's work (or possibly of a common source), purposely duplicated certain lengthy strings of words; i.e., decided not to edit them to his usual taste. That one made use of the other's text is already a common conclusion, since the verbal agreement is striking even from casual inspection and has been long known. Although later scribal assimilation might be responsible for two or three of the longer strings, through inadvertent or purposeful connection of adjacent shorter strings, it seems wholly inadequate in explaining most of the excesses. Of particular interest is that for the smaller values of I (I < 13 here), the Y values do appear to be quite well represented by the exponential curve, as was found to hold for the 1 Esdras-Ezra (in Greek) test case for its larger I also. This suggests that the editor normally cannot refrain from making many minor alterations in wording in order to improve, clarify or more faithfully render his own interpretation or translation of the text, even if he purposely duplicates certain longer strings of text. Thus the purely exponential curve may hold approximately for the shorter strings of duplicate words even if an editor should purposely copy intact a significant number of long strings of words while making alterations on the text as a whole.

For the record, the total number of words in the parallel Matthean verses was N = 8980, while the sum of words duplicated within word strings of two or longer was S = 3419 = 0.38N.

Results for Luke versus Mark

Here only those verses of Luke were utilized that exhibit a stronger, or much stronger, parallel to Mark than to Matthew. That is, not only were all "Q" verses omitted from Luke here, but most of the triple tradition where it is not certain if Luke's version is closer to that of Mark than to that of Matthew, or perhaps is a conflation of the two.41 The results are shown in Table 6:

Table 6. Number of occurrences, Y, of duplicate strings of word length I in parallel passages (in Greek) of Luke and Mark.

I	Y	Exp. Curve
2	92	51.4
3	57	35.2
4	25	24.1
5	11	16.5
6	11	11.3
7	4	7.7
8	5	5.3
9	5	3.6
10	3	2.5
11	2	1.7
12	1	1.15
13	1	0.79
14	0	0.54
15	0	0.37
:
:
26	1	0.0056
27	0	0.0038
28	0	0.0026
29	1	0.00180
30	0	0.00123
31	0	0.00089
32 etc.

The least-squares fit exponential curve, whose values are listed, is in this instance given by

Y(I) = 110 exp(-0.38*I). (6)

The purely exponential fit is seen to be satisfactory for 3 < I < 15, and even somewhat further. However the two long strings encountered at I = 26 and 29 are exceptions. For I > 23 the exponential expectation is that only 0.046 events should occur, not 2, so that their occurrence defies the odds for a normally edited work by a factor of 2/0.046 = 43. They appear anomalous further because in the region 15 < I < 23 no strings occurred while 1 or 2 could be expected. Hence in this case we cannot be certain that they indicate any abnormal editing behavior in view of the possibility that scribal assimilation may have produced their verbal agreement. The string of I=26 occurs at Mk 1:24-25 = Lk 4:34-35, but this might have been separate adjacent strings of 9 and 16 words, or of 9, 12 and 3 words, if the presence of one or two intermediately positioned words of somewhat uncertain reliability, according to Nestle-Aland's 27th edition, had not been judged as reliable.42 The string of I=29 occurs at Mk 10:14-15 = Lk 18:16-17, but might similarly have consisted of two strings of length 12 and 17 words each if a certain word in Luke ("gar" within the witness D and others) had been deemed authentic.

Thus no strong case can be made here that the writer of Luke used abnormal editorial behavior when editing passages from Mark that he utilized, although the consensus of New Testament scholarship is that the writer of Luke did indeed make heavy use of Mark. However, without any motivation being apparent as to why he would choose just two particular lengthy stretches of text to copy without making his usual more frequent alterations, the two anomalies at I=26 and 29 seem best attributable to scribal assimilation. Or, perhaps one of them is due just to the vagaries of chance.

The ratio S/N in this case is 821/2850 = 0.29.

Results for Luke versus Matthew: "Q" verses

The 65 Q passages listed by E. Linnemann are now analyzed in the same manner.43 Results are shown in Table 7:

Table 7. Number of occurrences, Y, of duplicate of length I in parallel passages (in Greek) of Luke and Matthew: the "Q" verses.

I	Y	Exp. #1	Exp. #2
2	90	38.1	75.2
3	59	30.9	50.9
4	31	25.0	34.6
5	23	20.3	23.5
6	15	16.4	16.0
7	11	13.3	10.9
8	8	10.8	7.4
9	3	8.7	5.0
10	7	7.1	3.4
11	5	5.7	2.3
12	4	4.6	1.6
13	3	3.8	1.08
14	6	3.0	0.73
15	4	2.5	0.50
16	1	2.0	0.34
17	0	1.6	0.23
18	0	1.3	0.16
19	1	1.1	0.106
20	1	0.86	0.072
21	0	0.70	0.049
22	0	0.57	0.033
23	0	0.46	0.023
24	2	0.37	0.015
25	1	0.30	0.0105
26	2	0.24	0.0072
27	1	0.20	0.0049
28	0	0.16	0.0033
29	0	0.13	0.0022
30 etc.

Values from the exponential curve, Exp. #1, in Table 6 derive from the usual manner of least-squares fit:

Y(I) = 58 exp(-0.21*I) (Exp. #1)

We see that here the value of b, namely 0.21, is significantly smaller than in the other cases for which a pure exponential applied; also, the value of A = 58 is also unusually small for a data set of this size. As a result, the predicted Y value for I=3 is radically too small, and somewhat too small for I=4 also. On the other hand, the predicted Y value for I = 9 is radically too large. That is, the actual Y data, for 2 < I < 10 does appear to decrease in exponential fashion, but the least-squares fit for Exp. #1 does not reflect this because it made use of the deviant Y data for 9 < I < 16. If the criterion for fitting the exponential had been to utilize data only up to but not including the point in I where Y first drops to 3 or less (rather than 2 or less), the previous analyses would remain the same but a much different exponential curve would be obtained in this case. This is the curve labeled Exp. #2 in Table 6, which utilizes the Y data for 2 < I < 9. It is given by:

Y(I) = 164 exp(-0.39*I) (Exp. #2)

We now see that the exponential decay coefficient, b = 0.39, is close to the values previously obtained, and gives a very satisfactory fit to the data for I up to 10, where the deviant values of Y commence.

It may again be wondered if a hypo-exponential curve fit through 3 points would prove satisfactory. The fitting was attempted at the (I, Y) points of (3, 59), (8, 8) and (25, 1). However, no value of n in Y = A*exp(-b*Iⁿ) small enough to provide a fit could even be found; a hypo-exponential curve was even less applicable here than in the case of Matthew-Mark.

Thus it appears that we may consider the frequency distribution of curve Exp. #2 as being representative of normal editing, with abnormal editing being responsible here for the occurrence of excessive numbers of duplicate strings of words for values of I of 10 and greater. Although sampling error becomes significant in this range, we find seven adjacent sets of strings that are abnormally long (none shorter than the mean expectation for Exp. #1 or #2). The odds of this happening for a normally edited work are only 1 in 128, or 0.5 raised to the 7th power, since about as many values of Y should fall below the expected curve as above. For the region 17 < I < 24, two strings occur when the expectation with ordinary editing (Exp. #2) is for only 0.6 of a string. For I > 24, where 6 strings occur, the expectation is for only 0.036 of a string altogether.

If one editor/evangelist was using a preexisting Greek Gospel here, this would mean that he purposely copied down intact some 27 strings of length I > 9, avoiding any editing of those strings.44 This appears to be far too many instances to represent the results of later scribal assimilation. The conclusion, then, is that either the writer of Luke edited verses of Matthew in this abnormal manner, or the writer/translator of Matthew edited verses of Luke in this abnormal manner.

The ratio S/N for these Q verses was found to be 1412/4319 = 0.33, larger than might be expected in view of the large amount of recasting of the Q verses by one of the two gospel writers. Although 13 of the 65 passages exhibit only one short string of duplicated words or none at all, this is compensated by others that exhibit many.

It might be thought that the longer duplicated strings in the Matthew-Luke comparison involve the most impressive sayings, and were preserved intact for that reason. However, this is not the case. The longest, 27-word, string combines the praising of God for revealing things only to infants with the declaration that all things had been delivered to Jesus, which is a union of two concepts each difficult to fathom and therefore unlikely to have been previously preserved and known through oral tradition. One of the duplicate 26-word strings involves the cutting asunder of the wicked servant; the 25-word string involves the centurion saying how his soldiers and slaves obey his orders, with its last two words involving narration rather than discourse; the 24-word string involves God raising up children to Abraham out of stones. These are not among the more memorable sayings. On the other hand, many of the more memorable sayings within the Q verses did not survive heavy editing. For example, the three parallel beatitudes (blessed are the poor, the mourning and the hungering) together comprise 28 words in Matthew and 22 in Luke, but contain only two 3-word strings and one 2-word string common to both. The saying concerning "enter by the narrow gate" is some 42 words long in Matthew and 15 in Luke, yet only one string of 3 words is common to both. The "house on the rock" parable contains 95 words in Matthew and 83 in Luke, yet only two 3-word strings are common to both (excluding single-word duplications in all these).

This kind of analysis does not tell us if Luke depends upon Matthew or vice versa, but it does indicate how very unlikely it is that the writer of either gospel made use of a "Q" document, for why would both of them edit their copyings from "Q" in this abnormal manner? If abnormal editing of this kind occurred, it is more plausible to attribute it to one editor/translator than to two. Thus, the present findings support the analyses of those who, for other reasons, find that the Q hypothesis is untenable. B. C. Butler, for example, showed that by the two-document hypothesis (also known as the two-source hypothesis), the writer of Matthew would repeatedly have had to conflate Mark and Q over whole passages, requiring of him a "superhuman but futile virtuosity."45

The question of just how both the parallel verses between Matthew and Mark, and those between Matthew and Luke, could exhibit the same abnormal frequency distribution in their verbal agreements will be deferred to the next section. Presumably only one evangelist or translator would be responsible for such unusual editorial behavior, as already noted. Thus, not only does the two-document hypothesis fail to explain it, but so also the Griesbach hypothesis. However, the latter fails on other counts also, including why the Greek grammar of the third gospel, if it were Mark, would be by far the worst of the synoptic gospels.46

After the first version of this study was completed and made known on the web, a similar, earlier study by John Poirier came to my attention.46.1 He did not delve into the frequency distribution, though he also found that the excess of lengthy word-strings in the double tradition's parallels (Matthew-Luke) and in Matthew-Mark parallels pointed to Matthew as the middle term. Although he concluded from this that the writer of Matthew had made use of both Mark and Luke, he did not consider the alternative solution to be presented.

It is appropriate now to sketch out a scenario that is consistent with the findings from the above frequency distributions. It will involve alternatives mentioned in Part I and will be consistent with the external evidence.

IV. A Solution Embracing Realistic Editorial Behavior

In keeping with Part I's demonstration of how both the two-document and Griesbach hypotheses evolved out of the theological commitment to avoid impugning the standards of truth held by the gospel writers, we return to the Augustinian hypothesis. We observe that Butler's argumentation in favor of Matthew having first come out in the Semitic tongue, or in Aramaic as he specifically preferred, is quite compelling, as well as agreeing with the external evidence implied by Papias and stated by Irenaeus, Origen and Augustine.47

However, the similarity in Greek wording between Matthew and Mark caused Butler to assume that Mark was not written until after Matthew had been translated into Greek, at which time the writer of Mark is then supposed to have utilized Greek Matthew. Butler did explore a particular Augustinian theory due to J.M. Vost, however, which postulated that the writer of Mark utilized Aramaic Matthew, and that it was a later translator of Matthew into Greek who duplicated pieces of Mark's Greek text.48

Butler noted a strong argument in favor of Vost's theory: Mark shows some Aramaisms that are absent from parallel passages in Matthew.49 Yet Butler felt it was too complicated a theory, though his counter arguments seem very weak. His chief one was that "It is psychologically most improbable that the translator of Matthew, with a good Aramaic text to work from, and a perfectly adequate command of Greek, if we may judge from his performance in the non-Marcan sections of his gospel, should nevertheless deliberately adapt, in his rendering of the Aramaic Matthew, the poor Greek of Mark."50

However, Butler did not attempt to demonstrate that Matthew's Marcan parallels systematically demonstrate poorer Greek than its non-Marcan sections, and indeed a prime criticism of the Marcan priorists against Matthean precedence is that the writer of Mark, if coming second (or third), would not have "dumbed down" what he copied from Greek Matthew. And Butler failed to consider reasons other than Vost's as to why the translator would have chosen to replicate some lengthy strings of Greek wording within Mark. Vost's theory avoids the "dumbing down" criticism, which Butler's does not, and Butler's only recourse was to assume that Mark embodies rough Greek discourse due to having derived from Peter while (John) Mark used Matthew as a guide, when he and Peter were in Rome together in 50-60 A.D.51 Objections here to Butler, however, are:

It is not likely that Peter, the ex-fisherman, knew much Greek, since (John) Mark was his interpreter;
if any of the Gospels had been written that early, surely they would have been referenced long before several decades into the 2nd century, and referenced by name;
Mark does not really seem to have been written from the viewpoint of Peter;
within an Augustinian framework it is especially illogical to posit Peter as an important source of Mark, since so much of Mark follows Matthew's text so closely, excluding Mark's omissions of Matthean material;
if Peter had really been the source of the Gospel of Mark, surely that gospel would have borne Peter's name.52

Moreover, H.-H Stoldt has supplied a chapter full of reasons why Peter was not an informant for the writer of Mark.53

According to Butler, Vost had assumed that the translator of Aramaic Matthew into Greek had utilized some of Mark's Greek language either as a guide and convenience in translation, or out of respect.54 However, neither of these possibilities satisfied Butler, and neither will fit the present theory. The translator of Semitic Matthew evidently knew Greek better than did the writer of Mark, and so did not need Mark as a guide. And after reading how the writer of Mark omitted so much from Matthew, how he had made alterations that portrayed the disciples in a bad light and how he had placed many Matthean verses in the wrong order, the translator would have had little or no respect for him, psychologically speaking.55

So Butler seems correct in not following Vost on this point; however, it was probably subconscious theological commitment that prevented either of them from arriving at a plausible explanation. Before presentation of this explanation, it is important to consider how Luke fits into the picture from a fresh Augustinian viewpoint. Butler did not question whether Luke's dependence upon Matthew was upon Semitic Matthew or upon Greek Matthew. He simply assumed the latter, due to the strong verbal dependence between Matthew and Luke we have quantified in Part III combined with all the other indications that Luke depends upon Matthew. However, it is psychologically inconceivable that the writer of Luke, who preferred the pro-gentile Mark over the anti-gentile Matthew so much as to incorporate the Matthean verses Mark omits in improper context and order, and to follow Mark's order and content where Mark deviates from Matthew's order, would then have turned around and followed Matthew's Greek wording meticulously in some 27 instances, carefully duplicating strings of up to 27 words in a row. This is a most important point.

At this juncture we therefore examine the situation from the point of view of the translator of Matthew, if his translation had been made after Mark and Luke came out. By then it would only have been prudent of those of the provenance where Matthew originated to decide that a Greek version of Matthew was urgently needed. Otherwise, the first and primary Gospel was in grave danger of becoming forgotten or obsolete within gentile Christianity. This translator would naturally have desired that Matthew retain its primacy in authority, but since its Greek translation was being written only after Mark and Luke had already appeared, would Matthew still be regarded as primary? The writers of Mark and Luke had made so many omissions, alterations and additions to Semitic Matthew that future readers might treat them as quite different writings from Greek Matthew, if Greek Matthew did not use familiar wording, and therefore might credit Mark and Luke as being more primary and authoritative than Greek Matthew.

To avoid this, the translator of Matthew had one weapon at his disposal. He could show how similar Mark and Luke were to Matthew by purposely copying significant stretches of their Greek language in those places where they had made no significant alterations in meaning to Semitic Matthew. Then, as it must have been well known at the time that Semitic Matthew had predated Mark and Luke, there would be no question but that Matthew was the primary gospel, to be available from then on in Greek, and that Mark and Luke were little more than alterations of it. It should be kept in mind that at the time the Gospels were written, it could have been no secret which one had come first, which had come second, and which third.56

So also the translator of Matthew would have known he had no chance of pretending that his Greek Matthew had come before Mark and Luke; it would be common knowledge that it was written afterwards. But to recapitulate, with the continued spread of Christianity into gentile lands in the early 2nd century, this translator would likely have realized that Semitic Matthew was doomed to future obscurity. Hope for the primacy of Matthew to be retained by its late-appearing Greek version could best be achieved by making it as clear as possible that Mark and Luke were just altered versions of Matthew that their writers had extracted much from Semitic Matthew. Therefore the translator rendered this fact of copying as evident as possible by returning the favor: he replicated lengthy strings of Greek words wherever feasible from both gospels, which replication has been seen in Part III.

It is interesting to observe, then, that with the present hypothesis, not only do Mark and Luke both depend upon (Semitic) Matthew, but some of the detailed wording in Matthew in turn depends upon Mark and Luke. This behavior on the part of the translator of Matthew, relative to Luke, would have the added advantage of helping draw attention to many of the spots (in the "Q" verses) where Luke takes Matthean verses far out of context and which therefore might be mistaken as original Lucan material. Thus the translator of Matthew was reaffirming the primacy of Matthew, so that its Greek version within gentile lands would maintain the eminence that its Semitic version had held within Jewish quarters.

It is likely that it was during this process of translation of Semitic Matthew into Greek that the translator added the longer ending to Matthew (i.e., "...make disciples of all nations..."), which would offset Matthew's anti-gentile statements elsewhere. It is only common sense that if their gospel was to be promoted amongst gentiles and made available to them by being translated into Greek, it would have to be made less unfriendly to gentiles.

However, the eight or so verses overtly hostile to gentiles and to gentile discipleship in Semitic Matthew could not at that late date be removed in forming Greek Matthew, as their existence was too well known; little more could be done than to add a counteracting pro-gentile passage to the end. Quite likely a few other, more minor, alterations and additions (such as Mt 12:17-22) were made to Matthew by the translator, and these, together with the added pro-gentile ending, provided further incentive to phase out the original Semitic Matthew as quickly as possible in ensuing years. If later Aramaic texts of Matthew were desired, they would have had to be translated from the Greek.

Thus the translator of Matthew, in replicating various strings of words within Mark and Luke, did not do so as a guide or convenience in translating. However, he likely did utilize the LXX for this purpose at places where Semitic Matthew's rendition of a Hebrew scriptural verse did not deviate very much from it.

By the present hypothesis the writer of Luke much preferred Mark over Matthew, but this would not have been cause for him to utilize abnormal editing behavior of the type analyzed here, with respect to Mark. It is true he would follow Mark's order and content most closely where Mark deviates from Matthew's order, but he had good reasons for not desiring to purposely copy lengthy strings of Markan words without editing them. These include:

The writer of Luke needing to improve upon Mark's poor grammar and remove most of its redundancies or pleonasms,
the desire that his gospel appear distinct from Mark and not be regarded as any copy of it (which he achieved additionally by omitting a substantial chunk from Mark, by including his own special material, and by including much Matthean material out of context).57

With the present hypothesis, the writer of Mark is seen as also having made many of his minor deviations from the content of Semitic Matthew as changes for the sake of change. Frequent abbreviation of Matthew and deviation from its order, specifically in the area of Mark's parallels to material within Mt 8-11, and writing in Greek rather than in Aramaic or Hebrew, were not enough to ensure that Mark would not be a replication of Matthew. Addition of pleonasms throughout and touches of apparent vividness were needed in this attempt at creating an aura of priority, in which the writer of Mark was not successful until the 19th century. Thus the translator of Semitic Matthew into Greek was not working within any vacuum in seeing to it that Greek Matthew would retain the authority its Semitic version had held; he could easily discern the attempts that had been made by the writers of Mark and Luke to dislodge this authority.

An important feature of the present theory deserves mention in order to understand how so much of Matthew, if it deserves priority, nevertheless appears to be made up of redactions. Even if many of the criticisms against Matthew's genuineness are incorrectly based upon the assumption of Marcan priority, as many more are based upon considerations independent of that assumption.58 So how could Matthew enjoy priority? The key consideration here is that Papias's statement about Matthew can be interpreted as meaning that Semitic Matthew was based upon the Logia, and that the latter was a very extensive manuscript or set of scrolls of narrations and sayings, as emphasized by Linnemann, since "ta logia" can mean "what the Lord said or did," not just sayings."59 The five separate treatises that Papias wrote concerning the Logia support this interpretation of the Logia having been an extensive writing. The large number of redactions in Matthew are unlikely to have come about, then, by anything other than the Logia having been unacceptable or heretical in various respects, requiring much editing before becoming sanctionable. This would also explain why the Logia did not survive, nor any of Papias's five treatises on them except what Eusebius extracted from the latter and presented as examples, intended to make Papias seem foolish.60

With the present scenario, the writer of Matthew was a different person than its later translator into Greek. This writer appears to have been a late holdout for the anti-gentile sentiment characteristic of Pharisees, which he carried over into early Christianity, and would therefore not likely have wished to set his gospel into the Greek language nor to add its longer pro-gentile ending.

In conclusion, this modified Augustinian hypothesis makes use of the external evidence that the two dominant, more recent schools of thought have largely abandoned, as well as being consistent with the internal evidence. It explains the verbal agreements in Greek between the Matthew-Mark and Matthew-Luke gospel parallels in human terms. A drawback for some is that theological commitment is in no way upheld. However, the fact that such commitment dates back to the time of composition of the Gospels themselves explains very much. Among other things, the shameful editorial behavior portrayed here explains why so very little has survived about the origin of the Gospels: when, where, why and by whom they were composed, as this should have been a most glorious time in the growth of the early church about which much should otherwise have been written.61

For detailed discussion of the internal evidence in support of this hypothesis, however, and the ease by which arguments that have been utilized against the Augustinian hypothesis can be reversed, or shown to be inapplicable to its present modified form, the reader must turn to references already cited.

Endnotes

1. E.g., see Burnett Hillman Streeter, The Four Gospels (London: Macmillan, 1964); Werner Georg Kmmel, with P. Feine and J. Behm, Introduction to the New Testament, 14th ed., transl. by A. J. Mattill (Nashville: Abingdon Press, 1966); Donald Guthrie, New Testament Introduction, 3rd ed. (Downers Grove, Illinois: Inter Varsity Press, 1970); W. Barnes Tatum, The Quest of Jesus: A Guidebook (Atlanta, Georgia: John Knox Press, 1982); C. M. Tuckett, The Revival of the Griesbach Hypothesis (New York: Cambridge University Press, 1983). The two-document hypothesis is also called the two-source hypothesis, Mark and Q being the two sources.

2. See Arthur J. Bellinzoni, ed., The Two Source Hypothesis: A Critical Appraisal (Macon, Georgia: Mercer University Press, 1985), pp. 97-217.

3. B. C. Butler, "The Synoptic Problem," in A New Catholic Commentary on Holy Scripture, R. C. Fuller, L. Johnston and C. Kearns, eds. (Nashville: Nelson, 1969), pp. 815-821; Pierson Parker, "The Posteriority of Mark," in New Synoptic Studies (Macon: Mercer University Press, 1983), pp. 67 142; William R. Farmer, The Synoptic Problem: A Critical Analysis (Dillsboro, North Carolina: Western North Carolina Press, 1976); J. W. Deardorff, The Problems of New Testament Gospel Origins (Lewiston, New York: Edwin Mellen Press, 1992), pp. 121-156.

4. B. C. Butler, The Originality of St Matthew: A Critique of the Two Document Hypothesis (Cambridge: University Press, 1951); A. W. Argyle, "Evidence for the view that St Luke used St Matthew's Gospel," JBL 83 (1964), pp. 390-396; R. T. Simpson, "The major agreements of Matthew against Mark," NTS 12 (1965-66), pp. 273-284; E. P. Sanders, "The argument from order and the relationship between Matthew and Luke," NTS 15 (1969), p. 261; Farmer, Synoptic Problem, pp. 220-225; Michael Goulder, Luke A New Paradigm, vols. 1 and 2 (Sheffield, England: JSOT Press, 1989). Seven other scholars whose studies found Luke to depend upon Matthew are listed in Kmmel, Introduction to the New Testament, p. 50.

5. H. G. Jameson, The Origin of the Synoptic Gospels (Oxford: Basil Blackwell, 1922); John Chapman, Matthew, Mark and Luke: A Study in the Order and Interrelation of the Synoptic Gospels (New York: Longman's, Green and Co., 1937); John Wenham, Redating Matthew, Mark and Luke: A Fresh Assault on the Synoptic Problem (Downers Grove: InterVarsity Press, 1992).

6. Farmer, Synoptic Problem, p. 14 (in discussing what was already common knowledge to Herbert Marsh in 1798); Butler, "Synoptic Problem," p. 821.

7. Irenaeus, Adv. Haer. III.1.1; Eusebius, Ecclesiastical History (EH) V.8.1 5, VI.25.3-6; Augustine, De Consensu Evangelistarum I.3-4.

8. See Papias as quoted by Eusebius in EH III.39.14-16. Papias could not have meant improper order relative to an oral gospel, as the context was one of written gospels; moreover, there is no evidence that any lengthy oral gospel paralleling any of the Gospels ever existed. Mark's lack of order relative to Mt 8-11 stands out as being what Papias referred to.

9. Farmer, Synoptic Problem, p. 8. Farmer's review of the synoptic problem in his chaps. 1-4 is especially comprehensive and relatively impartial, and will be utilized extensively here.

10. Hado Uden Meijboom, A History and Critique of the Marcan Hypothesis 1835-1866 (Groningen, Neth.: University of Groningen, 1866), J. Kiwiet, transl. & ed. (Macon: Mercer University Press, 1993), p. 17. Meijboom will also be referenced frequently here, as his work appears to be quite unbiased and yet contemporary with the emergence of the Marcan hypothesis. Herder's observation that the vividness and freshness in Mark (in some places, at least) points to Marcan priority does have much merit, however. This falls out as a byproduct of the modified Augustinian hypothesis of Deardorff, Problems of Gospel Origins, chap. 5.

11. Farmer, Synoptic Problem, pp. 33-34.

12. Farmer, Synoptic Problem, p. 41.

13. Schleiermacher, "ber die Zeugnisse des Papias von unsern ersten Evangelien," Theologische Studien und Kritiken, 1832, pp. 735-768.

14. Meijboom, History and Critique, pp. 48-49.

15. Sanders, Jesus and Judaism, pp. 333-334.

16. As attested to by Clement of Alexandria and also Irenaeus, and reported by Eusebius, EH V.8.2 3, VI.14.5-7.

17. E.g., see Phillip Segal, "Aspects of Mark Pointing to Matthean Priority," in Farmer, New Synoptic Studies, pp. 187-190.

18. Streeter, Four Gospels, p. 158.

19. Stoldt, Hans-Herbert, History and Criticism of the Marcan Hypothesis, transl. Donald L. Niewyck (Macon, GA: Mercer Univ. Press, 1980), pp. 227-235.

20. Meijboom, History and Critique, p. 40.

21. Eusebius, Theophania, Bk. I, 24.

22. Meijboom, History and Critique, pp. 38-39.

23. Farmer, Synoptic Problem, pp. 40-41.

24. Parker, "Posteriority of Mark," pp. 67-142.

25. It would be naive to assume that no influential 19th century scholars were aware of this problem.

26. Eugene E. Lemcio, The Past of Jesus in the Gospels (Cambridge: Cambridge University Press, 1991), p. 49.

27. Deardorff, Problems of Gospel Origins, pp. 76-77.

28. George Howard, "A note on the short ending of Matthew," Harvard Theol. Rev. 81 (1988), p. 119.

29. The denigration may even extend to Jesus' Jewish "friends" in terms of their supposed belief that Jesus was "beside himself" (Mark 3:21).

30. Streeter, Four Gospels, p. 183.

31. The writer of Luke could furthermore express his disdain for Matthew by frequently contradicting Matthew's content.

32. Deardorff, Problems of Gospel Origins, pp. 93-108.

33. It might be thought that it was the followers of the Augustinian school who most strongly upheld the faith, by maintaining traditional views. This may have been correct within their own minds, but their reasons for brushing aside the problems that came to disturb the other two schools of thought do not hold up.

34. I thank George Howard for suggesting this test case. For the parallels within the 1 Esdras text I used Septuaginta, vol. 1, Alfred Rahlfs, ed. (Stuttgart: Wrttembergische Bibelanstalt, 1935). For the canonical LXX I used The Septuagint Version: Greek and English, Lancelot C. L. Brenton, ed. (Grand Rapids: Zondervan Publishing House, 1970). The latter source is believed to be two or three centuries more recent than the former; see Charles C. Torrey, The Apocryphal Literature (Hamden, Conn.: Archon Books, 1963).

35. I = 0 corresponds to not drawing any black balls in a row, which is to say, drawing a red ball that is followed by a black. Hence the coefficient A is given by the total number of such red balls drawn, allowing for some sampling error.

36. For the 1 Esdras translation I used The Apocrypha, Edgar J. Goodspeed, transl. (University of Chicago Press, 1938); for the English canonical texts I used the RSV Bible.

37. The single string of 21 words occurs at 1 Esdr 8:54-55 = Ezra 8:24-25. Within the Greek parallels, this area of text is overlapped by shorter duplicate strings of 2, 3, 2 and 11 words.

37.1 In comparing the two translations, one may notice that where Laurence used "saw" Charles often used "beheld," and similarly for: see/behold, went/proceeded, dwelling/habitation, transform/change, trouble or anxiety/affliction, extremities/ends, power/might, top/summit, path/way, conducted/led, exhibited/showed, arrive/come, obtained/acquired, extreme/great, wealth/riches, disclosed/revealed, rembrance/memorial, rise/arise, deeds/works, respecting/regarding, secure/safe, and others. The only difference between words that I ignored, besides capitalization, was the theological English often used by Charles, such as "ye," "thee" or "thou" for "you," "cometh" for "come" and the like.

38. The analysis started on verse 81 of the TJ's first chapter, thereby omitting the lengthy genealogy, which was not at the discretion of the editors/translators to alter, and continued to the end of its 36th chapter, for a total of 1703 verses averaging about 27 words per verse. Breaks between chapters were of course ignored.

39. The parallels utilized here are those listed in Deardorff, Problems of Gospel Origins, App. 2. Or see Section 2 of this website document. Some 468 verses in Matthew and 574 verses in Mark are involved. The Greek text used for this and the other NT analyses herein is the 21st edition of Eberhard and Nestle's Novum Testamentum Graece. The longer duplicate strings of the Matthew-Luke (Q) parallels have been checked against N-A 27, however, with no changes occurring.

40. These two very lengthy strings occur at Mt 16:24-25 = Mk 8:34-35, and at Mt 10:21-23 = Mk 13:12-14, respectively.

41. The Lucan verses utilized (along with their Marcan parallels) are: 4:31-44; 5:12-16,18-28,29b-35; 6:1-2,6-16,17b-19a; 7:36b-37; 8:4-15,17-23, 25-43a,45-56; 9:4-5,8-9,13-16,18,23,26-29,33-35,37-38,40,42 43b,46-50; 12:12,41; 17:2; 18:15-22,24,26,29-30,32-33,35,37-43, which encompass 40 pericopes.

42. These words involve "oida" ("I know") and "legon" ("saying") in Mark, the first of which is the more uncertain, according to Nestle-Aland's 27th edition, and a three-word phrase in Luke.

43. Eta Linnemann, "Is there a Gospel of Q?" BR 11 (Aug. 1995), p. 21. These stem from Siegfried Schulz, Griechisch deutsche Synopse der Q berlieferungen (Zrich: Theologischer Verlag, 1972), pp. 5f.

44. 27 is the approximate number of occurrences that lie above the exponential curve #2 for I > 9.

45. Butler, Originality of St Matthew, chap. 1.

46. See Deardorff, Problems of Gospel Origins, chap. 11.

46.1 Poirier, John Christopher, "New Items in Synoptic Interdependence," Master of Theology thesis, The Divinity School, Duke University, Durham, NC, 1993.

47. Butler, Originality of St Matthew, chap. 10. There is an opposing contention of J. Krzinger, Papias von Hierapolis und die Evangelien des N.T. (Regensburg, 1983), supported by B. Orchard and H. Riley in The Order of the Synoptics (Macon: Mercer University Press, 1987), pp. 198-199, that Papias's "in the Hebrew dialect" or "in the Hebrew language" actually meant "in a Hebrew style," which in turn meant "in Greek but with a Hebrew style." I believe this is reaching much too far to support one's position. In addition it assumes that the knowledge by Irenaeus and others of Matthew having been written in Hebrew or Aramaic depended only upon Papias's testimony instead of being more widely known at the time.

48. Joacobus M. Vost, De synopticorum mutua relatione et dependentia (Rome: Collegio Angelico, 1928). This theory actually dates back at least to Theodor Zahn, Introduction to the New Testament, v. 2 (Edinburgh: T. & T. Clark, 1909) 570-612, and is supported by H. G. Jameson, The Origin of the Synoptic Gospels (Oxford: Basil Blackwell, 1922) 23-24.

49. Butler, Originality of St Matthew, p. 159.

50. Butler, Originality of St Matthew, p. 160.

51. Butler, Originality of St Matthew, pp. 167-169.

52. However, the statements from Clement of Alexandria about Peter and (John) Mark in Rome, and a document Mark wished to disseminate, should not be ignored; such a document (Peter's "Memoirs") was known also to Justin Martyr of Rome (Dialogue with Trypho, chap. 106). A likely connection between Peter in Rome and the document that (John) Mark possessed there, which accords with the five listed objections, is given by Deardorff, Problems of Gospel Origins, chap. 3.

53. Stoldt, Hans-Herbert, History and Criticism of the Marcan Hypothesis, transl. Donald L. Niewyk (Macon, GA: Mercer Univ. Press, 1980), chap. 11 (pp. 185-200).

54. Butler, Originality of St Matthew, pp. 159-160.

55. Mark's verses that are out of order correspond primarily to the parallels of Mt 8-11. A plausible explanation for this is presented in Deardorff, Problems of Gospel Origins, chaps. 4-5.

56. Hence the tradition that Matthew had come first, written in the Hebrew tongue, must have been based upon fact. There was no theological commitment involved in such an undeniable fact at that time, and thus no motivation for this tradition to have been the result of a false rumor.

57. Luke's "Great Omission" of Mk 6:45-8:26 can also be explained largely on this basis: its writer had followed Mark's order sufficiently long up to that point that to continue doing so would render his gospel too similar to Mark. In addition, it is only a little before Mk 6:45 that Mark's order commences to follow Matthew's order quite rigorously, which the writer of Luke preferred not to do.

58. E.g., see Francis Beare, The Gospel according to Matthew (San Francisco, Harper and Row, 1981).

59. Lineman, "Is there a Gospel of Q?" BR 11 (Aug. 1995), p. 20.

60. Orchard and Riley, Order of the Synoptics, p. 195.

61.Also, the less than ethical editorial behavior of the various evangelists explains why not one of the synoptic gospels' writers mentions his own role in the writing or his dependence upon preceding gospels (except perhaps in Lk 1:1-4) and/or upon the Logia. Humility on the part of the evangelists does not explain this, as it is not consistent with their editorial actions.

Go Back to: Top of Document

Go Back to: Contents