intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "an experimental study of ambiguity and context"

Chia sẻ: Nghetay_1 Nghetay_1 | Ngày: | Loại File: PDF | Số trang:8

43
lượt xem
2
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Ambiguity is the common cold of the pathology of language. The logician recognizes equivocation as a frequent source of fallacious reasoning. The student of propaganda and public opinion sees in ambiguity an enormous obstacle to successful communication.

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "an experimental study of ambiguity and context"

  1. [Mechanical Translation, vol.2 no.2, November 1955; pp.39-46] an experimental study of ambiguity and context* Abraham Kaplan, Department of Philosophy, University of California, Los Angeles Ambiguity is the common cold of the pathology Two important restrictions on this study are to of language. The logician recognizes equivoca- be noted. tion as a frequent source of fallacious reason- In the first place, it deals with ambiguity of ing. The student of propaganda and public opin- single words, not homonyms (word types, not ion sees in ambiguity an enormous obstacle to word tokens1): the four letters "blow" actually successful communication. Even the sciences may constitute a single word, semantically and are not altogether free of verbalistic disputes grammatically speaking, or may be one of sev- that turn on confused multiple meanings of key eral homonyms — a) to send forth a current of t erms. air, b) a wind or gale, c) a blossoming or blooming, or d) a forcible act or effort. There Special importance attaches to ambiguity as a is no doubt that the setting usually allows us to result of the growing interest in the possibili- distinguish nouns from verbs, for example, ties of mass translation: rapid and routine hence among homonyms which are different translation of large bodies of material. The parts of speech. The problem here will be to simplest expedient, as a first approximation, distinguish the multiple senses of a single word. is word by word translation — a word for word For instance, the verb "blow" has several substitution carried out by essentially clerical senses: a) producing a noise by blowing, b) methods, very possibly by machine. But word panting or puffing, c) talking loudly or boast- for word substitution is hardly usable when the fully, and so on. These are related senses, and words of both languages are even moderately as a group quite distinct from the senses of the ambiguous. homonym "blow" which means "to blossom." It is a familiar fact that ambiguity of isolated The ambiguity with which this study is con- words is reduced by the contexts of their occur- cerned is thus more subtle than homonymy. rence. The total behavioral situation in which Whatever analysis is to be given of the distinc- language functions is decisive in determining tion between homonyms and single words, it is what will be communicated. For many pro- reasonable to suppose that the effect of context blems, however, (and in particular, that of on homonym-ambiguity is more marked than mass translation), the behavioral situation is that of the single-word-ambiguity here dealt not accessible. The 'context' (itself an ambi- with. guous word) must here be taken to consist of A second restriction on the study is this. It is the verbal setting in which the word to be in- not concerned with what ambiguity actually oc- terpreted occurs, i.e., the other words with curs in written material. The attempt is to de- which it is being used. termine the reduction of ambiguity by context, and not the actual frequencies with which ambi- The problem of this study is to determine to guities and their reductions occur. To be sure, what extent and in what ways verbal setting re- the material selected is presumed to be suffi- duces ambiguity. Is ambiguity primarily a ciently representative of actual discourse to feature of words in isolation, or does it per- make the results of practical relevance. But sist to some extent even in context? What part this presumption is not itself being tested here. of the context is most effective in reducing am- All the cases studied are actual cases; the con- biguity — for instance, how is the ambiguity of texts were selected from published texts and a selected word affected by the words imme- were not constructed for the study. Nor were diately preceding and following it, as compared words selected on the basis of the kinds of con- with the effect of the entire sentence in which texts in which they occurred, except for cer- it occurs? Does it matter whether the imme- tain formal requirements described below. diate context consists solely of particles ? How is the reduction in ambiguity affected by the Procedure linguistic sensitivity of the translator? By the A group of "translators" was presented with a multiplicity of senses of the isolated word? By set of words, each with a number of possible the clarity of the word; that is, the ease with meanings to be judged applicable or not. The which its multiple senses are identified? These are the questions to which this study is ad- words were first presented in isolation, then in dressed. certain standard contexts. 1 For a discussion of this distinction, and a comprehensive survey of contemporary se- *Reprinted with permission of the Rand Corporation from mantics, see C. W. Morris, Signs, Language. their report P18, dated November 30, 1950, which has been and Behavior, 1946. out of print for several years. 39
  2. abraham kaplan 40 The key words selected were limited to nouns, The sample was derived entirely from the li- verbs, and adjectives; these are the major car- terature of pure and applied mathematics. This riers of the content of any discourse, and pro- selection was made partly because of the back- bably more markedly exhibit ambiguities. The ground of the translators used in the experi- position of the word in the sentence was varied ment, partly because it is commonly supposed at random, to avoid overemphasis on the special that such material involves less ambiguity than contexts constituted by opening and closing non-scientific writing, or even that of some phrases. The first and last two words of the other scientific disciplines. The specific books sentence were never selected, so that contexts used are as follows: could be restricted to a single sentence. No No. of mark of punctuation was allowed to occur with- Samples in two words on each side of the key word, so Alexander, J., Colloid Chemistry. Vol. 15 as to simplify the appraisal of the effect of ver- III, Chemical Catalog Co., 1931 bal setting. Only words of sufficiently general Holmboe J. et al., Dynamic Meteor- 15 use to be included in the Fifth Edition of Web- ology, Wiley, 1945 ster's Collegiate Dictionary were chosen; and Lefschetz S., Introduction to Topology 9 it was required that the dictionary distinguish Princeton, 1949 at least three senses of the word. Moulton, F. R., Introduction to Celes- 15 tial Mechanics, Macmillan, 1914 Although frequency of use was not a criterion of v. Neumann J and Morgenstern,O., 15 selection, it was afterwards found that all of the Theory of Games and Economic 140 words selected appear in The Teacher's Behavior, Princeton, 1947 Word-Book of 30,000 Words.2 Seventy-four of Richter W., Fundamentals of Industrial 15 the words are among the thousand most fre- Electronic Circuits, McGraw Hill, quent words in the English language; of these, 1947 forty-four are among the first 500. The follow- Stuhlman O., Introduction to Bio- 14 ing is the frequency of occurrence per million physics, Wiley, 1948 words in the Thorndike-Lorge count: Weyl H., Philosophy of Mathematics 12 and Natural Science, Princeton,1949 Frequency Number of cases Williams C.D. and Harris E. C., 15 Structural Design in Metals, Ronald Over 100 76 Press, 1949 50 - 99 31 Zemansky, M. W., Heat and Thermody- 15 25 - 49 18 namics, McGraw Hill, 1943 2 - 24 15 Total 140 Total 140 The contexts were provided by sentences se- lected at random from these books, not drawn, T he actual key words used in the sample are for example, solely from prosy introductory l isted in Table I. chapters. On the other hand, "symbol-heavy" sentences which would require either special- F or each word, a number of possible senses ized knowledge or considerable portions of text w as listed, obtained from the dictionary entry for their interpretation were omitted. Sentences f or that word. The fully inflected form of the were selected to vary in length from 15 to 40 w ord was used — e.g., the plural or past tense words; occasionally, dependent clauses irrele- i f this was the form of its occurrence. It was vant to the clause in which the key word occur- r equired that the senses listed be clearly dis- red were omitted. The distribution of sentence t inguishable (in the judgment of the experimen- lengths was: t er) from one another; this did not by any means c oincide with the numbered senses in the dic- Number of Words Number of Sentences t ionary entry. Obsolete, archaic, colloquial, 15 - 19 33 a nd highly technical senses were omitted. A 20 - 24 56 m aximum of ten senses was selected. Where- 25 - 29 39 e ver necessary, the total number of senses was 30 - 34 8 made up to ten by adding an appropriate num- 35 - 39 4 2 By E. L. Thorndike and I. Lorge, Columbia Total 140 U niversity Press, 1944 .
  3. 41 ambiguity and Context TABLE I Key Words Used appear direct narrow scale approaches dropped nature screen assume due new separated attached elements normal serve balance established note set bears eye numbers shank broad field observed shape care flow origin show case force part skin cells formal particle slight change found passes solution character free people spirit class function period spread classical general phase state clear generation place strong close given point study come goes position subject compose good possesses substance conceived ground power survey conditions heads produce system connections heat product tension consideration induced projection terms contain introduced properties t ests contracts leading protection time converted levels provides tool course lies put transmitting current little raised treated cycle load reached tubes deductions lower reaction types degree maintained reference used depending make relations value determined mass requires view developed material r est words device model r ise work diaphram motion runs world b er of "false" senses, obtained from dictionary d istribution of words in the sample with vari- e ntries for words of the same part of speech. ous n u m be r s o f s e n ses w as: T he average number of "correct" senses of the w ords in the sample was 5.6, approximately the Number of Senses Number of Words d egree of ambiguity in actual discourse.3 The 3 16 4 33 5 30 6 25 7 7 8 14 9 5 3 See G. K. Zipf, Human Behavior and the Prin- 10 10 c iple of Least Effort, A ddison-Wesley Press, 1949, p. 30. Total 140
  4. 42 abraham kaplan E xamples of words with the senses listed (in- 5) wants, lacks c luding the "false" ones) are given in Table II, * 6) ways, passages below. 7) p osterior sections 8) d wellings, sojourns T he study was carried out with the help of 9) s kills s even "translators", four of whom had consi- * 10) advances d erable training in the mathematical sciences, t he other three having only a high school edu- a ssume cation. 1) s natch, seize W ords were first presented in isolation — the 2) d erived by reasoning or implication s o-called null context. E ach translator indi- *3) suppose c ated which of the ten senses for each word 4) c ome into possession of a ppeared to him to be senses in which the word *5) u ndertake m ight sometimes be used. In the second phase, *6) a ppropriate, usurp s even contexts were employed, derived from *7) f eign, sham t he sentence of the actual occurrence of the 8) s wallow eagerly w ord. These contexts were: 9) h old in possession or control *10) r eceive, adopt t he word preceding (P1) t he word following (Fl) W ords were presented to the translators in one b oth of these ( Bl) o r another of these contexts, and acceptable the two words preceding (P2) s enses were again indicated by them. The de- t he two words following (F2) s ign used had the properties that each transla- b oth of these (B2) t or was presented with all the words in some t he entire sentence (S) c ontext or other; each word appeared in all the c ontexts; each context had all the words in it; a nd no person faced the same word in more than o ne context. Thus each subject made two inter- T ABLE II p retations of each word: once in the null con- t ext, and once in some verbal setting. E xamples of Words and Senses R esults S tarred senses are actual ones. (Of course, n o stars were printed in the sheets from which T he accuracy o f a translator was measured by t he translators worked.) t he number of his correct characterizations of a l isted sense as actually belonging to the word a ppear o r not: ascriptions of true senses plus denials o f false senses. (This measure could be used 1) s hine faintly o nly for the null context, where the true senses *2) b e obvious or manifest a re specified by the dictionary; no such stan- *3) c ome before the public d ard is available for occurrences in context.) 4) c ome or go near T he seven translators ranged in mean accuracy 5) b e in great plenty f or all the words from 62% to 84%, around a *6) a ttend before a tribunal m ean of 75%. The four trained in mathematics *7) s eem, look a veraged 80% accuracy, the other three 70%. 8) p ass or move suddenly or quickly S ince the isolated words are not distinctively *9) b ecome visible m athematical, the difference is presumably due 10) l ook steadfastly; meditate t o general linguistic facility. approaches T he clarity o f a word is defined as the mean a ccuracy attained on it by the seven translators. *1) a pproximations ( Like accuracy, therefore, it applies only to the *2) p reliminary steps n ull context.) The mean clarity for all the words 3) s ummaries, epitomes words was 75% (being linked to the mean accur- 4) s uppressions, suspensions a cy). The distribution was:
  5. ambiguity and Context 43 Clarity (%) No. of cases Reduction (%) Percent in Context 40-49 1 P1 Fl Bl P2 F2 B2 S 50-59 4 60 - 69 29 0-29 37 41 41 38 36 51 60 70 - 79 57 30 - 59 19 25 28 28 27 27 24 80 - 89 41 60 - 89 18 14 17 18 22 6 4 90 - 99 _8 99 - 100 11 9 9 10 4 6 4 over 100 15 11 5 6 11 10 8 Total 140 Total 100 100 100 100 100 100 100 Unclarity was not due markedly either to a fai- What is the effect of initial ambiguity on its lure to recognize true senses or to a tendency reduction? Do more ambiguous words profit to ascribe false ones. The mean number of more from context than less ambiguous ones? true senses was 5.6; of assigned senses, whe- To answer this question, words of from three ther true or false, 5.5. Clarity did not show any to five true senses were separated from those significant correlation with ambiguity: words of six to ten: there were 79 cases in the former with a large number of true senses were, on group, 61 in the latter. The reduction effected the whole, neither more nor less clear than by each context for these two groups of words those with a small number. Neither was clarity was found to be: correlated with familiarity, as measured by frequency in the Thorndike-Lorge count. In Context Reduction (%) for Reduction (%) for both cases the correlation was + .1 and not sig- l ess more nificant. ambiguous words ambiguous words P1 65 88 By the reduction of a context will be meant the F1 62 51 ratio of the number of senses assigned to a Bl 48 45 word occurring in that context to the number P2 56 43 assigned to it in the null context by the same F2 52 61 translator. The lower this ratio, the more B2 44 44 effective is the context in reducing ambiguity. S 47 47 The reduction of the contexts tested was found to be: A s can be seen, there was no consistent direc- t ion of difference: the mean reduction was Context Reduction (%) 5 3.4% for the less ambiguous words, 54.1% for t he more ambiguous. It is to be noted that P1 P1 75 a gain appears as the worst context; B1 as quite F1 57 g ood, and B2 comparable in effect to that of the B1 47 e ntire sentence. P2 50 F2 56 T he same procedure was used to appraise the B2 44 e ffect of clarity on reduction of ambiguity. The S 47 s ample was evenly divided into words of rela- t ively high and low clarity, as defined above, a nd reduction separately computed: The context consisting of one preceding word appears to be least effective in reducing ambi- guity, being significantly worse than one word Context Reduction (%) for Reduction (%) for following. One word on each side of the word c lear words u nclear words to be translated is more effective than two pre- P1 88 62 ceding or two following. It is noteworthy that F1 53 62 two words on each side of the key word are com- B1 47 47 parable in effect to the entire sentence. The P2 49 52 distribution of the various degrees of reduction F2 5? 59 for each of the contexts is given in the following B2 48 41 table. S 58 36
  6. 44 abraham kaplan quently appearing words, their mean reduction The effect is again not a consistent one, though being 52.0% as compared with 55.4% for the i t suggests some slight advantage to the initially more frequent ones. It is quite in accord with u nclear words, as profiting more from context. expectation, of course, that the less clear, less T he mean reduction was 56.6% for the clear familiar words should profit more by being put words, and 51.3% for the unclear. in context than those that are clear and familiar to start with. But the results can only be said T he effect of familiarity was appraised in the to be compatible with this expectation, and same way. The seventy-four words which, scarcely to confirm it. a ccording to the Thorndike-Lorge count, are among the thousand most frequent in the English By contrast with these slight effects of doubtful l anguage were separated from the remaining significance are two other factors which appear s ixty-six words in the sample, and reduction to be quite important in reducing ambiguity. The a gain separately computed: first is the semantic content of the context. A context might consist entirely of articles, pre- Context Reduction (%) for Reduction (%) for positions, conjunctions, etc., and could be ex- frequent words infrequent words pected to contribute less to a translation P1 89 59 than one which also contained words not so poor F1 56 59 in semantic content. We may call the first par- B1 49 44 ticle contexts, the second substantive contexts. P2 40 62 A context was classified as "substantive" if at F2 59 52 least one word in it was not a "particle" word. B2 44 45 The full list of words in the sample regarded as S 51 43 "particles" (not grammatically, but from the viewpoint of semantic content) is given in Table Again there is no consistent effect, though again III, below. The results were the following: t here is some slight advantage for the less fre- Type of Context Particle Contexts Substantive Contexts No. Cases Reduction (%) No. Cases Reduction (%) P1 89 80 51 66 F1 107 66 33 28 B1 67 54 73 40 P2 56 61 84 43 F2 62 62 78 51 B2 25 45 115 44 S 0 ─ 140 47 The effect is consistent and unmistakable. The each interpreting in the context in question) mean reduction for the particle contexts was were grouped separately, there being sixty 61.3%, for the substantive contexts, 45.6%. How cases for each group. The results were: effective a context is in reducing ambiguity is a function, therefore, of whether it itself has a semantic content or is functioning primarily C ontext R eduction (%) for Reduction (%) for syntactically. It is noteworthy that for the B2 i naccurate a ccurate context there was no significant difference in t ranslators t ranslators reduction; but the small number of cases of B2 P1 109 59 particle contexts (25) makes this result suspect. F1 67 51 A second markedly significant factor in reduc- B1 58 46 tion of ambiguity by context is the accuracy of P2 57 48 the translators. The samples translated by the F2 63 52 three most accurate and those by the three B2 60 36 least accurate (for the words which they were S 76 26
  7. 45 ambiguity and Context T ABLE III L ist of "Particles" a from only they above has or this against if other through all in our thus an into out to and is over under a re it quite until as its same us at just several v ery be let shall we behind many since when between may so which by must some whose can near than will certain no that with does not the within done of their would during on t here f or one these The effect is again unmistakable. The inaccu- 2. The accuracy of such translation varies sig- rate translators showed a mean reduction, for nificantly from person to person, and shows the various contexts, of 70.0%, while the accu- some relation to educational level. Whether rate translators attained a reduction of 45.5%. this is due to language ability, intelligence, or In the sentential context, the reduction of the some other factor was not investigated. accurate group was about three times as great as that of the inaccurate group. 3. There is no consistent direction of error in translation: false senses are as likely to be In terms of these two important factors, an ap- ascribed to words as are true senses to be un- recognized, praisal can be made of the optimal reduction of ambiguity by context, considering only the ac- curate translators, working with substantive 4. How accurately, on the whole, a word is contexts. The results are: translated bears no marked relation to the num- ber of its actual senses nor to the frequency Context No. Cases Reduction (%) (within a fairly wide range) of its occurrence in actual discourse. P1 24 40 F1 13 35 5. The verbal setting with least effect on reduc- B1 35 33 tion of ambiguity is the one word preceding the P2 38 39 word to be translated. The greatest effect is F2 29 42 that of the entire sentence in which the word B2 53 36 occurs. S 60 26 6. A context consisting of one or two words on Conclusions each side of the key word has an effectiveness not markedly different from that of the whole 1. Even for familiar words, no more than about sentence. 3/4 of the possible meanings presented are cor- rectly translated as senses in which the words 7. The most important factors affecting con- might sometimes be used. textual reduction of ambiguity are the accuracy
  8. 46 abraham kaplan translators, non-particle contexts, at least one of the translators and whether the verbal set- word on each side of the key word) ambiguity is ting includes words other than particles. The reduced to from 1/4 to 1/3 of the number of most practical context is therefore one word on senses assigned to the word in isolation. A each side, increased to two if one of the context short verbal setting therefore reduces average words is a particle. ambiguity from about 5 1/2 senses to about 1 1/2 or 2. 8. Under optimal conditions (most accurate
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2