intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "Some Psychological Methods for Evaluating the Quality of Translations"

Chia sẻ: Nghetay_1 Nghetay_1 | Ngày: | Loại File: PDF | Số trang:8

54
lượt xem
1
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

The excellence of a translation should be measured by the extent to which it preserves the exact meaning of the original. But so long as we have no accepted definition of meaning, much less of exact meaning, it is difficult to use such a measure. As a practical alternative, therefore, we must search for more modest, yet better defined, procedures.

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "Some Psychological Methods for Evaluating the Quality of Translations"

  1. [Mechanical Translation, vol.3, no.3, December 1956; pp. 73-80] Some Psychological Methods for Evaluating the Quality of Translations † George A. Miller and J. G. Beebe-Center, Harvard University, Cambridge, Massachusetts The excellence of a translation should be measured by the extent to which it pre- serves the exact meaning of the original. But so long as we have no accepted def- inition of meaning, much less of exact meaning, it is difficult to use such a meas- ure. As a practical alternative, therefore, we must search for more modest, yet better defined, procedures. The present article attempts to survey some of the possible methods: One can ask the opinion of several competent judges. Or, given a translation of granted excellence, one can compare test translations with this criterion by a variety of statistical indices. Or a person who has read only the translation may be required to answer questions based on the original. The char- acteristic advantages and disadvantages of each method are illustrated by examples. ONE HEARS it said that MT is currently rather language. Furthermore, the scale should be crude, but that workers in the field are striv- applicable to translations from or into any lan- ing to improve and refine their translations. guage whatsoever, and so should not take ad- A brief encounter with the unedited output of an vantage of any characteristics peculiar to a automatic dictionary is sufficient evidence of given language, say English — Whether or not a the tremendous range of quality between the single scale can apply to all languages and still simplest mechanical 'translation' and the prod- make linguistic sense is a debatable question. uct of a skilled, human translator. The ques- And, preferably, the scale should be unidi- tion is whether this intuitive judgment of the mensional, so that different translations could quality of a translation can be made more pre- be compared with respect to a single 'figure of cise by any psychological techniques of scale merit'. Finally, we would like to have one or construction. more cutoff points indicated along the scale; A scale of the quality of translations should "completely unusable," "useful for scanning as be reliable, valid, objective and easy to use. to subject matter", "useful after post-editing", In addition to these general desiderata for all "immediately readable, " and "suitable for pub- s caling procedures, there are certain special lication" are some criteria that we might hope features that this particular scale should have. to locate along the scale. For example, it should be applicable to any A ll these features would be desirable, but translation, whether produced by a machine or it is not obvious at present that they can be by a human translator. This feature would en- achieved. able us to compare the output of a particular Subjective Scaling machine to the output of a human who had had a known number of years of study in the foreign Perhaps the most direct approach is to give both the original passage and the translation to † Preparation of this paper was supported un- be tested to a person who understands both der Contract AF 33 ( 038 ) — 14343 between the languages and to ask him to assign a number U.S. Air Force and Harvard University and between 0 and 100 to the translation, where 0 appears as Report Number AFCRC —TN—56 — means that it is equivalent to no translation at 61, ASTIA Document Number AD 98823. Re- all and 100 means the best imaginable transla- production for any purpose of the U.S. Govern- tion. This method fails the criterion of objec- ment is permitted. tivity, of course, and cannot be applied when a We would like to acknowledge the assistance polyglot is not available to judge, but we ex- of Peter Aldin, Martha Taylor, Soon Duk Koh pected to be able to map out the general terri- and Elizabeth Friedman. tory in this way and to use subjective ratings
  2. 74 M iller and Beebe-Center Original. Il résulte de ceci qu'une atmos- as a criterion against which to test various phère stratifiée doit toujours réfléchir et other scaling techniques. donc produire des échos. In a short exploratory study, however, we ob- tained somewhat confusing results. We found (1) He result of this which a atmosphere much disagreement among different raters. stratified must always to think and there- Perhaps we should have used foreign language fore to produce of the echoes. teachers as our judges, for they probably have (2) It results from this which a atmosphere skill in grading that ordinary, bilingual persons stratified must always to reflect and do not seem to have, but we did not anticipate therefore to produce of the echoes. that the ratings would be so difficult. ( 3) It results from this that a atmosphere For the purposes of this study, we selected stratified must always reflect and there- four summaries of articles from the journal fore produce echoes. Acustica, two in German and two in French. (4) The result of this is that in a stratified The journal also gave an English translation, atmosphere, one must always think of the so we had the work of a theoretically compe- echoes that are produced. tent translator to use for comparison. (The (5) It results from this that a stratified at- published translations were not the best pos- mosphere must always reflect and there- sible, but they represent the sort of thing that fore produce echoes. is available in the current scientific literature.) (6) It results from this that a stratified at- Then we prepared mechanical translations, mosphere always reflects and therefore simulating by hand the possible operation of an always produces echoes. automatic dictionary. Each word of the origi- nal text was written on a card. These cards Published translation. It follows from this were then alphabetized, and on the reverse that a stratified atmosphere should reflect side we listed the possible English equivalents sound and produce echoes under all cir- in approximately the order of their frequency cumstances. of occurrence, as well as we could judge it on A similar sample taken from one of the Ger- intuitive grounds. From this pack we then con- man passages is the following: structed six different translations: (1) the first English alternative was chosen from each Original. Bei beliebiger Impulsform ergibt card; (2) an editor selected the best of the sich das Faltungsprodukt aus Membran- first two alternatives from each card, making und Impulsform. his selection in complete ignorance of the other (1) By any form of the impulse yields -self alternatives or the original passage; (3) an the products of the folding out membrane- editor selected the best one from all the alter- and form of the impulse. natives on each card, still in complete igno- (2) By any form of the impulse yields the rance of the original passage; (4) an editor products of the folding out membrane- rewrote the English passage from a knowledge and form of an impulse. of only the first alternative on each card; (5) (3) By any form of the impulse yields the an editor rewrote the English passage from a products of the folding out membrane- knowledge of only the first two alternatives on and form of an impulse. each card; and (6) an editor rewrote the Eng- (4) Any form of the impulse is yielded by the lish passage from a knowledge of all the alter- interaction of the bending out of the mem- natives on each card, but without seeing the brane and the form of the impulse. original passage. In all cases, these editors were monolingual Americans with no linguistic (5) The impulse in any form yields the prod- ucts of the folding-out membrane and the training. The first three procedures did not form of an impulse. lead to grammatical English, of course, so we obtained a fairly wide range of quality by these (6) Any form of the impulse yields the prod- procedures. These six translations, together ucts of the membrane-folding. with the translation taken from the journal and Published translation. With a given impulse the original passage, were presented to judges form one obtains a resultant effect of the who rated them on a scale from 0 to 100. shapes of the impulse and of the disk. As a sample of the sort of materials pro- duced, consider a single sentence taken from a French passage:
  3. E valuating Translations 75 Table I Mean Ratings of Quality of Seven Translations Method of French French. French German German German Translation I II Mean I II Mean ( 1) 21.9 28.2 25.1 27.1 22.2 24.7 ( 2) 35.5 30.1 32.8 21.6 37.0 29.3 ( 3) 47.3 27.7 37.5 13.3 29.0 21.2 ( 4) 38.2 70.1 54.2 45.6 31.8 38.7 ( 5) 90.5 80.4 85.5 24.0 34.0 29.0 ( 6) 75.9 54.3 65.1 45.5 77.5 61.5 Published 89.5 80.1 84.8 77.0 75.5 76.3 Translation When the seven translations were given to We were slightly surprised that rewriting subjects to judge, of course, no information made as much difference as it did, since the was supplied as to the method of translation. people who rewrote had essentially the same It is interesting to note that supplying several information about the original passage as was alternative English equivalents seems to be contained in the selectively edited translations. more useful in translating from French than The superiority of the rewritten translations from German, but this judgment is based indicated that the judges relied rather heavily upon only these four samples of about 75 words upon the grammaticalness of the translation in each. reaching their decisions. In order to check this notion, we asked another group of subjects Eleven judges were used for the French pas- to act as judges, giving them the same instruc- sages and ten for the German. The judges tions as before except that they were not shown were able to speak the language from which the the original French or German passages. translations came, but had no linguistic train- Their ratings correlated closely with the orig- ing; they were instructed to compare each inal ratings, especially for the translations translation with the original and to take time from German. It seems, therefore, that enough to be sure of their judgments. The people will not regard favorably an ungram- means of their ratings are summarized in matical translation even though they are able Table I. to understand it correctly. There was so much disagreement among the judges (which was reflected in their bitter comments about the difficulty of their task) Table II that even the means reveal only very general Mean Ratings for Three MT Procedures trends. These trends are clearer if we pool for French and German the data further, as in Table II. Method French German From Table II we see that far more success is possible with French than with German, and No editing (1) 25.1 24.7 that selective editing helps a little but not so Selective editing (2-3) 35.2 25.3 much as complete rewriting. These conclu- sions are intuitively correct, and it would be Rewriting (4-6) 68.3 43.1 disappointing indeed if they failed to appear. The error variance is so large, however, that Means 53.4 38.6 these conclusions are barely significant.
  4. 76 Miller and Beebe-Center method ignores the order in which the words We can conclude that a simple word-for- are written. As an illustration: word substitution, method (1), is not satis- factory, but that an automatic dictionary com- Original: La maison se trouve à droite. bined with rewriting is a fairly satisfactory Criterion: The house is on the right. solution for translating from French into Eng- lish. The problems with German are more Test: The house leans to the right. difficult and seem to require that the machine From the criterion translation an alphabetical recognize syntactic features. These conclu- check list of words is prepared and the words sions, however, are of less immediate impor- in the test translation are checked against it: tance to us than the conclusions we can draw about this method of estimating the quality of 1√ house translations: (a) The method is subjective; is 1 (b) Raters dislike the task; (c) There is con- on 1 Score = 4/6 = 0.67 s iderable error variance, so that many judges 1√ right are needed in order to obtain reliable means; the 2 √√ ( d) The literary skill of the rewriter is an important factor in the ratings; (e) An at- A number of exploratory experiments have tempt should be made to obtain more experi- been conducted with this method, using trans- enced judges — either language teachers or lations produced by students attempting to pass p rofessional translators. their language examinations in French or Ger- man and by competent translators. These studies have explored various possibilities, Word Scores but none of them has been followed up with large amounts of data. Disregarding levels of Another way to approach the problem is to significance, the studies can be summarized consider what a grader does when he evaluates as follows: a pupil's translation. Introspective reports in- (1) Five subjects with a good knowledge of dicate that he looks for two kinds of errors: both languages translated a sentence from Ger- ( 1) errors in vocabulary and (2) errors in m an into English. These translations, all as- construction. It is difficult to make these in- sumed subjectively to be 'good', were evalu- trospections more precise, for vocabulary and ated against a criterion translation. The syntax are complexly intertwined. Neverthe- scores ranged from 0. 73 to 0. 86. With stu- l ess, it seems worthwhile to try. dents whose knowledge of German ranged from low to high, scores ranged from 0.19 to 0.70. T he fact that a grader can recognize errors For three persons with little knowledge of Ger- at all implies that he must have some personal m an, the mean score was 0.31. Four persons standard against which he compares the stu- with a relatively good knowledge of German d ent's work. In its most rigid form, this had a mean score of 0.65. might consist of his own written translation; (2) One passage was translated from French m ore often it is probably a rather vague set of into English by a simple word-for-word sub- translations that would be about equally accept- stitution, taking the first English equivalent able. In order to imitate his procedures, that occurred in a French-English dictionary. therefore, we should have one or more explicit The score for this translation was 0.40. translations, written out in advance, that we (3) One person who knew no Turkish but will use as criteria. The task is then to obtain was familiar with the general subject matter some objective measure of the relation be- t ranslated a short, technical passage from tween the test translation and the criteria. Turkish into English. No dictionary was used. T he score for a language as little related to G iven a test and a criterion translation, the English as this was 0.20. The fact that the s implest thing to try first is to ask if they use score was not zero is due to the occurrence of the same words. That is to say, a score can common words in the two languages. be given by taking the number of words in the (4) In order to study the variability of the test translation which are duplicates of words s core, eleven French sentences were trans- in the criterion translation and then expressing lated with a mean score of 0.65. The standard this number as a fraction of the total number deviation was found to be 0.12. of words in the criterion translation. This
  5. Evaluating Translations 77 students each in the Kyung-Bock High School, (5) Seven translations of two German sen- Seoul, Korea, and they were asked to rewrite tences were made by students. These were them into intelligible Korean sentences. Their scored and the scores were compared with sentences were then scored against the crite- scores given by a grader on a longer passage rion translation. The average score without containing these same sentences and also with pre-editing was 0.125; with pre-editing, 0.218. scores on an 'objective test' of German lan- These scores are probably too low; the stu- guage ability and achievement. The three dents were being given instruction during the measures of the students' ability were in close summer vacation because of their poor school agreement. r ecords. (6) Since the use of a particular criterion t ranslation may seem rather arbitrary, the These studies support some general com- check lists from six different criterion trans- ments. For human translators, a simple lations were combined and used to score the measure of correspondence of vocabulary cor- students' translations. With one criterion relates rather well with a subjective evaluation translation, there was a ceiling of about 0.86 of the quality of the translation; a student who and a mean of 0.50. When six criterion trans- has achieved a given level of competence in vo- lations were combined, the ceiling rose to cabulary has probably achieved a correspond- about 0.95 and the mean increased to 0.58. No ing level of competence in grammar, so the significant changes in the rank order of the test vocabulary measure will be correlated with translations resulted from this broader defini- any other measure of quality. For MT, how- tion of the scoring criterion. ever, the correspondence is not so close. It is (7) When successive pairs of words, instead possible to imagine a mechanical translation of individual words, were used to construct the that is completely unintelligible yet contains check list, the scores were lower but were most of the correct words. That is to say, the linearly related to the scores for individual vocabulary measure is necessary but not suffi- words. With sequences of three successive cient. Nevertheless, we have been pleasantly words used to construct the check list, scores surprised that so mechanical and simple a pro- were very low and discrimination appeared to cedure gives us any discrimination at all. be lost. (8) A word-for-word substitution of Korean Word-Order Scores equivalents for English words was made with ten sentences totalling 171 words in length. In order to supplement the simple vocabulary The Korean words, in the English order, were score, we would like to have some indicant of given to three Korean students at Harvard. the syntactical adequacy of the translation. They were asked to rewrite the sentences in Before bringing to bear the more sophisticated Korean, ignoring as best they could their concepts of modern linguistics, we decided to knowledge of English. Their rewritten sen- try the simplest possible comparison with a tences were then scored against a criterion criterion translation. The simplest method we prepared by an experienced translator. The could think of was to compare the order of the three scores averaged 0.49. However, if dif- words which were common to the test and the ferences in inflection are ignored and the word criterion translations. For example: is considered correct if the root is identical, Criterion: The young boy walked fast. the average was 0.75. It is very likely, how- ever, that the subjects' familiarity with Eng- Test: The fast boy had walked. lish was a considerable aid to them. From the criterion translation a check list is (9) These same sentences were then trans- again prepared, but this time the ordinal posi- lated again, this time using some simple rules tion of each word is indicated: for pre-editing the English. (a) Articles were Position in Position in omitted; (b) Idioms were underlined; (c) Criterion Test When 'of' occurred in a possessive phrase, the order of the words was inverted; and (d) When boy 3 3√ 'to' occurred in an infinitive construction, it fast 5 2 was indicated. With this pre-editing, the word- 1√ the 1 for-word translation was repeated. The two walked 4 5√ sets of sentences, translated with and without young 2 pre-editing, were given to two groups of 31
  6. 78 Miller and Beebe-Center The word score is 4/5 = 0.80, when scored as These methods involving a statistical com- before. If we consider the four shared words, parison of the test translation with a criterion we find that the three checked words corre- translation are certainly effective at the lower spond as to order. Thus the word-order score end of the scale. Whether the statistical net can be stated as 3/4 = 0.75. can be woven fine enough to catch the subtle shades of meaning that differentiate between Thirteen people, whose knowledge of French 'acceptable' and 'good' or 'excellent', however, varied from low to high, were given four 300- is still an open question. word French passages to translate. These translations were scored by the word-order Measures of Transmitted Information method and also by a more subjective tech- n ique, with a grader scoring errors in words and in phrases. Furthermore, each person One goal, although an unrealistic one, that took two forms of an objective examination in we might hope to attain in translation is re- French language achievement. versibility. That is to say, we could recover the original passage exactly by translating The word-order scores ranged from 0.20 to back again. We do not usually aspire to this 0.72. The error scores given by the grader goal, because it is not necessary to recover ranged from 1.6 to 24.4. The objective exam- exactly the original passage. Various alterna- ination scores ranged from 252 to 750 ( where tive wordings may be adequate for purposes of 250 is chance performance). Thus all three communication; so we hope merely to land measures discriminated among the translators. somewhere inside this set of acceptable alter- The average correlation between word-order natives. When we translate we hope that some- scores and error scores was about 0.70, and thing will remain invariant under translation. between the word-order scores and the objec- This something might be called the meaning or tive examination scores was about 0.60. it might be called the information. Since tech- The reliability of the word-order score is niques for estimating amounts of information reasonably good and could probably be im- have been developed, this line of thought leads proved by lengthening the passages. The cor- to the suggestion that we should attempt to relation with error scores and objective exam- compare different translations to see how inations provides evidence for some degree of much information they have in common. validity, at least for human translators. This The method we have explored is one devel- technique is useful to discriminate against very oped by Claude Shannon for estimating the re- poor translations, but the present evidence in- dundancy of printed texts. Subjects guess re- dicates that it may not discriminate accurately peatedly at successive letters, advancing to in the range that might be labelled 'good' to letter n + 1 after they have correctly guessed 'excellent'. letter n. Shannon has shown how to estimate A slightly more sophisticated and less me- the amount of information, in bits per letter, chanical way to get at the syntactic aspects has from the frequency distribution of correct re- been used by Koh in the Korean studies. A sponses on the first, second, third, etc., scoring key is constructed in advance by noting guess. In fact, Miller and Friedman2 have which words modify other words in the origi- found that it is not necessary to obtain repeated nal English passage. If the rewritten Korean guesses, since the amount of information per translation contains this same relation, one letter can be estimated rather closely from the point is given. When the rewritten translations percentage of times the first guess is correct. produced by the Korean high school students The relation is H = 5Q, where H is the number were scored by such a key, they obtained an of bits per letter, and Q is the probability of average score of 8.5% on the passages without being wrong on the first guess. pre-editing and 23. 3% with pre-editing. The m ethod is rather arbitrary, inasmuch as the experimenter must select in advance those 1. Shannon, C.E., "Prediction and Entropy of syntactic relations for which credit will be Printed English", Bell Syst. Tech. J. 1951, given, and it is less mechanical than the word- 30, 50-64. order score, since it requires some intelligent 2. M iller, G.A., and Friedman, E.A., "The judgment both in constructing the key and in Reconstruction of Mutilated English Texts", doing the scoring. Nevertheless, it is a tech- Information and Control, 1957 (in press). nique that deserves further exploration.
  7. Evaluating Translations 79 questions which they could not answer before The strategy we have used involves an ap- they received the message, we conclude that proximation to the information formula, the communication was successful. T = H (x) - H y (x), One way to apply this technique is in the form where T is the amount of information common of commands that must be carried out by some to x and y; H (x) is the amount of information gross, bodily behavior. A more convenient in x; and Hy(x) is the amount of information way is to ask questions that can be answered in x when y is known. Now suppose that x and verbally. For example, in order to evaluate y are two alternative translations of the same the readability of a particular passage, psy- passage. We can estimate H(x) by asking a chologists give the reader a few minutes to subject to guess successive letters according study it and then ask him a series of questions to Shannon's technique. Then we can take an- ranging from very simple to very difficult. other subject and show him translation y; with Once a set of passages has been standardized y available to him, he now proceeds to guess for readability on a large sample of readers, successive letters in x, and so gives us an es- it can be used to measure the reading skill of timate of Hy(x ). Assuming the two subjects to other individuals. Such a set of passages with have identical guessing habits, the difference related questions is called a 'reading compre- between these two measures should give us an hension test'. It should be relatively straight- estimate of the amount of information common forward to apply this same technique to meas- to the two translations. If one translation is a ure the comprehensibility of a translation. criterion translation, the value of T should be The translation to be tested would be pre- high when the test translation contains essen- sented to a person along with a list of questions tially the same information, and low when it that he must answer about the meaning of the contains relatively little of the same informa- passage. These questions should be simple t ion as the criterion. enough that an intelligent person equipped with In a preliminary study we found that T aver- a good translation could answer them all, yet aged 0.8 bits per letter for two 'good' transla- difficult enough that a person with no transla- tions of a given sentence and 0.05 bits per let- tion could not answer any of them. We have ter for one 'good' and one 'poor* translation. hesitated to adopt this approach because the Although these results indicate that the method phrasing of the questions requires much skill may be feasible, it is laborious and time-con- and the test should be standardized on rela- suming; we have not explored a wide variety of tively large groups of subjects. conditions in this way and will probably not do For example, the subject might be presented so unless it becomes of some further theoret- with the following word-for-word translation of ical interest. It does have the slight advantage a German passage: that the measure is given in bits per letter, which may be more meaningful to computer The theory the passage of sound through d esigners than some more arbitrary scale. plates is — for even waves and bounded bundle — in such form given that the rela- Reading Comprehension Tests tion with it the free waves of the plate in A possible criticism of the methods discussed appearance steps. Cremer's conception so far is that they are too much concerned with the total number of passages as 'coinci- the small details of a translation and too little dences' the falling in wave with it free concerned with the general purpose of making waves of the plate, certain exceptions translations in the first place. The purpose, h ereof and the influence a final cross of course, is communication. The translation section of the wave are discusses. The should be judged successful if this purpose is conclusions are experimental with it achieved. ultra-sound on aluminum plate proven. In ordinary situations outside the psycholo- Then he would be confronted by questions like gist's laboratory, we have a simple check on the following: whether we have communicated successfully. We ask questions. For example, after a series 1. What does the form of the theory reveal? of communicative acts that he calls 'lectures', 2. What was done with the conclusions? a teacher will evaluate his success by a proce- 3. What kind of incident sound was studied dure that he calls an 'examination'. If the re- analytically? cipients of a message can answer correctly
  8. 80 Miller and Beebe-Center Ideas concerning the degree of grammatical- 4. What kind of incident sound was studied experimentally? ness of a passage are suggested in the work of A. N. Chomsky. For example, if words are 5. Was Cremer's theory accepted without classified into syntactic categories, we might qualification? ask how often ungrammatical sequences of cat- 6. What did Cremer think was coinciding? egories occur. As a variable we could examine the degree of precision of the syntactic classi- Although these questions have not been tested fication. A very grammatical translation would in any way, it is hoped that they will be diffi- have only permissible sequences even with the cult to answer until you have read the following most refined analysis of categories, whereas alternative translation: an ungrammatical translation might not have The theory of transmission of sound — only permissible sequences until the catego- plane waves and laterally bounded beams — ries were reduced to something as crude as through plates is given in a form which Noun, Verb, Adjective, and X, where X repre- reveals the connection with the free waves sents everything else. This is a forbidding in plates. Cremer's interpretation of total task to undertake, however, and does not get transmission as 'coincidence' of the inci- at the question of whether the translation, dent wave with a free wave in the plate, grammatical or not, carries the same meaning certain exceptions from that representa- as the original. Indeed, much syntactic analy- tion, and the influence of the finite cross sis carefully avoids any contamination with section of the beam are discussed. The semantics. conclusions have been examined experi- We have assumed, therefore, that such anal- mentally on aluminum plates by ultrasonic yses are much more important for workers waves. trying to develop translating machines than for those who would like to evaluate the finished This example should make clear the difficul- product. ties involved in formulating good questions. On the one hand, they should not be so specific Our studies have not explored the closely re- a s to require a particular word in answer, for l ated problem of measuring the "translata- this reduces to a vocabulary test. On the other bility" of the original passages. We have ob- hand they should not be so general that it is served, of course, that with respect to English, difficult to decide whether the answer is right French is more translatable than German. But or wrong. No doubt special passages would there are many other differences. The litera- have to be constructed for the purpose; we ture in any given language is not uniformly have not yet undertaken this formidable task. translatable, and some schemes for MT may succeed with one author and fail with another. Syntactic Analyses For example, a passage which is well written All of the scaling procedures discussed above in the original language will usually be more are linguistically naive. We have been much translatable than a poorly written passage. Or, impressed by the elegance of certain theories again, a passage written by a person who of grammar. For example, Z. Harris' con- k nows no English will usually be harder to stituent analysis should certainly yield some translate into English than something written kind of measure of agreement between the true in the same language by a person whose first analysis and the constituents of the translation language was English. Only a large sample of to be tested. However, these ideas have been different materials in the source language can difficult to apply because the translations pro- inform us on this question, and it is imprac- duced by some of the simpler mechanical pro- tical to generate such a sample by manual cedures are so bad that it is impossible to say simulation. Thus there are important aspects what the constituents are. Such analysis is of the evaluation problem that cannot be studied easier if the translation is grammatical. satisfactorily until the machines are running.
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2