intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "A Program for the Machine Translation of Natural Languages"

Chia sẻ: Nghetay_1 Nghetay_1 | Ngày: | Loại File: PDF | Số trang:9

51
lượt xem
4
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

In the following we give an account of a computer program for the translation of natural languages. The program has the following features: (1) it is adaptable to the translation of any two natural languages, not just to some particular pair; (2) it is a self-modifying program—that is, given the information that it has produced an incorrect translation, together with the translation which it should have produced according to the linguistic judgment of an operator, it will modify itself so as to eliminate the cause of the incorrect translation....

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "A Program for the Machine Translation of Natural Languages"

  1. [Mechanical Translation, Vol.6, November 1961] A Program for the Machine Translation of Natural Languages by W. Smoke and E. Dubinsky*, University of Michigan, Ann Arbor, Michigan In the following we give an account of a computer pro- gram for the translation of natural languages. The program has the following features: (1) it is adaptable to the translation of any two natural languages, not just to some particular pair; (2) it is a self-modifying program—that is, given the information that it has produced an incorrect translation, together with the translation which it should have produced according to the linguistic judgment of an operator, it will modify itself so as to eliminate the cause of the incorrect translation. Before the account of the program itself we give a short sketch of the considerations which led to the program, to- gether with a statement of the reasons why we feel a program of the type presented will be adequate for machine translation. T he naive way to do research in machine transla- w hich is automatically modified as it translates more tion would be to pick a pair of languages, say Russian and more. Now how would one program a machine and English, and to try to discover some sort of trans- so that it would translate and in addition be able to formational rules connecting them, in terms of which a modify its process of translating? computer program might be written. The transforma- Let us try to reach a more precise idea of what a tion rules might be derived from a comparison of the self-modifying translation program would look like. two languages on the basis of old-fashioned grammar, The complete program P w ould consist of two parts, or from the more recent theories developed by struc- a translation program T a nd a master program M . T he tural linguists, or by other means. Most of the effort program T w ould be responsible for the actual trans- in machine translation research so far has gone into lation from one language to another, while M w ould deriving such transformation rules by one method or take care of making the changes in T . T hus suppose another, and making them more explicit; that is to that P , or the part T o f P , is capable of translating the say, putting them into a form in which they can be pro- Russian sentences S 1; . . . , S n c orrectly into English, grammed, and patching up the holes which are apt to but that it translates the sentence S n +1 i ncorrectly. Then appear in such rules when they are applied to an the modification in P w ould take place as follows. actual text. Assuming that this kind of effort were suc- Given S n +1 a nd a correct English translation of S n +1 a s cessful, its result would be a computer program, prob- input, the master program M w ould modify T t o ob- ably haywired together, which would—given a certain tain a translation program T '. T he new complete pro- restricted kind of input material—produce a more-or- gram P ' w ould consist of M a nd T ', a nd would trans- less accurate, more-or-less readable translation. One late S n +1 c orrectly. Furthermore, while we need not would never know exactly when the machine was go- require that P ' b e capable of translating all of S 1 , . . ., S n +1 ing to bog down on some particularly difficult Russian c orrectly, it is necessary that after some limited series P , P', P" . . . P (m) o f modifications to P , a program P (m) passage, and when the program did bog down, no one would know exactly where to put the next piece of b e obtained which is capable of translating all of haywire to make it run again. S 1; . . . , S n +1 c orrectly. That is, while the modifications Sapir said, “All grammars leak.” The same is going can introduce errors, we cannot have a strictly recur- to be true of any computer program for the translation ring series of errors introduced. Finally, the programs P (m) w hich are obtained as of languages: the time will come when it is inadequate —there will always be exceptions. If for no other modifications of P s hould be subject to some kind of reason, this will be true because languages are always regularity. We do not want a program which becomes changing. For this reason, we feel that any computer complicated and uneconomical too fast; that is, the program which deserves the name of a language trans- series of modified programs should converge in some lation program has to be a program which is capable reasonable sense, not diverge. of expansion, in a regular manner, to keep up with the This process suggests to us the familiar kind of be- demands that are made on it. Essentially, what one havior which we call learning behavior. We like to must have is a machine which l earns t o translate, think of a machine which is programmed in the man- ner outlined as a machine which learns to translate. * The authors would like to thank A. Koutsoudas, without whose How does one go about constructing a translation stimulus and support this paper would not have been written. 2
  2. p rogram of the type we have described? It should be language), or as a part of the output states (in the fairly clear by now that this problem is more a com- case of the target language). That is, however we puter problem than a linguistic problem. But it is not represent a text in a language, this representation must a problem in programming techniques. be essentially equivalent to representation by a state, or When we set out to attack the problem, we felt a partial state, of an automaton. If we restrict our that what we needed was a way of discussing lan- thinking to reasonably realistic automata, we may sup- guages, translations, computers, etc., from an abstract pose that an automaton has only a countable number of point of view. That is, the problem in its main fea- cells, each cell having only finitely many states. If we tures is clearly independent of whether we are trans- represent the cell states by a countable alphabet—in lating from Russian into English, or Chinese into fact we will consider only finite alphabets—then a Sanskrit. Furthermore, it will be unimportant whether state of an automaton, and hence a text in a language, we think of using a Univac or an IBM 709 as a vehicle can and must be represented by a sequence from this for the translation program. alphabet. We can observe at this point that a solution to the Thus we are led to the following provisional defini- problem as stated would of necessity have certain tion of a language: a language is, for our purposes, bonus features: it would not just be a solution to the nothing more than a collection of sequences of symbols problem of translating, by machine, Russian into from some finite alphabet. It has turned out to be con- English, but would, in all likelihood, be a solution to venient to study systems with a bit more structure the problem of machine translation for any given pair than this definition would imply. In fact, we have been of languages. primarily interested in studying systems of finite se- But if we do not restrict our use of the term quences with some kind of binary composition. In the ‘language’ to Russian or to English, or to any other case of an associative binary composition, the systems particular, concrete language, then what do we have are equivalent to a special kind of semigroup.* Lately, in mind? And what do we have in mind when we we have become interested in systems with non- discuss a translation, a translation program, or a trans- associative binary composition. The reason for this shift lation program embodied in a machine? of interest will become clear as we go on. Perhaps we should first examine the question of But before we go on to describe our latest efforts, what we mean by a translation program. The idea of let us spend a few moments reviewing the earlier work. a computer program abstracted from any particular First, what is the problem? We can formulate it as computer is not new; it is usually depicted by a flow- follows. We are given two collections of corresponding diagram. When the same thing is studied by those texts, that is, two collections of finite sequences of with a more abstract turn of mind, it is sometimes symbols from two alphabets. The symbols may be called an abstract automaton. Abstract automata, at thought of as letters, words, or any other convenient least the kind we are interested in, can be thought of linguistic unit (which particular unit we use is of little as a collection or matrix of information-retaining cells. importance at this stage). The correspondence is, more The information retained by any particular group of exactly, a function, the translation function, from the cells at any one time may be called the state of this one collection (source language) to the other (target part of the automaton. The state of the entire automa- language). But what kind of function? We must re- ton changes discretely through time, its state at one quire that the function be such as is realizable by an instant completely determining its state at the follow- automaton. But this requirement by itself is not suf- ing instant. In an input state the cells of the automaton ficiently restrictive. In fact, as long as we are dealing are readied with information from the “outside”—the with only a finite number of pairs of corresponding input information. Corresponding to each input state texts, it would always be possible, given sufficiently will be an output state, signaled by a “stop” or some large storage capacity, simply to program a computer such indicator. When the information from the cells to translate each of the source language texts by look- is read off to the “outside”, it becomes the output in- ing it up in a text “dictionary”, where the complete formation. The output state is a function of the input text together with its translation is stored, and feeding state, and correspondingly, the output information is out the translation. a function of the input information. This means that a translation function, defined only on a finite domain, is always realizable in a trivial An automaton, in its capacity as a means for pass- fashion. Therefore, it is reasonable to consider func- ing from input to output, is simply a certain kind of tions defined on infinite domains. In fact, since it realization of a function. In our case, the function seems to be impossible to give any explicit method for which is to be realized is what we have been calling a singling out sequences of symbols which we want to translation. The domain of this translation function is translate from those that we will not be called upon a certain class of texts in some language, and its range to translate (i.e., for separating “meaningful” from “non- is a class of texts in another language. A text might meaningful” sequences of symbols) it is reasonable to be anything from a sentence to a paragraph or an consider functions which are defined on all sequences article. Whatever it is, however, it is clear that it must of symbols from a given alphabet. But now, we clearly be something which can be represented as a part of one of the input states (in the case of the source * See appendix. 3
  3. c an have functions which are not realizable by auto- tion. A sequential function is a function defined on the mata. free, finitely generated semigroup of all sequences of symbols of some finite alphabet. It is a kind of semi- What sorts of functions are realizable by automata? homomorphism. The defining property of a sequential A very simple example of such a function is provided function f i s that if a a nd b a re two elements of the by a homomorphism defined on a free and finitely domain semigroup, then f (ab) = f(a)b', w here b ' is generated semigroup. In fact, a homomorphism is de- s ome element of the semigroup which contains the fined by exploiting the sequential character of the ob- range of f . A homomorphism h i s a special case of a jects in its domain. Each element in its domain is a sequential function, since h (ab) = h(a)h(b), t hat unique sequence of a finite number of symbols, and is, b ' = h(b) i n this case. In general, b ' w ill depend the definition of the homomorphism on the sequence is on a . T hat is, because of the fact that the range semi- accomplished by letting the sequence translate as the group as well as the domain semigroup is free on its sequence (in the same order) of the translations of the generators, the correspondence which assigns to the symbols. The fact that there are only finitely many elements b , c, d, e tc., of the domain, the elements symbols, together with the uniqueness of the repre- b ', c', d', e tc., which occur as well-defined parts of the sentation by sequences of these symbols, guarantees sequences f (ab) = f (a)b', f(ac) = f(a)c', f(ad) = the realization of the homomorphism by an automaton. f(a)d', e tc., is a function which has the same domain An example of a homomorphism is given by a simple and range semigroups as f . We can denote this func- substitution cipher, e.g. tion by f a , so that we have, for any element b o f the THE BOY WENT HOME domain, f (ab) = f (a)f a (b). T hen in order that the sequential function f n ot be a homomorphism, it is translates as sufficient that there be two elements a a nd b, such UIF CPZ XFOU IPNF that for some element c w e have f a(c) ≠ f b (c). T hat is, the translation f a (c) o f c i n the sequence a c i s dif- using the device of translating each letter of the alpha- ferent from the translation f b (c) o f c i n the sequence bet by the following letter, translating space as space, b c. F urthermore, it turns out that this new function and extending the function thus defined to a homo- f a i s again a sequential function. For we can calculate morphism. f a (bc) a s follows. By definition f (abc) = f(a)f a (bc). What is wrong with using this kind of translation B ut also f (abc) = f(ab)f ab (c) = f (a)f a (b)f ab (c). function for Russian to English translation? The diffi- T hus we have f (a)f a (bc) = f(a)f a (b)f ab (c) s o that culty lies partially in the size of the unit that would f a (bc) = f a (b)f ab (c), w hich shows that f a i s a se- be necessary. One would probably need to use a unit quential function. We call f a a d erived function of f . of clause size, because of the ambiguity which would Carrying the above computation a little farther, we arise in dealing with units of lesser length. But this is have f a (bc) = f a(b)(f a ) b (c) ; hence f a (b)(f a ) b (c) = not the only difficulty which might arise. f a (b)f ab (c) , and therefore ( f a ) b (c) = f ab (c). T hat is, Suppose that we have a collection of units U a nd the function derived from fa using b i s the same as the a homomorphism T d efined on sequences of elements function derived from f u sing a b. T hus the corre- of U . I n other words, U i s the set of generators of the spondence ψ w hich associates to an element a o f the free semigroup that is the domain of T . S uppose that semigroup and a sequential function f t he sequential a a nd b a re two of the units of U , a nd that T ( a) = function ψ ( f, a) = f a , has the associativity property T (b) = If then, we encounter the sequence a b, ψ ( ψ ( f , a ), b) = ψ ( f, ab). W hat this means is that a i ts translation will be T (ab) = T(a)T(b) = S up- sequential function f c an be defined on a free semi- pose this is incorrect, that is, we wish to assign an- group by defining the sequential functions derived other translation to the sequence a b. R ecall that in from f o n each of the generators of the semigroup. In particular, then, a sequential function certainly be- this case, we wish to modify the translation function comes realizable by an automaton if it has only finitely T t o obtain a new translation function T ' w ith the prop- many derived functions, and is defined on a finitely erty that T ' t ranslates a b c orrectly, and also translates generated free semigroup. In fact, the realization of a those sequences of elements of U w hich do not contain sequential function of this kind is accomplished in a a b a s did T . B ut now, T ' c annot be a homomorphism. very natural way by the type of automaton known For any homomorphism which agrees with T o n U as a sequential automaton, or a finite state ma- w ill be identical with T . I n particular, then, such a chine. These automata have been extensively studied homomorphism cannot translate a b c orrectly, if T d oes by several authors 3 , 4 , 5 , 6 . To obtain the sequential not. Thus we see that we cannot restrict our choices automaton A c orresponding to a sequential func- of translation functions to homomorphisms, if we wish tion f , w e need merely take, as a set of states F o f A , to be able to modify these functions as we indicated the set of derived functions f a o f f , letting f i tself be earlier. the initial state. The input I o f A i s the semigroup on If homomorphisms do not lend themselves to modi- which f i s defined, and the output O i s the range of f . fication, what kinds of functions, realizable by auto- T he next-state function of A i s the function f d efined mata, do have this property? Perhaps the first such previously, and the output function of A i s the cor- function to consider is what we call a sequential func- 4
  4. r espondence φ w hich associates to an element b o f I from right-to-left instead of from left-to-right, we a nd to a state f a o f A t he element φ ( f a , b ) = f a (b) o f could equally well modify the translation of a unit on O . We thus obtain the sextuple A = ( I, O, F, f, ψ , the basis of following context. In fact it would seem φ ) w ith the requirement ψ ( ψ ( g , a ),b) = ψ ( g,ab) that, by proceeding from left-to-right and “holding- o n ψ a nd a corresponding requirement φ ( g,ab) — up” the translation of a given unit until the machine φ ( g,a) φ ( ψ ( g, a ) ,b ) on φ w here g i s in F , a a nd b senses what follows it, it would be possible to take into a re in I . E xcept for the designation of f a s initial state, account both preceding and following context. That is, the restriction of F t o be finite, and the restriction of we could attempt to construct a sequential machine I a nd O t o be free and finitely generated, this is ex- that would translate b a s i n the context a bc a nd as actly the definition of a sequential machine as given o therwise. This attempt would run into the difficulty by Ginsberg. 3 that b w ould go untranslated in the context a b o ccur- E quivalently, one may begin with a sequential ma- ring at the end of input sequences, since the machine chine with a designated initial state, and define a “waits” to see what comes next before translating b sequential function. It is clear intuitively that an auto- a fter a , a nd in case a b i s a terminal segment nothing maton will realize a sequential function just in case comes next. This difficulty could be avoided by the ad- the output sequence corresponding to an initial seg- dition of a special symbol [] to the input alphabet, ment of some input sequence is an initial segment of having the function of “closing off” input sequences, so the output sequence corresponding to the complete in- that the terminal segment a b w ould become a b [] . put sequence. T his device, however, is awkward. A simple example of a sequential function is given A more serious problem is encountered when we by the translation of examine sequential functions from the point of view of their flexibility with regard to alterations of order be- THE BOY WENT HOME tween input and output. For example, it is impossible as to construct a finite-state sequential automaton which will realize the very simple function which translates TBG IXW TYMG ODQV THE BOY WENT HOME accomplished by using the correspondence between the letters and the numbers from 1 to 26, and assign- as ing to each letter in the first row the letter which cor- EMOH TNEW YOB EHT responds to the sum of the numeral values, modulo 26, of the letters up to and including the one to be trans- i.e., the function which simply reverses the order of lated (except that space always translates as space). the letters in an input sequence. The sequential function thus defined has 26 derived Another difficulty that we run into using sequential functions, f A t hrough f Z = f . Every derived function is functions as translation functions is illustrated by an equal to one of these; e.g., f AB = f C . attempt to construct a sequential function, defined on Let us now return to a consideration of the problem the alphabet ~, ∨ , (,), p 1 , p 2 , p 3 , . . . e tc., which will of modifying a given translation function T , w here we correctly translate well-formed expressions of the pro- now may let the modified function T ' b e a sequential positional calculus, in the primitives ~ and ∨ , into the function. Suppose, for simplicity that T i s the function equivalent expressions in the primitives ∧ a nd ⊃ . Con- considered before, defined as an extension to a homo- sider expressions of the form morphism of some function (we can still call it T ) ∨ p 2 ) ∨ p 3 )...) ∨ p n defined on the set U o f free generators of a free finitely ~(... (~((~p 1 ) generated semigroup. Suppose also that we wish to w hich translate correctly as have T ' a gree with T e xcept on sequences containing ⊃ p 2 ) ⊃ p 3 )...) ⊃ p n . a b, a nd that the proposed modification on a b i s that (...((p 1 b s hould translate as after a , a nd otherwise as = It is intuitively clear that, reading from left-to-right, a sequential machine would translate ∨ a s ⊃ i f it “re- T ( b). T hen we can define T ' b y letting T ' m = T i f m i s a sequence not ending in a , T' a (c) = T (c) i f c ≠ members” that a ~ preceded the opening parenthesis paired with the closing parenthesis preceding the ∨ i n b , T' a (b) = a nd then let T ' b e the extension question. But it is clear that to overtax the “memory” which results by enforcing the associativity condition. of a given sequential machine, it is enough to try using This kind of modification also succeeds in case T is it to translate correctly a proposition of the above form already a sequential function which is not a homo- with sufficiently many “levels”. morphism. This difficulty is related to the objection, voiced by Thus we are able to introduce modifications into Chomsky, 2 t hat arises when one attempts to employ translation functions which are sequential functions, a “finite-state grammar,” which is essentially a sequen- if these modifications are suitably restricted. Essentially, tial automaton without input, as a “sentence generator” we can let preceding context modify the translation of for languages which have sentences of the form “if . . . a particular unit, thereby modifying the translation then . . .”, or “either . . . or . . .”. Again, these sentences function itself. By running the text into the machine 5
  5. m ay be “nested” to a level which overtaxes the capac- t he result is not a sentence. This is reflected in the fact ity of the machine. that the category of the juxtaposition of ( (the boy) Thus, sequential functions would seem to be not went), a n expression of category s , and h ome, a n ex- only awkward, but perhaps even basically inadequate pression of category ( (n \ s) \ ( n \ s), i s undefined. for use as translation functions. This is in accord with An expression may belong to several categories. our intuitive feeling about language. It is not that we Thus h ome c ould also be in category n ; o r in category feel that a language has a God-given structure of some ( n/n), a s in h ome run. S ometimes the context will kind, which it is our task to discover, adopting then a determine that a given expression must be function- type of translation function which fits this structure. ing in a certain capacity within that context, as f lying However, we do feel that a given type of translation i n t hey are flying. T hat is, if it is known that the entire function will necessarily impose a corresponding struc- expression has only the category s , t hen an analysis of ture on the language on which it is defined; and we the assignments resulting from can then appraise our choice on the grounds of econ- T hey are flying omy, our intuitive feelings of neatness and elegance, etc. By these standards, it appears that sequential n ( (n \ s)/n) (n/n) functions do not offer a good choice as translation ( ((n \ s)/n) \ ((n \ s)/n)) functions. We have now reached the point where we shall n begin to describe our recent work. We intend now to discuss a type of translation function which does not s hows that of the three choices of category for f lying have the inadequacies of those that we have described. o nly n c an be correct. However, consider the sentence In fact, the type of translation function which we now T hey are flying planes wish to consider, will lead, at the end of this discus- sion, to what we believe to be a computer program n ( (n \ s)/n)) (n/n) n which is adequate for machine translation. ( ((n \ s)/n) \ ((n \ s)/n)) The origin of the program is a system of notation, pro- posed by Bar-Hillel 1 w hich is designed to denote D epending on whether we read the sentence as the syntactic categories of linguistic expressions. Bar- Hillel’s notation can be built up out of the symbols n , ( They ((are flying) planes)) s, /, \ , (,). Used in conjunction with a natural lan- (They (are (flying planes))) guage, expressions which are commonly called nomi- nals—nouns, pronouns, adjective-noun combinations, o r as noun phrases, etc.—are assigned the category n . Sen- we choose ( (n \ s)/n) \ ( (n \ s)/n)) o r ( n/n) a s a tences are assigned the category s . An expression which produces an expression of category β w hen pre- category for f lying. T his ambiguity occurs not only in sentences, of course, but also in such an expression as fixed to an expression of category a i s assigned the category (β /a). T hus the adjective t he p refixed to the the nominal p urple people eater. I s it ( (purple people) eater) o r is it ( purple (people eater))? noun b oy p roduces the nominal t he boy; hence t he h as the category (n/n) s ince b oy a nd t he boy b oth have W e have observed that the way we associate the category n . S imilarly, an expression which produces an words in a sentence or a phrase can alter the meaning expression of category β w hen affixed to an expression of the expression. It is reasonable to suppose then, that of category a i s assigned the category ( a \ β ). Thus the association of the units in an expression can influ- w ent i n t he boy went i s assigned the category (n \ s), ence its translation. But this means that we should be a nd h ome i s assigned the category ( (n \ s) \ ( n \ s )). studying translation functions defined, not on associa- The parts of the sentence are assigned categories as tive systems such as semigroups, but on non-associa- follows: tive systems. We will not be satisfied, of course, with a computer program which requires that a pre-editor T he boy went home insert parentheses into a Russian sentence before it is ( n/n) n (n \ s) ((n \ s) \ ( n \ s)) given to the machine to be translated. This is not what n ( n \ s) we have in mind, but rather we think it might prove convenient to break our problem into two parts—to s supply parentheses, and to translate. In fact, one way P erhaps we can notice now that this process of cate- of correctly supplying parentheses will be to try trans- gory assignment is in some sense non-associative. That lating all possible associations of a given input se- is, the assignment indicated induces an association of quence, and then to consider that association the cor- the sentence as follows: rect one which has a translation. If there are two associations with differing translations, this means, of ( (The boy) (went home)) course, that we are dealing with an ambiguous se- A ssociated another way, e.g.: quence, just as in the case of a sentence with two ( ((The boy) went) home) meanings corresponding to two different associations. 6
  6. L et us now turn to the program. It will be evident plication table, a given association of the text is suc- how the construction of the program was influenced by cessively reduced. Either the process ends with at least Bar-Hillel’s notation. one category assignment to this association, or some Recall that we have said that a self-modifying pro- derived list is empty because products are undefined. gram P f or machine translation would consist of a In the latter case the association is considered to have translating part T a nd a modifying part M . It will be no translation. In the former case the list correspond- convenient to describe our program in these terms. Let ing to the association is considered to be a possible us first describe T , that is, we will describe T (n) , the translation of the original input text and is printed out. translation program at the n th stage of modification. The output consists of the complete list of all possible The information which is stored in the machine and translations corresponding to all associations. If the forms the reference material for T c onsists of a dic- complete list is empty an indication of this fact re- tionary and a category multiplication table. The input places the translation. to T i s a source language text. The action of T o n t his This completes the description of T . We now de- input text is as follows. scribe M , the modifier program. The program M i s 1. T he units of the input text are referred to the called into action only when T m akes an error, that is, dictionary, and for each unit for which an entry is pre- only when it is decided, by a comparison of the input sent in the dictionary, the entry is extracted and and output texts, that the translation is unsatisfactory. brought to the working space of the machine. For each There are two ways in which the translation can be unit for which a dictionary entry is not present, a spe- unsatisfactory. On the one hand the list of translations cial entry, indicating dictionary blank, substitutes as a may not contain any translation which is correct. On dictionary entry for the unit. A dictionary entry con- the other hand the list of translations may contain sists of a list of pairs of output units and symbols some translations which are incorrect. In the first case designating categories. the necessary modification involves supplying a cor- 2. W e now have stored in the working space of the rect translation, in the second case it involves eliminat- machine a list for each input unit. Together these lists ing the incorrect translations. comprise a sequence of lists in the same order as the We must organize the modification process in such corresponding sequence of input units in the text. This a way that these two kinds of modification do not in- sequence of lists is now processed by a multiplication terfere with one another. What we shall do is to per- operation on all possible associations. form the modifications of the second type, i.e., elimi- For each ordered pair of associated lists, i.e., ( A,B) nating incorrect translations, in such a way that correct i n ( (AB)(CD)) , and each ordered pair ( a,b) o f en- translations are never eliminated. Then an unsatisfac- tries in ( A,B) , i.e., a i n A a nd b i n B , the machine tory translation of the first kind can occur only if the refers to the category multiplication table. The category dictionary is inadequate. That is to say, when there is multiplication table is a square array of the following no correct translation present in the output list, the type: modification amounts to augmenting the dictionary. Thus the first part of M i s a program which makes λ α β γ up new dictionary entry lists and adds to lists already λ λ,λ λ,λ λ,λ λ,λ present in the dictionary. When no correct translation α λ,λ λ,λ γ , α α ,- is present in the output list, one must be supplied by β λ,λ β ,- λ ,- -,- the operator. Corresponding to this translation the γ λ,λ -,α α,β -,β operator will also indicate, for each input unit, which sequence of units in the translation it corresponds to. w here the row refers to the first, the column to the This material then becomes the input of M , which second element of the ordered pair. The two elements locates the unit in the dictionary corresponding to each of ( a,b) e ach consist of a pair, the first element an input unit, or enters it into the dictionary if it does output unit, the second a category. Let us suppose not already appear there, and adds to the dictionary that the category of a is a and that of b i s β . T he ma- entry list thus obtained the corresponding sequence chine then locates the entry corresponding to a a nd β , of output units, assigning them to a special “universal” which in the example is ( γ , α ), and places two entries category. The universal category is defined as that in the derived list A B . One entry consists of the pair unique category, such that its product with any cate- ( γ ) w here and are the output units of a a nd b gory is a pair of universal categories. α ). T he de- r espectively, and the other is the pair ( This completes the first stage of the correction rived list A B c onsists of all such pairs for all choices of process. If T w as the original translation program, the new translation program T ' w hich results from T b y ( a,b) in (A,B) e xcept for the pairs ( -). That is, the modifications described above will yield a transla- if in the example the category of α w ere γ a nd that of tion of the text which is satisfactory on at least the first b w ere α , t hen the multiplication table entry corre- count—the list of translations will contain at least one sponding to this pair would be (-, α ), which indicates which is correct. that the first element of the product is “undefined”. The next problem is to eliminate from the list the In this way, building up derived lists from the basic incorrect translations. As a first step the operator must dictionary entry lists by means of the category multi- 7
  7. i nform the machine exactly in what respect an incor- the example, the new multiplication table will be as rect translation is incorrect. For example, a translation follows. of a sentence might be incorrect if it contains an in- λ α β γ α’ β’ correctly translated phrase; or each phrase within a λ λ,λ λ,λ λ,λ λ,λ λ,λ λ,λ sentence may be correct if considered without refer- α λ,λ λ,λ γ , α α ,- λ,λ γ,α ence to context, but incorrect when considered in con- β λ,λ β ,- λ ,- β ,- λ ,- -,- text; or finally, the translation of each phrase may bo γ λ,λ -,α α,β -,β -,α α,β correct even when considered in context, but the ar- α’ λ,λ λ,λ γ,α α ,- λ,λ -,α rangement of the translation may be incorrect. β’ λ,λ β ,- λ ,- β ,- λ ,- -,- The task of the operator is thus as follows: for each association of the text which leads to an incorrect I f now and are not translations of units, but are translation, he must decide, for every indicated juxta- elements built up out of combinations of units, not position of two associated elements—assuming it has only must the categories of a nd b e changed from already been decided that each of the two elements α a nd β t o α ' a nd β ' w ith the first element of α ' β ' u n- is correctly translated—whether the indicated juxta- defined, but also the categories of the successive seg- position of the elements (in either order) is a correct ments of which a nd a re resulting combinations translation of the corresponding part of the input. That is, he must think of the corresponding part of the input must be correspondingly changed. For example, if = as entirely divorced from its context, and decide h as category γ , h as category δ , then the a nd whether in fact it is correctly translated by the juxta- m ust be changed to γ ’ a nd δ ’ , categories of a nd position (in either order) of the two output units in where γ ’ a nd δ ’ h ave all the properties of γ a nd δ e x- question. Essentially then he must decide this on the cept that the first element of γ ’ δ ’ i s α '. T his procedure same basis on which he decides on the translations of will finally result in changes in the categories of the complete texts: for the purposes of this decision the units of which a nd a re composed. When the cate- part of the input in question is treated as a complete gory of a unit is changed the corresponding dictionary text. In particular, if the translation is considered in- entry is also changed. correct in one association, it must also be considered It is asserted that this procedure will lead to the incorrect in any other association which contains the elimination of all incorrect translations and retain all two elements associated in the same order, as a trans- correct translations. It should be clear, in the first lation of the same part of the input. place, that an incorrect translation is eliminated if and If it is decided that the translation is correct, the only if it is eliminated as a result of every association, two elements are combined to produce a new element and that a correct translation is retained if and only if which is also considered correct. Proceeding in this it is retained as a result of some association. Thus, in way the operator must eventually encounter a pair of order to convince ourselves that the procedure actually elements which are correct, but whose juxtaposition does lead to the desired result, it will be sufficient to is incorrect (he cannot encounter a u nit w hich is in- consider a fixed association, and show that any correct correct since we may suppose the dictionary not to translation which results from this association before contain incorrect entries). the modification will continue to do so after the modi- Suppose then that and are two elements, each fication, and that no incorrect translation will result after correct, but is incorrect. The operator then gives the modification. But it is clear than any pair of output units which enter into at least one correct translation, this information to the machine. That is, he supplies e.g., a nd in , a re such that there is a choice the machine with the part of the input which led to the translation t ogether with the association of the for the other units, i n the example, such that the resulting juxtaposition is a correct translation. There- units in a nd indicates for each unit of the input fore the juxtaposition of these two units is correct, and text to which units of i t corresponds. Since is a their categories are not changed as a result of the permissible combination according to the present cate- modification. gory multiplication table, this means that the first On the other hand, given an incorrect translation it element of the product α β i s defined. In the example must result either from the incorrect juxtaposition of α β = ( γ , α ). The action of M w ill be to change the its two highest order segments, in which case it is t o categories α ’ a nd β ’ s uch that categories of a nd eliminated at this stage, or from one of these two seg- the first element of α ’ β ’ i s not defined, while at the ments being incorrect, etc. Again, inductively one sees same time keeping α ’ δ = α δ f or every category δ ≠ β ’, that there must be two segments of some order whose k eeping δ β ’ = δ β f or every category δ ≠ α ’, a nd keep- juxtaposition is incorrect, causing their categories to ing δ α ’ = δ α a nd β ’ δ = β δ f or every category δ . In be altered and the translation eliminated. other words M w ill change the categories of a nd This completes the description of the modification t o α ’ a nd β ’ and respectively, and will add two rows and program M . I t will probably be helpful at this point to two columns to the category multiplication table (un- consider an example of the use of T a nd M . less these rows and columns are already present). In L et us suppose we are translating from English into German. We will take as our input unit the word, and 8
  8. c onsider the input text t he boy left. L et us suppose the modification program M w ill locate the dictionary entries corresponding to the input units, and will enter also that, corresponding to the three input units, the v erliess i n the list for l eft, a ssigning to it the universal dictionary contains the three entries category λ . THE: DER α B OY: KNABE δ L EFT: LINKS ε Again using T he boy left a s input, the new transla- D AS β tion program will cause the sequence D IE γ DER α K NABE δ L INKS ε a nd that the portion of the category multiplication D AS β V ERLIESS λ table in which we are interested is as follows (only the D IE γ required products are indicated): t o appear in the work space. From the association λ α β γ δ ε µ DER α K NABE δ L INKS ε λ D AS β α λ,λ µ ,- V ERLIESS λ D IE γ β λ,λ - ,- γ λ,λ - ,- w e obtain δ λ,λ -,δ ε DER KNABE µ L INKS ε µ - ,- V ERLIESS λ T he first act of T i s to place the dictionary entries in a nd from this list, the two translations sequence in the work space: DER KNABE VERLIESS λ DER α K NABE δ L INKS ε V ERLIESS DER KNABE γ . D AS β D IE γ From the second association DER α K NABE δ L INKS ε T here are two possible associations from which a D AS β V ERLIESS λ translation might be obtained: D IE γ DER α K NABE δ L INKS ε (1) D AS β w e get D IE γ DER a L INKS KNABE δ DER α ( KNABE δ L INKS ε ) ( 2) D AS β K NABE VERLIESS λ DAS β D IE γ V ERLIESS KNABE λ D IE γ w hich leads to the translations S ince of the products α δ , β δ , and γ δ , only the first DER LINKS KNABE µ element of α δ i s defined, the first association reduces D ER KNABE VERLIESS λ to K NABE VERLIESS DER λ DER KNABE µ L INKS ε D ER VERLIESS KNABE λ b ut, as µ ε i s undefined, no translation results from this V ERLIESS KNABE DER λ association. D AS KNABE VERLIESS λ From the second association we obtain first the de- K NABE VERLIESS DAS λ rived list D AS VERLIESS KNABE λ V ERLIESS KNABE DAS λ DER α L INKS KNABE δ D IE KNABE VERLIESS λ D AS β K NABE VERLIESS DIE λ D IE γ D IE VERLIESS KNABE λ s ince the first element of δ ε i s undefined, and the sec- V ERLIESS KNABE DIE λ ond is δ . This list then reduces to DER LINKS KNABE µ s o that the complete list of translations, from both s o that the entire output consists of this one transla- associations, has fourteen members. D er Knabe verliess tion. r esulting from both associations. Suppose now that it is decided that the correct Suppose now it is decided that only D er Knabe translation of T he boy left i s not D er links Knabe b ut verliess i s correct, and that in fact we wish to retain it D er Knabe verliess. A ssuming that the correspond- only as a result of the first association. That is, we ence between input units and output units is indicated can decide first that l inks Knabe i s incorrect as a trans- as lation of b oy left a nd that so also are K nabe verliess THE—DER a nd v erliess Knabe, a nd finally, that while D er Knabe BOY—KNABE LEFT—VERLIESS 9
  9. p rogram. One could begin with a completely blank a nd v erliess a re correct as translations of t he boy a nd dictionary and a multiplication table of the form l eft, t hat v erliess der Knabe i s incorrect as a transla- tion of T he boy left. I n terms of the categories, this λ means that the dictionary entries are corrected to: λ λ,λ THE: DER α ' B OY: KNABE δ ' L EFT: LINKS ε ' a nd begin translating sentences as texts. It would D AS β V ERLIESS λ ' probably be more reasonable, however, to begin with D IE γ the above multiplication table and a dictionary al- a nd the multiplication table becomes (part of it): ready reasonably large, and begin translating short λ α β γ δ ε µ δ ε λ and more or less unambiguous phrases, thus adding λ gradually to the category system. α λ,λ µ ,- It is of course evident that a text need not be any β λ,λ - ,- one in particular of the standard linguistic units, but γ λ,λ - ,- it might be mentioned that the segment which we have δ λ,λ -,δ been referring to as a unit is similarly unrestricted. The ε only requirement on the system of segmentation of the α’ µ ’,- input text, leading to these units, is that it be such as δ’ -,- -,- to give a free decomposition, that is, that no input µ’ -,- λ ,- -,- text should have two distinct decompositions as a se- quence of units. The obvious choice is of course the ( One notes that it would be possible for a category word, but theoretically one could use letters of the to become empty, all units belonging to it becoming alphabet, syllables, sentences, etc. In fact, if the de- reassigned. Thus it would be reasonable to periodically tails of the decomposition could be worked out, some examine the multiplication table for unnecessary cate- choice of stems, prefixes, and endings might mate- gories.) rially reduce the size of the dictionary (at the cost of We will conclude by offering a few comments on increasing the size of the multiplication table, of methods of using the program. In the first place, it course). There is no restriction at all on the output should be clear that it would be possible to institute units. Thus if the input units were words, the output several different kinds of “training programs” for the units could be, and frequently would be, sequences of two or more words. R eceived July 16, 1959 APPENDIX least one set of generators, namely h(ab) = h(a}h(b) for a and b T itself. In particular, S has a set of in S. Binary Composition and Semigroups generators. A semigroup S is finitely REFERENCES A set S is said to have defined on generated if it has a finite set of gen- it a (not necessarily associative) law 1. Y. Bar-Hillel, “A Quasi-Arithmeti- erators. of binary composition if there exists a cal Notation for Syntactic De- The product of any sequence map S × S → S. The image of a scription,” Language 29 (1953) s1, s2, . . .,.sn of elements of a semi- pair (a, b) of elements of S under 47-58 group S is an element of S defined this map is denoted ab. The map 2. N. Chomsky, Syntactic Structures inductively in terms of the binary S × S → S is associative if for every (The Hague, 1957). composition, and is shown to be in- three elements a, b, c of S we have 3. S. Ginsburg, “Some Remarks on dependent of the association of the Abstract Machines,” Transactions sequence. A set F of elements of S is (ab)c = a(bc) of the American Mathematical said to be free in S if every element A system with an associative binary Society 96 (1960) 400-444. of S is a product of at most one se- composition is called a semigroup. 4. E. Moore, “Gedanken-Experiments quence of elements of F. A semi- A subset T of S is a subsemigroup on Sequential Machines,” Auto- group S is free if it has a free set G of S if the restriction of S × S → S mata Studies (Princeton, 1956). of generators. It is easily shown that maps T × T into T. The intersection 5. M. Rabin and D. Scott, “Finite this is the ease if and only if every of any family of subsemigroups of S Automata and their Decision Prob- element of S is the product of one is again a subsemigroup of S. If G is lems,” IBM Journal of Research and only one sequence of elements any set of elements of S, the sub- and Development 3 (1959) 114- of G. It is shown that if a semigroup semigroup generated by G is the 125. S is free then its set G of free gen- intersection of all subsemigroups 6. G. Raney, “Sequential Functions,” erators is unique. containing G, and G is called a set Journal of the Association for Given two semigroups S and T, a of generators for this subsemigroup. Computing Machinery 5 (1958) homomorphism of S into T is a map Every subsemigroup T of S has at 177-180. h:S → T with the property that 10
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2