缩略图

TheDevelopmentProcess,Trendsand

作者

Sun Hao Zhao Liang Wang Yanli

College of Engineering and Technology Xi’ an FanYi University Shaanxi Xi’ an 710105; Xi’ an iFLYTEK

Abstract:With therapid development artificialintelligence technology,trslation technologyisalso consttlyupdated.Today,technologies such asmachine trslation computer-assisted trslation(CAT) havebecome importt part thetrslation industry.Thispaper

introduces the three stages the development machine trslation technology: rule-based methods, statistical machine trslation, neural machine trslation. It alyzes the advtages disadvtages each stage machine trslation technology, elaborates on the

marketapplicationsfuture developmenttrends machine trslation. However,therearestillmy problemstobe solved inmachinetrslation, it cnotfully replacehumtrslation for the timebeing.Therefore,thereisstill a long wayto go toachieve high-quality machine

trslation.In thefuture, itisnecessary todevelop newmethodsthat c combinesymbolicrules, knowledge, neural networksturther improvetrslation quality. Keywords:Artificial Intelligence; Machine Trslation;Enlightenment

1.Introduction

1.1 definition application AI machine trslation

Artificial intelligence, abbreviated as AI. It is importt driving force for the new round scientific technological revolution industrial chge, a new technological science for researching developing theories, methods, technologies application systems for simulating, extending exping hum intelligence. With the rapid development artificial intelligence technology, trslation technology is also consttly updated. Nowadays, machine trslation, computer aided trslation (CAT) other technologies have become importt part the trslation industry.

Machine trslation, also known as automatic trslation, or "machine trslation" for short, is the process using computers to convert one natural lguage (source lguage) into other natural lguage (target lguage). It is a brch computational linguistics, the ultimat goals artificial intelligence, has importt scientific research value. At the same tim value. With the rapid development economic globalization th play increasingly importt role in promoting political, economi the Internet has made global information connected, the bigg obl different lguages. Only by truly solving the communication barriers between different lguages c the concept "global village" be truly realized.

1.2 application encryption decryption technology in machine trslatio

Since the birth computers in the 1940s, the first thing people think is to let computers realize the dream trslation in different lguages. The development history machine trslation is as long as the development computers. At the beginning, people regarded trslation in different lguages as encryption decryption, foreign lguages as ciphertext, native lguages as plaintext. Using the most advced encryption decryption technology to realize machine trslation, although it achieved very preliminary results, it was bound to end in failure. Because the encryption decryption technology center requires that the ciphertext plaintext should be one-to-one corresponding, the corresponding relationship should not be ambiguous. The core is to alyze the cipherbook. However, the polysemy one word the difference grammatical structure between different lguages make the machine trslation technology based on encryption decryption impossible to succeed.

2 development machine trslation technology

Looking back on history, the development machine trslation technology is usually regarded as three stages, rule-based method, statistical machine trslation neural machine trslation. 2.1 rule based machine trslation method

The essence rule-based machine trslation method is to write down the trslation knowledge experts in the form rules, use the trslation rules in the way stware to realize the machine trslation process. The early rule method has achieved great success in the field expert system. The so-called rule is equivalent to the form conditional statement "if... Then", that is if the if conditio met, then the then action is executed, which is a deterministic decision rule. Its advta s tha itable fo cessing conforms to the logical thinking computer stware s disadv e is that the cost communication between linguists programmers is very high, the cost is relatively high, the cost maintence is also very high. Once the trslation rules are modified, the program needs to be rewritten.

Later, the rule description lguage was used to let linguists write highly readable trslation rules themselves provide them to programmers as a knowledge base. Programmers focused on writing programs to interpret execute trslation rules, which separated linguists programmers independently, greatly improving the efficiency construction. The most importt thing is that linguists modified the trslation rules without requiring programmers to modify the program code. This is a great progress, which greatly promoted the application development rule-based machine trslation methods.

However, with the gradual deepening the application machine trslation, rule-based machine trslation

(1) For the trslation structure trsformation my words, even linguists are unable to summarize the appropriate trslation rules, which c only be understood but not expressed;

(2) The number trslation rules is very large, the actual construction machine trslation system needs the cooperation multiple linguists. Because everyone writes trslation rules from different gles, how to ensure that the rules written by different linguists do not conflict is a world problem;

(3) Real user input sentences may be colloquial, not strictly follow the traditional linguistic rules, the trslation rules such sentences do not know how to construct. 2.2 statistical machine trslation methods

Since IBM put forward the statistical machine trslation method in the late 1980s early 1990s, revolutionary chges have taken place in the field machine trslation. The biggest chge is the trsformation from mual writing trslation rules to data-driven machine learning method. The basic process is: first, prepare a certain scale bilingual sentence pairs (such as 10million Chinese English biling sente pairs, Chinese sentence one English trslation mually trslated), use machine lear ethod to au omatically train the learning trslation model from the bilingual training sentence pairs at the same time cale target lguage (such as English) monolingual data to automatically train the learning lguage model, finally build a complete set statistical machine trslation system through parameter optimization.

2.2.1 advtages statistical machine trslation method

The theory technology statistical machine trslation have been developed for more th ten years have entered the industry with great success. Its advtages mainly include:

(1) There is no need to mually write trslation rules, only need to mually prepare bilingual sentence pairs monolingual data;

(2) The whole training learning time c be reduced to a few days, unlike the regular method, which requires several person years;

(3) The whole training learning process basically does not need mual intervention, is fully automatic; (4) machine trslation engineers are not required to underst bilingualism. The complete machine learning method automatically learns from bilingual sentence pairs. Of course, system error alysis performce optimization are finally joined by linguists who underst bilingualism, but it is not necessary.

Another great advtage statistical machine trslation is that y sentence c give a trslation (even if the trslation quality is not good), which is unmatched by the traditional rule method, because if there is no suitable trslation rule before, the trslation will fail the trslation cnot be generated.

2.2.2 shortcomings statistical machine trslation methods

With the gradual deepening the application statistical machine trslation method, its shortcomings also gradually appear. From the experimental results, statistical machine trslation has a strong ability to select Vocabulary Trslation phrase trslation, but its ability to adjust the word order the trslation is not enough, especially when it comes to the need to adjust the order motel mple, the Chinese question "where" is at the end the sentence, the English questio he the sentence). Usually, the trslation construction machine trslation trslation. Once it involves long-distce sequence adjustm effect is ten not good enough. This leads to the trslation observe it, you feel that the trslation effect most ords nslation, you ten feel that the sentences are not smooth, at the same time, will introduce the problem omission.

You c imagine that the grammatical structure a sentence is unreasonable. Even if every word is trslatedcorrectly, the trslation long sentences is very difficult for readers to underst. Therefore, in the application, myhum trslators are not willing to post edit based on the statistical machine trslation results, feel that it is betterto trslate from the beginning, which leads to the failure machine trslation to reduce the cost hum trslation.

2.3 neural machine trslation methods

Since the neural machine trslation theory was proposed in 2013, its development speed is very fast, the shortcomings statistical machine trslation methods have been well alleviated. The use neural networks to solve trslation problems has been studied in academic circles for a long time, but a feasible framework (coding decoding framework is adopted at present) is really proposed to make the neural machine trslation system truly feasible, which also benefits from the wide application deep learning.

The neural machine trslation technology was initially implemented by using the recurrent neural network RNN. Later, because the parallel processing ability RNN was not good, Facebook proposed to use the convolutional neural network CNN to implement neural machine trslation. At present, the trsformer model proposed by Google is the mainstream, most the neural machine trslation systems in the industry are based on this model varits.

Compared with statistical machine trslation, neural machine trslation is characterized by high fluency. For readers, intuitively, the quality trslation has made a huge leap, mainly because compared with the target lguage trslation lguage model that carries the source lguage information, the lguage model c build a smooth trslation output.

The statistical machine trslation system is usually deployed by CPU, but the neural machine trslation system needs GPU deployment. From this perspective, the cost system deployment is much higher, the decoding speed will be relatively slow, the user's hardware investment will be greatly increased. In addition, in the case large-scale training data, the training cost neural machine trslation will be larger th that statistical machine trslation. However, the biggest advtage neural machine trslation is that the quality trslation has been greatly improved, which is the most critical for users.

3. market application future development trend machine trslation

3.1 market application machine trslation

With the development globalization digitalization, more more compies need cross lguage communication trslation services. The revenue China's machine trslation industry is mainly generated by machine trslation hardware products. At present, there are mainly three kinds hardware products: trslation machine, Bluetooth trslation headset trslation mobile phone. With the enhcement the functions intelligent trslators the improv t us perienc , the application intelligent trslators is expected to be further popularized, which will drive the growth the size the machine trslation market.

This work was supported by Artificial Intelligence Trslation Shaxi Research Center. References [1] Liao B, Khadivi S, Hewavithara S. Back-trslation for Large-Scale Multilingual Machine Trslation[J]. arXiv preprint arXiv:2109.08712, 2021. [2] Shi Y, Wg Y, Wu C, et al. Emformer: Efficient memory trsformer based acoustic model for low latency streaming speech recognition[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech Signal Processing (ICASSP). IEEE, 2021: 6783-6787. [3] Elbayad M, Besacier L, Verbeek J. Efficient wait-k models for simulteous machine trslation[J]. arXiv preprint arXiv:2005.08595, 2020. [4] He H, Wg Q, Yu Z, et al. Synchronous Interactive Decoding for Multilingual Neural Machine Trslation[C]//Proceedings the AAAI Conference on Artificial Intelligence. 2021, 35(14): 12981-12988. [5] Zhg F, Tu M, Y J. Accelerating Neural Machine Trslation with Partial Word Embedding Compression[C]//Proceedings the AAAI Conference on Artificial Intelligence. 2021, 35(16): 14356-14364. [6] Li B, Wg Z, Liu H, et al. Learning light-weight trslation models from deep trsformer[J]. arXiv preprint arXiv:2012.13866, 2020.

作者信息:孙豪,西安翻译学院,专业负责人,副教授,通信、计算机,76802682@qq.com,17791282219,陕西省西安市长安区太乙宫镇西安翻译学院信息工程学院电子信息系