這兩天,翻譯圈裡突然被Google推出的新科技給刷屏了。它有著非常高級的名字,叫做「神經機器翻譯系統」(GNMT, Google Neural Machine Translation system),簡而言之,就是用機器學習的方法來訓練機器,像上次Alpha Go打敗李世石那樣,不是告訴機器什麼語法規則,怎樣用詞造句,而是扔給它一堆素材,讓它自我學習自我提高。這一技術的名字里有「神經」兩個字,並不是說機器接上了神經,有了感應能力,而這只是訓練機器學習的一種方法,模仿神經元交流的方式。

早一代的翻譯軟體用的是PBMT技術(Phrase-Based Machine Translation),大致就是將一句話拆成一個個片語(phrase),然後針對每個片語去尋找合適的翻譯辭彙。現在Google隆重推出的GNMT技術,是將整句話作為輸入源,機器經過學習後,直接給出整句話的翻譯。(當時訓練Alpha Go的時候,也運用了神經網路技術。)


Using human-rated side-by-side comparison as a metric, the GNMT system produces translations that are vastly improved compared to the previous phrase-based production system. GNMT reduces translation errors by more than 55%-85% on several major language pairs measured on sampled sentences from Wikipedia and news websites with the help of bilingual human raters.







First, the network encodes the Chinese words as a list of vectors, where each vector represents the meaning of all words read so far (「Encoder」). Once the entire sentence is read, the decoder begins, generating the English sentence one word at a time (「Decoder」). To generate the translated word at each step, the decoder pays attention to a weighted distribution over the encoded Chinese vectors most relevant to generate the English word (「Attention」; the blue link transparency represents how much the decoder pays attention to an encoded word).



Google Translate 的 翻譯實踐測試


為了實踐測試一下它的表現,小編請人幫忙,隨手找了幾段中文給Google Translate,摘下它的譯文,讓大家一起瞧瞧Google的這個人工智慧到底表現如何,是否可以真正替代人工翻譯。同時,為了對比,小編也摘錄了有道翻譯的譯文,它應該用的是Google所謂上一代的Phrase-Based的技術吧,大家可以看看兩者表現差異究竟有多大。

Eg. 1





Innovation drive and environmental forcing together, vigorously promote innovation and entrepreneurship, and resolutely eliminate backward production capacity, small and medium scattered remediation, promote economic upgrading quality and efficiency upgrade. Deepen the reform of science and technology system, increase investment in science and technology, speed up the science and technology city, dream town and other innovative space construction, summed up the promotion of Xinchang science and technology system reform pilot experience. In the year, 23,000 new invention patents were granted, 1,688 high-tech enterprises, and R & D expenditure increased to 2.33% of the total GDP.


Combine innovation drive reversed transmission with the environment, vigorously promote innovation entrepreneurship, resolutely eliminate backward production capacity, low control small scattered, promote economic quality efficiency upgrades. We will deepen reform of the science and technology system, increase investment in science and technology, and accelerate the construction of innovative science and technology city, dream town space, sum up the experiences of promoting science and technology system reform pilot enjoys. For the authorized invention patent 23000, 1688 high and new technology enterprises, research and development spending accounts for the proportion of GDP to 2.33%.

Eg. 2






The characteristics of Internet governance, like Wuzhen the same water. Water, the first is omnipotent, interoperability; the second is clear and transparent; the third is to co-governance. Therefore, the water is not who can manage, but the need for common governance. Every country, every nation, every culture is different, the governance method is really different.

I listen to a lot of people overseas, the Chinese Internet is not too tight tube? But I think so: Anyway, the country out of 700 million users out of possession of the BAT (Baidu, Alibaba, Tencent) Out of a lot of innovation, there are still many things worth learning and reflection. As we Alibaba, I particularly opposed to control, but today there is a fake, fraud, people say you are not strict management, it still needs some management.


The characteristics of Internet governance, like water of wuzhen. Water, are the first and connectivity; The second is crystal clear; The third is to common governance. So, water is not who can manage, but need to common governance. Each country, each nation, each culture is different, the governance approach is really different. I listen to many people abroad, China s Internet tube too tight? But I think so, anyway, the pipe out of the 700 million users, tube out of the BAT (baidu, alibaba, tencent), tube out a lot of innovation, there are a lot of things worth learning and reflection. Just like our alibaba, I especially against tube before, but today the fake goods, fraud, the somebody else say you, lax management, so still need some of the management.

Eg. 3



據外媒報道,繼GalaxyNote 7因電池過熱存爆炸隱患遭調查後,日前,三星的洗衣機產品也因存在「安全問題」,遭到來自美國消費者產品安全委員會(CPSC:the Consumer Product Safety Commission)調查。






According to foreign media reports, following the GalaxyNote 7 because of the battery overheating explosion hidden after investigation,a few days ago, Samsung washing machine products because of the existence of"security", from the US Consumer Product Safety Commission (CPSC: theConsumer Product Safety Commission )survey.

On Wednesday, the CPSC warned some users of the Samsungwashing machine that "washing machines" have a "securityproblem" for users who use clothing models from the top of the fuselage.The CPSC states that these problematic machines were manufactured between March2011 and April 2016, but the CPSC did not specify specific models.


Samsung said in a statement that the company is working with the US authorities on how to solve potential problems to start a dialogue. "In rare cases, the affected product may be subject to vibration, personal injury or property damage when washing bedding, heavy or water-resistant clothing," Samsung said in a statement.

Samsung added that since 2011, its customers have completed hundreds of millions of laundry tasks, were not accident.


According to foreign media reports, the Galaxy Note 7 for battery overheat deposit was explosion hazard investigation, a few days ago, samsung washing machine products is also due to the existence of "security", were from the United States Consumer Product Safety Commission (the CPSC: the Consumer Product Safety appointed the investigation. Local time, on Wednesday, the CPSC to some users warned samsung washing machine, for those who use clothing models from the fuselage top loaded in the user need to pay attention to, the washing machine products exist "security issues". The CPSC says, the existing problems of machine manufacture date is between March 2011 and April 2016, but the CPSC unspecified specific models.


Samsung said in a statement, the company is working with the us authorities on how to solve the problem of potential dialogue. "In rare cases, the affected products in wash bedding, bulky or waterproof clothing happens when abnormal vibration, could face a risk of personal injury or property damage." Samsung said in a statement. Samsung added, since 2011, the customer has completed hundreds of millions of times laundry task, has not been an accident.


2. 一些非常專業的、有中國特色的表達,如果語料庫里以前沒見過,它就會自我發揮了。如「淘汰落後產能、整治低小散」。

3. 如果中文句子不是傳統意義上的完整結構的句型,它很難處理好。中文是一種意合的語言,不是依靠結構上的完整性來傳達意思,有時說半句,中文中也是可以接受的,別人也是聽得懂的,但對於機器可能一下子難以理解。例如:「我在海外聽不少人講,中國互聯網是不是管得太緊?但我自己這麼認為:不管怎樣,這個國家管出了7億用戶管出了BAT。(機器就蒙了)

4. 小編為了照顧Google的感受,並沒有扔給它一些古詩、古文讓它來試試。但是這些領域確是很多大翻譯家陶醉其中的地方。有時,你在琢磨怎樣的用詞能傳神地表達這一語境,這一過程本就是一種享受,這也是翻譯的魅力所在。


1. 作為一個新工具,譯員們可以參考GNMT版的譯文,有些搭配和表達,可以和它學一學,去其糟粕,取其精華,語法什麼的錯誤就可以忽略啦~

2.對自己的存在感有信心,儘管現在機器翻譯進步很快,但是仍然未達到替代人類的地步。就算正確率達到80%以上,對於一個機器來說,已經非常了不起了,但對於我們人類的使用來說,還是遠遠不夠。重要的場合,關鍵還是得靠人。(扯個題外話:這次川普和希拉里的辯論大戰,Youtube直播的時候,實時字幕用的不是自家語音識別技術的自動生成的字幕,而是人工速記員打的。這種重要場合,人家要避免把「year of horse」 聽成「year of whores」的尷尬)



