current position：Home>A new breakthrough in AI Simultaneous Interpreting: Sogou simultaneous interpreting 3.0 pioneered the context engine, and the accuracy of PPT content translation increased by 40%
A new breakthrough in AI Simultaneous Interpreting: Sogou simultaneous interpreting 3.0 pioneered the context engine, and the accuracy of PPT content translation increased by 40%
2022-05-07 19:55:56【Qingdeng ancient temple】
Machine center editorial department
This is the first multimodal AI voice simultaneous interpreting product , Sogou simultaneous interpreting 3.0 Bring the accuracy of intelligent simultaneous interpreting to a new level .
last Saturday , Sogou released the industry's first multimodal simultaneous interpreting product —— Sogou simultaneous interpreting 3.0 edition . Based on Sogou's original 「 Context engine 」, Sogou simultaneous interpreting 3.0 Added visual and thinking skills , The simultaneous interpreting of the machine will not only listen to it. , It is also the first time to see 、 Ability to understand and reason . After the first exhibition of this technology , The scene attracted public attention .
Last Saturday , Sogou simultaneous interpreting 3.0 Debut .
lately , sogou AI Chen Wei, general manager of Interactive Technology Department 、 Zhangjingjing, product director of Sogou simultaneous interpreting, and Zhaochao, project leader, revealed to us the technology behind Sogou simultaneous interpreting .
Initiate 「 Context engine 」, sogou AI A new breakthrough in simultaneous interpreting
Sogou simultaneous interpreting technology is from 2016 Since its release , It has experienced the practical application of simultaneous interpreting in thousands of meetings . Developers have found in practice that , The mainstream voice simultaneous interpreting system in the industry is not stable 、 High quality to meet the needs of a variety of speech occasions , The recognition and translation of professional words in speech content are often ineffective .
In order to solve the above problems , Sogou is in simultaneous interpreting 3.0 Added in version 「 Context engine 」, I hope to solve the problem through in-depth understanding of the language .「 The context engine can use the camera to recognize the scene on the screen in real time PPT Content ,」 Chen Wei introduced ,「 Before, the machine simultaneous interpreting could only obtain voice information , adopt OCR technology , Now Sogou simultaneous interpreting can get voice information + PPT Information , Then the context engine can build personalized knowledge , Thus, the translation effect of simultaneous interpreting is greatly improved .」
The following figure shows some 3.0 The application effect of version simultaneous interpreting , The second column is the original content of the guest speech , The third column is the content of the old version of speech recognition . In the past , The speaker uttered some rare words , such as 「 dice 」, Usually it will be AI Identified as an investment , however PPT There is... In the content AlphaGo Man machine war with Li Shishi , Will make simultaneous interpreting 3.0 The system expands 「 dice 」（ It means that one party admits defeat ） Such GO terms , With the help of knowledge map ,AI A lot of corrections can be made to the translation .
Except for proper nouns , How much has the performance of the new technology improved ？ Sogou means , In particular, they chose a more difficult professional conference speech , Yes, simultaneous interpreting 2.0 edition 、3.0 A comparative test was carried out between version a and human professional simultaneous interpreting . Man has reached 4.08 branch 、 Sogou simultaneous interpreting 2.0 You can achieve 3.41 branch , and 3.0 The version has obtained 3.82 branch . This achievement has achieved a new breakthrough in the field of simultaneous interpreting , Give Way AI It is one step closer to the professional level of human simultaneous interpreting .
The multimodal technology of seeing and listening is not Sogou simultaneous interpreting 3.0 The only bright spot . Sogou means , Simultaneous interpreting 3.0 It mainly brings improvements in three directions ：
- Closer to nature , From simple speech recognition to speech + Images , The new method simulates the working mode of manual simultaneous interpreting , Increase the function of vision and brain to spread knowledge points , Have a more complex perception system .
- More professional , Previous AI The simultaneous interpreting model uses general data , The new model enhances the capability by customizing knowledge in real time , Able to capture the scene PPT The content complements the knowledge of professional fields related to the speech , And customized the model for each speech , Enhance the effect of simultaneous interpreting .
- More intelligent , In the past, model training needed a passive learning process , Now learn automatically PPT The content of , Automatically capture massive vocabulary , Ensure that the quality of simultaneous interpreting is excellent .
Chen Wei further concluded ：「 Sogou simultaneous interpreting 3.0 The version has a large-scale update from front to back , The first is the introduction of multimodality , Added visual processing capabilities . Secondly, in the process of processing, it is upgraded from the perceptual level to the cognitive level , stay 『 Context engine 』 With the help of the , The system can further expand the content of simultaneous interpreting with the help of knowledge map . Form contextual information related to the content of the speech . In the new simultaneous interpreting tool , The system can also enhance the effect of simultaneous interpreting and translation in real time , Less delay .」
With the speaker 「 Look and think 」
Compared with the previous , Multimodal AI Simultaneous interpreting is closer to human beings ,「 Will see 」 It means that simultaneous interpreting has the visual ability for the first time . According to introducing , Sogou simultaneous interpreting 3.0 In use, it can be intercepted with the help of the screen , Get real-time image information or ordinary camera , There is no need to use specific equipment . 「 Can understand and reason 」, Thanks to the application of Sogou context engine . This includes the knowledge map of Sogou and the reasoning ability of encyclopedia , The system can OCR The text content obtained by technology is related to the core knowledge related to the speech , And pass 「 Search dog knows cube 」 Knowledge map real-time reasoning , Acquire background knowledge . in addition , The simultaneous interpreting system can obtain bilingual Chinese English comparison based on the Chinese English language library of Sogou encyclopedia , Real time optimization of simultaneous interpreting recognition and translation .
Sogou means , Get information in a multimodal way , In the case of introducing knowledge map at the same time , Sogou simultaneous interpreting 3.0 in the light of PPT The recognition accuracy of content has been improved 21.7%, The accuracy of translation has improved 40.3％.
In addition to the conference speech , The technical system of Sogou simultaneous interpreting will be implemented in more scenes , Teleconferencing 、 Journalist interview 、 Live video 、 Tourism travel , Even court trial records are the direction of future efforts .
Sogou simultaneous interpreting technology is from 2016 Released in 1.0 Since Edition , Experienced a process of continuous upgrading .「 Behind the translation module of the simultaneous interpreting system ,1.0 Version use RNN Model , stay 2.0 In the version , We introduced Transformer Model , Solved the problem of gradient explosion , And can remember longer historical content . stay 3.0 Version of the system , except Transformer, Context based streaming decoding is also used , And introduces the knowledge map based on Sogou encyclopedia .」 Zhao Chao said .
But at the same time, we should also see the common problems of the industry ,AI The accuracy of simultaneous interpreting is still far from the level of human experts , Among them, the challenge of existing algorithm ability , There are also people for AI「 Higher requirements 」 Why .「 After communicating with many simultaneous interpreting practitioners, we found that , Follow the normal process , Manual simultaneous interpreting requires the partner to provide background materials in advance , And have one or two days to prepare ,」 Chen Wei explained ,「 But there is no preparation time for machine simultaneous interpreting , And at the beginning of simultaneous interpreting , Humans can also see the scene PPT Content on . So for machine simultaneous interpreting , In addition to doing a good job in pronunciation , Visual information is also very important .」
Sogou simultaneous interpreting 3.0 behind , It's the company 「 Natural interaction + Knowledge of computing 」 The deepening of strategy . sogou CEO Wang Xiaochuan recently said , sogou AI The core of Technology , Is to add perception to the machine through deep learning , So as to realize the natural interaction with human beings , At the same time, we can further extract the relevance in the language , Let machines produce human 「 cognition 」 Ability .
From initial voice interaction to lip recognition , To machine translation 、 Sogou busy （ Synthesis of the host ）, To today's multimodal interaction , Sogou is relying on voice 、 Images 、 Gestures and other ways to make AI It is more important to expand with human beings 「 natural 」 The communication of .
author[Qingdeng ancient temple],Please bring the original link to reprint, thank you.
The sidebar is recommended
- Foreign venture capital news | food technology start-up "mooji meats" raised a new round of US $3 million to produce artificial meat using 3D printing technology
- The value of crayfish plummeted, and the wholesale price of a kilogram was less than 20 yuan
- Jining development training program
- Which major does the lawyer's personal IP build Tiktok short video company in Hubei
- Which is the major of Tiktok short video operation company in the sports industry
- Fitness center Xi'an Tiktok short video shooting operation company which is the major
- Netease cloud music officially released the k-song app "music street" and invested 200 million to support music stars
- Jimi h3s experience: a private theater you can enjoy at home
- Which is the major of Tiktok short video production company in Tianjin?
- 5g "cloud" life
guess what you like
What if the win11 store cannot load the page?
Last night, it ushered in the largest IPO of US stock medical this year, which fell on the ophthalmology track
Is it good to belittle yourself?
Krypton evening news tiktok has obtained the anti fraud certification of tag, an advertising self regulatory organization; Peter, chief financial officer of BMW Group: China will maintain the world's largest new energy vehicle market in the next few years
Singles exclusive benefits 360 search unveils the mystery of "black Valentine's Day"
5g RF manufacturer Fuman micro: at present, the company's wafer capacity is still in short supply
China Telecom's "SIM digital ID card" was officially launched
Lixun precision: East China factory is now in the process of orderly resumption of work and production
13 years of Alibaba e-commerce: Zhang Yong's merits and demerits
Authoritative express ｜ new quantum computing software released an important step in the combination of domestic quantum computing software and hardware
- Apple's first retail store in Japan will be dismantled by the end of 2022
- Sell spoiled overnight fruits. This is the "more delicious fruit" that Baiguoyuan wants to make?
- Musk set three fires on Twitter's "cigarette butt"
- Samsung will provide dish with 5g network equipment and other telecommunications equipment, with an estimated amount of more than 1 trillion won
- Acer: the problem of short and long materials has changed from a shortage of semiconductors to "factories can't start"
- Wechat: the official account needs to provide relevant qualification certificates for overseas recruitment. If it is not provided within 7 days, it will be directly sealed
- Apple has officially integrated icloud "manuscript and data" service into icloud cloud disk
- Jia Yueting's microblog IP territory shows that Beijing netizens say "President Jia has returned home"
- Tiktok has recently disposed 1750 videos showing off wealth in violation of regulations, such as placing and decorating RMB
- Kwai live broadcast launches intermodal plan 3.0 intermodal training high potential and high-quality anchor
- Iphone14 family photo exposure no mini version
- In January, when the powder rises 40W, @ you can't eat enough and become a new top stream of little red books?
- The most practical introduction to the middle stage
- Teach you how to use your mobile phone to remotely control another mobile phone or computer. Only one software is needed
- Installation of a fast threshold stone master asked for 200 yuan. The owner thought the price was too expensive, so he did it himself!
- Chen Zhiwu: family business will be replaced
- Shanghai auto enterprises resume work: the workers have returned to the production line after leaving the shelter
- How to create your own encyclopedia in Baidu and the skills of doing Encyclopedia
- Football field on the cliff "New Year gift" from moto, Tibet
- The second echelon is the banner of anti growth. What enlightenment will the sales of new energy vehicles bring in April?
- Hualin securities broke the Bureau's fintech dolphin stock app and upgraded it to securities trading software
- 11000 words, 11 CEOs, detailed analysis of brand growth
- The delivery of the three major businesses cost more than 3 billion yuan. Danone and Mengniu bid farewell
- Adults' daily stepping on the pit: the school didn't teach these at all
- Analysis of China's MCN market trend: it is expected that the scale will exceed 50 billion yuan in 2023
- The ups and downs of live broadcasting and delivery in 2021 will usher in an orderly development stage
- One day, one history, one country, one diplomacy (today, 49 years since the establishment of diplomatic relations between China and Germany!)
- The supplier said Huawei's mobile phone production capacity was restored, and Yu Chengdong had previously stated his position
- IP territory shows that Beijing Jia Yueting has returned home? Media disclosure: release with the assistance of domestic team
- Doctor story | medical staff & 34; Go to the countryside & 34; What have you been through? " Behind the "living Bodhisattva" is the patient's hopelessness and endless gratitude again and again
- E-commerce information / dry goods in the "April 27 express" of e-monkey.com, which gathers new e-commerce news all over the world
- Headline @ leader, I heard that telecommuting is the general trend
- Shein, is it worth $100 billion?
- R & D investment of listed game companies: more than 40 increased investment, and the per capita annual salary of 6 exceeded 500000
- Look! The working hours have changed this week
- With an annual salary of more than 400 million, who is the "working emperor" in the beauty industry?
- Methods for the company to establish Tiktok guild
- How to create school encyclopedia? Which is better to be an encyclopedia company?
- Ode to joy, an encyclopedia of growth in the new era!
- Baidu health service experience