|
Proof-Reading & Quality Enhancement Cycle
Historically, one of the key problems with machine translation was that the machine consistently made the same mistakes. This meant that post-editors had to make the same correction over and over. Asia Online is unique in the Statistical Machine Translation (SMT) market with its rich set of post-editing and post editing tools that contribute to a complete human feedback loop for continued translation quality improvement. These tools enable users to make corrections to the translated output, with the corrections being analyzed and leveraged by the Asia Online translation platform for greater quality on future translations.
High-Level Overview of Improvement Process Base Around Human Feedback
The Post-Editing tools provide a rich environment that allows reviewing translators/post-editors to do any or all of the following:
- Identify the most common kinds of errors e.g. unknown words
- Get direct and immediate access to any web based dictionary and glossary resources
- Make edit changes to the SMT system output to correct high frequency translation errors
- Chat in real time on specific translations to enable basic collaboration on the cleanup process
- Blog about the process or the material being worked on to again facilitate communication and collaboration on the translation task
- Compare translations by different editors to determine what is the best approach
- Track different versions of translations for the same segment so that the system can learn the human variances that occur with language
- Track different opinions on the best translation as there is very rarely complete agreement on the “best” translation
The changes made in the Asia Online Post-Editing tool are directly fed back into the translation platform. This ensures that once errors are corrected, the corrected data can be used to permanently make modifications to the statics that determine the best translation quality and in doing so eliminate any future occurrence of the same error.
This feature allows Asia Online’s Enterprise Translation Platform to improve each time users provide feedback and over time raise both the base output quality and the overall productivity of translators and post-editors
Although raw un-edited machine translation can be useful, Asia Online understands that machine translation alone will not produce high enough quality for many kinds of translation tasks. Our tools have been designed as an integrated post-editing and human colaboration environment that enables translation professionals to focus on the finer details of translation.
Integrated Post-Editing and Human Collaboration Environment
Ongoing Use and Rapid Quality Evolution
Unlike other SMT or Rules Based Machine Translation (RBMT) systems that have a very slow and painful error correction process, the Asia Online approach allows the user to take immediate control of how the initial translation engine will improve and evolve. The greater the post-editing and post-editing efforts made into the initial engine the more quickly it will improve.
Asia Online offers a number of models to use out translation platform. Our Enterprise Translation Portal is available as a Software as a Service (SaaS) offering, and is designed to meet the needs of small to mid-sized organizations. Because Asia Online’s Enterprise Translation Portla is a SaaS solution, a large server farm provides the mass of computing resources behind each key process. This enables the platform to respond to key user requirements much more rapidly and easily than traditional machine translation solutions.
The Rapid Feedback & Quality Evolution Cycle
No automated translation engine can be expected to produce human quality translations out-of-the-box. We see that the process of evolution is a very deliberate and structured process that takes a predictable path. Asia Online provides a variety of tools and processes to facilitate the enhancement of a customer translation system. In the typical development of a translation engine at Asia Online the following process is observed and is common to most projects.
Quality Assurance Process
Asia Online is able to achievement rapid quality improvement because, unlike our competitors dirty data approach, we only use clean data to train our engines. Clean data means that when an error occurs, the error is usually due to gaps in knowledge coverage in the data. Errors caused in SMT systems built on dirty data are usually due to bad data, which is very difficult and costly to remove. Asia Online provides the tools to understand which data was used to determine a translation, which in turn allows gaps to be seen when they occur. These gaps can quickly be filled, ensuring that future translations do not have the same errors. We have a through process of machine and human checks to ensure only clean data is used in our engines.
The Impact of Targeted Error Corrections
The above graphic illustrates how rapidly the raw output of the automated translation system can improve. As a focused error identification and correction strategy is executed, the system can improve in a matter of weeks. The green indicates phrases that are ready to publish directly from the translation platform.
Crowdsourcing and other quality drivers
Since an Asia Online system is a living and constantly changing system there are always multiple concurrent ways that the system is improving. Some of this is because of a growing corpus of clean data underlying the system, but also because we are constantly improving our spell checking, parsing, segmentation and alignment tools and technologies.
There are three ways that the basic translation engines will continue to improve over time and with use:
- New data from Asia Online publisher partners and general data gathering efforts to expand the number of domains that we cover, add vocabulary in specialized fields, add vernacular and colloquialisms which all boost baseline quality.
- Corrective Feedback from linguistic experts and translators within your localization team who can identify the major error patterns and dramatically enhance the overall quality by cleaning these errors out of the system. Typically, by focusing on 20% of the most used phrases and language it is possible to make a large positive impact on the bulk of the content.
- For very large bodies of content that simply cannot be done by an internal team in any meaningful time frame, Asia Online can also support a broad web based collaboration environment. Corrective Feedback from crowdsourcing (users who visit the site and make suggestions to improve translations) can also be handled and we have processes and procedures in place to manage this and control quality as shown below.
|