EXPERT: EXPloiting Empirical appRoaches to Translation

EXPERT (EXPloiting Empirical appRoaches to Translation) aims to train young researchers, namely Early Stage Researchers (ESRs) and Experienced Researchers (ERs), to promote the research, development and use of hybrid language translation technologies.

Human and automatic translation are an important part of the policy of multilingualism within Europe and EXPERT brings the two together through the development of next generation technologies to address the needs of both translators and EC policy. EXPERT fits within the EC's 2020 strategic framework to promote (i) a digital agenda for Europe, which proposes to better exploit the potential of ICTs in order to foster innovation, economic growth and progress: EXPERT will improve translation practices and enhance the p roductivity of relevant actors in the translation market by developing new ICT in the field of translation; and (ii) an agenda for new skills and jobs: EXPERT will help modernize the translation labour market by promoting new job profiles such as human translators and post-editors who will have the skills to make use of the latest ICT and translation technologies, along with automated translation researchers and developers. EXPERT will contribute to the general notion that ICT needs to be language-aware and promote content creation in multiple languages. By training young researchers to become future leaders in this area in Europe, as well as producing training material to be used by a number of other professionals and users, EXPERT will contribute to a strong and effective European leadership in the area.

Translation Memory (TM) and Machine Translation (MT) are the two most common technologies used to support human language translation. TMs are interactive systems which aim to help humans during the translation process, by offering suggestions based on matches with previous translations of similar texts for all or parts of the input text (called segments), leaving the unmatched parts for the human translator. MT systems, on the other hand, aim to fully translate the input texts. Most recent research in the MT field has focused on corpus-based (or empirical) approaches, particularly two variations based on examples of translations to automatically build translation systems: Example-Based (EBMT) and Statistical Machine Translation (SMT). These approaches are cheaper and faster to develop, as compared to rule-based MT, which requires specifying linguistic rules, a costly process usually done manually by experts in both languages.

TM technology is mainly used by professional translators who are experts in both the language pair and text domain, to translate repetitive documents. MT technology is mostly aimed at the general public with little knowledge of the source or target language, translating general domain and genre texts, mostly interested only in getting the gist of the text or a draft translation.

Recently, a number of developments in EBMT and SMT have shown the potential of corpus-based MT approaches for producing fast and low cost translations, significantly reducing human effort, time and costs. However, according to Allied Business Intelligence, only 1% of the world's translation demand is covered by MT, while the remainder is covered by human translators. The main reason for this figure is that MT tools are not designed to aid professional translators. Some of the shortcomings of MT technologies are user-unfriendly interfaces, lack of awareness of translator's feedback, etc. Therefore, their great potential has not yet been appropriately exploited. TM systems also have a number of well-known limitations, mainly their poor performance for texts which have not been translated before. The view of diverse and non-overlapping target users of these two types of translation approaches has resulted in little research towards exploiting the integration of these technologies to provide better solutions. In EXPERT we advocate that there is no clear boundary between supposedly fully automatic translation (i.e. MT) and semi-automatic translation (i.e. TM). We consider instead that both variations are tools to help humans (professional translators and end-users) to produce high quality, reliable, fast and cheap translations. EXPERT will accommodate the requirements of different types of users, by prioritising its research according to evidence about the needs and problems encountered in real-life conditions by users of translation technologies, including both professional translators and readers of translations.

  • Start date: 1 October 2012

  • Duration: 48 months

  • Consortium:

    • Research Group in Computational Linguistics, University of Wolverhampton, UK (coordinator)
    • Pangeanic, Spain
    • Universidad de Málaga, Spain
    • University of Sheffield, UK
    • Universitaet des Saarlandes, Germany
    • Translated srl, Italy
    • Dublin City University, Ireland
    • Hermes Traducciones y Servicios Lingüísticos, SL, Spain
    • Universiteit van Amsterdam, Netherland