Project Description
Automatic translation is an undeniable need in a globalized world where communication using several languages becomes increasingly relevant. Translation Memory (TM) and Machine Translation (MT) systems are the two most elaborate technologies to support human translation. Recent developments in the area of Example-based and Statistical Machine Translation (EBMT and SMT), in particular, have shown the potential of data-driven approaches for producing fast and low cost translations. A number of user studies have however established shortcomings in the technology state-of-the-art, including poor quality translations for low resource languages, interfaces that do not take into account user requirements and user feedback, etc.
We create an Initial Training Network to train young researchers on ways to improve current data-driven MT technologies (TM, SMT and EBMT) by exploiting their individual strengths through their combination and by addressing some of the main limitations of each of these technologies.
Leading academic and industrial partners in all data-driven translation technologies, along with both professional translators and end-users of translation technologies will support young researchers of the network during the whole research and development cycle, providing guidance, core and complementary training skills and evaluating the resulting technologies.
A comprehensive set of training materials on core and complementary skills developed during this project will be made freely available to other researchers interested in the field. We expect the training of researchers in the new skills required for the development and use of technologies that can increase productivity and reduce costs in the translation sector, as well as facilitate reliable communication and content creation in multiple languages, will contribute to several aspects of Europe’s ICT development.
The list of EXPERT research projects is available here:
Fellow |
Project Title |
HOST INSTITUTION |
---|---|---|
ESR 1 |
Investigation of translators’ requirements
from translation technologies
|
UMA |
ESR 2 |
Investigation of an ideal translation workflow
for hybrid translation approaches
|
USAAR |
ESR 3 |
Collection and preparation of multilingual
data for multiple corpus-based approaches to
translation
|
UMA |
ESR 4 |
Use of language technology to improve
matching & retrieval in translation memories
|
UoW |
ESR 5 |
Use of terminologies and ontologies to
improve corpus-based approaches to
translation
|
USAAR |
ESR 6 |
Learning from human feedback on the quality
of the translations
|
USFD |
ESR 7 |
Estimating the confidence of corpus-based
approaches to translation and the quality of
the translated texts
|
USFD |
ESR 8 |
Investigation of how each individual corpusbased
translation approach (TM, EBMT and
SMT) can benefit from each other
|
DCU |
ESR 9 |
Investigation of the ideal infrastructure for
computer-aided translation: pipeline with
NLP tools for pre/post-processing, SMT,
EBMT and TM techniques–a hybrid CAT tool
|
DCU |
ESR 10 |
Exploiting hierarchical alignments for
linguistically-informed SMT models to meet
the hybrid approaches that aim at
compositional translation
|
UvA |
ESR 11 |
Exploiting hierarchical alignments for a
semantically-enriched SMT system that offers
an extension to existing TMs to allow
incremental, recursive partial match of the
input using hierarchical constructions
containing variables
|
UvA |
ESR 12 |
Investigation of methodologies to evaluate the
improved SMT, EBMT and TM prototypes
and new hybrid computer-aided translation
technology proposed in EXPERT
|
UoW |
ER 1 |
Investigation of automatic methods for
collection & preparation of multilingual data
|
Translated |
ER 2 |
Implementation and evaluation (including
user aspects) of the improved SMT, EBMT
and TM prototypes proposed in EXPERT
|
Hermes |
ER 3 |
Implementation and evaluation of the new
hybrid computer-aided translation technology
proposed in EXPERT
|
Pangeanic |