Government Technology

European Commission Makes Computer-Assisted Translation More Accessible



January 18, 2008 By

The European Commission has made freely available its collection of about 1 million sentences, and their high quality translations, in 22 of the 23 official EU languages -- including those of the new member states. This is the biggest collection ever of so many languages.

This kind of data is highly sought after by developers of machine translation systems in which automatic translation software "learns" from manually translated texts how words and phrases are correctly and contextually translated. The data can also help the development of other linguistic software tools such as grammar and spell checkers, online dictionaries and multilingual text classification systems.

The European Union institutions have more multilingual texts than any other organization because of the requirements that EU law exist in each of its 23 official languages. Their translation services work with 253 possible language pair combinations and produce around 1.5 million translated pages a year.

Whereas large amounts of translations of English or French texts can be found on the Internet, such resources are scarce for languages such as Latvian or Romanian, and they are practically nonexistent for the combination of two languages for which few resources exist.

Therefore the commission, through co-operation between its translators and its in-house scientists, is releasing large collections of sentences from legal documents covering technical, political and social issues which are available in 22 languages. In this translation repository it is possible to find sentences with their equivalent in all other official languages. Only Irish translations are not yet available. This release of language data is a good example of the commission's open policy of re-use of its information resources and follows the opening of the EU's documentary and terminological databases Eur-Lex and IATE.

Leonard Orban, Commissioner for Multilingualism, says: "By this initiative the European Commission intends to boost human language technologies, support multilingualism and make computer-assisted translation easier, cheaper and more accessible. Citizens belonging to the smaller linguistic communities will have an easier access to documents and Web pages only available in the most used languages."

Janez Potocnik, European Commissioner for Science and Research, says: "This unique collection of language data contributes to the creation of a new generation of software tools for human language processing and helps foster the competitiveness of the language industry, which is already one of the fastest growing industries in the European Union."

The Commission has extensive experience with the development of multilingual text processing tools and is at the forefront of multilingualism, offering publicly accessible news search sites covering up to 35 languages via its European Media Monitoring tool. The 7th Framework program for research and development -- in its Information and Communication Technologies strand -- supports research on machine translation and other language related technologies.

To find more information on the translation data: http://langtech.jrc.it/DGT-TM.html


| More

Comments

Add Your Comment

You are solely responsible for the content of your comments. We reserve the right to remove comments that are considered profane, vulgar, obscene, factually inaccurate, off-topic, or considered a personal attack.

In Our Library

White Papers | Exclusives Reports | Webinar Archives | Best Practices and Case Studies
Fresh Ideas In Online Security for Public Safety Organizations
Lesley Carhart, Senior Information Security Specialist at Motorola Solutions, knows that online and computer security are more challenging than ever. Personal smartphones, removable devices like USB storage drives, and social media have a significant impact on security. In “Fresh Ideas in Online Security for Public Safely Organizations,” Lesley provides recommendations to improve your online security against threats from social networks, removable devices, weak passwords and digital photos.
Meeting Constituents Where They Are With Dynamic, Real-Time Mobile Engagement
Leveraging the proven and open Kofax Mobile Capture Platform, organizations can rapidly integrate powerful mobile engagement solutions across the spectrum of mobile image capture, mobile data capture and complete mobile process integration. Kofax differentiates itself by extending capture to mobility, supporting multiple points of constituent engagement. Kofax solutions dynamically orchestrate the user’s mobile experience from a single platform—reducing time to market, improving process perf
Public Safety 2019
Motorola conducted an industry survey on the latest trends in public safety communications. The results provide an outlook of what technology is in store for your agency in the next five years. Download the results to gain this valuable insight.
View All

Featured Papers