Government Technology

    Digital Communities
    Industry Members

  • Click sponsor logos for whitepapers, case studies, and best practices.
  • McAfee
  • Net App
  • NIC
  • Perceptive Software

European Commission Makes Computer-Assisted Translation More Accessible



January 18, 2008 By

The European Commission has made freely available its collection of about 1 million sentences, and their high quality translations, in 22 of the 23 official EU languages -- including those of the new member states. This is the biggest collection ever of so many languages.

This kind of data is highly sought after by developers of machine translation systems in which automatic translation software "learns" from manually translated texts how words and phrases are correctly and contextually translated. The data can also help the development of other linguistic software tools such as grammar and spell checkers, online dictionaries and multilingual text classification systems.

The European Union institutions have more multilingual texts than any other organization because of the requirements that EU law exist in each of its 23 official languages. Their translation services work with 253 possible language pair combinations and produce around 1.5 million translated pages a year.

Whereas large amounts of translations of English or French texts can be found on the Internet, such resources are scarce for languages such as Latvian or Romanian, and they are practically nonexistent for the combination of two languages for which few resources exist.

Therefore the commission, through co-operation between its translators and its in-house scientists, is releasing large collections of sentences from legal documents covering technical, political and social issues which are available in 22 languages. In this translation repository it is possible to find sentences with their equivalent in all other official languages. Only Irish translations are not yet available. This release of language data is a good example of the commission's open policy of re-use of its information resources and follows the opening of the EU's documentary and terminological databases Eur-Lex and IATE.

Leonard Orban, Commissioner for Multilingualism, says: "By this initiative the European Commission intends to boost human language technologies, support multilingualism and make computer-assisted translation easier, cheaper and more accessible. Citizens belonging to the smaller linguistic communities will have an easier access to documents and Web pages only available in the most used languages."

Janez Potocnik, European Commissioner for Science and Research, says: "This unique collection of language data contributes to the creation of a new generation of software tools for human language processing and helps foster the competitiveness of the language industry, which is already one of the fastest growing industries in the European Union."

The Commission has extensive experience with the development of multilingual text processing tools and is at the forefront of multilingualism, offering publicly accessible news search sites covering up to 35 languages via its European Media Monitoring tool. The 7th Framework program for research and development -- in its Information and Communication Technologies strand -- supports research on machine translation and other language related technologies.

To find more information on the translation data: http://langtech.jrc.it/DGT-TM.html


| More

Comments


Add Your Comment

You are solely responsible for the content of your comments. We reserve the right to remove comments that are considered profane, vulgar, obscene, factually inaccurate, off-topic, or considered a personal attack.

In Our Library

White Papers | Exclusives Reports | Webinar Archives | Best Practices and Case Studies
WHITEPAPER: D Block Spectrum Act and the FirstNet Broadband Network. What does it all mean?
On Feb 22, 2012, the Middle Class Tax Relief and Job Creation Act of 2012 was enacted into law. This law will ensure the establishment of a nationwide, interoperable public safety broadband network in every state and territory in the U.S. Learn about the new law and what you can do to prepare for it now.
New Research Reveals Surprising Trend for Funding Innovation
Listen to an informative discussion with Digital Communities members to learn how you can use your IT savings and efficiencies to do the new things you have been waiting to do.
Continuity with Cloud Solutions
Cloud solutions provide agility, flexibility and scalability to government agencies. In an emergency situation where an agency’s infrastructure and resources are impacted, prioritization and restoration become critical elements of a disaster recovery plan. The flexibility of cloud services helps agencies make adjustments to processing capacity on demand.
View All

Digital Communities members get access to our collaboration task forces

427 Members

77 Discussions

84 Files

Latest members Become a member

Digital Communities members get access to our collaboration task forces

669 Members

145 Discussions

150 Files

Latest members Become a member

 


Featured White Papers & Reports

The Future of the Desktop in Government

Until recently, there was no alternative to the familiar desktop computer, and its expensive upgrades and maintenance requirements. For cash-strapped local governments, the desktop computer is quickly becoming an unsustainable option for future progress. Now, a technology known as virtual desktop infrastructure (VDI) offers an alternative. It can be significantly more affordable than buying individual computers for every employee, and it provides similar capability. This paper shows how VDI is the future of the desktop and is a game-changer for local governments.


View Full Library

Events

GTC East

Don't miss this opportunity to see the latest in digital government solutions, keep abreast of current policy issues and network with key government executives, technologists and industry specialists.

View All Events