UWEE Tech Report Series

Collection of Bilingual Data for Lexicon Transfer Learning


UWEETR-2016-0001

Author(s):
Leanne Rolston, Katrin Kirchhoff

Keywords:
translation, bilingual data, transfer learning

Abstract

This technical report describes the collection and format of a dataset of bilingual lexicons for 50 languages, undertaken as part of the DARPA LORELEI project on developing language technology for low-resource languages. We describe the data sources, collection method, and types of linguistic information included in the lexicons.

Download the PDF version