TY - JOUR AU - AB - Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System Judith Gaspers Penny Karanasou Rajen Chatterjee Amazon Amazon University of Trento Aachen, Germany Cambridge, UK Trento, Italy gaspers@amazon.de pkarana@amazon.co.uk chatterjee@unitn.it Abstract we investigate the use of Machine Translation to translate existing data sources to a new target lan- This paper investigates the use of Machine guage and use them to bootstrap an NLU system Translation (MT) to bootstrap a Natural Lan- for this target language. guage Understanding (NLU) system for a new A common procedure for data gathering for a language for the use case of a large-scale new language starts by some grammar-generated voice-controlled device. The goal is to de- data. Significant time and effort is consumed at crease the cost and time needed to get an an- this stage by language specialists to build gram- notated corpus for the new language, while still having a large enough coverage of user mars that offer a good coverage needed for a first requests. Different methods of filtering MT working system. Once this first system reaches data in order to keep utterances that improve a certain performance threshold, it can be shared NLU performance and language-specific post- with TI - Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System JF - Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers) DO - 10.18653/v1/n18-3017 DA - 2018-01-01 UR - https://www.deepdyve.com/lp/unpaywall/selecting-machine-translated-data-for-quick-bootstrapping-of-a-natural-zjEmDdLgq4 DP - DeepDyve ER -