To the top

Page Manager: Webmaster
Last update: 9/11/2012 3:13 PM

Tell a friend about this page
Print version

Bootstrapping Language De… - University of Gothenburg, Sweden Till startsida
To content Read more about how we use cookies on

Bootstrapping Language Description: The case of Mpiemo (Bantu A, Central African Republic)

Conference paper
Authors Harald Hammarström
Christina Thornell
Malin Petzell
Torbjörn Westerlund
Published in Proceedings of the 6th edition of the Language Resources and Evaluation Conference (LREC 2008), 28-30 may 2008, Marrakech, Morocco,
Publication year 2008
Published at Department of Computer Science and Engineering, Computing Science (GU)
Department of Oriental and African Languages
Language en
Keywords Acquisition, Machine Learning, Endangered languages, Language modelling
Subject categories Computer Science, African languages


Linguists have long been producing grammatical decriptions of yet undescribed languages. This is a time-consuming process, which has already adapted to improved technology for recording and storage. We present here a novel application of NLP techniques to bootstrap analysis of collected data and speed-up manual selection work. To be more precise, we argue that unsupervised induction of morphology and part-of-speech analysis from raw text data is mature enough to produce useful results. Experiments with Latent Semantic Analysis were less fruitful. We exemplify this on Mpiemo, a so-far essentially undescribed Bantu language of the Central African Republic, for which raw text data was available.

Page Manager: Webmaster|Last update: 9/11/2012

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?