Cross-Linguistic Data Format (CLDF) dataset derived from von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners" from 1853 for the comparative numeral data (p. 434). It is a work in progress and another practice session with CLDF to handle/test multiple languages.
NB: In this first version (v1.0.0), the word forms are still in the original orthography and not yet segmented/tokenised. The next release attempts to include orthography standardisation and segmentation.
Funding
Lexical resources for Enggano, a threatened language of Indonesia