Skip to content

CLDF dataset derived from Grollemund et al.'s "Bantu expansion shows habitat alters the route and pace of human dispersals" from 2015

License

Notifications You must be signed in to change notification settings

lexibank/grollemundbantu

Repository files navigation

CLDF dataset derived from Grollemund et al.'s "Bantu expansion shows habitat alters the route and pace of human dispersals" from 2015

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Grollemund, Rebecca, Branford, Simon, Bostoen, Koen, Meade, Andrew, Venditti, Chris, & Pagel, Mark (2015) Bantu expansion shows habitat alters the route and pace of human dispersals. Proc Natl Acad Sci USA. doi:10.1073/pnas.1503793112.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a https://creativecommons.org/licenses/by-nc/4.0/ license

Available online at https://doi.org/10.1073/pnas.1503793112

Conceptlists in Concepticon:

Notes

Language mapping

From Harald Hammarström:

  • I don't know what Os and Dj stand for in B71a_Teke_Os, B71a_Teke_Dj but if they are B71A the id should be Tegue of the Alima/Gabon [teg].
  • Lega is ambiguous between Lega Shabunda and Lega Mwenda, I have reason to suspect this is Lega Shabunda because that's the one Stappert worked on and I bet they got their wordlist from there.
  • The D20B_Vamba vocabulary (it's been discussed a couple of times in the literature)
    probably did not come from a native speaker, but purports to be Amba [rwm] so I've assigned it that id.
  • The D313_Mbuttu_1919 is often taken to be a variant of Vanuma, but that's based on impressionistic comparison, their paper puts it closer to Bodo, so I've id:d it as Bodo.
  • Nyiha and Emakhua are to specified to region but those bare names usually mean Central Nyiha and Central Emakhua so I've id:d them so.
  • Based on Philippson's other publications (the data is from him), JE32_Luyia could only be Masaaba, Isuxa, Logooli, or Saamia. Saamia [lsm] is the largest one and also the one the missionaries tried to use for standardisation so one might as well guess JE32_Luyia is Saamia [lsm].

Consonant Clusters

The orthography profile is only an approximation. There remain quite a few cases where we could not decide what the pronunciation is, due to ambiguities. We left them in this form, but ask kindly to check upon this, when running any kind of analysis in which the phonetic transcriptions of this dataset are important.

Statistics

CLDF validation Glottolog: 100% Concepticon: 100% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 424 (linked to 333 different Glottocodes)
  • Concepts: 100 (linked to 100 different Concepticon concept sets)
  • Lexemes: 37,730
  • Sources: 217
  • Synonymy: 1.00
  • Cognacy: 37,712 cognates in 3,853 cognate sets (1,794 singletons)
  • Cognate Diversity: 0.10
  • Invalid lexemes: 0
  • Tokens: 183,363
  • Segments: 606 (0 BIPA errors, 0 CLTS sound class errors, 600 CLTS modified)
  • Inventory size (avg): 40.85

Contributors

Name GitHub user Description Role
Robert Forkel @xrotwang CLDF conversion Editor
Tiago Tresoldi @tresoldi CLDF conversion Editor
Johann-Mattis List @lingulist orthography profile Editor
Rebecca Grollemund data collection Distributor
Mark Pagel data analysis Distributor

CLDF Datasets

The following CLDF datasets are available in cldf:

About

CLDF dataset derived from Grollemund et al.'s "Bantu expansion shows habitat alters the route and pace of human dispersals" from 2015

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •