OUP user menu

Language evolution and climate: the case of desiccation and tone

Caleb Everett, Damián E. Blasí, Seán G. Roberts
DOI: http://dx.doi.org/10.1093/jole/lzv004 33-46 First published online: 22 February 2016


We make the case that, contra standard assumption in linguistic theory, the sound systems of human languages are adapted to their environment. While not conclusive, this plausible case rests on several points discussed in this work: First, human behavior is generally adaptive and the assumption that this characteristic does not extend to linguistic structure is empirically unsubstantiated. Second, animal communication systems are well known to be adaptive within species across a variety of phyla and taxa. Third, research in laryngology demonstrates clearly that ambient desiccation impacts the performance of the human vocal cords. The latter point motivates a clear, testable hypothesis with respect to the synchronic global distribution of language types. Fourth, this hypothesis is supported in our own previous work, and here we discuss new approaches being developed to further explore the hypothesis. We conclude by suggesting that the time has come to more substantively examine the possibility that linguistic sound systems are adapted to their physical ecology.

  • climate
  • tone
  • humidity
  • cultural evolution
  • interaction

1. Introduction

One of the chief distinguishing features of our species is its aptitude for adapting to its environment. Beyond the evidence of our adaptation at the genotypic and phenotypic levels, the reflections of our cultural and behavioral adaptation are pervasive. Humans adapt to their ambient conditions at every observed level in a broader sense, from ways that facilitate the transmission of an individual’s genes, to ways that enable the survival of cultures (Boyd et al. 2011). Perhaps more than any other species, we are in a very real sense adapted for adaptation, and few would question that this overarching capacity for adaptation is a sine qua non of the human condition. This characteristic, brought to the fore by human language and culture, facilitated our genus’ migration from Africa and the subsequent global circum-ambulation and conquest that followed.

Unsurprisingly, climate plays a major role in our intra-species genotypic and phenotypic variation, as it does for other species. Cross-population variations in size, surface area-to-volume ratio, pigmentation, and the like, present advantages associated with particular ecological constraints (e.g., Wells 2012). The same is true of non-latitude contingent variations such as the reduced hemoglobin levels evident in the bloodstream of high-elevation Himalayan populations (Beall et al. 2012).

Even more tellingly for our purposes, human behavior is replete with signals of ecologically motivated behavioral variation. This variation is evident at nearly every observed stratum of human behavior. From sexual practice to subsistence strategy to diet, there are signs of environmental influences on human behavior (e.g., Nettle 2009). These influences yield behavioral changes that foster survival benefits in most cases, even if the immediate motivation for adaptive behavior is often associated with discomfort avoidance (e.g., some cross-group clothing disparities).

While our culturally mediated behavior may be distinctly adaptive, of course other species adapt behaviorally as well. Interestingly, their behavioral adaptation is well known to include intra-species variations in communication strategies, as we note in Section 2. Despite the pervasive adaptation at nearly every inspected level of the human experience, and in the face of the pervasive adaptation evident in the communication of other species, there exists a standard theoretical presumption in linguistics that language is not ecologically adaptive at any meaningful level. In fact, language is presumed to be ecologically autonomous by most language researchers, with statements to that effect occasionally offered in introductory texts, typically without any buttressing data (e.g., Kaye 1989). Prima facie, we submit, this ‘autonomous’ position is actually problematic. As Nettle (2009) notes, it is a matter of theoretical presumption, not a research finding. Theoretical presumptions may be well motivated, and there are understandable reasons why most linguists consider language autonomous, impervious to external constraints. The current autonomous position seems to have been motivated in part by the rejection of quasi-racist, deterministic views espoused in the eighteenth and nineteenth century (e.g., Crowley and Bowern 2010, 13). Such facile rejections do not constitute evidence in their own right, however, that language is actually immune to ecological influences. In short, the autonomous position is, from our perspective, poorly grounded—however understandable it may be. Actual empirical inquiry into the potential ecological adaptability of linguistic form is only now in its nascent stages. We submit that, to truly settle this issue and in so doing have a more comprehensive understanding of both the development of languages and the range of human adaptability, this inquiry must be expanded rather than ignored. It is insufficient to dismissively point out, for instance, that ‘any language can be learned anywhere.’ The time for over-simplifying aphorisms defending an autonomous view has passed, and the time for the substantive exploration of relevant linguistic-geographic data is here.

In the next section, we offer crucial background on the ecological adaptability of other animals’ communication, as well as evidence for indirect ecological influences on human communication through population-mediated effects. We then present our case that ecological influences on language actually include direct effects on their sound systems, at least in the case of the tonal systems of languages.

2. Background on adaptive communication

2.1 Ecological influences on animal communication

There is considerable evidence that the signals of animals are adapted for communication in their particular environment. In general, the signals adapt to have high efficacy—a high probability of being transmitted and received effectively (Guilford and Dawkins 1991), and ecological conditions provide a major evolutionary pressure. For example, chemical signals used by insects evaporate over time, and species which live in hotter, more humid climates use chemicals that are more resistant to evaporation in order to ensure longevity of the signal (Alberts 1992). In the case of sound signals, the particular acoustic properties of the environment have a critical bearing on the optimal frequency and bandwidth for transmission. Similarly, some environments (like forests or jungles) present a dense number of obstacles for spreading soundwaves, which results in reverberation. Several animal species, including birds, anurans, spiders, and some mammals, adapt their signals by adjusting their frequencies and their duration in order to overcome these obstacles (Morton 1975; Hunter and Krebs 1979; Wilkins et al. 2013). Noise in the soundscape also leads to strategies such as increases of rate, duration and amplitude of the signal, all of which might be tuned to the noise level—for instance, some fish species with nocturnal behavior display acoustic signals that are markedly nonoverlapping with each other (in contrast to diurnal species) presumably due to a lesser reliance on ancillary visual cues during night (Ruppé et al. 2015). Temperature and humidity also affect acoustic absorption, with the ideal frequency for detectability changing in a complex way according to the local climate. The characteristics of animal signals should adapt to these constraints (e.g., Griffin 1971). For example, bats adapt their signals by restricting their frequency to the range least affected by local conditions (including variation between seasons, Snell-Rood 2012).1 The absolute frequencies of bat communication (and echolocation) also differ in accordance with ambient humidity levels, as evidenced by recent findings on the South African Cape horseshoe bats (Guillén et al. 2000; Jiang et al. 2010; Odendaal et al. 2014). Given the ubiquity of ecological adaptation in animal communication alone, its potential existence in human communication merits serious inquiry.

2.2 Previously suggested indirect influences of ecology on human communication

Previous research has offered evidence for indirect external influences on linguistic form (Fig. 1). Climate is known to influence human genetic evolution and population-level factors such as size, density, contact, and migration patterns, which have in turn claimed to impact the development of languages. Most dramatically, perhaps, during the Miocene (23-5 million years ago), the climate of Eastern Africa changed acutely, becoming cooler and drier as jungle was slowly transformed into savannah. It was during this period that the ancestors of humans diverged from the rest of the apes, with several theories suggesting that climatic influence served as a principal motivator of this divergence. The emergence of bipedalism would have increased travel efficiency in this new climate (Wheeler 1985; Steudel 1996), making it possible for the ancestors of humans to maintain larger group sizes (Isbell and Young, 1996). Both of these aspects have been suggested as pre-adaptations for language (Dunbar 1993; MacWhinney 2005), signaling that climate likely played a role in shaping the very conditions for the emergence of language. This conclusion is supported by a growing body of research suggesting that periods of inordinate climate change occurred in Africa around 2.6, 1.8, and 1 million years ago. These climatic oscillations, ultimately due to variations in the earth’s orbit, were likely pivotal to the eventual speciation of Homo sapiens (Shultz and Maslin 2013).

Figure 1

Potential causal connections between climate and language change. The path of causality linking climate and tone is highlighted in red. Boxes represent processes and arrows represent the direction of causality. Processes further to the left of the diagram have a more short-term effect than those further to the right. Climate can affect local carrying capacity, food production, and disease. Following a model from Michaelowa (2001: 212), this has a variety of effects on individuals and populations which eventually lead to differences in demography, migration, and contact, leading to language change. Climate affects the ecology, including the interface of communication (e.g., plant cover or acoustic absorption), affecting perception. It can also directly affect the physical articulators of language. Both of these create a selective pressure on linguistic items which affect their cultural diffusion. The ecology can affect the communicative needs of a community, leading to a selective pressure to express certain semantic distinctions. The selective pressures eventually lead to language change. Climate can also directly affect survival, creating a selection pressure, or population bottlenecks which can lead to genetic and, therefore, physiological changes. These may take place over longer time spans than the linguistic changes. There may also be several feedback loops; for example, the genetic changes may affect production (e.g., adaptations to cold climates affect the morphology of the nose), perception, or survival. Technological innovations may also mediate the effects of climate, as well as lead to climate change, which may have knock-on effects on migration and contact. Production of artifacts may also affect communication needs. In more recent times, technological innovations may also mediate cultural diffusion through communication technologies.

Climate continued to shape evolution within the human lineage, even within H. sapiens (e.g., Cavalli-Sforza et al. 1994: 142–5). For instance, the morphology of the human nasal cavity adapted, so that populations in drier, colder climates have higher and narrower cavities that increase the contact between inspired air and nasal walls, helping to humidify inhaled air (Noback et al. 2011; Evteev et al. 2014, though see Betti et al. 2009). These adaptations could have small effects on nasal sounds used in speech production, though this remains unclear. It is not an unreasonable suggestion, however, particularly in the light of recent work offering evidence for population-level anatomical biasing in the production of some sounds, namely clicks (Moisik and Dediu 2015).

Population contact and movement are essential factors in linguistic change, and these too can be affected by climate (e.g., Jones et al. 2001; Tyson et al. 2002), as can population levels more generally (Tallavaara and Seppä, 2011).2 Researchers have suggested that population level, in turn, correlates negatively with morphological complexity (Lupyan and Dale 2010), and that it correlates positively with size of phonemic inventory (Hay and Bauer 2007). The latter claims remain controversial (Wichmann et al. 2011; Moran et al. 2012) but are at least suggestive of the indirect influence of climate on linguistic form.

Nichols (1992, 1997) suggests that climate is a causal factor in contact phenomena. Contact between groups, which can lead to lower rates of linguistic diversity, is more common in high latitudes and arid continental interiors, and less common in more rugged terrain. In line with this, the range and density of linguistic groups is predicted by latitude and climatic conditions (Mace and Pagel 1995; Currie and Mace 2012). A direct test of the effect of climate on linguistic diversification was carried out in a study of the spread of the Uralic family (Honkola et al. 2013). This study reconstructed phylogenetic trees of descent in Uralic and estimated the relevant time depths of the tree nodes. The largest number of estimated ‘speciation’ events—namely, branching in the genealogical tree representing the shared history of the languages—co-occurred with changes in average temperature, suggesting that climate change caused massive population movement leading to diversification.

More relevant to our study, recent research has suggested that the phoneme inventories of a language may be affected by climate.3

Several studies have claimed that the overall sonority of a language correlates positively with the warmth of its speakers’ natural environment. Fought et al. (2004) and Munroe et al. (2009) inter alia, suggest that, because communities in warmer climates spend more time outdoors, they must communicate over relatively large distances. In these environments, sonorous speech sounds are putatively more adaptive because they carry over longer distances. This hypothesis was initially supported by a very modest sample of diverse languages, and has since been buttressed by analysis of a larger database (Maddieson et al. 2011). It is based on a suggested indirect influence of climate on language, since the direct motivator for the cross-linguistic variance is supposedly variance in intra-speaker distance (and associated acoustic interference) during communication. We note that the potential effect of climate on language may be extended to other modalities. For example, Schuit (2012) discusses the possible impact of climate on sign languages. Though the author does not find evidence of such an effect on the specific phonological factors considered, they do observe that climate impacts the language since ‘communication outside tends to be brief’ (Schuit 2012: 202).

More trivially, climate is claimed to affect the relevance of certain conceptual distinctions, motivating the adaptation of semantics, the lexicon or metaphor to fit communicative needs (e.g., the semantics of temperature, Koptjevskaja-Tamm and Rakhilina 2006). Indirectly, Witkowski and Brown (1985) and Brown (2013) suggest climatic factors influence the likelihood that a language lexicalizes a distinction between ‘hand’ and ‘arm’, since languages in cold regions are more likely to be spoken by people wearing long-sleeved clothing that yields a clearer discrete categorization at the wrist.

3. Tone-absence and desiccation

In arguably the most substantive foray into the exploration of the language–climate nexus, we recently published a paper demonstrating a robust statistical association between ambient desiccation and the absence of lexical tone (Everett et al. 2015). Through various strategies, from simple intra-linguistic-family and intra-regional regressions to cross-isolate comparisons to global Monte Carlo analyses, we demonstrated that the association was clear and not the result of confounds, such as language or areal relatedness between particular data points. Furthermore, we offered a brief meta-analysis of relevant studies from laryngology. These studies, previously uncited in the linguistic literature, suggest clearly that ambient air with very reduced specific humidity yields a variety of effects on human phonation. These include increases in phonation threshold pressure, perceived phonation effort, as well as increases in jitter and shimmer (see Leydon et al. 2009 for one review). We refer the reader to Everett et al. (2015) for a more detailed discussion of these factors, but it is worth mentioning here that at least some of the effect of desiccated air is due to the evaporation of the airway surface liquid coating the vocal folds and other parts of the vocal tract, evaporation which can result in reduced viscosity of the vocal cords’ surface liquid. Severe ambient dryness can yield dry, relatively inelastic vocal folds that are harder to manipulate. This difficulty of manipulation manifests itself, at least partially, in increased imprecision of fundamental frequency (Hemler et al. 1997).

Given the heightened articulatory effort and imprecision associated with phonation in desiccated contexts, we suggested that the clear avoidance of complex tonality in arid contexts is unlikely a matter of coincidence. Since fundamental frequency plays such a prominent role semantically in languages with complex tone, our conjectured causal relationship was, we think, both plausible and investigable via further experimental inquiry. After all, ease of articulation is well known to influence the typological distribution of certain sound patterns. Voiced velar plosives are less frequent than their alveolar counterparts at least in part, because it is more difficult to maintain the reduced supralaryngeal air pressure requisite for voicing when air is stopped at the velum rather than at the alveolar ridge. The same could be said for numerous other patterns in the world’s sound systems, and the tradeoff between articulatory difficulty and cross-linguistic frequency is also present in sign languages (Napoli et al. 2014). We have simply suggested [as in Everett (2013) study of ejectives and elevation] that characteristics of the air in a given environment likely impact the ease of articulation of particular sounds, namely tonal sequences relying on precise pitch modulation for the construction of meaningful units. Given the laryngology data demonstrating the comparable inelasticity of the vocal folds in dry contexts (both in ex-vivo and in-vivo contexts), this suggestion is ultimately grounded in experimental data.

The functional load of fundamental frequency and pitch is generally higher in tonal languages, particularly those with complex tone. Of course, many non-pitch phenomena are associated with the production of tone, including ancillary laryngealization and duration influences (e.g., Moisik et al. 2014). Yet, the heightened role of F0 (and therefore pitch) in languages with complex tone is evident in the fact that its fine-grained modulation is required on every or almost every syllable, in contrast to pitch accent languages (in which this burden typically affects at most one syllable per word) or non-tonal languages (where pitch modulation is mostly used to convey pragmatic information at the phrasal level). Furthermore, there is evidence that speakers of languages with complex tone exhibit superior performance in pitch-recognition tasks, both in linguistic (Caldwell-Harris et al. 2015) and non-linguistic (Peng et al. 2013) tasks. This too is suggestive of the greater reliance on precise pitch patterns in languages with tone and particularly complex tone.4

In Everett et al. (2015), we offered a variety of statistical tests of two large global databases (ANU’s phonotactics database, Donohue et al. (2013) and The World Atlas of Language Structures (Dryer and Haspelmath 2013) representing over 3700 languages. These included simple intra-linguistic-family regressions to cross-isolate comparisons to global Monte Carlo analyses. The results consistently offered support for our hypothesis that complex tonality should be disfavored in arid contexts, particularly extremely arid regions. The hypothesis and conclusions were widely covered and discussed, and received positive responses from numerous language researchers.5 Unfortunately, many other responses appeared to address claims in media reports of the work, rather than seriously engaging with the work itself. One relatively frequent reaction to the work seemed to be one of simple disbelief, and many linguists suggested the correlation we had drawn attention to was spurious. Other skeptical reactions included references to particular counter-examples or to disagreements about the nature of the databases employed, or even to the quantitative usage of such databases. Many of these responses failed to engage with the general approach—experimental evidence from laryngology motivated a testable hypothesis, which was supported with empirical data (in contrast, the previous study on potential direct influences of ecology on sounds (Everett 2013), relied more heavily on correlational data). These responses illustrate the prevalence of the autonomous position. Additionally, the bulk of linguistic research in the twentieth century abstracted away from the physical world, and focused on language as a formal system where constraints were primarily other aspects of language. The generativist approach also emphasized universal properties of language, rather than variation. This means that studies that investigate variation in language based on differences in variables related to geography, demography, processing, physical morphology, or genetics face resistance to integration into established theory.6 In extreme cases, some language researchers assume domains outside of formal aspects of language are unimportant or uninformative (Hauser et al. 2014). Such an autonomous position is tempting; but nevertheless inadequately supported. In fact, it is arguably an empirically impoverished position since there are no clear data demonstrating that language is not ecologically adaptive, and since linguistic theory has not seriously engaged with the possibility of ecological influences on language.

3.1 An evolutionary hypothesis

In this section, we lay out a theory about how desiccation affects the cultural evolution of lexical tone. The hypothesis involves processes at different time scales. A physical mechanism is hypothesized to affect individual aspects of production in the short term, but these accumulate into long-term language change. A range of mechanisms might be involved from production, perception, and interaction to cultural evolution, complicating the construction of associated predictions. Given this scope, the current hypothesis will not be entirely fleshed out, but we hope to give an overview of the process.

Based on the laryngology evidence above, we assume a mechanism whereby ambient dry air causes dehydration of the vocal folds, and that this dehydration reduces the accuracy with which the larynx can be controlled. This increases the production effort and the likelihood of producing a noisy pitch signal, creating a bias against using complex tones in dry areas. Admittedly, the precise effect here needs to be fleshed out.

The observed patterns could be due simply to the inhibition of precise phonation, or to the inhibition of phonation more generally. After all, laryngology studies have shown most clearly that perceived phonation effort and phonation threshold pressure are increased in desiccated contexts. Munroe et al. (2009) suggest that louder sounds (i.e., with more phonation) are less common in colder (typically drier) regions, so it is not unreasonable to question whether the tonal patterns we have documented are associated with a larger pattern of reduced functional load of phonation in desiccated contexts. This could include combinations of tone types in running speech which lead to large changes in tone. Alternatively, the deleterious effects of desiccation appears to have clearer effects at extreme pitch ranges (Leydon et al. 2009; Patel et al. 2015), so the possibility of greater influences on particular tone types should likely be explored. Another possibility is that tones with the maximum range or dynamics would be most affected.7 These are speculative points, but represent precisely the sort of investigable issue that we hope researchers will begin addressing.

In order to link this effect to wider change, we take the perspective that the locus of language change is the production and perception of individual utterances in conversation (e.g., Croft 2000; Enfield 2014). In this view, linguistic constructions are the units on which selection applies. Units ‘replicate’ by being used in utterances in conversation, making them available for further replication by other speakers. These constructions vary in form and function, and are in competition given limited time resources, a pressure for efficient communication and the potential for roughly the same meaning to be expressed in many ways. These factors set the scene for Darwinian cultural evolution. In the case of pitch, linguistic elements which require careful control of pitch should be less likely to be replicated in dry climates.

For this bias to lead to language change, phonetic effects of individual items need to propagate in a way that affects the whole phonological system. Tonogenesis and the loss of tone are complex processes, and can be influenced by processes outside of lexical tone. For now, we assume that high production effort can lead to tone leveling given enough time. Computational modeling of tone change (e.g., Kirby 2014) could help articulate and test the two latter aspects. This leads to a prediction that languages in dry climates should be statistically less likely to exhibit lexical tone.

3.2 Potential diachronic mechanisms

There are several selectional mechanisms by which the observed patterns could come about. One example is based on the effort of production, discussed above, but an alternative (and not mutually exclusive) pressure may come from the potential for miscommunication. Problems in production or perception in a system where small distinctions in pitch affect the interpreted meaning could lead to confusion between lexical items, leading to a cost either from misunderstandings or from having to take time to repair the misunderstanding. This may be more or less likely given the structure and number of possible tone contours. A prediction would be that errors will be more likely for tone contours which vary most in range, or in dynamics. This depends on how confusable lexical items are, which can be affected by the phonetic density of words, word frequency, and predictability in context. In the case of pitch, items which rely on distinctions in pitch should be less likely to be ‘replicated’ from utterance to utterance in dry conditions.

An alternative selectional mechanism is acquisition based. The effect of climate on tone could make learning phonemic contrasts more difficult in very dry areas. While pitch may be an important cue in learning (Filippi et al. 2014), this mechanism may be difficult to investigate. The physical development of infant articulation no doubt has much larger effects on production than any effects from climate. Additionally, differences in climate are confounded with cultural differences, including social factors shaping learning environments. L2 acquisition may be a more feasible line of investigation. It is certainly not impossible to learn a tone language in a dry environment, but adult acquisition is sensitive to psychological aspects, such as confidence and motivation (e.g., Dörnyei 2006). If sounds are harder to produce or perceive due to dry air, adult learners may find them harder to learn. In theory, this is testable by looking at learning performance over a range of climates, but is also subject to cultural confounds.

A seemingly more plausible potential mechanism, associated with L2 acquisition, is the following: words with complex tonal contrasts are less likely to be adopted by languages without such contrasts, in desiccated contexts. In actuality, there is still debate about how tone has become such a regional phenomenon (Maddieson 2013), and how it is adopted across language families (e.g., Ratliff 2002). Yet, one commonly accepted mechanism for tone transfer is the adoption of words with tonal contrasts into other languages. A recent comprehensive study suggests that, on average, about 25 per cent of words in a given language are borrowed, and in some cases this figure is much higher (Haspelmath and Tadmor 2009). As words with tonal contrasts are ‘borrowed’ into non-tonal languages, there are numerous effects of native phonologies on the borrowed form. Ultimately, this cross-linguistic transfer is dependent on the replication of variants produced by native and non-native speakers, variants which differ in the extent to which they accurately replicate the relevant tonal contrasts. Given that such replication requires precision of pitch at the morphemic level, particularly when words are produced in isolation without tone sandhi and other contextual effects, we might posit that the faithful replication of precise pitch sequences, already a potentially onerous task for speakers of non-tonal languages, is less likely if environmental context places pressures against that replication. In other words, as word transfer takes place iteratively over numerous cross-linguistic contexts, and as tonal patterns are emulated, this interactional emulation could quite well be deleteriously impacted by the heightened difficulty of phonation, and particularly precise phonation, in very desiccated contexts. This suggested ‘inhibited borrowing’ mechanism is in some ways testable, perhaps through iterated learning experiments in which ambient humidity is systematically varied.

Finally, it is possible that tonal languages are less likely to emerge in the first place in dry regions. This is compatible with the synchronic evidence, but requires diachronic evidence to distinguish it from the opposite direction of change.

3.3 Strength of the bias

In the two sections above, we have discussed how climate might exert an evolutionary pressure on the cultural evolution of language. However, one aspect which some researchers may doubt is that a subtle effect on production can yield pervasive global trends. It is easier to believe that the effects of desiccation may apply, but be too weak to cause a difference, or be overridden by other pressures. For example, drier climates may lead to the evolution of physical systems to combat laryngeal dehydration or cultural practices, such as specialized breathing techniques to maintain hydration (though we know of no studies on these issues). Cultural innovations such as permanent shelter, food preservation, and clothing may also shield groups from the pressures of climate. For example, Currie and Mace (2012) find only weak effects of climate on population range and density, but a moderate effect for societies whose primary subsistence method is foraging. Even marked differences between populations do not always imply a strong effect. For example, as mentioned above there is a striking phenotypic variation in the shape of the nasal cavity between populations (Noback et al. 2011), yet we doubt this has an effect on the phonology of the languages. Nevertheless, it might, and it likely impacts the formant structures of nasal consonants and vowels (interestingly, spectral effects have been demonstrated for nasal airflow produced at high elevation, in Oghan et al. 2010). The anatomy of the vocal tract also varies between cultures in ways that could affect production (Esling et al. 2015), including possibly compensation mechanisms.

These factors suggest that the effect of climate should be subtle, as are many of the influences on human behavior and communication. We would argue that evidence of the predicted pattern (tonal languages being rare in dry climates) is enough to suggest that the mechanism has a salient effect and is worth investigating. This is not to say that the effect size of desiccation on production need be very large, nor does it need to apply constantly. As studies of cultural evolution have shown, a small bias can be amplified into a strong trend by repeated application in a cultural system (Kirby et al. 2007).

The subtlety of these effects also has implications for statistical approaches to finding evidence. Robust evidence is likely only to be found in large, cross-cultural samples of languages. This type of data has its own pitfalls, such as the inflation of correlations due to historical relatedness (Roberts and Winters 2013) or the increased noise-to-signal ratio (Taleb 2012). Advanced statistical techniques can help avoid these problems, such as mixed effects modeling, phylogenetically weighted regression, Monte Carlo tests, and tests which use baselines tailored to the actual data, such as permutation tests or serendipity tests (Roberts et al. 2015).

3.4 Tests of diachronic change

Given the available data, it should be possible to test whether climate affects tone in a dynamic way, factoring in explicitly the role of shared history between languages. However, there are several problems which are particular to looking at how linguistic and climatic factors interact. The way tone systems change over time is unlikely to be the same as how climate changes. Therefore, given that the historical changes in question span enough time for climate change to be a serious issue, one would need separate models of climate change and language change. Also, it is likely that the path of historical influence of tone through populations is not the same as the path of populations through different climatic zones. Therefore, one might need to model the expansion of populations through climatic zones. The so-called geo-phylo techniques can achieve this (e.g., Currie et al. 2013). Combining both of these issues suggests that the ideal study would simulate climate change over thousands of years, then simulate population movement through these climatic zones, while at the same time simulating the linguistic evolution of tone (loss and gain of tone may not occur in a strict sequence, but could be modeled by discrete trait evolution, e.g., Currie et al. 2010).

This test could be applied within many language families (global analyses would require estimating links between language families). However, the prediction of our hypothesis is not that humidity broadly correlates with tonality. It is simply that desiccation yields subtle diachronic pressures against the usage of complex tonality. Therefore, we would only predict an effect in areas which include very dry climates. That is, the prediction does not apply to language families such as Trans New Guinea, which has variation in tone, but no regions in the lowest 50 per cent quantile of humidity. This is not to suggest that valid exceptions do not exist, but further inquiry may prove that such is the case. Our prediction is a statistical one, and cultural evolutionary trajectories will vary from family to family (Dunn et al. 2011), so the hypothesis does not require the prediction to be borne out in all language families.

One group of languages which might be ideal for this kind of study would be the Bantu languages. According to many sources, including geo-phylogenetic analyses (Currie et al. 2013), the Bantu population started in the humid rainforests of West Africa, and spread over a huge range, finally out to drier climates in the South and East. A reduction in tonal contrasts would be not only predicted by the desiccation hypothesis, but also by alternative mechanisms such as simplification due to contact. However, crucially there were then movements back into more humid zones on the East coast (Fig. 2). This might provide enough variation in climates to exhibit a pattern of replicated bursts of change across the phylogeny, an important factor in detecting evidence of correlated evolution (Maddison and FitzJohn 2015). Furthermore, several inferred phylogenetic trees of Bantu languages are available (Holden 2002; Holden and Gray 2006; Currie et al. 2013). Trees based on historical linguistics evidence are also available (e.g., Glottolog, Hammarström et al. 2015), though care would have to be taken that splits in the tree were not directly motivated by differences in tone systems.

Figure 2

The spread of Bantu languages through different areas of humidity in Africa. Lighter colors indicate more humid regions. The arrows indicate the spread of languages according to geo-phylogenetic methods from Currie et al. (2013).

The benefit of such an analysis would be to produce evidence of a causal link rather than simply synchronic correlation. The reconstruction of ancestral states allows a diachronic perspective and an analysis of how change in one variable leads to change in another. We would predict that the loss of tone complexity would be more common when moving into very dry regions, although (in line with our previous findings) there would be no such bias in very humid regions.

4. Discussion and conclusion

The diachronic development of human sound systems is an unpredictable, meandering enterprise. For all the regularity of some sound change types, the Neogrammarian vision of sound change ‘laws’ proved ephemeral. Extant historical linguistic methods based on internal and comparative reconstructions, while invaluable to the linguistic enterprise do not address a range of explananda in the current development of phonologies. Among other lacuna, the patterns gleaned from such methods fail to capture the range of sociolinguistic factors that help motivate the preponderance of certain sound variants at the expense of others (e.g., Labov 2001), frequency-based effects on the reification of certain sound sequences (e.g., Bybee and Hopper 2001), as well as language-external factors such as those we are suggesting. Whatever one’s position on our guiding hypothesis, it must be acknowledged that the machinery of traditional studies of sound change is simply not equipped to consider possibilities such as the ones we are suggesting. Consequently, ecological data have not previously factored in to studies of sound change. And perhaps such data can shed light on some of the many mysteries that remain vis-à-vis the progression of sound changes.

Tonogenesis is said to be motivated by myriad factors including pre-vocalic laryngealization, stress patterns, and vowel height variations. Yet, the most prevalent hypothesis for many cases of the origin and subsequent splits of tonemes relates to the voicing contrasts of adjacent consonants (Hombert et al. 1979). Vowels following voiceless consonants tend to have higher fundamental frequencies than those following voiced consonants, and if the consonantal voicing contrast in question is neutralized or elided, and all that remains is the associated F0 shift of the adjacent vowels, this pitch discrepancy may be phonologized. Such tonogenetic accounts are descriptive rather than predictive, since many languages with the relevant voicing contrast do not subsequently develop phonemic tone. Nevertheless, the accounts are well grounded and motivated, and we stress that our account does not contradict them in any way. Yet, bearing in mind the descriptive usefulness of such accounts, consider that they do not explain one of the clearest observations one can make about tonal patterns, in particular complex tone, from a typological perspective: it is very regional, tending to cluster in non-arid areas that frequently have arid borders. It is precisely this sort of finding that our account may help to explain. While we are not proffering an alternate account of tonogenetic mechanisms, we are offering a plausible motivation (and there could be others) for the fact that tone crosses linguistic boundaries readily in many regions, but not in all regions.

A first line of evidence for our hypothesis is to demonstrate that the predicted effects are strong enough to be seen in synchronic typological patterns. This demonstration was offered in Everett et al. (2015), summarized here. A second line of evidence is a detailed investigation of the diachronic co-evolution of the factors claimed to be causally associated, to see if they trend in the predicted directions.8 In this article, we discussed how diachronic evidence of change in tone systems according to changes in climate could be obtained. We also used the quantitative data to identify promising candidates for more in-depth case studies.

At least two aspects remain to be improved. First, while we have suggested a potential ‘inhibited borrowing’ mechanism, the theoretical articulation of how biases in production lead to problems with communication, and then how these lead to language-wide changes in phonology, is admittedly somewhat nebulous. Second, more detailed statistical models of the co-evolution of tone and climate, taking into account the idiosyncrasies of each system, are requisite. Eventually, as more detailed data on the global distribution of tone is produced,9 these two improvable aspects could be refined in order to produce a detailed model of change at both short-term and long-term timescales, at least for particular case studies.

While we believe the findings discussed in this study and in Everett et al. (2015) are consistent with our hypothesis, we recognize that there are limitations to the data we have examined and that this work is not dispositive. We believe it is tantalizingly suggestive of ecological influence on language, however, and are happy to see others actively engaged in discussions and thoughtful criticisms related to our work. In contrast, a more circumspect and frequent retort to our previous study, paraphrased, is that ‘that’s not how sound change works’. But we submit that such a reply amounts to a tautological interpretation of sound change. Of course, one is not going to believe ambient air is relevant to sound production and change if they have ruled out the possibility a priori, or have not considered it, while concomitantly utilizing methods that are not even equipped to incorporate the possibility. Responses to our work have been comparatively mute vis-á-vis the laryngology data we highlighted, and as a result our previous study was received by some as simply another correlational piece. Perhaps this is because many linguists have not heretofore considered the fact that ambient desiccation does indeed impact the vocal folds, or perhaps it is simply unclear to them how such an influence would be factored into their research.

Another common response to the general conclusions we offered in Everett et al. (2015), and which we have underscored with this effort, is to simply cite one or a few counterexamples. Frequently, these are cases we were aware of prior to our initial study, and the purpose of that study was to examine global and regional patterns, not specific cases. Nevertheless, such objections are understandable, since many linguists have a deep knowledge of the mechanisms of change of a particular language or language family. However, many large-scale statistical studies trade this deep knowledge of particular cases for broad coverage, and individual counterexamples do not disprove a statistical tendency (additionally, counterexamples must offer similar phylogenetic and areal controls of the sort we presented in our original study, a point we feel is frequently overlooked). Similarly, errors in individual data points in the databases we utilized are unlikely to change the overall conclusion of our work given the robust nature of the patterns we have uncovered. Also, such contestations are typically easy to incorporate: we can just change or remove the queried data point(s) and rerun the analysis to see if it changes the qualitative conclusions. We have done this for several readers concerned about individual data points or individual families. More perspicacious criticisms should, from our perspective, systematically demonstrate flaws such as clear spatial auto-correlations in our datasets, in a way that leads to a confound in our overall interpretation. Such flaws are possible, but we have yet to encounter any such counter-analysis—perhaps since the linguistic distribution we predict has been observed across numerous regions. We invite these sorts of criticisms. Our goal is not to see this hypothesis of climate–language interaction immediately accepted, but tested and subsequently refined or discarded, thereby truly advancing our understanding of these issues.

With regards to the specific hypothesis about tone and humidity, admittedly there is some way to go before the whole chain of causality from a desiccated larynx to a global distribution of tonal patterns is fleshed out. However, we believe that the observable pattern—a lack of languages with tone (and particularly complex tone) in desiccated regions—warrants an explanation. We look forward to interactions which try to develop the hypothesis or attempt to explain the correlation with alternative mechanisms. We believe that such interactions are pivotal.

We hope that many adherents to both the autonomous and non-autonomous camps will agree that this issue merits further exploration. That exploration has an uncertain destination but, we believe anyhow, it is at the heart of scientific inquiries into the nature of both language and, more generally, H. sapiensalready well established as a uniquely adaptive species. Ultimately, all members of our species live at the bottom of an ocean of air. But we live in different seas that vary in sundry ways. Just as we would when examining the communication of any other species, we should examine carefully whether this ecological variation results in adaptive effects on speech. Our initial investigations offer evocative evidence, we think, that it does.


This publication was made possible in part by a grant to C.E. from the Carnegie Corporation of New York. S.G.R. is supported by an European Research Council Advanced Grant No. 269484 INTERACT to Stephen Levinson.


The statements made and views expressed are solely the responsibility of the authors. We thank the Max Planck Society for additional support.


  • 1 Bird and bat signals are learned to some extent, so there may be gene–culture co-evolution.

  • 2 Nettle (1998) suggests that the mechanism by which this occurs, is the ‘carrying capacity’ of the environment. Favorable climates allow a high carrying capacity which limits the need for contact. Harsher environments require more collaboration with others, and so more contact, and less linguistic diversity. In support of this, Nettle finds correlations between linguistic diversity and climatic factors such as temperature and mean growing season. Another mechanism relating populations and climate is disease. Heat and humidity facilitate the contraction of diseases and their subsequent spread through a population, which can affect demography through mortality rates, or migration due to epidemic disease (Michaelowa 2001).

  • 3 Traces of similar ideas, generally anecdotally based, can be found as far back as the eighteenth century, with one author suggesting that the effects of cold weather on the vocal apparatus may cause biases in the phonemes used: ‘But the total want of P and W may be looked on as the grand literal distinction, between the Scandinavian and the German dialects of the Gothic. And this seems a remarkable instance of the effect of climate upon language; for P and W are the most open of the labial letters; and V is the most shut. The former requires an open mouth: the later may be pronounced with mouth almost closed, which rendered it an acceptable substitute in the cold climate of Scandinavia, where the people delighted as they will delight, in gutturals and dentals. The climate rendered their organs rigid and contracted; and cold made them keep their mouths as much shut as possible.’ (Pinkerton 1789: 354)

  • 4 In our previous study and in this one, we relied/rely on Maddieson’s (2013) independent categorization of languages with complex tone as those with three or more tonemic contrasts. Admittedly, the distinction between language types is actually cline-like, yet our categorization choice offered a useful point of departure for the test of our hypothesis.

  • 5 It should be noted that we have also encountered biologists and anthropologists who found our conclusions fairly commonsensical, given the ecological adaptability of communication systems in other species.

  • 6 There is also probably a sociological dimension, since engaging with these factors entails experience with quantitative methods outside the purview of most linguists.

  • 7 In Mandarin, production errors are most likely for the fourth tone (falling, Wan et al. 1998), which is also the one with the greatest pitch range. However, this is also the most frequent tone type, suggesting that there may be a more complex relationship between production effort and selection pressures.

  • 8 In Everett et al. (2015), we did observe that the predicted patterns held within large language families and on a continent-by-continent basis, at the coarse level of simple regressions between humidity and number of phonemic tones.

  • 9 One acknowledged shortcoming of the work at present is that it relies on databases that, for all their elegance and usefulness, simply categorize languages by number of tonemes. Ultimately, we hope that our hypothesis can be tested against phonetic databases as well as databases that might allow for the clearer differentiation of languages, according to for example, the extent to which they actually rely on complex tone in the speech stream.


View Abstract