Benjamin Fagard

Benjamin Fagard, researcher at LATTICE is observing the dynamics of language change through the lens of corpus linguistics & mathematical models. He focuses on the analysis of grammaticalization phenomena and on changes in gender marking, with the added tools provided by mathematical models and big data.

The existence of large language models makes it possible to imagine new ways of looking at language change. But much can still be done by using existing language corpora and models of language change. For instance, looking at changes in frequency alone, it has been shown that it is possible to identify ongoing language change, e.g. the competition between variants, which displays a characteristic ‘s-curve’. This concept is anything but new (Osgood & Sebeok 1954, Kroch 1989), but has been recently refined theoretically with more explicit models (Blythe & Croft 2012, Feltgen et al. 2017, Feltgen 2024), and used to identify cases of language change in large corpora (Boukhaled et al. 2019). A crucial question in that respect is whether it can be used not only to identify language change, but to distinguish between different types of change (viz. lexical innovation, borrowing, calques, grammaticalization). In my talk, I will address this question with data from large diachronic corpora.