-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathresult_en.html
More file actions
4210 lines (4131 loc) · 492 KB
/
result_en.html
File metadata and controls
4210 lines (4131 loc) · 492 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.8.25">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>result_en</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
</style>
<script src="result_en_files/libs/clipboard/clipboard.min.js"></script>
<script src="result_en_files/libs/quarto-html/quarto.js" type="module"></script>
<script src="result_en_files/libs/quarto-html/tabsets/tabsets.js" type="module"></script>
<script src="result_en_files/libs/quarto-html/axe/axe-check.js" type="module"></script>
<script src="result_en_files/libs/quarto-html/popper.min.js"></script>
<script src="result_en_files/libs/quarto-html/tippy.umd.min.js"></script>
<script src="result_en_files/libs/quarto-html/anchor.min.js"></script>
<link href="result_en_files/libs/quarto-html/tippy.css" rel="stylesheet">
<link href="result_en_files/libs/quarto-html/quarto-syntax-highlighting-7b89279ff1a6dce999919e0e67d4d9ec.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="result_en_files/libs/bootstrap/bootstrap.min.js"></script>
<link href="result_en_files/libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="result_en_files/libs/bootstrap/bootstrap-44aa975785bef04068f56b03a8931892.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="light">
</head>
<body class="fullcontent quarto-light">
<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
<main class="content" id="quarto-document-content">
<section id="seminars-of-the-linguistic-convergence-laboratory" class="level2">
<h2 class="anchored" data-anchor-id="seminars-of-the-linguistic-convergence-laboratory">Seminars of the <a href="https://ilcl.hse.ru/en/">Linguistic Convergence Laboratory</a></h2>
<p>If you are interested in participating in the laboratory seminars, please register <a href="https://ilcl.hse.ru/en/polls/420288221.html">here</a>.</p>
<section id="seminar-schedule-2026" class="level3">
<h3 class="anchored" data-anchor-id="seminar-schedule-2026">Seminar schedule 2026</h3>
<div class="columns">
<div class="column" style="width:15%;">
<p><strong>28 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anastasia Alekseeva (HSE University)</em></p>
<p><strong>Negation in Chicham: preliminary findings</strong></p>
<details>
<summary>
Abstract
</summary>
Chicham (also known as Jivaroan) is a small language family, consisting of four languages (Aguaruna, Shuar, Shiwiar and Wampis), that are spoken in the foothills of the Andes in Peru and Ecuador. Their genealogical relation is not very deep, but at the same time they show some notable differences, which allows for productive comparison. One of these differences lies in the domain of negation. In all Chicham languages there are two negation markers: -cha and -tsu. They do not show any differences in semantics, but they are used in different contexts, e.g. -cha — in the past tenses, and -tsu — in the present tense. However, their contextual distribution varies in different Chicham languages. In this talk I am going to show the distribution of the negation markers across Chicham as it is described in the grammars, provide a more detailed analysis of it and show some preliminary results based on a small corpus compiled from examples in the grammar.
</details>
</div><div class="column" style="width:15%;">
<p><strong>21 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anastasia Alekseeva, Akhmed Dugrichilov, Lilya Fayzeeva, Konstantin Filatov, Diana Khayaleeva, Yury Koryakov, Timur Maisak, Maksim Melenchenko, Lena Mironova, Varvara Nikolaeva</em></p>
<p><strong>Field trip report: the Kusur dialect of Avar</strong></p>
<details>
<summary>
Abstract
</summary>
We report on a 2026 field trip to Kutan Kambulat (Rutulsky district, Republic of Daghestan) to document the Kusur dialect of Avar, a highly endangered variety spoken by a small seasonally mobile community. The trip took place in early April (11 days), when most speakers are in the lowland settlement before moving to the mountain village of Kusur for summer pastoral work. The main aim of the trip was to compile a spoken corpus. We recorded spontaneous speech, focusing on narratives and conversation, as well as collected basic sociolinguistic metadata. A key interest is the dialect’s unusual contact setting, shaped by long-term interaction with Tsakhur and Azerbaijani languages. We briefly discuss the results of the field trip, our sociolinguistic data, as well as findings regarding some previously undescribed aspects of grammatical structure. The language corpus resulting from the field trip is intended for future analysis of micro-isoglosses and contact-driven variation.
</details>
</div><div class="column" style="width:15%;">
</div><div class="column" style="width:85%;">
<p><em>George Moroz, Asya Antsupova, Viktoria Zubkova</em></p>
<p><strong>Field trip report: Zilo Andi</strong></p>
<details>
<summary>
Abstract
</summary>
<p>We present a report on a field trip to Zilo (Botlikhskiy district, Republic of Daghestan) conducted in April 2026. We had several documentation-focused goals:</p>
<ul>
<li>expansion of the online dictionary of Zilo Andi</li>
<li>clarification of some morphology related information</li>
<li>solution of several morphological questions</li>
<li>rewriting of the morphological analyzer (currently, it processes pronouns, numerals, and adjectives). Additionally, we refined some details described in the grammatical discription of the Zilo Andi in [Kaye et al., to appear].</li>
</ul>
Additionally, in pursuing the goal of documenting previously undescribed aspects of Zilo Andi, we addressed several topics, including the syntax and semantics of temporal clauses, verbal actionality and stress pattern.
</details>
</div><div class="column" style="width:15%;">
<p><strong>14 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Aleksey Starchenko (HSE University)</em></p>
<p><strong>GNMCCs are not always semantically based relativization: Evidence from Northern Khanty</strong></p>
<details>
<summary>
Abstract
</summary>
<p>General noun-modifying clause construction (GNMCC) is “a single construction covering a wide range of semantic relations between the head noun and the clause” [Matsumoto et al. 2017]. These relations include relativization of various positions and extended NMCC: modification of valency-bearing nouns and constructions in which the relations are not deduced from any syntactic clues. In addition to the distributional criteria, the GNMCC in the strict sense are claimed to be built based on semantic/pragmatic mechanisms and not to be subject to syntactic constraints. This property brings together the GNMCC and semantically-based relativization [Comrie 1998]. On the other hand, constructions described as GNMCCs exhibit syntactic restrictions, primarily in terms of island effects [Kornfilt, Vinokurova 2017; Kim, Sells 2017; Nikolaeva 2017].</p>
In the talk, I will focus on the Northern Khanty adnominal modification constructions with non-finite forms in -ti and -əm. Introducing novel data on the extended NMCCs, I will show that Northern Khanty non-finite forms meet the distributional criteria of the GNMCC. Despite this fact, in relativization contexts, they show syntactic restrictions of various kinds that can be related to the presence of a gap [Bikina 2019]. On the other hand, extended NMCC in Khanty show semantic/pragmatic effects expected from the classical NMCC. I argue that Northern Khanty non-finite forms constitute a single noun-modifying construction, that is, they share the same external syntax. Their difference in syntactic restrictions stems from the presence of a gap, while its absence gives a way to the semantic/pragmatics mechanisms. The interpretation of the Northern Khanty data presented here indicates that the GNMCC is not equal to semantically based relativization. Taken together with the data on island effects in other languages, one could claim that purely distributional definition is more favourable for building the typology of GNMCC.
</details>
</div><div class="column" style="width:15%;">
<p><strong>31 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anna Panova (HSE University)</em></p>
<p><strong>Towards a typology of nominal coordination systems</strong></p>
<details>
<summary>
Abstract
</summary>
This talk investigates systems of nominal coordination in the languages of the world. Though coordination has been examined from functional–typological point of view (e.g., Mithun 1988, Stassen 2000, Haspelmath 2007, Mauri 2008), some questions regarding syntax and semantics of coordinating constructions still remain unanswered. Based on a sample of 336 languages (118 families, 25 isolates, 4 macroareas), we consider factors such as number of constructions for nominal coordination, other functions of the coordinator, position of the coordinator in a construction with two conjuncts, and possibility of using the same coordinator for coordinating verb phrases and clauses. We will discuss minimum, average, and maximum number of coordinating constructions in one language, frequency of mono- and bisyndetic constructions, and attested paths of the grammaticalization of a coordinator. We will consider which are the most common sets of coordinators and whether they are “aligned” by syndetism in languages with multiple coordinating constructions; which coordinators can and can not be used in clausal coordination; which coordinators are most often mono- and bisyndetic. We tentatively identify five types of coordinating constructions: purely logical; introducing an additional participant; referring to participants as a set; describing change of states; (not) fully covering all participants. Additionally, we will discuss areal tendencies.
</details>
</div><div class="column" style="width:15%;">
<p><strong>24 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Olga Alieva (HSE University)</em></p>
<p><strong>Testing Plato’s Chronology with Phylogenetic Methods</strong></p>
<details>
<summary>
Abstract
</summary>
This project critically reexamines the long-standing stylometric basis for the standard tripartite chronology of Plato’s dialogues (early–middle–late), arguing that core assumptions underpinning this model are methodologically dubious. While stylometric analysis has often been portrayed as a ‘scientific’ foundation for dating dialogues, the clustering patterns it reveals are not reliably correlated with any temporal sequence. Combining insights from Classics with methodologies drawn from evolutionary biology and computational stylometry, I apply modern phylogenetic tools—including tree-based and network-based models—to the entire Platonic corpus, for the first time integrating lesser-studied spuria from the Appendix Platonica. Using high-dimensional distance metrics (e.g., cosine similarity) across most frequent features and incorporating robustness checks via bootstrapping, I demonstrate that only two stylistic groupings emerge as stable under various models—what scholars would traditionally label ‘late’ dialogues (e.g., Laws, Timaeus, Philebus), and Republic (except for book 1). However, no statistically robust cluster corresponds to the so-called ‘early’ and the rest of the ‘middle’ dialogues, while some of the later dubia and spuria exhibit stylometric proximity to allegedly early texts. This suggests that the stylometric features thought to define philosophical ‘youth’ in fact correspond to ‘Socratic’ genre: stylometry measures style, not time. This critique is of dual interest: first, it underscores the need for philological caution when engaging with statistical claims about authorial development; second, it offers a cautionary tale about interpretability, domain assumptions, and the transfer of methods from bioinformatics to historical linguistics and literary studies.
</details>
</div><div class="column" style="width:15%;">
<p><strong>17 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Aigul Zakirova (University of Potsdam)</em></p>
<p><strong>The expression of necessity in the Volga-Kama area: argument marking and (im)personality</strong></p>
<details>
<summary>
Abstract
</summary>
<p>The Volga–Kama (VK) area encompasses Turkic (Tatar, Bashkir, Chuvash) and Uralic (Moksha, Erzya, Meadow Mari, Hill Mari, Udmurt) languages, all spoken in the Middle Volga region of Russia (Johanson 2000; Helimski 2003).</p>
<p>Drawing on spoken corpora, grammatical descriptions, and literature on modality in the VK area, I will present the types of constructions found in these languages, which include but are not limited to the following:</p>
<ol type="1">
<li><p>‘Need’-predicates, compatible both with infinitives and NPs (Hill Mari keleš, Chuvash kirlë, Tatar kiräk, Moksha er’avi).</p></li>
<li><p>Non-finite future/necessity forms (Chuvash -mAllA, Tatar -As- EXIST, Udmurt -ono).</p></li>
<li><p>Finite verbs from other semantic domains, grammaticalized into necessity meanings (Bashkir tura kilew ‘come straight’, Hill Mari väreštäš, Meadow Mari logalaš ‘end up’, Udmurt lunə̑ ‘be’).</p></li>
<li><p>Personal ‘must’-predicates, often borrowed (Tatar tiješ, Bashkir teješ, cf. also Kipchak-borrowed tijə̑š Southern Udmurt, Russian-borrowed dolžen in Moksha).</p></li>
</ol>
After discussing the diachronic sources of these constructions, I will focus on the argument marking and argument expression vs. omission in these constructions. Using data from spoken texts, I will show that with the majority of the VK necessity constructions, both in terms of type and token frequency, do not express their A- or S-argument and are (largely) impersonal. However, the diachronic tendency appears to be that in the Uralic languages of the area new personal constructions develop from already existing or borrowed material.
</details>
</div><div class="column" style="width:15%;">
<p><strong>10 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Eva Poliakova (HSE University)</em></p>
<p><strong>Reading group: Levshina, N. (2022). Corpus-based typology: applications, challenges and some solutions. Linguistic Typology, 26(1), 129-160.</strong></p>
<details>
<summary>
Abstract
</summary>
Over the last few years, the number of corpora that can be used for language comparison has dramatically increased. The corpora are so diverse in their structure, size and annotation style, that a novice might not know where to start. The present paper charts this new and changing territory, providing a few landmarks, warning signs and safe paths. Although no corpus at present can replace the traditional type of typological data based on language description in reference grammars, corpora can help with diverse tasks, being particularly well suited for investigating probabilistic and gradient properties of languages and for discovering and interpreting cross-linguistic generalizations based on processing and communicative mechanisms. At the same time, the use of corpora for typological purposes has not only advantages and opportunities, but also numerous challenges. This paper also contains an empirical case study addressing two pertinent problems: the role of text types in language comparison and the problem of the word as a comparative concept.
</details>
</div><div class="column" style="width:15%;">
<p><strong>3 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Lena Mironova (HSE University)</em></p>
<p><strong>Verbal plurality in Papuan languages</strong></p>
<details>
<summary>
Abstract
</summary>
<p>Despite its global distribution and increasing interest in recent years, verbal plurality (VPL) remains an inconsistently treated grammatical domain. This talk investigates patterns in the (co)expression and formal encoding of VPL functions in order to clarify its internal structure. The analysis draws on a large convenience sample of Papuan languages (164 languages from 42 families and 19 isolates), that is, the non-Austronesian languages of New Guinea and the surrounding islands, where VPL is widely attested but has not yet been examined systematically.</p>
In this study, VPL is defined as the verbal encoding of plurality independent of other grammatical categories such as person or gender. Four major functional types are distinguished: collective (non-individuated participant plurality), distributive (individuated participant plurality), intRAoccasional (repetitions within a single occasion), and intERoccasional (repetitions across separate occasions). The results highlight VPL as a heterogeneous yet internally structured functional domain. They suggest a continuum of functional adjacency from collective to intERoccasional plurality and reveal distinct tendencies in the coexpression and formal marking across the different functional types.
</details>
</div><div class="column" style="width:15%;">
<p><strong>24 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maksim Melenchenko (HSE University)</em></p>
<p><strong>Omission of the light verb kin- ‘do’ in Shughni complex verbs</strong></p>
<details>
<summary>
Abstract
</summary>
In the Shughni language (‹ Eastern Iranian), spoken in the Pamir mountains, the majority of verbal lexical meanings are expressed with multiword constructions called “complex verbs”. The talk focuses on a puzzling phenomenon: in complex verbs with the light verb kin- ‘do’ (for example, rāng kin- ‘paint’, lit. ‘color-do’) the root of the light verb can sometimes be omitted. In such instances, the subject agreement suffix on the verb attaches to the non-verbal component of the complex verb (for example, rāng kin-um ‘I color [smth]’ → rāng-um). This raises many interesting questions about morphosyntactic properties of the resulting construction (for example, about the phrasal / lexical status of the non-verbal component). In the talk, I will discuss these questions and their relation to the general phenomenon of complex verbs in Shughni, its diachronic development and the role of language contact in this process, as well as draw unexpected typological parallels (for example, with the Lezgic language [‹ East Caucasian]).
</details>
</div><div class="column" style="width:15%;">
<p><strong>17 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Nikita Muravyev (HSE University)</em></p>
<p><strong>Light verbs as a missing link in grammaticalization: a case study of Russian posture verbs and a brief typological overview</strong></p>
<details>
<summary>
Abstract
</summary>
Posture verbs frequently undergo grammaticalization across languages. In the literature, their typical grammaticalization path is described as developing from literal posture use to an aspectual marker and/or a locative or existential copula. However, what is often overlooked is the wide range of light verb constructions (LVCs) — idiomatic expressions consisting of a semantically bleached verb and a predicatively used syntactic constituent (usually NP or PP), e.g. German unter Druck stehen ‘be (lit. stand) under pressure’ or Russian sidet’ na diete ‘be (lit. sit) on a diet’. In this talk, drawing on Russian corpus data, I demonstrate that such uses cannot simply be treated as ordinary copular constructions, as they convey more specific meanings rooted in residual semantic components of posture. I argue that these meanings are quasi-grammatical and pre-grammatical, in that they reflect a lower degree of grammaticalization and possibly represent an intermediate stage in the diachronic development of posture verbs. Finally, I briefly compare posture-based LVCs across twelve Eurasian languages and discuss their typological variation.
</details>
</div><div class="column" style="width:15%;">
<p><strong>10 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Mark Stoneking (Biométrie et Biologie Évolutive, UMR 5558, CNRS & Université de Lyon)</em></p>
<p><strong>The Genetic History of the Caucasus</strong></p>
<details>
<summary>
Abstract
</summary>
To paraphrase Tolstoy, all populations are alike in have interesting histories, but each history is interesting in its own way. In the case of the Caucasus, the interest centers around the extensive linguistic diversity (in particular, the relationship of Caucasian populations speaking Indo-European or Turkic languages to those speaking Caucasian languages), the position of the Caucasus as a potential crossroads for contact between the East and the West, and the impact of the Caucasus Mountains on the genetic diversity and structure of populations living in the mountainous regions. In this presentation, which I shall endeavour to make accessible to non-geneticists, I will take an historical approach: first, I will describe early studies of genetic variation in Caucasian populations that I carried out with my colleague, Ivane (Vano) Nasidze, that focused mostly on analyses of the maternally-inherited mitochondrial DNA (mtDNA) and the paternally-inherited Y chromosome. I will then discuss the more detailed insights into population history provided by analyses of genome-wide variation in modern human populations from the Caucasus, followed by the additional insights arising from the recent studies of ancient DNA. The results to date of this relatively under-studied region indicate a complex history of both contact and continuity, and a major impact of the mountainous regions on the genetic structure of the populations living there.
</details>
</div><div class="column" style="width:15%;">
<p><strong>27 January</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Petr Rossyaykin (Lomonosov Moscow State University)</em></p>
<p><strong>Indefinites, scalar particles, and question semantics</strong></p>
<details>
<summary>
Abstract
</summary>
According to a barely controversial generalization, the licensing of (at least some) negative polarity items (NPIs) is dependent on a particular entailment pattern between the assertion and its alternatives, viz. (entailment) scale reversal or downward entailingness (Fauconnier 1975, 1978; Ladusaw 1979; et seq.). Questions are one of the environments in which NPIs are licensed (e.g. Did you eat anything?), yet there is no obvious entailment relation between questions. This raises the puzzle of why NPIs are acceptable in questions. In this talk I will present and discuss a cross-linguistic dataset concerning the distribution of scalar particles (like English even), showing that indefinite NPIs and NPIs with scalar particles behave differently w.r.t. their acceptability in (polar) questions. In particular, PQs are not a scale reversal environment for scalar particles (contra some earlier proposals). I will discuss the consequences of this observation for the theory of NPI licensing.
</details>
</div><div class="column" style="width:15%;">
<p><strong>20 January</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anastasia Panova (Stockholm University)</em></p>
<p><strong>Subordination strategies in Gawarbati (Indo-Aryan): an areal-typological perspective</strong></p>
<details>
<summary>
Abstract
</summary>
Gawarbati is an under-described Indo-Aryan language spoken by approximately 20,000 people in the border area between Pakistan and Afghanistan. Since 2021, it has been documented by a Swedish-Pakistani team under the supervision of Henrik Liljegren (Stockholm University). One of the main outputs of the documentation project is a spoken corpus containing more than 20 hours of transcribed, glossed and translated speech from various genres. The focus of this talk will be on the use of finite subordination strategies in the Gawarbati corpus. I will start by presenting an overview of finite subordination strategies in neighboring Indo-Iranian languages. Against this background, I will describe the functions of each of the subordinators attested in the Gawarbati corpus. On the basis of the analysis of the functional distribution of various subordinators, I will try to reconstruct the possible stages in the evolution of subordination strategies in Gawarbati and discuss the role of language contact in this process.
</details>
</div>
</div>
</section>
<section id="seminar-schedule-2025" class="level3">
<h3 class="anchored" data-anchor-id="seminar-schedule-2025">Seminar schedule 2025</h3>
<div class="columns">
<div class="column" style="width:15%;">
<p><strong>23 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Leah Finkelberg, Polina Nasledskova, Johanna Nichols (HSE University)</em></p>
<p><strong>Progress on causative alternations and noun synthesis databases</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk two on-going projects will be discussed. The causative alternations project is dedicated to collecting and analyzing the data on causal-noncausal alternations in verbs. 18 pairs of verbs across 238 languages (from various language families and continents) have been collected and coded. The noun synthesis project is dedicated to collecting and analyzing inflectional noun categories in 172 languages. In this talk, we are going to talk about typological findings and geographic distribution of features present in the collected data. We are going to discuss 1) Is there a tradeoff between noun and verb synthesis or between causativization and decausativization (worldwide or continent by continent); 2) Is lower morphological complexity a feature of more archaic patterns?; 3) Are there resemblances in causal / non-causal patterns or in noun inflectional categories between, on the one hand, the languages of Western North America and Australasian languages, and, on the other hand, between the languages of Eastern North America and Siberian languages. We will compare our observations to the conclusions made in [Sokur & Nichols, 2018], [Hartmann & Nichols, forthcoming], and in [Nichols 2024].
</details>
</div><div class="column" style="width:15%;">
<p><strong>16 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Kholodilova (Institute for Linguistic Studies, HSE University)</em></p>
<p><strong>Locative relativization in Slavic languages and beyond</strong></p>
<details>
<summary>
Abstract
</summary>
In most European languages, locative relativization involves competition between at least two relativization strategies, roughly corresponding to English house in which I live vs. house where I live. The former strategy is more explicit, as it specifies the spatial relation, while the latter neutralizes at least the distinction between ‘in’ and ‘on’. Based on my current sample of 12 Slavic and 8 non-Slavic European languages, I will discuss the corpus distribution of these strategies with particular attention to the impact of head noun semantics. I propose that there is a consistent tendency toward greater explicitness of marking along the following hierarchy of head nouns: ‘place’ < ‘house’ < ‘book’, i.e., the marking is more explicit with the nouns that are less likely to appear in locative expressions. These findings align with a broad range of phenomena showing more explicit marking in less frequent configurations — both in relative clauses (Keenan, Comrie 1977; Fox, Thompson 2007; Cristofaro, Ramat 2007) and in non-relative locative expressions (Stolz & al. 2014; Haspelmath 2019).
</details>
</div><div class="column" style="width:15%;">
<p><strong>9 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Ermolova (HSE University)</em></p>
<p><strong>Gerunds in the Russian language of the 17th century: a transitional period in the history of their grammatical development</strong></p>
<details>
<summary>
Abstract
</summary>
I will present the results of a corpus study on the functioning of gerunds in the Russian language of the 17th century. Comparing these findings with data from the 18th century allows tracing the stages of the evolution of the grammatical meaning of gerunds and making adjustments to existing theories. The situation with the use of gerunds remains relatively stable throughout the 17th century. Continuing the Old Russian tradition, gerunds can be used in the living language as finite forms for both past and present tenses. Compared to the earliest period, gerunds in the 17th century become even more similar to the -л- form, as evidenced by their use with the particle бы in the subjunctive mood, which is not documented in early texts. The 17th century demonstrates how gerunds lost their tense meaning, acquiring a relative one depending on the tense of the main predicate, while still remaining formally autonomous predicates. The establishment of the gerund’s function as a predicate in a dependent adverbial clause occurs in the 18th century.
</details>
</div><div class="column" style="width:15%;">
<p><strong>2 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Pupynina (Institute for linguistic studies)</em></p>
<p><strong>Multilingualism in the Northeastern Siberia</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk, I will present the results of the ongoing study of small-scale multilingualism in the Northeastern Siberia (2017-present). The study focuses on the languages that have been in contact for more than 1,5 century: Tundra Yukaghir, Yakut, Chukchi, Even, Naukan and Chaplino Yupik Eskimo. I will discuss small-scale multilingual areas in the north of Yakutia (Lower Kolyma) and Chukotka (Chukchi peninsula) and touch upon the linguistic outcomes of long-lasting multilingualism. Both multilingual areas involve unrelated languages, and Lower Kolyma individual language repertoires can consist of five unrelated lects. The possible ways to measure the level of language distance/similarity/convergence between the languages of these areas will be discussed.
</details>
</div><div class="column" style="width:15%;">
<p><strong>25 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Zaira Khalilova (Institute of Linguistics, HSE University)</em></p>
<p><strong>Typology of verbal borrowings in Tsezic and beyond</strong></p>
<details>
<summary>
Abstract
</summary>
Khwarshi, which is distant and geographically separated from the other Tsezic languages and surrounded by Avar- and Andic-speaking villages, combines all three strategies identified by Wohlgemuth (2009) for the integration of borrowed verbs: a light verb strategy is used for Russian borrowings, while both direct and indirect insertion are used for borrowings from Avar and Andic. Combining several strategies in a single language is typologically a rare phenomenon; the other Tsezic languages make use of only one integration strategy. The paper explains the factors underlying the distribution of verbal borrowing strategies within the Tsezic languages. The crucial factor accounting for this distribution is the variation found across the family in the degree of bilingualism and language contact with donor languages.
</details>
</div><div class="column" style="width:15%;">
<p><strong>18 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anna Grishanova (HSE University)</em></p>
<p><strong>Stress variation in the speech of L2 Russian and dialectal speakers: the case of verbs in past indicative</strong></p>
<details>
<summary>
Abstract
</summary>
Stress variation in standard and dialectal Russian is an interesting and well-researched phenomenon. While A. Zaliznjak (1985) attributes stress variation to the pragmatic factor, W. Lehfeldt (2006) suggests that frequency of the lexeme plays an important role as well. Data presented in the study by D. Savinov, E. Skachedubova, A. Somova (2020) indicates that sociolinguistic factors like age are crucial to thoroughly describe stress variation in Standard Russian. Differences in stress patterns in various Russian dialects are often explained by the history of the dialect under observation. To our knowledge, the variation of stress in the speech of L2 Russian speakers has not been discussed before. This study aims to grasp what factors influence the stress variation of the verbs in past indicative in dialectal and L2 Russian speakers. The data comes from dialectal and bilingual corpora of the Linguistic Convergence Laboratory. Specifically, I investigate the verbs that have been previously outlined by D. Savinov, E. Skachedubova, A. Somova (2020).
</details>
</div><div class="column" style="width:15%;">
<p><strong>11 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anna Panova, Yury Lander (HSE University)</em></p>
<p><strong>Competing coordinating constructions in the languages of the North Caucasus</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk we discuss systems displaying several (two to three) constructions for nominal coordination in eleven languages of the North Caucasus. Our sample includes representatives of West Caucasian, East Caucasian, Indo-European and Turkic languages. The data come both from corpora and from elicitation. We tentatively propose a syntactic prototype of nominal coordination based on the collective contexts and certain other formal properties. We suggest that the variation observed among the systems discussed in the talk results from competition between coordinate constructions covering contexts closer to this prototype and additive constructions extending from less prototypical contexts.
</details>
</div><div class="column" style="width:15%;">
<p><strong>28 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Timofei Dedov, Alexander Letuchiy (HSE University)</em></p>
<p><strong>Distant negative concord in Ashkharywa Abaza</strong></p>
<details>
<summary>
Abstract
</summary>
<p>Our talk focuses on the phenomenon of negative concord (NC) in the Ashkharywa dialect of Abaza, a West Caucasian language and a close relative of Abkhaz. The class of negative concord items contains items like aʒ̂-g’ə́ ‘nobody’. Although these elements do not contain a negative marker in the proper sense, they correspond to the definition of NCI, because they are usually licensed by a predicate negation (see, for example, Zeijlstra 2004, Giannakidou 2006 for the general analysis of negative concord).</p>
<p>In the talk, distant negative concord will mostly be discussed: this notion covers cases when a negative concord is licensed by a predicate negation from a higher clause (‘I do not want to see anyone’, and not a clausemate one (‘I do not see anyone’). While local negative concord is described in detail in descriptive, typological, and theoretical studies, not all relevant parameters of distant NC organization have been analyzed. The main question to be considered is what factors facilitate the distant NC or make it problematic. In Russian, for instance, finiteness of the embedded verb seems to affect the possibility of this negative concord type: the distant NC is possible in most nonfinite constructions (Я не хочу никого обидеть), but marginal or highly colloquial in finite ones (?Я не хочу, чтобы ты никому звонил).</p>
<p>In Abaza, finiteness and the opposition of finite vs. nonfinite forms is organized differently from the European finite vs. nonfinite form opposition (see a recent paper by Arkadiev (2023) for details). As will be demonstrated in the talk, finiteness itself cannot be regarded as the main factor of the (im)possibility of distant NC, although sometimes different complement types behave differently regarding the distant NC.</p>
The main factor is the semantic type of the matrix verb. It turns out that factive verbs of knowledge and emotional attitude, modal verbs, opinion verbs and so on differ in their (in)availability to license negative concord items in subordinate clauses. In our talk, we will discuss semantic parameters of matrix verbs that can account for these differences.
</details>
</div><div class="column" style="width:15%;">
<p><strong>21 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Irina Politova (HSE University)</em></p>
<p><strong>Reading group: Ploeger, Esther, Wessel Poelman, Miryam de Lhoneux, and Johannes Bjerva. 2024. What is “typological diversity” in NLP? In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 5681–5700. Association for Computational Linguistics, Miami, Florida, USA.</strong></p>
<details>
<summary>
Abstract
</summary>
The NLP research community has devoted increased attention to languages beyond English, resulting in considerable improvements for multilingual NLP. However, these improvements only apply to a small subset of the world’s languages. An increasing number of papers aspires to enhance generalizable multilingual performance across languages. To this end, linguistic typology is commonly used to motivate language selection, on the basis that a broad typological sample ought to imply generalization across a broad range of languages. These selections are often described as being ‘typologically diverse’. In this meta-analysis, we systematically investigate NLP research that includes claims regarding typological diversity. We find there are no set definitions or criteria for such claims. We introduce metrics to approximate the diversity of resulting language samples along several axes and find that the results vary considerably across papers. Crucially, we show that skewed language selection can lead to overestimated multilingual performance. We recommend future work to include an operationalization of typological diversity that empirically justifies the diversity of language samples. To help facilitate this, we release the code for our diversity measures.
</details>
</div><div class="column" style="width:15%;">
<p><strong>14 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Konstantin Filatov (HSE University)</em></p>
<p><strong>More evidence for withdrawal effects: the case of Andic future marking</strong></p>
<details>
<summary>
Abstract
</summary>
The talk is a revised and expanded version of the author’s SLE 2025 presentation. It discusses the system of future grams in the Anchiq dialect of Karata (< Andic < Nakh-Daghestanian). The two core future forms can be described as marking future certainty vs. future possibility. While this type of system is not typologically uncommon, the diachrony of similar systems in closely related Andic languages (Godoberi and Bagvalal) poses a challenge for Bybeean source determination principle. This principle requires attributing all semantic differences between grams to differences between their grammaticalization sources. However, the principle can not fully account for the emergence of Andic future systems: the same source seems to have developed into the certain future gram in Anchiq, while having developed into the uncertain future gram in Godoberi. The talk presents a proposed diachronic scenario for the three Andic systems and explains their differences using the notion of withdrawal effects (as defined by Reinöhl and Himmelmann, 2017). This notion refers to a situation where the semantic network of an earlier gram is deformed by the «intrusion» of a newer one.
</details>
</div><div class="column" style="width:15%;">
<p><strong>7 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Marina Gasanova (Daghestan State University)</em></p>
<p><strong>О работе Центра изучения родных языков Дагестанского государственного университета</strong></p>
<details>
<summary>
Abstract
</summary>
Семинар посвящён деятельности Центра изучения родных языков Дагестанского государственного университета, созданного в 2016 году при поддержке Министерства по национальной политике и делам религии Республики Дагестан. В рамках доклада будут представлены основные направления работы Центра: популяризация и просветительская деятельность, организация семинаров для учителей, а также реализация научно-исследовательских проектов. Особое внимание будет уделено задачам по сохранению и развитию родных языков народов Дагестана.
</details>
</div><div class="column" style="width:15%;">
<p><strong>30 September</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Alina Russkikh (HSE University)</em></p>
<p><strong>From Additivity to Optative? Evidence from the Avar Language</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In Avar, there are two functionally distinct markers that are synchronically homonymous. The first is the multifunctional additive particle =gi. The second is the optative suffix -gi. In [Zhirkov 1936: 157–158], the particle =gi and the optative marker -gi are treated as one and the same marker. Z. Mallaeva notes that these two forms may have originated from a common source, despite their synchronic functional differences. N. Dobrushina discusses additive particles as one of the possible sources of optatives, with material parallels attested in four languages of the Eastern Caucasus, including Avar [Dobrushina 2024]. However, in typological works on additives [Forker 2016; Gast & van der Auwera 2011], the use of additive markers in optative constructions is not attested.</p>
All this suggests, on the one hand, that in several languages of the area additive and optative markers coincide, while on the other hand the semantic link between the optative and the additive is far from straightforward. In this talk, I will present field data and discuss possible motivations for the development of optative meaning from the additive particle, as well as counterarguments to this hypothesis.
</details>
</div><div class="column" style="width:15%;">
<p><strong>24 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Polina Nasledskova (HSE)</em></p>
<p><strong>Compatibility of ordinal numerals with nouns of various semantic classes</strong></p>
<details>
<summary>
Abstract
</summary>
In this study, I investigate the compatibility of ordinal numerals with nouns of various semantic classes in 5 languages: Russian, English, Spanish, Indonesian and Rutul. The comparison is based on the parallel translations of the New Testament. Nouns of four semantic classes are present in the data: names of living creatures, inanimate objects, time periods and abstract concepts. Additionally, I analyze constructions of the type “for the X-th time”. According to the data, different semantic classes of nouns are used with ordinal numerals with varying frequency. The talk includes the discussion of the results and their possible theoretical implications.
</details>
</div><div class="column" style="width:15%;">
<p><strong>3 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Elena Shvedova, Elizaveta Zabelina, Yuri Koryakov (HSE University)</em></p>
<p><strong>Quantifying lexical distances among Nudiz, Mahmudi, and Verin Dvin Urmi (North-Еastern Neo-Aramaic)</strong></p>
<details>
<summary>
Abstract
</summary>
Our study documents and analyzes lexical data from four Christian North-Eastern Neo-Aramaic varieties: Mahmudi, Nudiz, Verin Dvin Urmi, and Urmia Urmi, focusing on the previously undescribed Mahmudi and Nudiz. We provide correspondences from these lects for an extended 226-item basic vocabulary list collected for this study with etymologies, cognates from earlier Aramaic, and loanword sources. Cognate share calculations reveal that all four varieties belong to a single language. Notably, Mahmudi and Verin Dvin Urmi—spoken in the same village for 67 years—exhibit stronger convergence than with their genealogical relatives (Nudiz and Urmia Urmi respectively), highlighting contact-driven divergence from inherited patterns.
</details>
</div><div class="column" style="width:15%;">
<p><strong>27 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Timur Maisak (Institute of Linguistics RAS & HSE University)</em></p>
<p><strong>Numeral ‘one’ + additive ‘also, even’: one source structure for two Udi particles</strong></p>
<details>
<summary>
Abstract
</summary>
In Udi, a Nakh-Daghestanian language of the Lezgic branch, two function words sal and saal seem to share the same source structure: both appear to represent combinations of the numeral sa ‘one’ with the additive clitic =(a)l ‘and, also, even’. At the same time, synchronically the two are both formally and functionally distinct. The word sal is an emphatic negative polarity item ‘not a (single one)’, ‘(not) at all’. The word saal can be used with the meaning ‘again, one more time’, but even more often, one finds it as a coordinating conjunction ‘and’. Cross-linguistically, the numeral ‘one’ is a common grammaticalization source: for example, the World Lexicon of Grammaticalization lists nine paths leading from ‘one’ to a grammatical marker. Additive markers (‘and, also, even’) are also known to take part in the derivation of various grammatical forms or classes of forms. What makes the two Udi particles unusual is the fact that two very different words go back to one and the same combination of two grammaticalization sources. In the talk, I plan to illustrate the uses of both sal and saal (mainly based on textual data from the Nizh dialect). I will also discuss some structural and functional parallels of the two Udi words found in the languages of the area.
</details>
</div><div class="column" style="width:15%;">
<p><strong>20 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Ellie Wren-Hardin (The Ohio State University)</em></p>
<p><strong>Computer-Assisted Differentiation of Loans and Cognates: Possibilities and Pitfalls</strong></p>
<details>
<summary>
Abstract
</summary>
The past several decades have seen a dramatic rise in the creation of computational and computer-assisted approaches to cognate and loanword detection. However, many cognate and loanword detection methods rely on identifying surface lexical similarity, creating a challenge for the differentiation of family-internal loanwords from cognates. While true cognates typically demonstrate higher lexical similarity than non-cognates due to shared genetic inheritance, loanwords also demonstrate higher lexical similarity than non-borrowed words, meaning lexical similarity is an insufficient metric on its own. In this talk, I will discuss how computational and qualitative methods can be combined to tackle the challenge of differentiating cognates from family-internal loanwords in the Northeast Caucasian language family. First, I will discuss the sociolinguistic factors in the Northeast Caucasian language family that make it useful for studies of family-internal contact. Then, I will talk through several computational methods for cognate and borrowing detection and explain why they alone are insufficient for this specific challenge. Lastly, I will demonstrate how utilizing computational methods in conjunction with knowledge of the languages and sociolinguistic factors involved in a contact situation provides improved results over computational methods alone, exemplifying the benefits of a “computer-assisted” approach.
</details>
</div><div class="column" style="width:15%;">
<p><strong>13 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Ermolova (HSE University)</em></p>
<p><strong>On the grammaticalization of -no/-to-forms in the history of the Polish language in comparison with Middle Russian</strong></p>
<details>
<summary>
Abstract
</summary>
I will examine the history of indefinite personal -no/-to-forms analyzing two Polish texts from different centuries (the Bible of Queen Sophia from the mid-15th century and “Roczne dzieje kościelne” from the early 17th century). I will analyze contexts with past passive participles (PPP) in a predicative position, where actions of the preterit type are described, as well as examples with PPP in the subjunctive mood – those in which contemporary Polish -no/-to-forms are used. Based on the conducted analysis, the following conclusions can be made about the stages of the formation of the -no/-to-form. The first stage in the evolution of PPP was the loss of the copula needed with the participial form: PPP begins to be used as a finite form of the past tense (on zabit instead of on był zabit). Concurrently, by grammaticizing and losing its participial properties, PPP gradually loses agreement with the semantic object, with which it originally agreed, and solidifies in the neuter gender form. Losing its nominal properties and retaining exclusively verbal ones, PPP in the form of -no/-to ceases to agree with the semantic object and begins to govern it (zabito go). The analyzed texts demonstrate how, throughout the 15th century and up to the early 17th century, there is a decrease in the frequency of contexts like on zabit due to an increase in the frequency of contexts like zabito go. During this period, in the paradigm of -no/-to-forms, intransitive verbs are also included, as the former PPP loses the characteristics of passive participles, which could only be formed from transitive verbs.
</details>
</div><div class="column" style="width:15%;">
<p><strong>6 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Viktoria Zubkova, George Moroz, Chiara Naccarato (HSE University)</em></p>
<p><strong>Phonological adaptation of Russian borrowings in Avar-Andic languages </strong></p>
<details>
<summary>
Abstract
</summary>
In this talk we will discuss processes of phonological adaptation of Russian borrowings in languages of the Andic branch of East Caucasian. Dictionary data from eight Andic languages are compared to data from Avar, the closest relative of Andic within the family and a major lingua franca in northern Daghestan. We will illustrate the process of data annotation, our qualitative analysis of the correspondences, and their modeling with a mixed effect logistic regression. As we will show, modeling the probability of loanword adaptation gives a hierarchy of languages that is partially explained by a language’s history of direct contact with Russian and authorship of the dictionaries, but does not fully match with geographic distances and phylogenetic classifications, nor with population sizes. Factors known to play a role in processes of loanword adaptation, i.e., time depth and frequency of use, show the expected effect but their predictive strength is not statistically significant.
</details>
</div><div class="column" style="width:15%;">
<p><strong>29 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>George Moroz, Chiara Naccarato, Natalia Koshelyuk, Maya Artyukh (HSE University)</em></p>
<p><strong>The DiaL2 project: progress and future plans</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk, we will discuss the progress and future plans of the DiaL2 project, which is aimed at studying linguistic variation in spoken corpora of bilingual and dialectal Russian. In particular, we will discuss the following topics: - non-standard negative existential constructions in L2 Russian; - preposition drop in Khanty and Mansi L2 Russian; - non-standard word order in noun phrases with a genitive modifier in L2 Russian.
</details>
</div><div class="column" style="width:15%;">
<p><strong>22 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Asya Alekseeva (HSE University)</em></p>
<p><strong>Verb system of Aguaruna</strong></p>
<details>
<summary>
Abstract
</summary>
Aguaruna is a language spoken in Peru and Ecuador, in the foothills of the Andes. It belongs to the Jivaroan language family, which consists of only four languages (Shuar, Achuar-Shiwiar, Huambisa, and Aguaruna). These languages are located between two areas: the languages of the Andes and the languages of the Amazon, and they exhibit properties similar to both areas. Aguaruna can be characterized as an agglutinative language with a high degree of fusion and complex morphophonology. In this talk, I will present a work in progress, which is part of a study on verb systems of the Jivaroan languages. I will provide an overview of the Aguaruna verb system and discuss some of its peculiar features, such as the orientation of the verb system to illocutionary force/modality and the discourse-oriented nature of the tense system.
</details>
</div><div class="column" style="width:15%;">
<p><strong>15 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Egor Kashkin, Irina Khomchenkova (Vinogradov Russian Language Institute of the Russian Academy of Sciences)</em></p>
<p><strong>Russian in contact: projects of the Language Contact Group at the Vinogradov Russian Language Institute</strong></p>
<details>
<summary>
Abstract
</summary>
<p>The presentation will outline the research on language contact which is carried out at the Vinogradov Russian Language Institute. We are interested in contact situations where Russian is either the source or the target language.</p>
<p>First, we will discuss the influence of Russian on neighbouring languages, particularly instances of code-switching and borrowings. We will elaborate on corpus-based case studies of Russian conjunctions, which interact with the local system of clause combining.</p>
Second, we will present our projects on non-standard varieties of Russian (mainly those used by the speakers of different Uralic languages, with some parallels drawn from the study of Russian speech in Kyrgyzstan and preliminary typological observations). In addition to the evidence from participant observation, they involve data from the corpus specifically designed to annotate contact-induced features. The main principles of the development of such a corpus will be summarized. Selected corpus-based case studies will be presented (e.g. prepositional phrases in the Russian variety used by Nganasan speakers).
</details>
</div><div class="column" style="width:15%;">
<p><strong>8 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Тимкин Тимофей Владимирович (Институт филологии СО РАН)</em></p>
<p><strong>Исследование длительности гласных в языках народов Сибири: новые методы и материалы</strong></p>
<details>
<summary>
Abstract
</summary>
Длительность является одним из ключевых признаков гласных, участвующих в организации вокалической системы. Однако высокая вариативность темпа речи, слоговых и ритмических факторов затрудняют анализ этого признака в типологической перспективе. Для учета этих факторов и построения динамической модели длительности гласных в Институте филологии СО РАН ведется сбор фонетического материала по тюркским и финно-угорским идиомам Сибири. Комплексный подход предполагает использование экспериментально-фонетических методик, а также сбор соматических данных с помощью УЗИ и МРТ. В докладе будут представлены предварительные результаты работы.
</details>
</div><div class="column" style="width:15%;">
<p><strong>1 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Oxana Goncharova (Pyatigorsk State University)</em></p>
<p><strong>Emotion Recognition in Bilingual Speech: A Comprehensive Deep Learning-Based Method</strong></p>
<details>
<summary>
Abstract
</summary>
This study explores emotion recognition in bilingual speech through a comparative analysis of machine learning (ML) and deep learning (DL) techniques. Initially, a hybrid framework was implemented, combining Mel-frequency cepstral coefficients (MFCCs) with prosodic features (e.g., pitch, intensity, speech rate) and conventional ML algorithms. While preliminary results were encouraging, the approach suffered from overfitting and limited robustness to minor data variations. To overcome these limitations, we propose a deep learning architecture that integrates a CNN-based autoencoder with an embedding network. Experimental evaluations demonstrate a significant enhancement in performance metrics compared to traditional methods, highlighting the potential of multimodal frameworks for emotion analysis in bilingual speech.
</details>
</div><div class="column" style="width:15%;">
<p><strong>25 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Eva Poliakova (HSE University)</em></p>
<p><strong>Field notes on Khwarshi: the biabsolutive construction and information structure (PART 2)</strong></p>
<details>
<summary>
Abstract
</summary>
<p>This talk will be dedicated to two topics which were the focus of my research during a field trip to the Khwarshi language (Nakh-Daghestanian) that took place in January of this year. Therefore, the talk will be divided into two parts.</p>
<p>First, I will discuss the biabsolutive construction in Khwarshi. In this construction both arguments of a transitive verb are marked by the absolutive case, and the verb form is restricted to (periphrastic) progressive. I will discuss some of its properties, including ones that were not discussed before (e.g. its behavior in an embedded clause). I will also show that some speakers allow forming an absolutive construction not only with a progressive form, but also with a resultative one, though in this case some additional restrictions seem to hold.</p>
Second, I will discuss some findings about information structure in Khwarshi. This topic was investigated mostly based on question-answer tasks and tasks involving picture description. I will show that different word orders can be used to mark focus in Khwarshi, including insertion of the focused constituent inside a periphrastic verb form and inversion of lexical verb and auxiliary.
</details>
</div><div class="column" style="width:15%;">
<p><strong>18 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Eva Poliakova (HSE University)</em></p>
<p><strong>Field notes on Khwarshi: the biabsolutive construction and information structure</strong></p>
<details>
<summary>
Abstract
</summary>
<p>This talk will be dedicated to two topics which were the focus of my research during a field trip to the Khwarshi language (Nakh-Daghestanian) that took place in January of this year. Therefore, the talk will be divided into two parts.</p>
<p>First, I will discuss the biabsolutive construction in Khwarshi. In this construction both arguments of a transitive verb are marked by the absolutive case, and the verb form is restricted to (periphrastic) progressive. I will discuss some of its properties, including ones that were not discussed before (e.g. its behavior in an embedded clause). I will also show that some speakers allow forming an absolutive construction not only with a progressive form, but also with a resultative one, though in this case some additional restrictions seem to hold.</p>
Second, I will discuss some findings about information structure in Khwarshi. This topic was investigated mostly based on question-answer tasks and tasks involving picture description. I will show that different word orders can be used to mark focus in Khwarshi, including insertion of the focused constituent inside a periphrastic verb form and inversion of lexical verb and auxiliary.
</details>
</div><div class="column" style="width:15%;">
<p><strong>11 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Masha Volina (HSE University)</em></p>
<p><strong>Demonstrative Pronouns in Khwarshi</strong></p>
<details>
<summary>
Abstract
</summary>
<p>The Khwarshi language (Nakh-Daghestanian) has a rich system of demonstrative pronouns. Three series of demonstratives can be distinguished: žu — idu, o-CL-žu — a-CL-du and hobo-žu — hobo-du. Each series includes proximal and distal pronouns (which can be used both attributively and substantively), a demonstrative adjective with a meaning close to such and several adverbs. Pronouns from all three series can function deictically and anaphorically, although there is a ‘primarily anaphoric’ series žu — idu and a ‘primarily deictic’ series o-CL-žu — a-CL-du. Also, the paradigmatic structure of the Khwarshi demonstrative system is quite complex.</p>
In this talk, mainly based on my fieldwork data, I will describe the morphological structure of Khwarshi demonstratives, and the syntactic differences in their usage (mostly outlined in ‘A language for Guinness World Records: Fifteen (or more?) reflexive pronouns in Khwarshi’ by Yakov Testelets). I will also briefly discuss spatial and discursive factors that influence the choice of a deictic, as well as my hypotheses regarding the differences in their semantics and, accordingly, their functions.
</details>
</div><div class="column" style="width:15%;">
<p><strong>4 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Timofey Mukhin (HSE University, University of Liège), Michael Daniel (University of Tübingen)</em></p>
<p><strong>From space to anaphora: there and back again</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In this talk, we consider the anaphoric uses of demonstratives in Mehweb Dargwa. The main goal is to explore how the primary spatial deictic meaning of demonstratives is refracted in the textual - anaphoric - dimension.</p>
<p>We found that the choice of the demonstratives cannot be fully explained in terms of discourse dimensions such as anaphoric distance (Givón 1983). In narrative uses of elevational demonstratives, the center relative to which the referent’s position is determined shifts as compared to its deictic uses. In deictic uses of the elevational demonstratives, the deictic center is primarily associated with the speaker. In their anaphoric uses, the elevation value is calculated wrt the most topical/activated referent.</p>
We suggest that the deictic uses of demonstratives defined by spatial relation with the deictic center do not fully convert into the textual dimension of anaphora when the same demonstratives are used in narratives. While this is easily seen with elevational demonstratives, the question remains whether the same factor is not present in elevation-neuter demonstratives in Mehweb and cross-linguistically, where researchers attempt to provide a full account of their use in purely anaphoric terms.
</details>
</div><div class="column" style="width:15%;">
<p><strong>25 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Lena Mironova (HSE University), Yury Lander (HSE University), Shamset Unarokova (Adyghe State University)</em></p>
<p><strong>One or two approaches to West Caucasian demonstratives</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk, we discuss the demonstrative systems in (Temirgoi) West Circassian and (Ashkharawa) Abaza, which represent two branches of the West Caucasian family. Our data come from an experiment based on the questionnaire (Wilkins 2018), which helps to establish the parameters that affect the choice of a demonstrative in its exophoric non-contrastive function. Both West Circassian and Abaza languages have tripartite demonstrative systems. Our data show the relevance of both the distance parameter and the speaker- or addressee-anchoring parameter, as well as the parameter of visibility, the presence of spatial boundaries and the presence of gestures. However, there are significant differences between the systems of the two languages. We will present our experimental results, describe the meanings of each demonstrative, outline the structural differences between the systems, and suggest some generalizations. Finally, we will discuss two possible interpretations of our data: it remains to be determined whether these treatments are complementary or whether they should be strictly differentiated.
</details>
</div><div class="column" style="width:15%;">
<p><strong>18 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maksim Stepanyants (HSE University)</em></p>
<p><strong>An attempt at a comprehensive description of Modern Eastern Armenian additive marker ēl</strong></p>
<details>
<summary>
Abstract
</summary>
Modern Eastern Armenian (MEA) discourse markers have been generally neglected in the typological literature. However, there is one that has been included in the sample of Forker’s (2016) influential paper on additives’ polysemy, namely, ēl (էլ). A closer look at the semantics and morphosyntactic properties of this exponent reveals its broad polysemy, which can contribute to the theory of additive markers, cf. also (Gast & van der Auwera 2013. Its diachronic development also needs to be addressed: it presents a case of divergent development of multiple specialized markers (with different morphosyntactic properties) from a conceivable common source, possibly affected by areal influence. The marker ēl is special among other MEA focus markers (cf. Giorgi & Haroutyunian 2016) due to its almost unique enclitic status. In this talk an attempt will be made to address all these issues in a wholistic typologically-anchored approach.
</details>
</div><div class="column" style="width:15%;">
<p><strong>11 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Masha Krivolap, Maksim Melenchenko (HSE University)</em></p>
<p><strong>Predicting Shughni gender with machine learning</strong></p>
<details>
<summary>
Abstract
</summary>
Our study aims to investigate the influence of various factors of gender assignment in the Shughni language (Eastern Iranian) using machine learning. We have trained several models to predict grammatical gender (feminine or masculine) on a dataset of 2,390 nouns from the Shughni-Russian dictionary. For training, we used both semantic features (semantic classes and vectorized Russian definitions) and formal features (word endings and the last vowel of the stem). Our results show that semantics plays a primary role in gender assignment in Shughni, as the proposed semantic features can correctly predict the gender for ≈80% of nouns in our sample. Formal features seem less significant and can correctly predict the gender for only ≈70% of nouns in the dataset. The correlation between these two types of gender predictors is high (especially for feminine gender), so combining them does not yield significantly better results.
</details>
</div><div class="column" style="width:15%;">
<p><strong>4 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Ivan Olkhov (HSE University)</em></p>
<p><strong>Gender agreement slots in East Caucasian verbs: An areal-typological study and a case study of Andic</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk I will discuss the findings of two studies of gender agreement on verbs in East Caucasian languages. In most languages within this language family, gender agreement on verbs is sporadic, meaning that some verb lexemes have an agreement slot in the root while others do not. The first part of my talk will focus on a typological investigation of sporadic agreement on verbs across the East Caucasian family, which was done for a chapter of the Typological Atlas of the Languages of Daghestan. I will provide information on the number of agreeing lexemes for those languages where such data are available. These numbers vary significantly, with some languages like Rutul and Tsakhur having agreement on all verbs, while others like Agul and Lezgian have none. Additionally, I will explore the possible positions of non-root agreement slots. The second part of my presentation will delve into a case study of the Andic branch. I will examine verbs with the same meanings in languages within the sample; for each meaning I check in how many languages it is expressed by agreeing verbs and in how many languages these verbs are cognate. By analyzing this data, we can draw conclusions about how the verb agreement slots in Andic are preserved.
</details>
</div><div class="column" style="width:15%;">
<p><strong>28 January</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Alexander Letuchiy (HSE University)</em></p>
<p><strong>Abaza masdars: what regulates the choice of marking?</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In this talk, I focus on the types and properties of masdars (nominalizations) in Abaza (a language of the West Caucasian family spoken in Russia). A special feature of Abaza is that it has one marker of masdar (the suffix -ra) – however, masdars themselves fall into several types, based on the person marking. Masdars can inherit the argument marking from the verb (the polypersonal agreement with A and DO of transitive verbs, as well as S and IO of intransitive verbs), show possessive agreement with the argument of the masdar, take a definiteness marker a- or remain unmarked in the prefixal part. In this case, the Abaza system of masdars is rich and poor at the same time.</p>
<p>Each type of masdar marking is in a sense a separate complementation strategy. The four strategies are not freely combined with any matrix verb, but chosen according to semantics of the matrix verb (especially reality- and modality-related properties) and syntactic properties of the construction. Although Abaza has no canonical control structures, some features of masdar constructions are reminiscent of control / restructuring phenomena.</p>
<p>The existence of several masdar types are compatible with the fact that nominalizations, including masdars in Caucasian languages, occupy an intermediate place in the system: on the one hand, they denote a situation and inherit many verbal properties; on the other hand, they get some nominal properties. However, very often, as in English or Arabic, it is the syntactic construction with a nominalization that shows similarities with verbal vs. nominal constructions. In Abaza, this intermediate nature of nominalization is manifested in morphology.</p>
The data, considered in the talk, are collected during fieldwork organized by the HSE University in 2024.
</details>
</div><div class="column" style="width:15%;">
<p><strong>21 January</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Elena Shvedova (HSE University)</em></p>
<p><strong>Lability drift in Neo-Aramaic languages</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In this talk, I examine labile verbs in Neo-Aramaic languages (< Semitic), focusing on diachrony and semantics. Labile verbs, which can be used both transitively and intransitively without morphological change, are widespread in Modern Aramaic languages, in contrast to earlier Aramaic varieties where anticausative or causative marking was more prevalent. The verbal system of Christian Urmi (< North-Eastern Neo-Aramaic) can illustrate this expansion of lability: I analyzed 1811 verbs from the dictionary (Khan 2016) and at least 172 of them are labile.</p>
<p>Neo-Aramaic languages can be divided into two main genealogical groups: Eastern and Western Aramaic, which separated during the first millennium BC. In my study I use data from both branches. I categorize Neo-Aramaic labile verbs into three groups based on their historical development: (1) verbs such as ‘freeze’, ‘fill’, and ‘begin’, which retain lability from earlier stages of Aramaic; (2) verbs such as ‘open’, ‘break’, and ‘close’, which transitioned from anticausative marking in Middle Aramaic to lability in Modern Aramaic, reflecting parallel development in Eastern and Western varieties; and (3) verbs unique to Modern Western Aramaic (MWA), including ‘boil’, ‘dry’, and ‘wake up’. In other Middle and Modern Aramaic languages the meanings from the third group are expressed by causatively marked pairs, so the lability of these verbs in MWA represents a morphological innovation.</p>
I will also propose some explanations for the lability drift in Neo-Aramaic languages of different branches, such as the phonetic loss of the anticausative marker, the expansion of verbs with four root consonants that cannot be causativized, and possible areal factors. The study is still a work in progress, so I would like to discuss some future plans, including the research of corpus data from historical texts and modern corpora to trace the development of labile verbs in more detail.
</details>
</div>
</div>
</section>
<section id="seminar-schedule-2024" class="level3">
<h3 class="anchored" data-anchor-id="seminar-schedule-2024">Seminar schedule 2024</h3>
<div class="columns">
<div class="column" style="width:15%;">
<p><strong>17 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Rita Popova (Saarland University)</em></p>
<p><strong>Where have all the humans gone? Gender assignment of human nouns in Bantu</strong></p>
<details>
<summary>
Abstract
</summary>
The Bantu languages (Atlantic-Congo), a group of 400–500 varieties, are spoken on the southern part of the African continent, from Nigeria and Cameroon in the west, to the Kenyan coast in the east, and South Africa in the south. These languages are known for their grammatical gender (or noun class) systems, where nouns are categorized into as many as 19 classes that govern agreement in verbs, nominal modifiers, and other targets (Maho 1999). Unlike the gender systems in Indo-European languages, Bantu noun classes are not based on the sex distinction. Instead, the primary semantic contrast in Bantu gender systems lies between humans and non-humans. In a typical Bantu gender system, most nouns referring to humans are assigned to a single ‘human’ gender value (traditionally labelled as Gender 1/2 in Bantuist notation). In contrast, non-human nouns are distributed across several other gender values, often according to principles that are highly opaque (Corbett 1991, Katamba 2003). Occasionally, nouns denoting humans with unusual characteristics are found in gender values other than 1/2 (Van de Velde 2019). However, the gender assignment of human nouns has not been systematically investigated, and most of the widely accepted generalizations are derived from observations on a few well-studied Bantu languages. In this talk, I will demonstrate that gender assignment of human nouns is a parameter of intra-Bantu variation. My study is based on the investigation of more than 30 Bantu lexicons available at the RefLex database (Segerer & Flavier, 2011-2023). I will show that while some Bantu languages assign most human nouns to Gender 1/2, others have a significant number of human nouns in gender values other than 1/2. In fact, some languages seem to assign most human nouns outside 1/2, ‘scattering’ them across other gender values. Languages of my sample that exhibit this latter pattern come from the North-Western Bantu region, a zone traditionally recognized as the most diverse within the otherwise relatively homogenous Bantu-speaking world (Nurse & Philippson 2003, p. 165). I will argue that systems where human nouns are dispersed over different gender values challenge the traditional typological account of nominal classification. According to this view, human nouns — being the semantic core of any nominal classification system — are expected to consistently follow transparent semantic rules of gender assignment (Corbett 1991).
</details>
</div><div class="column" style="width:15%;">
<p><strong>10 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Martin Haspelmath (Max Planck Institute for Evolutionary Anthropology)</em></p>
<p><strong>Language parameters and construction parameters in the CrossGram database collection</strong></p>
<details>
<summary>
Abstract
</summary>
Replicability of research results minimally relies on data accessibility, but the data should ideally be FAIR: Findable, Accessible, Interoperable, and Reusable. For technical interoperability, the CLDF standard (Forkel et al. 2018) could be used by typologists, though uptake seems to have been slow so far. In this presentation, I discuss the design of the CrossGram database collection, which is specifically designed for typological datasets (it has been public since the summer of 2024: https://crossgram.clld.org/), Here I describe how CrossGram enhances findability and reusability, and I highlight the two different data types that it supports (language parameters and construction parameters). CrossGram makes typological data more findable in that it “brings to light” what is often “hidden away” in supplementary spreadsheet files (or even tables in PDF files, though this is becoming rare). Research papers typically limit themselves to summary tables or graphs and a few small maps, but ideally we want to access all typological datasets with the amenities known from CLLD applications such as WALS Online (Dryer & Haspelmath 2013, wals.info) or Grambank (Skirgård et al. 2023, grambank.clld.org). These provide not only easy exporting in CLDF format, but also easy searching and sorting in data tables, as well as map visualization, and links to references and Glottolog language information. In addition, CrossGram provides glossed example sentences in tabular form, similar to the thousands of examples in the APiCS database (Michaelis et al. 2013). These are a particularly striking case of increased transparency, because it is not uncommon for example sentences to be hidden in PDF supplements (for example, Bugaeva 2022 has a supplement of 80 pages of annotated examples). Interlinear glossed text has a range of applications even independently of the typological claims that the examples illustrate, so this is another obvious improvement in reusability. CrossGram supports two types of typological data: Language parameters that classify entire languages (i.e. parameters of the type known from the maps of WALS and Grambank), and construction parameters that classify constructions. There are many grammatical meanings or functions that can be rendered by multiple constructions in a given language, and if we only consider language parameters, the language must be classified as “mixed” (or a minor construction must be ignored). For example, Kashmiri has both correlative relative clauses and postnominal relative clauses, so both of these strategies could be included in the database and their properties recorded. Stereotypically, typology consists in classifying languages into types, but in reality, languages often have multiple types coexisting with each other, so the addition of constructions and construction types as a data type makes typological databases more fine-grained. Finally, CrossGram parameters (both language parameters and construction parameters) are not only explained succinctly and clearly, but there is also a sophisticated keyword annotation that allows users to easily find grammatical information on a wide range of topics, and for the future, integration with the envisaged “Grammaticon” reference catalogue is planned (see Haspelmath 2022). This will enhance findability and accessibility even further, and it will facilitate replication and (more generally) cumulative science.
</details>
</div><div class="column" style="width:15%;">
<p><strong>3 December</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Daria Ryzhova (HSE University), Polina Padalka (HSE University)</em></p>
<p><strong>YES and NO answers and their synonyms in Shughni</strong></p>
<details>
<summary>
Abstract
</summary>
Shughni response particles ůn ‘yes’ and nāy ‘no’ may be used in various contexts, including, besides answers to polarity questions, reactions to requests, suggestions, opinions and other types of speech acts. In addition to their usage in a dialogue setting, these particles can function as discourse markers in narratives. In this talk, we will outline the range of their functions and present their synonyms: other response particles (e.g. en ‘yes’, an ‘yes’, nāyo ‘no’, na-a ‘no’) and multiword expressions (discourse formulae). We will show that synonymous items tend to distribute across different discourse functions. For some items, we will trace their presumable pragmaticalization paths.
</details>
</div><div class="column" style="width:15%;">
<p><strong>26 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Leah Finkelberg (HSE University)</em></p>
<p><strong>Reading group: Cormac Anderson et al. (2023) Variation in phoneme inventories: quantifying the problem and improving comparability</strong></p>
<details>
<summary>
Abstract
</summary>
For over a century, the phoneme has played a central role in linguistic research. In recent years, collections of phoneme inventories, originally designed for cross-linguistic purposes, have increasingly been used in comparative studies involving neighbouring disciplines. Despite the extended application of this type of data, there has been no research into its comparability or tests of its reliability. In this study, we carry out a systematic comparison of nine popular phoneme inventory collections. We render them comparable by linking them to standardised formats for the handling of cross-linguistic datasets, develop new measures to test both size and similarity, and release the organised data in supplementary material. We find considerable differences in inventories supposedly representing the same language variety, both in terms of size and transcriptional choices. While some of these differences appear to be predictable, reflecting design decisions in the different collections, much of the observed variation is unsystematic. These results should sound a note of caution for comparative studies based on phoneme inventories, which we suggest need to take the question of comparability more seriously. We make a number of proposals for improving the comparability of phoneme inventories.
</details>
</div><div class="column" style="width:15%;">
<p><strong>19 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Nina Dobrushina (Laboratoire Dynamique du Langage, CNRS, Lyon), Chiara Naccarato (HSE University) and Samira Verhees (independent researcher)</em></p>
<p><strong>Discourse in contact: an areal study of wish formulas in Daghestan</strong></p>
<details>
<summary>
Abstract
</summary>
Research in the domain of language contact and areal studies so far has focussed on the diffusion of lexical, phonological and grammatical elements, and to a smaller extent — lexical semantics (Koptjevskaja-Tamm et al. 2022). Much less is known about the spread of discourse forms, although the mechanisms through which discourse units spread are presumably different from that characterizing the lexicon, phonology and grammar. In this talk, we will look at the diffusion of three discourse formulas across the East Caucasus (46 languages from four families), using elicitation, grammars and dictionaries as source: commemorative formulas, farewell wishes and morning greetings. We look at instances of pattern and matter copying across the languages of the East Caucasus, and analyze their areal distribution. The case studies show that the formulas are diverse (there is great variation across the area), and that both matter and pattern copying are abundant. Some formulas cover very large areas and cross genetic boundaries. In all cases the same spread zones influenced by large lingua francas — Avar (in Central Daghestan) and Azerbaijani (in South Daghestan) — were attested. The distribution of discourse formulas thus shows a very strong areal signal; we cannot come up with any phonological or grammatical phenomena which are spread in the East Caucasus to the same extent. However, there is also evidence of inheritance, especially for those languages that are spoken outside their genealogical area.
</details>
</div><div class="column" style="width:15%;">
<p><strong>12 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Alina Russkikh (HSE University)</em></p>
<p><strong>Adnominal Possessive Constructions in Christian Urmi (Neo-Aramaic) from a Typological Perspective</strong></p>
<details>
<summary>
Abstract
</summary>
The Northeastern Neo-Aramaic (NENA) varieties are a group of dialects or closely related languages of the Semitic branch of the Afro-Asiatic language family. This study aims to distinguish the inventory of adnominal possessive constructions and to describe the formal and syntactic properties of each construction in a particular variety of NENA, the Christian Urmi, from a typological perspective. Data were collected in the village of Urmiya, Krasnodar Region (Russia) and in Verin Dvin, Ararat Province (Armenia) during field trips in 2021, 2022 and 2024. The presentation will also focus on the locus of marking in adnominal possessive constructions and its evolution. Based on the applied syntactic tests, I will propose the new interpretation of the basic construction with the particle ət attached to the head, analyzing its type of marking as detached rather than head marking.
</details>
</div><div class="column" style="width:15%;">
<p><strong>5 November</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Konstantin Filatov (HSE University)</em></p>
<p><strong>Reading group: Alexandre François (2014) Trees waves and linkages: Models of language diversification</strong></p>
<details>
<summary>
Abstract
</summary>
Contrary to widespread belief, there is no reason to think that language diversification typically follows a tree-like pattern, consisting of a nested series of neat splits. Except for the odd case of language isolation or swift migration and dispersal, the normal situation is for language change to involve multiple events of diffusion across mutually intelligible idiolects in a network, typically distributed into conflicting isoglosses. Insofar as these events of language-internal diffusion are later reflected in descendant languages, the sort of language family they define - a “linkage” (Ross 1988) - is one in which genealogical relations cannot be represented by a tree, but only by a diagram in which subgroups intersect. Non-cladistic models are needed to represent language genealogy, in ways that take into account the common case of linkages and intersecting subgroups. This paper will focus on an approach that combines the precision of the Comparative Method with the realism of the Wave Model. This method, labeled Historical Glottometry, identifies genealogical subgroups in a linkage situation, and assesses their relative strengths based on the distribution of innovations among modern languages. Provided it is applied with the rigour inherent to the Comparative Method, Historical Glottometry should help unravel the genealogical structures of the world’s language families, by acknowledging the role played by linguistic convergence and diffusion in the historical processes of language diversification.
</details>
</div><div class="column" style="width:15%;">
<p><strong>29 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Polina Nasledskova (HSE University)</em></p>
<p><strong>Typology of ways of expressing ordinal meaning: work in progress</strong></p>
<details>
<summary>
Abstract
</summary>
Languages of the world vary with respect to the way they form ordinal numerals: some languages form them using a specialized ordinal marker, some languages form ordinals with a marker that is not specialized, and some languages lack ordinal numerals altogether. In this talk, I am going to present a classification of ordinal markers and other ways of expressing ordinal meaning based on a sample of 100 languages. As this is a work in progress, I am also going to discuss some challenges I am facing while working on this topic.
</details>
</div><div class="column" style="width:15%;">
<p><strong>22 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Alexander Rostovtsev-Popiel (Mainz University)</em></p>
<p><strong>Suppletion and Selective Restrictions in the Kartvelian Verb</strong></p>
<details>
<summary>
Abstract
</summary>
This talk is about suppletion and selectional restrictions in the Kartvelian verb. This phenomenon, albeit well-known and widely acknowledged, has never been subject to a dedicated study. Suppletion in Kartvelian has a number of facets that are distributed among a number of diverse domains of linguistic structure, viz. morphology, morphosyntax, semantics, and social deixis, as well as cross-sections thereof. This talk thus aims to provide a concise overview on the patterns found in Kartvelian and categorize them in scalar format.
</details>
</div><div class="column" style="width:15%;">
<p><strong>15 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Akhmed Dugrichilov, Taisia Trenikhina, Maksim Melenchenko (HSE University)</em></p>
<p><strong>The typological database of vigesimal numeral systems</strong></p>
<details>
<summary>
Abstract
</summary>
In our talk, we present the typological database of vigesimal (base-‘20’) numeral systems in languages of the world, which is currently in development. First, we discuss some theoretical problems of numeral systems, the solutions implemented in existing typological databases (WALS and Grambank) and their shortcomings. Then we describe the details of our approach and the preliminary results of the project. We have created a sample of 256 languages which are claimed to have vigesimal systems by Grambank or WALS and annotated 73 so far, focusing on two linguistic areas with high concentration of base ‘20’: Mesoamerica and Papunesia. We show that the distribution of types of numeral systems differs significantly from the one presented by the Grambank data. This is caused not only by annotation mistakes in Grambank but also by the application of a more strict methodology in our study.
</details>
</div><div class="column" style="width:15%;">
<p><strong>8 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Igor’ Marchenko (University of Groningen) and Roman Ron’ko (HSE University; Vinogradov Russian Language Institute, RAS)</em></p>
<p><strong>Database of the Dialectological Atlas of the Russian language and the classification of Russian dialects</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk, we will provide an overview of the key functions of the Database of the <a href="https://da.ruslang.ru/">Dialectological Atlas of the Russian Language</a>. Additionally, we will present two case studies that utilize this resource. The first case study assesses the stability of dialects using the database alongside dialect corpora, focusing on the dialect vocabulary spoken in a set of villages in the Zapadnodvinsky district of the Tver region. The second is devoted to the classification of Russian dialects using dialectometric methods, specifically multidimensional scaling (MDS). This study draws on data from the Dialectological Atlas of the Russian Language in its entirety, offering four classifications based on individual linguistic levels—morphology, phonetics, syntax, and lexis—as well as a comprehensive classification that accounts for all linguistic features reflected in the atlas. The primary focus of the study will be on distinguishing the differences between eastern and western Russian dialects based on the database materials, as well as offering a historical interpretation of this dialect division.
</details>
</div><div class="column" style="width:15%;">
<p><strong>1 October</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Yury Lander (HSE University)</em></p>
<p><strong>Differential argument (un)marking: A new survey of alignment in West Circassian</strong></p>
<details>
<summary>
Abstract
</summary>
West Circassian, also erroneously known as Adyghe, is usually characterized in literature as showing ergative alignment in morphology and possibly even in syntax (Gishev 1985; Kumakhov et al. 1996; Kumakhov & Vamling 2006; Kumakhov & Vamling 2009; Lander 2010; Letuchiy 2012; Ershova 2019 inter alia). In this talk I will show that if we consider differential argument marking (see, e.g., Arkadiev & Testelets 2019) and certain syntactic properties of nominals (such as those discussed in Arkadiev et al. 2009; Lander et al. 2021), the actual situation turns out to be much more complex. No novel data will be provided for scholars of Circassian languages, but I am going to discuss various kinds of pressure in the West Circassian alignment system and the related issues (including the distinction between the “canonical” differential object marking and incorporating processes and the alignment preferences displayed by different kinds of nominals).
</details>
</div><div class="column" style="width:15%;">
<p><strong>24 September</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Irina Politova (HSE University)</em></p>
<p><strong>Reading group: Peter W. Smith et al. (2019) Case and number suppletion in pronouns</strong></p>
<details>
<summary>
Abstract
</summary>
Suppletion for case and number in pronominal paradigms shows robust patterns across a large, cross-linguistic survey. These patterns are largely, but not entirely, parallel to patterns described in Bobaljik (2012) for suppletion for adjectival degree. Like adjectival degree suppletion along the dimension positive < comparative < superlative, if some element undergoes suppletion for a category X, that element will also undergo suppletion for any category more marked than X on independently established markedness hierarchies for case and number. We argue that the structural account of adjectival suppletive patterns in Bobaljik (2012) extends to pronominal suppletion, on the assumption that case (Caha 2009) and number (Harbour 2011) hierarchies are structurally encoded. In the course of the investigation, we provide evidence against the common view that suppletion obeys a condition of structural (Bobaljik 2012) and/or linear (Embick 2010) adjacency (cf. Merchant 2015; Moskal and Smith 2016), and argue that the full range of facts requires instead a domain-based approach to locality (cf. Moskal 2015b). In the realm of number, suppletion of pronouns behaves as expected, but a handful of examples for suppletion in nouns show a pattern that is initially unexpected, but which is, however, consistent with the overall view if the Number head is also internally structurally complex. Moreover, variation in suppletive patterns for number converges with independent evidence for variation in the internal complexity and markedness of number across languages.
</details>
</div><div class="column" style="width:15%;">
<p><strong>17 September</strong></p>
</div><div class="column" style="width:85%;">
<p><em>George Moroz (HSE), Olga Gich (FEFU), Anna Grishanova (HSE), Natalia Koshelyuk (HSE), Chiara Naccarato (HSE), Anna Panova (HSE), Anastasia Yakovleva (HSE), Svetlana Zemicheva (HSE)</em></p>
<p><strong>The DiaL2 project: pipeline, results, news and future work</strong></p>
<details>
<summary>
Abstract
</summary>
<p>There are 24 dialectal and 8 bilingual corpora of Russian at the Linguistic Convergence Laboratory (see the <a href="https://lingconlab.ru/">resources page</a>), and more are coming. The DiaL2 project was launched two years ago with an aim to study the linguistic variation found in these corpora. We applied a UDpipe morphological and syntactic parser, manually annotated a set of linguistic features (sometimes relistening the recordings in order to check the transcriptions), and implemented statistical models for each feature that predict the probability of divergence from Standard Russian. During the talk we will discuss our results based on several features:</p>
<ul>
<li><p>non-standard marking in numeral constructions (dva dom [two.M house.SG] ‘two houses’);</p></li>
<li><p>preposition drop (rodilas’ [v] tridcat’ devjatom godu ‘(she) was born (in) nineteen thirty-nine’);</p></li>
<li><p>non-standard marking in negative existential constructions (ranše sadiki ne byli ‘there were no kindergartens before.’).</p></li>
</ul>
As possible predictors in the models, we used sociolinguistic features (gender, year of birth, years of education), measures of collocationality, and some relevant linguistic features. During the work we discovered multiple typos, inconsistent and wrong transcriptions, and corrected a lot of them. Therefore, we started a parallel project dedicated to automatic correction of the Lab’s corpora, which will also be discussed during the talk.
</details>
</div><div class="column" style="width:15%;">
<p><strong>25 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Peter Arkadiev (Freiburg Institute for Advanced Studies)</em></p>
<p><strong>Towards a typology of passive lability with special reference to Abaza</strong></p>
<details>
<summary>
Abstract
</summary>
Uncoded passive alternations, also known as passive lability, are only rarely mentioned and discussed in the typological and theoretical literature on passive and voice, despite their being prominent in some language families (e.g. Mande) and linguistic areas (e.g. Western Africa). In this talk I start by discussing the peculiar objective resultative construction in Abaza, a polysynthetic Northwest Caucasian language, and investigate the degree of its similarity to the cross-linguistic prototype of the passive. Given that the Abaza resultative is morphologically unmarked, I argue that it can be considered an instance of passive lability. Further, I propose a preliminary typology of uncoded passives on the basis of a small convenience sample of examples I could gather from the literature. I’ll try to show that this phenomenon is somewhat more widespread than is usually believed and that its cross-linguistic variation largely fits within the typology of “canonical” morphologically marked passives and complements it.
</details>
</div><div class="column" style="width:15%;">
<p><strong>18 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Ilia Afanasev (HSE University/MTS AI)</em></p>
<p><strong>A new method for genetic language distance measurement between closely related lects</strong></p>
<details>
<summary>
Abstract
</summary>
Measuring distance between different language varieties (lects) generally must rely on an extensive linguistic research that includes collecting wordlists and information on evolution of the phonetic system (Campbell, 2013). However, sometimes gathering this kind of data seems to be impossible, due to the lack of material, as the only one researchers stay with is a small sample of remaining texts. Most often this is the case of historical small territorial varieties. This eliminates any possibility of a reliable automatic classification, yet still preserves the possibility of a preliminary one. The talk proposes a new method for measuring language distance between small historical closely-related lects, that is based on the combination of frequency-based methods and string similarity measures, and introduces a corpus-based string similarity measure that intends to imitate more advanced phonetic-based scores. The materials for its evaluations are modern and historical Slavic lects, including Slovak, Slovenian and Croatian standards, Belogornoje, Megra and Zialionka dialects, as well as Novgorod, Smolensk and Polack legal texts of XII – XIV centuries. The key technique used is cross-evaluation with more traditional dialectometry methods, where it is possible. Python implementation of the methods given is available as a Python package.
</details>
</div><div class="column" style="width:15%;">
<p><strong>11 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Alena Muravyova (HSE University)</em></p>
<p><strong>Syllable structure in Andic languages: data-driven approach</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk I will present results of the data-driven research of syllable structure of Andic languages. Despite the fact that the syllable structure is described in every grammatical description of the Andic languages, in my work I try to get the same result using a database of dictionaries (Moroz et al. 2023). To this data, I applied the LexStat method (List 2012) in order to automatically match cognacy and then used the Levenshtein distance search function for comparing syllabic structures of cognates. As a result of a pairwise comparison of cognates from different languages, the general tendency of the Andic languages to prioritize the CV-syllable. What is more, I came to particular conclusions about the beginning of the word (strict openness of the first syllable in Ahvakh, high tolerance of the closed syllable in Botlikh and Andi) and about the end of the word (mainly open syllable in Ahvakh, closed syllable in Botlikh, Chamalal and Bagvalal). In addition, I have separately examined the processes of reducing consonants r, l, m, n, j, w, b in Andic languages. The results I have obtained are of scientific interest, as they testify to the historical processes in the Andic languages.
</details>
</div><div class="column" style="width:15%;">
<p><strong>4 June</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Dmitry Ganenkov (Leibniz-Zentrum Allgemeine Sprachwissenschaft)</em></p>
<p><strong>Causative in Dargwa infinitival constructions</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In this talk, I report on a work-in-progress concerning the behavior of the morphological causative inside infinitival (obligatory control) constructions in Dargwa, as shown in (1).</p>
<ol type="1">
<li><pre><code>nab [ ħe-zi ʡinc-bi d-iʡ-aq-es ] dig-ul-ra
I(dat) 2sg-loc apple-pl(abs) n.pl-steal:pf-caus-inf want-dur-1</code></pre>
‘I want you to steal apples.’</li>
</ol>
Based on the appearance of example (1), we might expect it to mean ‘I want to make you steal apples.’ However, the presence of the causative construction in the embedded clause is not reflected in the semantics, since the sentence in (1) cannot be understood as expressing a want to cause the stealing. Instead, as the translation of (1) shows, the embedded clause is non-controlled, with the locative “causee” understood as the agent of the embedded event. I present an overview of the phenomenon across Dargwa and concentrate on the details of the construction in Standard/Aqusha Dargwa.
</details>
</div><div class="column" style="width:15%;">
<p><strong>21 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Kyuseva (University of Surrey), Daria Ryzhova, Ekaterina Rakhilina, Tatiana Reznikova (HSE University)</em></p>
<p><strong>Parts of the body: New insights into cross-linguistic variation</strong></p>
<details>
<summary>
Abstract
</summary>
We discuss cross-linguistic variation in the usage of the body part terms within the frame-based approach to lexical typology (Rakhilina & Reznikova 2016). We contribute to the extensive body of research by providing a detailed analysis of one type of contexts in which body part terms appear, i.e., non-semiotic bodily movements, such as to cross the hands behind the back, to cover the face with the hands, etc. We find that different languages use different body part terms to describe the same bodily movement. The choice of the term depends on a range of conditions, including the inventory of the available terms, the сhoice of the linguistic construction, and the degree of conventionalisation. We take this as evidence for an additional aspect of the meaning of a body part term, which has been largely ignored in the previous typological literature. This aspect is the set of semantic restrictions on constructions in which the body part term can appear. We argue that it needs to be addressed in order to ensure the exhaustive cross-linguistic description of this semantic domain.
</details>
</div><div class="column" style="width:15%;">
<p><strong>14 May</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Ermolova (HSE University)</em></p>
<p><strong>On the genesis of the anti-resultative meaning of the pluperfect in the history of Russian</strong></p>
<details>
<summary>
Abstract
</summary>
As is well known, Russian constructions with the participle было (пошел было, но вернулся) go back to the Old Russian pluperfect with the anti-resultative meaning. Denoting the anti-resultative meaning is one of the main secondary functions of the pluperfect in the languages of the world. V.A. Plungian and D.V. Sitchinava, based on typological data, connect the emergence of the anti-resultative with different pluperfect meanings from the semantic zone of the discontinuous past. However, data from the Russian chronicles, as well as live dialect data, show that the anti-resultative meaning in the history of Russian could develop not from the meaning of the discontinuous past, but from the resultative meaning.
</details>
</div><div class="column" style="width:15%;">
<p><strong>23 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Anna Golovina and Ksenia Dunaeva (HSE University)</em></p>
<p><strong>Using transducers to create morphological parsers and other NLP tools for Nakh-Daghestanian languages</strong></p>
<details>
<summary>
Abstract
</summary>
Our talk is dedicated to creating morphological parsers for low-resource Nakh-Daghestanian languages. Morphological parsers can be created based on either processing of a set of grammatical rules of the language or probabilistic models underlying neural networks. The latter are not suitable for languages with a small collection of annotated texts. A finite-state transducer is a rule-based parser that can be defined as a type of finite-state automata with two input tapes. Whereas ordinary finite-state automata can merely determine whether a concrete string belongs to the described regular language, a transducer maps between two sets of symbols: input symbols and output symbols. The transducer makes correspondence between a surface word form and a string with morphological analysis. Building a two-level rule-based parser requires combining a minimum of two different finite-state transducers: one for lexicon storing and morphotactics modeling and another for implementing morphophonological rules. In recent years, morphological parsers based on transducers were implemented for a wide range of East Caucasian languages, including Tsez (Wilson & Howell, 2022), Andi Proper (Buntiakova 2023) and Zilo Andi (Moroz 2022), Bagvalal (Ignatiev 2022), and some others. Our talk will be focused on building parsers for Avar and Bezhta Proper. We will discuss in detail the tools that can be used to create a morphological transducer, the difficulties that one may encounter while computationally modeling the morphology and morphophonology of Nakh-Daghestanian languages, the projects that are already being implemented at the Higher School of Economics, and the future prospects for using rule-based morphological parsers.
</details>
</div><div class="column" style="width:15%;">
<p><strong>16 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Konstantin Filatov (HSE University)</em></p>
<p><strong>Reading group: Rik van Gijn & Max Wahlström (2023) Linguistic areas</strong></p>
<details>
<summary>
Abstract
</summary>
Linguistic area research has received ample attention in the last century. Nevertheless, methodology remains somewhat underdeveloped, and there seem to be few, if any, generalizations about the relation between the processes underlying area formation and their outcomes. The main challenge is that, in most cases, the past is not directly accessible and therefore has to be reconstructed. Linguistic area research, therefore, stands to gain immensely from a firm embedding into a framework that includes both other strands of contact linguistics and extra-linguistic disciplines to complete the picture.
</details>
</div><div class="column" style="width:15%;">
<p><strong>9 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Rita Popova (HSE University) and Michael Daniel (Collegium de Lyon / Laboratoire Dynamique du Langage)</em></p>
<p><strong>Number-conditioned stem alternation in adjectives of size (East Caucasian supplement)</strong></p>
<details>
<summary>
Abstract
</summary>
<p>In our previous talk at the Lab, we showed that, across the world, property words denoting size may use different stems depending on the number of the entities they are predicated of; and that, in a cross-linguistic perspective, they do so visibly more often than other property words (Popova and Daniel in prep.; see also Nurmio 2017). This pattern of stem alternation for ‘small’ has also been observed in East Caucasian languages (Yakovlev 1960 on Chechen, Kibrik & Kodzasov 1988 on Lak, Azaev 2000 on Botlikh, Kibrik 1999 on Tsakhur, and arguably Nichols 2011 on Ingush).</p>
<p>Notably, the phenomenon has been reported in areally separated (south, central, northwest) languages belonging to different branches (Lezgic, Lak, Andic and Nakh); but the lexical items involved do not always seem to be cognate. It thus seems another case of a cross-linguistically rare phenomenon that emerged as a result of parallel independent development (cf. Daniel & Maisak 2014 on verificative, Nasledskova & Netkachev, under review on ordinal numerals, Daniel 2017 on “person by other means”). Such phenomena are notoriously challenging for an evolutionary interpretation.</p>
<p>It is not obvious that we deal here with suppletion in number. In the languages under study, number agreement is not necessarily present in adjectives, so that the two forms may not belong to one inflectional paradigm. Instead, we will call this pattern number-driven dislexification (cf. François 2022) in the sense that two meanings, ‘small (of one)’ and ‘small (of many)’, commonly expressed by the same lexical item, are split into two different lexical items (cf. similar approach to verbal number in Durie 1986, Mithun 1988, François 2019). Strikingly, while the phenomenon is attested in individual languages dispersed across different branches, and thus could be suspected of being inherited, the cognacy of the alternating stem is not obvious. In some of these languages ‘small (of many)’ is recruited from ‘fine-grained (of e.g. sand)’ (a component structure adjective in terms of Maiden 2014, Nurmio 2017, who discuss the same path of emergence of number suppletion is discussed). We hypothesized that the development of number dislexification emerged through gradual lexical extension of ‘fine-grained’, from more to less mass-like nouns. To test this claim, we ran an online elicitation test, collecting data from each but two languages of the family, in most cases from several respondents. We investigated lexical preferences for expressing the meaning of ‘small (of many)’, expecting to find cases intermediate between languages that only apply ‘fine grained’ to masses and languages that apply it to all plural nouns. We used similar number contexts for ‘big’ as fillers.</p>
<p>Our expectation was partially borne out. In addition to this, we also had some less expected results. We noticed that the phenomenon of number dislexification has a wider spread than reported in the literature in terms of languages involved. We noticed that, in some languages, the dislexified adjective for ‘small (of many)’ and the adjective for ‘fine-grained’ may be unrelated, at least synchronically. We noticed that, notwithstanding plural reference of the noun, ‘small (of one)’ may be induced by the use of the singular form in NPs modified by a numeral - a switch more expected under a grammatical (suppletion) rather than lexical (dislexification) perspective on the phenomenon. Finally, we discovered the same phenomenon present, even if more rarely, in adjectives for ‘big’, our intended fillers.</p>
We expected that a more nuanced model distinguishing between – (a) absence of lexical extension of ‘fine-grained’ altogether, (b) presence of such an extension as a preference and (c) a strict dislexification – complemented by historical analysis of cognacy of the items could provide a more feasible explanation of the distribution of the phenomenon of number dislexification across the family in genealogical or contact terms. So far, our results are not conclusive in this respect.
</details>
</div><div class="column" style="width:15%;">
<p><strong>2 April</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Matthew Sung and Jelena Prokić (Leiden University)</em></p>
<p><strong>Recent Developments in Quantitative Approaches to Linguistic Micro-Variation</strong></p>
<details>
<summary>
Abstract
</summary>
Dialectometry is the quantitative branch of dialectology which utilizes computational methods to calculate linguistic distances and generate visualizations which allow us to explore relationships between dialects. Although dialectometry is a growing field with an increasing number of new approaches, some corners in dialectal variation are still rather unexplored. Firstly, dialectometry is a popular method in Europe, but not so much in other parts of the world. It is unclear whether our findings of dialectal variation in Europe (e.g. the existence of a dialect continuum, the specific dynamics of dialect spread) are also found in other corners of the world. Secondly, most of the work on phonetic variation is based on segments, while most of the world’s languages are tonal (Yip 2002). It is unclear how dialects vary on the tonal level. Lastly, the outcome of a dialectometric analysis is usually a classification of dialects, but the features that contribute to this classification, i.e. dialect features which are exclusive to certain groups, are not explored in these classifications. In this talk, we would like to address the issues raised above based on our latest work done in Leiden.
</details>
</div><div class="column" style="width:15%;">
<p><strong>26 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Irene Gorbunova (Russian State University for the Humanities)</em></p>
<p><strong>Nominal Causal Constructions in Khwarshi Proper</strong></p>
<details>
<summary>
Abstract
</summary>
In this talk I will address the various ways of nominal cause marking in Khwarshi Proper (a dialect of Khwarshi < Tsezic < East Caucasian). The data was collected in Daghestan during several field trips in 2022-2023, the research was based on (but not limited to) the NoCaCoDa project questionnaire. The grammar descriptions of Khwarshi mention the causal case, which seems to be a unique feature of Khwarshi as compared to other (West) Tsezic languages. Even more peculiar is the fact that the causal case in Khwarshi, albeit attested, is not the default option for marking the nominal cause: rather, a designated postposition or spatial case forms are used. Furthermore, the designated postposition, as well as the spatial case most commonly used for cause marking, both show unexpected semantic shifts under certain predicates.
</details>
</div><div class="column" style="width:15%;">
<p><strong>19 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Насиб Амирхан-оглы Искандаров (м.н.с., ФГБНУ “МГНЦ”)</em></p>
<p><strong>Истоки генофонда народов Восточного Кавказа: вклад автохтонного населения бронзового века и миграций из Передней Азии по данным Y-хромосомы</strong></p>
<details>
<summary>
Abstract
</summary>
Генофонд Восточного Кавказа, включающий народы Азербайджана и Дагестана, систематически охарактеризован по единой панели 83 SNP-маркеров Y-хромосомы в контексте населения окружающих регионов. Анализ генетических расстояний между 18 популяциями (N=2216) нахско-дагестанской, алтайской и индоевропейской языковых семей выявляют три компонента генофонда - «степной», «иранский» и «дагестанский» - с разным весом вклада в генофонд и в разные периоды его формирования. «Степной» компонент выражен только у караногайцев и отражает хронологически самую позднюю волну миграций – тюркоязычных кочевников Евразийской степи в средние века. «Иранский» компонент выражен в генофонде азербайджанцев Азербайджана и Дагестана, табасаран Дагестана и всех ираноязычных народов Кавказа. «Дагестанский» компонент преобладает во всех популяциях, говорящих на дагестанских языках (кроме табасаран), и у тюркоязычных кумыков. Каждая компонента связана с определенным комплексом Y-гаплогрупп: «степной» комплекс - C-M217, N-LLY22g, R1b-M73 и R1a-M198; «иранский» комплекс - J2-M172(×M67,M12) и R1b-M269, «дагестанский» комплекс - J1-Y3495. Выдвинута гипотеза, что гаплогруппа J1-Y3495 возникла 6,5±0,6 kya в автохтонной прапопуляции центральной части Дагестана. Около 6 kya она подразделилась на две основные линии: J1-ZS3114 (с максимумом у народов даргинской, лакской, лезгинской языковых ветвей) и J1-CTS1460 (с максимумом у народов для аваро-андо-цезской языковой ветви) с ее дальнейшим ветвлением около 4-5 kya. Результаты анализа филогеографии J1-Y3495 в контексте данных археологии и палеоДНК указывают на рост численности населения на территории Дагестана, начиная с бронзового века, расселение и дальнейшую микроэволюцию подразделенной популяции.
</details>
</div><div class="column" style="width:15%;">
<p><strong>12 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maksim Melenchenko (HSE University)</em></p>
<p><strong>Reading group: Martin Haspelmath (2023) Inflection and derivation as traditional comparative concepts</strong></p>
<details>
<summary>
Abstract
</summary>
This article revisits the distinction between inflectional and derivational patterns in general grammar and discusses the possibility that this well-known distinction is not rooted in the reality of languages, but in the Western tradition of describing languages, through dictionaries (for words, including derived lexemes) and through grammar books (where we often find tables of exemplary paradigms). This tradition has led to rather different terminological treatments of the two kinds of patterns, but from the perspective of a constructional view of morphology, there is no need to incorporate such differences into formal grammatical descriptions. For practical purposes, we need clear and simple definitions of entrenched terms of general linguistics, so the article proposes semantically based (retro-) definitions of <em>inflection</em>, <em>derivation</em> and <em>lexeme</em> that cover the bulk of the existing usage. Finally, I briefly explain why we need sharp definitions of comparative concepts, and why prototype-based and fuzzy definitions of traditional terms are not helpful.
</details>
</div><div class="column" style="width:15%;">
<p><strong>5 March</strong></p>
</div><div class="column" style="width:85%;">
<p><em>John Mansfield (University of Zurich)</em></p>
<p><strong>When social contact promotes diversification</strong></p>
<details>
<summary>
Abstract
</summary>
In much linguistic literature, small, socially isolated speech communities are the main locus of diversification and grammatical complexity (e.g. Trudgill 2011). Similarly, linguistic differentiation is traditionally viewed as resulting from social separation of groups (Paul 1888), while intensive social contact between groups can lead to structural convergence of their languages (e.g. Gumperz & Wilson 1971; Ross 1996). However, sociolinguistic literature shows that social groups in regular contact use language as a way of developing and maintaining distinct group identities (Eckert 2008), and in regions with many small ethnic groups this can drive diversification (François 2011; Evans 2019; Epps 2020), a kind of ‘sympatric speciation’ in linguistic evolution. In this presentation I consider evidence for contact-driven diversification, paying particular attention to which dimensions of language may be used to index group identity. I present a cross-linguistic database on dialect differentiation, which analyses grammatical variation and dialect differences in 42 languages, drawing on data from reference grammars. The main finding is that grammatical ordering very rarely differentiates dialects in close contact, but the form of grammatical markers (affixes, clitics and function words) frequently <em>does</em> differentiate dialects in close contact.
</details>
</div><div class="column" style="width:15%;">
<p><strong>27 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Silvia Luraghi (University of Pavia) and Chiara Zanchi (University of Pavia)</em></p>
<p><strong>Introducing PaVeDa – The Pavia Verb Database</strong></p>
<details>
<summary>
Abstract
</summary>
<p>PaVeDa – The Pavia Verb Database is the focus of the project “Verbs’ constructional patterns across languages: a multi-dimensional investigation”, a joint enterprise of two teams of researchers from the Universities of Pavia and Naples “Federico II.” PaVeDa is an open-source relational database for investigating verb argument structure across languages (Zanchi et al. 2022), intending to expand and enhance the Valency Patterns Leipzig (ValPaL) database (Hartmann et al. 2013), developed within the Leipzig Valency Classes Project, which carried out a large-scale cross-linguistic comparison of valency classes. The project relied on a group of contributors, who collaborated by providing a consistent set of cross-linguistic data. The online database ValPaL contains data from 36 languages, based on a database questionnaire for a selected sample of 80 verb meanings. Apart from valency frames, contributors provided information about possible alternations, both uncoded and coded.</p>
In spite of the research carried out within the ValPaL project, no systematic comparative study on diachronic developments across languages is available. The PaVeDa project intends to expand and enhance the ValPaL database with more languages and further features and is configured to contrastively display valency patterns simultaneously in different languages. Within this project, the Pavia team cooperates with a number of international partners who provide sets of data for the new languages uploaded in the database. For the time being, the datasets from several ancient languages (Old Latin, Ancient Greek, Gothic, Old English, Classical Armenian, Old High German) and modern languages (Modern Greek) have been uploaded in the database, along with the modern languages stored in the ValPaL database. As for the additional features, an intermediate level of annotation to the original ValPal have been added, the alternation class, which categorizes language-specific alternations into four cross-linguistic types (valency re-arranging, valency augmenting, valency decreasing, argument identifying), and serves as the initial comparative tool. While the ValPaL database does not allow for contrastive visualization of constructions across the languages it contains, developers of the PaVeDa database designed a special layer of annotation that allows generalizing over language-specific patterns, and makes them visually comparable. Work on ancient languages also brought to methodology redesign, as ancient languages can only be studied based on corpus data rather than relying on the native speakers’ knowledge. This practice brings about a usage-based methodology that we have started implementing for modern languages too, linking the data on constructional patterns to existing digitalized corpora. In the near future, we aim to further develop both typological and diachronic comparison by adding more languages, both ancient and modern, from language families already represented in the ValPal database (Indo-European and Afro-Asiatic), as well as from families that are not represented (Uralic and Turkic).
</details>
</div><div class="column" style="width:15%;">
<p><strong>20 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Nina Zdorova (HSE University), Olga Parshina (HSE University), Bela Ogly (HSE University), Irina Bagirokova (HSE, IL RAS), Ekaterina Krasikova (HSE University), Shamset Unarokova (Adyghe State University), Aida Bguasheva (Adyghe State University), Maria Rodina (HSE University), Susanna Makerova (Adyghe State University), Olga Dragoy (HSE, IL RAS)</em></p>
<p><strong>Language processing while reading in Adyghe: evidence from eye-tracking studies</strong></p>
<details>
<summary>
Abstract
</summary>
A bulk of psycholinguistic research is dedicated to eye movements while reading that reflect online language processing. Yet, little is known about language processing in polysynthetic languages. The talk will cover the latest eye-tracking studies of language processing while reading sentences in Adyghe conducted by the researchers from the Center for Language and Brain HSE University, Moscow, together with colleagues from the Laboratory of Experimental Linguistics of the Adyghe State University, Maykop. Experimental studies in focus answer fundamental questions about language processing like 1. What features of language processing are observed while reading in a polysynthetic language (Adyghe) compared to reading in other languages? 2. How does language processing change depending on morphosyntactic features when reading in Adyghe? Apart from that, some ongoing research projects of text reading in Adyghe will be presented.
</details>
</div><div class="column" style="width:15%;">
<p><strong>13 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Svetlana Zemicheva (HSE University)</em></p>
<p><strong>Reading group: Philipp Stöcklec (2023) “Dialect areas and contact dialectology” in Language contact: Bridging the gap between individual interactions and areal patterns</strong></p>
<details>
<summary>
Abstract
</summary>
Spatial variation of language has been researched qualitatively and quantitatively for at least 150 years by different sub-disciplines of linguistics, each defining differently what dialects and dialect areas are. Linguists agree, however, that the concept of dialect is vague and the extent of a dialect is fuzzy. With contact being a crucial driver of linguistic change at sublanguage levels, we attempt to sketch the perspective that contact dialectology and related sub-disciplines can offer on this fuzziness with regard to the spatial variation of dialects and dialect areas. Thus we address contact processes and patterns characterizing individuals, groups, communities, areas and beyond, at temporal scales spanning from mundane contact through generations to deeper time enough for dialects to diverge and disappear.
</details>
</div><div class="column" style="width:15%;">
<p><strong>6 February</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Maria Khachaturyan (University of Helsinki), Maria Konoshenko (University of Helsinki), George Moroz (HSE University) and Valentin Vydrin (INALCO)</em></p>
<p><strong>Valency patterns in Mande: contact vs inheritance</strong></p>
<details>
<summary>
Abstract
</summary>
In our talk, we address valency patterns in seven Mande languages of different genealogical subgroupings. Our study is based on the BivalTyp questionnaire focusing on 130 two-place predicates (Say 2018). We explore areal and genealogical factors in valency expression. While belonging to two distinct genealogical groupings, two languages of the set, i.e. Mano (Southern Mande) and Kpelle (Southwestern Mande), are in intense contact with one another (Khachaturyan & Konoshenko 2021). We investigate to what extent the synchronic patterning of valency expression in the data can be accounted for by contact and / or inheritance. We found that on the basis of the lexical equivalents for a given predicate, the languages are distributed strictly following the genealogical principle, and Mano clusters together with other Southern Mande languages. Yet the type of construction chosen for a particular predicate, as well as, for intransitive constructions, the postposition introducing the second argument, are subject to areal influence, with Mano clustering together with its Southwestern neighbors, Kpelle and Kono, and not with its closest genealogical Southern Mande relatives, Guro and Dan-Gweetaa. In addition, the structure of complex verbs in Mano resembles more that of Kpelle and Kono than that of Guro and Dan-Gweetaa. Thus, although Mano verbal forms are virtually unaffected by contact, the patterns of valency expression, as well as verbal lexical patterns are strongly influenced by neighboring languages. While this is by far not the only study showing predominance of pattern-borrowing in multilingual settings (Epps 2008; François 2011, inter alia), it showcases argument coding as a useful parameter for a comparative study of both pattern and matter.
</details>
</div><div class="column" style="width:15%;">
<p><strong>30 January</strong></p>
</div><div class="column" style="width:85%;">
<p><em>Polina Padalka (HSE University)</em></p>
<p><strong>Reading group: Blum, F., Barrientos, C., Ingunza, A. et al. Grammars Across Time Analyzed (GATA): a dataset of 52 languages. Sci Data 10, 835 (2023). https://doi.org/10.1038/s41597-023-02659-1</strong></p>
<details>
<summary>
Abstract
</summary>
Grammars Across Time Analyzed (GATA) is a resource capturing two snapshots of the grammatical structure of a diverse range of languages separated in time, aimed at furthering research on historical linguistics, language evolution, and cultural change. GATA comprises grammatical information on 52 diverse languages across all continents, featuring morphological, syntactic, and phonological information based on published grammars of the same language at two different time points. Here we introduce the coding scheme and design features of GATA, and we describe some salient patterns related to language change and the coverage of grammatical descriptions over time.
</details>