Bild
Kaja Dobrovoljc
Kaja Dobrovoljc
Breadcrumb

The Linguistic Forum: A Treebank-Driven Exploration of Spoken Language Grammar

Culture and languages

Guest lecture with Kaja Dobrovoljc, researcher in Linguistics and Computational Linguistics at the University of Ljubljana and at the Department for Artificial Intelligence at the Jožef Stefan Institute, Slovenia. Everyone’s warmly welcome to participate in this event organised by the Linguistic Forum!

Lecture
Date
8 Apr 2025
Time
13:15 - 15:00
Location
J236, Humanisten, Renströmsgatan 6 and via Zoom:

Participants
Kaja Dobrovoljc
Good to know
Language: English
Organizer
Department of Languages and Literatures; Department of Philosophy, Linguistics and Theory of Science; Department of Swedish, Multilingualism, Language Technology

The lecture is co-presented with the Linguistics Seminars  and Grounding Human-Centred Ai on Embodied Multimodal Interaction Eutopia Connected Community.
 


Abstract

Based on the unitary approach to language, which views speech and writing as two ends of a continuum that should be described as a whole, the past three decades have seen a surge in spoken language research aimed at capturing speech-specific linguistic phenomena that have been overlooked or insufficiently addressed by traditional grammatical frameworks. Although spoken communication exhibits idiosyncrasies on various levels of linguistic description due to the specific circumstances of its production, its most pronounced differences emerge in syntactic patterns, where features such as disfluencies and ellipsis shape its structure in ways rarely found in writing.

The rise of spoken treebanks — morphosyntactically annotated transcriptions of dialogue and other spoken interactions — has opened new avenues for systematically studying these differences. Drawing on insights from the SPOT project, which investigates the syntactic features of spoken Slovene, this talk presents a bottom-up, data-driven approach to analyzing spoken language by extracting and comparing syntactic patterns across corpora. Specifically, I will share our experience in creating a manually annotated spoken language treebank, developing a new method for systematic, bottom-up treebank comparison, and applying it to a cross-linguistic study of syntactic differences between speech and writing.

This approach not only deepens our understanding of speech-specific syntactic patterns but also demonstrates the broader potential of a fully inductive, treebank-driven analysis for studying structural variation across languages, registers, and genres.

The Linguistic Forum

The linguistic forum is an informal meeting place for all linguists working at the Faculty of Humanities. We are a faculty-wide seminar activity with financial support from the faculty (since 2020). Our aim is to promote knowledge exchange and collaboration between the faculty's linguists. 

Read more about the Linguistic Forum here (in Swedish)

Mailing list

Register to our mailing list to get information on upcoming events. Send an email to anmalan.epostlista@sprak.gu.se and ask to be added to "Språkvetenskapligt forum".