Dokument: The Limits of Progress: Understanding the Performance of State-of-the-Art Language Models in Argument Mining

Titel:

The Limits of Progress: Understanding the Performance of State-of-the-Art Language Models in Argument Mining

URL für Lesezeichen:

https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=73263

URN (NBN):

urn:nbn:de:hbz:061-20260518-112050-3

Kollektion:

Dissertationen

Sprache:

Englisch

Dokumententyp:

Wissenschaftliche Abschlussarbeiten » Dissertation

Medientyp:

Text

Autor:

Feger, Marc [Autor]

Dateien:

[Dateien anzeigen]	Adobe PDF
[Details]	7,08 MB in einer Datei
[ZIP-Datei erzeugen]
Dateien vom 14.05.2026 / geändert 14.05.2026

Beitragende:

Prof. Dr. Dietze, Stefan [Gutachter]
Prof. Dr. Mauve, Martin [Gutachter]
Prof. Dr. Stein, Benno [Gutachter]

Stichwörter:

Argument Mining, Natural Language Processing

Dewey Dezimal-Klassifikation:

000 Informatik, Informationswissenschaft, allgemeine Werke » 004 Datenverarbeitung; Informatik

Beschreibung:

Identifying arguments is a foundational stage for many forms of manual and automated discourse analysis, including the study of political deliberation, online dialogue, and scientific reasoning. As such, the availability of rich open-access text corpora combined with powerful Language Models (LMs) has sparked overall interest in Argument Mining (AM) research. Nevertheless, two significant shortcomings persist. Specifically, there is an incomplete account of the systemic factors that ostensibly propel progress and validate its acclaimed achievements and enduring constraints that this oversight imposes on both the reliable extraction of arguments across heterogeneous domains, particularly in the fast-paced context of social media. To be precise, Twitter, recently rebranded as X, has become a global agora that shapes public opinion and offers a rich source of conversational data, but systematic investigation of argumentation on this platform remains scant. Second, despite steady methodological advances, most argument systems are validated only on the datasets for which they were designed, leaving their generalizability untested. This thesis tackles these issues with three interlinked studies. At the data level, it introduces and fully documents the creation of the first conversation-based corpus for mining arguments on Twitter. Findings reveal that Twitter debates frequently involve meaningful information exchange and explicit reasoning. Furthermore, although human argument recognition is highly sensitive to conversational context, the state-of-the-art (SOTA) is driven by LMs that appear to attain strong performance when trained on such data. Closer inspection, however, indicates that this performance is largely superficial and does not reflect robust representational alignment of these models with the expected class semantics. At the representation level, this work addresses the inherently entangled representations of arguments on Twitter and proposes a pre-training strategy for LMs that disentangles and refines them internally. By explicitly modeling argument structure at the component level and capturing distinctions between those components, this approach generates representations that are inherently more interpretable and more faithful to the intended semantics. The resulting model surpasses earlier baselines on Twitter and generalizes robustly across topics. Finally, an extensive re-evaluation and cross-application analysis assesses the most task-relevant approaches and corpora, including those introduced in this thesis. Experiments reveal sharp performance drops caused by dataset-specific artifacts related to topic and content, which current SOTA LMs undesirably exploit when identifying arguments. Hence, this challenges the informative value of isolated baseline experiments and the transferability of findings in AM. Taken together, the thesis offers both a novel resource and a refined pre-training approach for LMs that enhance the generalization of argument signals on Twitter, alongside a systematic comparative study that highlights the limitations of existing research and sets the stage for
future advances in genuine argument identification.

Lizenz:

Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung 4.0 International Lizenz

Fachbereich / Einrichtung:

Mathematisch- Naturwissenschaftliche Fakultät » WE Informatik

Dokument erstellt am:

18.05.2026

Dateien geändert am:

18.05.2026

Promotionsantrag am:

22.11.2025

Datum der Promotion:

06.05.2026

Heinrich-Heine-Universität Düsseldorf

Dokument: The Limits of Progress: Understanding the Performance of State-of-the-Art Language Models in Argument Mining